Are quaternions good or evil?

W.S. Harwin

(http://www.cybernetia.co.uk/dnb/racequaternions.html)

14/4/2021

\( \def\RR{{\bf R}} \def\bold#1{{\bf my #1}} \renewcommand{\vec}[1]{\underline{{#1}}} \)

Here as he walked by on the 16th of October 1843 Sir William Rowan Hamilton in a flash of genius discovered the fundamental formula for quaternion multiplication $i^2 = j^2 = k^2 = ijk = -1$ and cut it on a stone of this bridge.

- Quaternions were used by James Clerk Maxwell to describe EM theory; in 20 equations.
- Oliver Heaviside is responsible for the 4 equations known today as 'Maxwell's equations'

- `
*I came later to see that, as far as the vector analysis I required was concerned, the quaternion was not only not required, but was a positive evil of no inconsiderable magnitude; and that by its avoidance the establishment of vector analysis was made quite simple and its working also simplified, and that it could be conveniently harmonised with ordinary Cartesian work.*' Oliver Heaviside (1893)

- `
*Quaternions came from Hamilton after his really good work had been done; and, though beautifully ingenious, have been an unmixed evil to those who have touched them in any way, including Clerk Maxwell.*' W. Thompson, Lord Kelvin (1892)

- Needed for robotics, virtual reality, computer graphics etc.
- Each link can have a local coordinate frame
- Vector quantities (forces, positions, velocities and accelerations) can all be considered in a local coordinate frame

- Unit quaternions are ideal for considering vector rotations.
- Coordinate transforms can then separate into translation (vector addition) and rotation (quaternion multiplication)

- Homogeneous transforms
- Can use a single function (matrix multiply) for both translations and rotations
- Possible also with dual quaternions but there is no obvious benefit

If we have a vector $v$ we can introduce an operator $L_{\theta,\vec{u}}(v)$ rotates $v$ around axis defined by a unit vector $\vec{u}$ by an angle $\theta$.

This operator can be interpreted in two ways.

- Rotation of a vector by an angle $\theta$
- The same vector measured with respect to a second coordinate frame that has been rotated by $-\theta$

A rotational transform can be considered as

\[ {}^b\vec{v}=L_{\theta,\vec{u}}({}^r\vec{v}) \]a vector rotation of $\theta$ about $\vec{u}$, where ${}^b\vec{v}$ is the blue vector and ${}^r\vec{v}$ is the red vector (left).

Or it can be considered as

\[ {}^n\vec{v}={}^n_sL_{-\theta,\vec{u}}({}^s\vec{v}) \]that provides the coordinates of the vector in the new frame of reference rotated by $-\theta$ about the $z$ axis (right).

If $L_{-\theta,\vec{u}}$ has an inverse $L^{-1}_{-\theta,\vec{u}}=L_{\theta,\vec{u}}$ then

\[ {}^s\vec{v}={}^n_sL^{-1}_{-\theta,\vec{u}}({}^n\vec{v})={}^s_nL_{\theta,\vec{u}}({}^n\vec{v}) \]If we have a vector

\[ \vec{u} = \begin{bmatrix} \omega _{x} & \omega _{y} & \omega _{z} \end{bmatrix}^T \]We can use a skew operator to form a skew matrix of the form

\[ S(\vec{u})=\tilde{S}=S_u= \begin{bmatrix} 0 & -\omega _{z} & \omega _{y}\\ \omega _{z} & 0 & -\omega _{x}\\ -\omega _{y} & \omega _{x} & 0 \end{bmatrix} \]$\tilde{S}$ or $S_u$ has the property that implements the cross product, i.e. $\vec{u}\times\vec{v}=\tilde{S}_u\vec{v}$ and $\vec{v}\times\vec{u}=-\tilde{S}_u\vec{v}=\tilde{S}_u^T\vec{v}$

It also has the following properties

\begin{align} S^T=-S\\ S^2-\vec{u}\vec{u}^T+I\vec{u}^T\vec{u}=0\\ S^3+S=(\vec{u}^T\vec{u}-1)S^T=(1-\vec{u}^T\vec{u})S \end{align}If $\vec{u}$ is a unit vector then these properties simplify to

\begin{align} S^2-\vec{u}\vec{u}^T+I=0\label{eq:S4}\\ S^3=-S \label{eq:S5} \end{align}Another property if $\vec{u}$ is a unit vector (from equation 4 and 5) of a unit vector skew matrix is that $S\vec{u}\vec{u}^T=0$

Further properties are

- S is singular if it has an odd dimension (Which is sad since it would make it easier to deal with exponential forms)
- (S+I) is invertable
- odd dimension skew matrices have only one real Eigenvalue which is 0.

A property of $S_u$ is that the exponential will generate an orthogonal rotational matrix, usually expressed as

\begin{equation} R=e^{S(\vec{u})\theta}=e^{S_{\vec{u}}\theta} \label{eq:expform} \end{equation}where $\vec{u}$ is a unit vector and $\theta$ represents some rotation about $\vec{u}$.

This can be demonstrated by expanding the exponential as

\[ R=I + S\theta + S^2\frac{\theta^2}{2!} + S^3\frac{\theta^3}{3!} + S^4\frac{\theta^4}{4!} +... \]\[ R=I + S\theta + S^2\frac{\theta^2}{2!} - S\frac{\theta^3}{3!} - S^2\frac{\theta^4}{4!} +... \]\[ R=I + \left(S\theta - S\frac{\theta^3}{3!} +...\right) + \left(S^2\frac{\theta^2}{2!} - S^2\frac{\theta^4}{4!} +...\right) \]\[ R=I + \sin\theta S + (1-\cos\theta)S^2 \]This is sometimes called Euler's formula, possibly because it is an extension of $e^{i\theta}=\cos\theta+i\sin\theta$ but it is more properly called Rodrigues rotation formula.

Rodrigues formula was originally expressed in a vector notation, but using skew matrices is a more elegant approach.

Using the properties of skew matrices above the following variants of Rodrigues rotation formula are possible.

\begin{align} R = e^{S_u\theta} &= I + \sin\theta S_u + (1-\cos\theta)S_u^2\\ &= I + \sin\theta S_u + S_u^2 -\cos\theta S_u^2\\ &= I + \sin\theta S_u+(1-\cos\theta)(\vec{u}\,\vec{u}^T - I)\\ &= I\cos\theta + \sin\theta S_u + (1-\cos\theta)\vec{u}\,\vec{u}^T\\ &= \sin\theta S_u + \vec{u}\,\vec{u}^T + (I-\vec{u}\,\vec{u}^T)\cos\theta \end{align}and evidently

\[ R^T= e^{-S_u\theta}= e^{S^T_u\theta} =I-\sin\theta S_u + (1-\cos\theta) S_u^2 \]if $S_u$ is based on a unit vector and constant and $R=e^{S_u\theta}$ (equation 6) then

\[ \frac{dR}{d\theta}=S_u e^{S_u\theta} = S_u R \]and

\[ \dot{R}=\frac{dR}{dt}=\frac{dR}{d\theta}\frac{d\theta}{dt}=S_u \left(R\frac{d\theta}{dt}\right)=\omega S_u R \]where $\omega=\frac{d\theta}{dt}$

Often the Skew matrix is considered as representing an angular velocity with components in a particular coordinate frame, that is \[\Omega=S_u \omega\] represents rotation about a unit vector $\vec{u}$ at an angular speed of $\omega$.

Quaternions (when used to calculate rotations) are a 4-tuple, i.e. a set of 4 Real numbers.

Advantages for quaternions

- Efficient to calculate
- Good for applying splines and Bezier curves (only 4 parameters to spline)
- Fewer terms to consider when doing calculations
- Easy to interchange with matrices (e.g. via Rodrigues rotation formula)
- Would manage the Dirac belt trick/plate trick ( SU(2) (e.g. quaternions) double-covers SO(3) (the rotations group))
- Extends to affine transforms with dual quaternions

Advantages for matrices

- Efficient notation ${}^av= {}^a_bR\,\,{}^bv$ against ${}^av= {}^a_bq\,\, {}^bv\,\, {}^a_bq^{-1}$
- More widely used in publications, easier to understand
- Possible to interchange with angle axis and quaternions. (axis of the rotation comes via the Eigen vectors, the angle of rotation comes via the matrix trace)
- extends to affine transforms with homogeneous transforms.

Advantages angle axis

- Easy to conceptualise
- Easy to interchange with matrices and quaternions

- Quaternion multiplication is non commutative and is sometimes explicitly represented by $l\otimes q$ and sometimes not.
- Binary operations (multiplication and addition) will be assumed if one of the variables is a quaternion (e.g. a vector pretending to be a quaternion $(0,\vec{v})$, or a real number $(4,0,0,0)$ )
- We will ignore the $ijk$ multiplications and use the scalar/vector representation
- Thus for the tuple $r=(r_0,r_1,r_2,r_3)$ we will represent as $r=(r_0,\vec{r}$) with $\vec{r}=[r_1\ r_2\ r_3]^T$ (note this may annoy some mathematicians)

- Occasionally we may also abuse the $+$ and $-$ signs and write quaternions as $r=r_0+\vec{r}$ this is because quaternions can be written as $r=r_0+r_1i+r_2j+r_3k$.

Given

\[ r = (r_0,\vec{r_v}) \]\[ q = (q_0, \vec{q_v}) \]we get

\begin{align*} q \otimes r = & (q_0 + \vec{q}_v)( r_0 + \vec{r}_v) = (q_0,\vec{q}_v)\otimes ( r_0, \vec{r}_v) \\ = &(q_0r_0 - \vec{q}_v{\bf.}\vec{r}_v), (q_0\vec{r}_v + r_0\vec{q}_v + \vec{q}_v\times\vec{r}_v) \end{align*}where $\times$ is the vector cross product.

which can be expressed as the conjugate over the squared norm.

Any quaternion with a norm of 1. A useful unit quaternion is

\[ q=\left(\cos\left(\frac\theta{2}\right),\vec{u}\sin\left(\frac\theta{2}\right)\right) \]where $u$ is a unit vector.

Rotation of a quaternion representing a vector $v$ about a unit vector $\vec{u}$ through an angle $(\theta)$ is given by $$ q' = rqr^{-1}$$ where

\[ r=\cos(\theta/2) + \sin(\theta/2) \vec{u} \]and the quaternion $q=(0,\vec{v})$ is a vector subset of the quaternion group.

Expanding the generalised rotation in vector notation (and using the conjugate rather than the inverse to saved work)

\[ rqr^* = q_0(r_0^2+\vec{r_v}{\bf .}\vec{r_v}) \,,\, (r_0^2 - \vec{r_v}{\bf .}\vec{r_v})\vec{q_u}+2r_0(\vec{r_v}\times\vec{q_u}) +2(\vec{q_u}{\bf .}\vec{r_v})\vec{r_v} \]if $q_0=0$

\[ rqr^* = (r_0^2 - \vec{r_v}{\bf .}\vec{r_v})\vec{q_u}+2r_0(\vec{r_v}\times\vec{q_u}) +2(\vec{q_u}{\bf .}\vec{r_v})\vec{r_v} \]which is a vector

This can be rewritten (expanding the half angles to full angles) as

\begin{equation} rqr^* = \cos(\theta)\vec{q_u} + \sin(\theta)(\vec{r_v}\times\vec{q_u})+(1-\cos(\theta))(\vec{q_u}{\bf .}\vec{r_v})\vec{r_v} \label{eq:rod99} \end{equation}rewriting the quaternion elements

\[ \displaystyle L_{\theta,\vec{k}}(\vec{v}) = \cos(\theta)\vec{v} + \sin(\theta)(\vec{k}\times\vec{v})+(1-\cos(\theta))(\vec{v}{\bf .}\vec{k})\vec{k} \]The equation 12 is the vector form of Rodrigues rotation formula

$q\otimes r$ can be done as matrix product $Q_L r$ if $Q_L$ is a 'left product' matrix and $r$ is considered as a 4 element vector.

The conjugate is simply the matrix transpose so $q^*\,r$ can be done as $Q_L^T r$

\[ Lmatrix= Q_L= \begin{bmatrix} q_{0} & -q_{1} & -q_{2} & -q_{3}\\ q_{1} & q_{0} & -q_{3} & q_{2}\\ q_{2} & q_{3} & q_{0} & -q_{1}\\ q_{3} & -q_{2} & q_{1} & q_{0} \end{bmatrix} = \begin{bmatrix} q_0 & -\vec{q}_v^T\\ \vec{q}_v & S_{\vec{q}}+I q_0 \end{bmatrix} \]A `right product' matrix is needed and $r\otimes q$ is then done as the calculation $r^T Q_R$ where

\[ Rmatrix= Q_R= \begin{bmatrix} q_0 & \vec{q}_v^T\\ -\vec{q}_v & S_{\vec{q}}+I q_0 \end{bmatrix} \]Note that $Q^T_L\ne Q_R$

It is then possible to calculate

\[ r'=qrq*=Q_L(r^T Q_R^T)^T \]The resulting 4x4 matrix $Q_L Q_R$ embeds the 3x3 rotation matrix in the lower right corner.

Consider a quaternion changing over an incremental period of time from $q(t)$ to $q(t+\Delta t)$. This incremental change is given by

\[ q(t+\Delta t) =q(\Delta t)\,\, q(t) \]If $q(t+\Delta t)$ and $q(t)$ are unit quaternions then $q(\Delta t)$ must be a unit quaternion. Assume that in the incremental time $\Delta t$ the angle of $q(\Delta t)$ changes by some $\Delta\theta$

so define the quaternion differential to be

\[ \frac{d q}{d t}=\dot{q}=\lim_{\Delta t\to 0}\frac{q(t+\Delta t)-q(t)}{\Delta t}\\ =\lim_{\Delta t \to 0}\frac{q(\Delta t)\,\,q(t)-q(t)}{\Delta t}\\ =\lim_{\Delta t \to 0}\frac{(q(\Delta t)\,-\,1)q(t)}{\Delta t} \]If we assume $q(\Delta t)$ is of the form

\[ q(\Delta t)=\left(\cos\left(\Delta\theta/2\right)+\vec{u}\sin\left(\Delta\theta/2\right)\right) \]Then as $\Delta t$ approaches 0, $q(\Delta\theta)$ approaches $(1+\vec{u}\,\Delta\theta/2)$ and hence

\[ \frac{d q}{d t} =\lim_{\Delta \to 0}\frac{(1+\vec{u}\Delta\theta/2\,-\,1)q(t)}{\Delta t} =\lim_{\Delta \to 0}\frac{\vec{u}\Delta\theta/2}{\Delta t}q(t) =\frac12\vec{u}\frac{d\theta}{d t}q(t) \]If $q$ defines a vector or coordinate frame then the differential can be considered as an angular velocity around unit $\vec{u}$ at a speed of $\frac{d\theta}{d t}$ so

\[ \frac{d q}{d t} =\frac12 \vec{\omega} q(t) \]and $\vec\omega$ is the angular velocity vector.

A more complete explanation given in [Jia2020]

Vote now.

- Quaternions use 4 parameters to specify a rotation, vs 9 for rotation matrices, or 4 when expressed in exponential form.
- Easily described as a (half) angle and a rotation axis

- Numerically easy to normalise when compared with rotation matrices, or exponential form.
- important to keep integration errors under control

- Easy to change between quaternions, rotation matrices, and angle axis.
- Fewer calculations to compute a rotation
- perhaps not important given overheads of operating systems and speeds of GPUs and CPUs.

- Longer to write out, rotations need 3 symbols vs 2.
- Are not suitable for every problem.
- e.g. recursive dynamics well established using rotation matrices.

- Don't have a cool exponential form

- Probably neither.
- But not a universal panacea.
- In effect, a good compliment to rotation matrices and angle-axis representations.
- Don't overlook Rodrigues forms

Shuster, M. 1993, "A Survey of Attitude Representations", Journal of the Astronautical Sciences, 41(4):349-517 See equations and discussion in the paper above, p463-464. (http://www.ladispe.polito.it/corsi/Meccatronica/02JHCOR/2011-12/Slides/Shuster_Pub_1993h_J_Repsurv_scan.pdf)

Yan-Bin Jia, Problem Solving Techniques for Applied Computer Science, Quaternions Lecture course Sep 3, 2020 (https://web.cs.iastate.edu/~cs577/) (https://web.cs.iastate.edu/~cs577/handouts/quaternion.pdf)

Quaternions, Interpolation and Animation Erik B. Dam Martin Koch Martin Lillholm (https://web.mit.edu/2.998/www/QuaternionReport1.pdf)

Lee, Byung-Uk (1991). Unit Quaternion Representation of Rotation - Appendix A, Differentiation with Quaternions - Appendix B (PDF) (Ph. D. Thesis). Stanford University (http://home.ewha.ac.kr/~bulee/quaternion.pdf)

Plenty of details at Wolfram Mathworld, Wikipedia, Numberphile (although surprisingly quiet about differentiation)

(https://math.stackexchange.com/questions/1896379/how-to-use-the-quaternion-derivative)

Possibly a good explanation! (https://fgiesen.wordpress.com/2012/08/24/quaternion-differentiation/) google:Differentiation with respect to the rotation quaternion

(https://math.stackexchange.com/questions/1896379/how-to-use-the-quaternion-derivative)