Why are rotation matrices always unitary operators?

Question

Can someone explain why the rotation matrix is a unitary, specifically orthogonal, operator?

Note that a rotation matrix is more than a unitary matrix: it is an orthogonal matrix. Now you may be thinking of representation matrices on a Hilbert space (I'm guessing from the QM tag), in which case the rotation group $SO(3)$ lifts to its universal cover, which is the unitary group $SU(2)$. Why unitary representations then? Because physics is supposed to be rotation invariant. Think about transition amplitudes and what can happen to them under a symmetry. — Michael, Sep 06 '13 at 12:40
@F'x I looked at my solid in a mirror and the volume picked up a minus sign please help — ZachMcDargh, Sep 06 '13 at 14:07
@ Michael Brown, thanks for reminding it's orthogonal. Can you explain a little bit more why orthogonality is necessary for rotation invariant? — Lorniper, Sep 06 '13 at 14:53
@Shawniper: Try multiplying a non-orthogonal matrix with a vector. The magnitude of the vector will change. — Abhimanyu Pallavi Sudhir, Sep 06 '13 at 15:22
@DImension10 Abhimanyu PS I know it won't, I want to know why, in a intuitive way. — Lorniper, Sep 06 '13 at 15:26
@Shawniper: What would be the determinant of a non - orthogonal matrix ? A rotation matrix would have to have cosine thetas and sine thetas in it's entries in such a way that the detwerminant would be 1, isn't it ? . — Abhimanyu Pallavi Sudhir, Sep 06 '13 at 15:32
I don't get why this was downvoted so much, nor why it was flagged as low - quality ? Except that it may be more on - topic on Math.SE , there's nothing wrong with this question. — Abhimanyu Pallavi Sudhir, Sep 06 '13 at 15:33
@ DImension10 Abhimanyu PS: firstly thanks for voting my question, I pretty much know the cos sin example you used, I just want to be convinced, more than varifying some known examples — Lorniper, Sep 06 '13 at 15:42
@Shawniper: No, I'm not giving you examples. All these rotation matrices have sines and cosines in their entries. By the way, don't put a space after the "@", or it won't ping me. — Abhimanyu Pallavi Sudhir, Sep 07 '13 at 08:06
@Shawniper Do you understand that the determinant of a matrix measures the change in volume that it produces? That is essentially the definition of the determinant. The usual determinant formula in terms of the sums of $\pm$ the product of matrix entries can then be derived by a geometrical argument starting from this definition. So volume preserving transformations must have unit determinant by definition. (I'm glossing over a potential minus sign issue here since we're actually talking about oriented volume elements: reflections have a determinant $-1$.) — Michael, Sep 07 '13 at 12:58
everyone, please give an answer, rather then filling up the comments. — Larry Harson, Sep 07 '13 at 20:37

score 6 · Accepted Answer · edited Apr 13 '17 at 12:39

You can define and do the geometry several ways but I'd say the reasons are linearity, isometry and handedness (preservation of left/right handedness: this one is not needed to prove orthogonality so it's a bit more than what you asked for, but it is what sets rotations aside from other isometries). Handedness is sometimes rather loftily called chirality.

Intuitively, you need to think of a grid of $x$, $y$ and $z$ co-ordinates being ruled throughout the space taken up by the rotated object and also think of what happens to drawings and 3D sculptures in that space.

After the rotation, all the $x$, $y$ and $z$ gridlines are still orthogonal and not distorted. Distances between all mapped points are the same as what they were before the rotation, and so angles between vectors are left unchanged.

We know a rotation leaves at least one point in space fixed. So let's arbitrarily put our origin at such a point. Then the lack of global distortion in our grid shows that the transformation is linear. I say lack of "global distortion" because some nonlinear transformations (conformal ones) can also have zero local distortion - little drawings are undistorted - but beget distortion in big enough drawings and sculptures.

So, with the origin fixed, our transformation is linear and homogeneous. So our transformation $\mathcal{U}$ can be represented by a matrix $\mathbf{U}$ so that:

$$\mathcal{U}:\mathbb{R}^N\to \mathbb{R}^N:\;X\mapsto \mathbf{U}\,X$$

Now as discussed above, lengths of positions vectors stay the same, as do angles between position vectors. This means the inner product $\left<X,\,Y\right> = X^T\,Y = Y^T X$ between any pair of position vectors $X$ and $Y$ is unchanged. Therefore:

$$\left<\mathbf{U}\,X,\,\mathbf{U}\,Y\right> = (\mathbf{U}\,X)^T\,(\mathbf{U}\,Y) = X^T\,\mathbf{U}^T\,\mathbf{U}\,Y = \left<X,\,Y\right> = X^T\,Y$$

or rather:

$$X^T\,\left(\mathbf{U}^T\,\mathbf{U} - \mathbf{I}\right)\,Y = 0;\;\forall X,\,Y\in\mathbb{R}^N$$

it is now not hard to show, since we can put any pair of basis vectors $X$, $Y$ into the above equation, that we must have $\mathbf{U}^T\,\mathbf{U} = \mathbf{I}$ as an identity. Therefore the matrix must be orthogonal. Here naturally $\mathbf{I}$ is the identity matrix.

This essentially answers your question, but rotations are not the only orthogonal transformations. Reflexions are too; in $\mathbb{R}^3$ the orthogonal matrix $\operatorname{diag}(1,-1,1)$ reflects in the $x-y$ plane and it fulfills $\mathbf{U}^T\,\mathbf{U} = \mathbf{I}$. So let's go a little further. Any rotation of angle $\theta_0$ can be thought of as being joined to the identity transformation (rotation through angle of nought) through a continuous path of rotations, all about the same axis and with angles between $0$ and $\theta_0$. It belongs to the identity connected component of the group of all orthogonal transformations. Therefore:

$$\mathbf{U} = \exp(\theta\,\mathbf{H})$$

for some constant matrix $\mathbf{H}$. By imposing the orthogonality condition on the expression we get $\mathbf{U}$ orthogonal iff $\mathbf{H} = -\mathbf{H}^T$, i.e. $\mathbf{H}$ is skew-symmetric. This then is the general form of an $N$ dimensional rotation: it is a matrix of the form $\exp(\mathbf{H}_\theta)$ for some skew-symmetric $\mathbf{H}_\theta$. In three dimensions, the most general such matrix is:

$$\theta\,\mathbf{H} = \theta\,\left(\begin{array}{ccc}0& \gamma_z& -\gamma_y\\\gamma_z&0&\gamma_x\\\gamma_y&-\gamma_x&0\end{array}\right) $$

where $\gamma_x^2 + \gamma_y^2 +\gamma_z^2 = 1$, $(\gamma_x, \gamma_y,\gamma_z)$ is a unit vector defining the axis of rotation, as you can prove by finding the eigenvectors and values of $\mathbf{H}$ and showing that this vector is the eigenvector corresponding to an eigenvalue of 0 (therefore the exponential of $\exp(\theta\,\mathbf{H})$ has this vector as an eigenvector and its eigenvalue is $e^0 = 1$, i.e. it is an axis left invariant by the transformation). Also note that $\det \mathbf{U} = \exp(\operatorname{trace}(\theta\,\mathbf{H})) = 1$, as in the comments. This is the last ingredient, namely handedness I spoke of at the beginnning. A reflexion has a determinant of $-1$ and maps a right handed co-ordinate system into a left handed one and contrawise. You can find the wonted expressions for rotation operators using the Rodrigues formula grounded on the $\mathbf{H}$ matrix's characteristic equation: working through this reasoning in 3D: the three eigenvalues of $\mathbf{H}$ are $0,\, \pm i$, so by the Cayley-Hamilton theorem:

$$\mathbf{H}^3= -\mathbf{H}$$

which relationship is then used to simplify the exponential's Taylor series:

$$\exp(\theta\mathbf{H}) = \mathbf{I} + \theta \mathbf{H} + \frac{\theta^2}{2!}\mathbf{H}^2 + \cdots$$

leading to:

$$\mathbf{U} = \mathbf{I}+\sin\theta\,\mathbf{H} +(1-\cos\theta)\,\mathbf{H}^2$$

whence can be worked out the wonted formulas for a 3D rotation of angle $\theta$ about an axis defined by the unit vector $(\gamma_x, \gamma_y,\gamma_z)$.

In higher dimensions, a real valued skew-symmetric matrix $\mathbf{H}$ has the eigenvalue $0$ (possibly repeated) as well as imaginary eigenvalues in conjugate pairs $\pm i\,\theta_j$. It should be mentioned here that $\mathbf{U}$, being orthogonal is also normal (commutes with its adjoint - here equal to its transpose) and so it can always be diagonalised (has a strictly diagonal Jordan normal form) and its eigenvectors are all orthogonal. The rotation then has an invariant hyperspace given by the kernel (nullspace) of $\mathbf{H}$ - this is the generalization of the rotation axis in 3D and then one or more linearly independent 2D hyperplanes (indeed orthogonal hyperplanes, given normalness of $\mathbf{U}$), each spanned by the pair of eigenvectors corresponding to the eigenvalues $\pm i\,\theta_j$. So the idea of an "axis" is no longer really useful: in 3D it is useful because the nullspace of $\mathbf{H}$ must be precisely one dimensional. Sometimes authors require "rotations" to be transformations that leave all of $\mathbb{R}^N$ invariant aside from precisely one 2D hyperplane, but I don't think this is particularly useful because the composition of two such transformation is not then a rotation (unless the hyperplane is the same one for the two composed "rotations"): there is no group of rotations defined in this way. It's easier and more useful simply to talk of simply orthogonal transformations with unit determinant, i.e. members of the group $SO(N)$.

If, as Michael Brown's comment suggests, you are thinking of representations of rotations, then further discussions of the lifting of $SO(3)$ to its universal (in this case double) cover $SU(2)$ can be found in the second section "What the Lie Bracket does not "remember" about the group: Global Topology and the Fundamental Group" in my answer here and especially in the Stillwell references my answer gives.

@Shawniper no, not quite. You'd simply have a member of the group $O(3)$, which includes rotations and reflexions. As I showed above, a rotation matrix has unit determinant because it belongs to the identity connected component of $SO(3)$, whose members can all be written in the form $\exp(\theta,\mathbf{H})$ where $\mathbf{H}$ is real and skew Hermitian, therefore traceless, so that the determinant of $\exp(\theta,\mathbf{H})$ is the exponential of this trace, therefore unity ... — Selene Routley, Sep 13 '13 at 00:57
...$O(3)$ has two connected commponents: $SO(3)$ the group of rotations and the coset which is $SO(3)$ displaced by any reflexion. What this means is that you only need one "prototypical" reflexion (say in the $x-y$ plane): any other orthogonal transformation can be realized as the composition of a rotation and this prototypical reflexion. — Selene Routley, Sep 13 '13 at 00:58
thx! i was wrong, however, unitary transformation can preserve the transition rate in Hilbert space, right?(hence can be interpreted as kind of "orthorgonality") — Lorniper, Sep 24 '13 at 10:03
@Shawniper Certainly a pretty good notion of unitary is as a generalized rotation, but now we have replaced the Hilbert space $\mathbb{R}^N$ with $mathbb{C}^N$. Indeed most of my discussion above works in the same way if matrix transpose is replaced by Hermitian conjugate. We're still fundamentally concerned with isometry here - preservation of lengths and "angles" (i.e. inner products). The main thing that changes is that $U(N) /SU(N)$ is now a continuous group, the circle group $U(1)$, whereas $O(N) / SO(N) = \mathbb{Z}_2$ is the two element discrete group: "flipped" and "not flipped". — Selene Routley, Sep 24 '13 at 11:45
thanks again...being not familiar with group theory, could you use some plain words to explain "The main thing that changes is that...is the two element discrete group..."? — Lorniper, Sep 24 '13 at 16:43
@Shawniper Sorry about that. All I am really saying is that isometries, which keep shape and length, must also keep hypervolume: so they have unit magnitude determinant $|\Delta|=1$. So for real element matrices - $O(N)$ - the only determinants we can have are $\pm 1$: transformations that keep orientation (rotations) and those that flip it (reflexions): so $O(N)$ splits into two sundered bits - called cosets. In $\mathbb{C}$ however, we can have any determinant of the form $\Delta = e^{i\theta};;\theta\in\mathbb{R}$, so you get a whole family of linked cosets that form one ... — Selene Routley, Sep 24 '13 at 23:44
@Shawniper .... connected whole $U(N)$. It becomes meaningless to speak of the "sense" of a transformation in $\mathbb{C}^N$. The language of group theory lets us see symmetry at higher magnification: sighted animals are hard wired to understand symmetry - we grasp it subconsciously, so ironically this makes it harder to understand at a finer level. Group theory does this for us. — Selene Routley, Sep 24 '13 at 23:45

Why are rotation matrices always unitary operators?

1 Answers1

Linked