Matrix diagonalization in SU(2) and SO(3)

Question

I'm currently using Nadri Jeevanjee's book on group theory for physicists to understand quantum mechanics. I came across these two pages that left me stuck:

Example 4.19 $SU(2)$ and $SO(3)$

In most physics textbooks the relationship between $SO(3)$ and $SU(2)$ is described in terms of the 'infinitesimal generators' of these groups. We will discuss infinitesimal transformations in the next section and make contact with the standard physics presentation then; here we present the relationship in terms of a group homomorphism $\rho:SU(2)\to SO(3)$, defined as follows: consider the vector space (check!) of all $2\times2$ traceless anti-Hermitian matrices, denoted as $\mathfrak{su}(2)$ (for reasons we will explain later). You can check that an arbitrary element $X\in\mathfrak{su}(2)$ can be written as $$ X=\frac12 \begin{pmatrix} -iz & -y -ix \\ y-ix & iz \end{pmatrix},\quad x,y,z\in\mathbb R\tag{4.39} $$ If we take as basis vectors \begin{align} S_x&=-\frac i2\sigma_x=\frac12 \begin{pmatrix} 0 & -i \\ -i & 0 \end{pmatrix} \\ S_y&=-\frac i2\sigma_y=\frac12 \begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix} \\ S_z&=-\frac i2\sigma_z=\frac12 \begin{pmatrix} -i & 0 \\ 0 & i\end{pmatrix} \\ \end{align} then we have $$X=xS_x + y S_y + z S_z$$ so the column vector corresponding to $X$ in the basis $\mathcal B = \{ S_x, S_y, S_z\}$ is $$[X] = \begin{pmatrix}x \\ y\\ z\end{pmatrix}.$$

Note that $$\det X = \frac14\left(x^2+y^2+z^2\right)=\frac14\|X\|^2$$ so the determinant of $X\in\mathfrak{su}(2)$ is proportional to the norm squared of $[X]\in\mathbb R^3$ with the usual Euclidean metric. Now you will check below that $A\in SU(2)$ acts on $X\in\mathfrak{su}(2)$ by the map $X\mapsto AXA^\dagger$, and that this map is linear. Thus, this map is a linear operator on $\mathfrak{su}(2)$, and can be represented in the basis $\mathcal B$ by a $3\times 3$ matrix which we will call $\rho(A)$, so that $[AXA^\dagger]=\rho(A)[X]$ where $\rho(A)$ acts on $[X]$ by the usual matrix multiplication. Furthermore, $$\left\|\rho(A)[X]\right\| = \left\|[AXA^\dagger]\right\|^2 = 4 \det(AXA^\dagger) = 4 \det X = \left\|[X]\right\|^2 \tag{4.40}$$ so that $\rho(A)$ preserves the norm of $X$. This implies (see Exercise 4.19 below) that $\rho(A)\in O(3)$, and one can in fact show¹³ that $\det \rho(A) = 1$, so that $\rho(A)\in SO(3)$. Thus we may construct a map \begin{align} \rho: SU(2) & \to SO(3) \\ A & \mapsto \rho(A) \end{align}

Furthermore, $\rho$ is a homomorphism, since \begin{align} \rho(AB)[X] & = \left[(AB)X(AB)^\dagger\right] = \left[AB X B^\dagger A^\dagger\right] =\rho(A) \left[BXB^\dagger\right] \\ & = \rho(A)\rho(B)[X] \tag{4.41} \end{align} and hence $\rho(AB) = \rho(A)\rho(B)$. Is $\rho$ an isomorphism? One can show¹⁴ that $\rho$ is onto but not one-to-one, and in fact has kernel $K=\{I,-I\}$. From the discussion preceding this example, we then know that $\rho(A)=\rho(-A)\ \forall A\in SU(2)$ (this fact is also clear from the definition of $\rho$), so for every rotation $R\in SO(3)$ there correspond exactly two matrices in $SU(2)$ which map to $R$ under $\rho$. Thus, when trying to implement a rotation $R$ on a spin $1/2$ particle we have two choices for the $SU(2)$ matrix we use, and it is sometimes said that the map $\rho^{-1}$ is double-valued. In mathematical terms one does not usually speak of functions with multiple-values, though, so instead we say that $SU(2)$ is the double cover of $SO(3)$, since the map $\rho$ is onto ('cover') and two-to-one ('double').

I have the following questions:

I have previously been introduced to the process of diagonalizing matrices. What I don't understand is in this case, what is being diagonalized, $X$ or $A$?
Is the final matrix in $SO(3)$ or $SU(2)$?
Does this diagonalization process correspond to a change of basis and if it does, what's the point of changing bases?
What mapping corresponds to $X$ being represented as a vector in the basis $\{S_x,S_y,S_z\}$?
Also, I just don't understand what the 2-to-1 mapping means?
I also understand that $X$ is a Lie algebra and to find its Lie group you have to exponentiate it: $e^{X}$ I've read in a math book that this process also involves diagonalization. But I just don't understand the whole process and the relationship between Lie algebras and lie groups. Why is there a need to switch between the two?

First, $X$ is not a Lie algebra; it's an element of the Lie algebra su(2). Next, the Lie algebra structure is totally irrelevant here; all that matters is that su(2) is a three-dimensional real vector space. Third, $A$ is an element of $SU(2)$, but it acts (orthogonally) on $su(2)$. This defines a map $\rho:SU(2) \rightarrow SO(3)$. I do not understand what "the final matrix" means, or what "What mapping corresponds to $X$ being represented as..." means. The 2-to-1 mapping is the mapping $\rho$. — WillO, Aug 15 '16 at 15:05
Where does the matrix you get after the diagonalization process lie? — pkjag, Aug 15 '16 at 15:08
$$AXA^{-1}$$ and what is being diagonalized here? I understand that $$A^{\dagger}=A^{-1}$$ in unitary matrices — pkjag, Aug 15 '16 at 15:10
$X$ and $AXA^{-1}$ lie in $su(2)$, which is a 3-d real vector space. As for "what is being diagonlized", you are the only one who seems interested in diagonalizing something, so you are the only one who can know what you're talking about. — WillO, Aug 15 '16 at 15:11
Incidentally, you can bypass the $su(2)$ completely by letting $SU(2)$ act on the (3-d real) vector space $V$ of pure-imaginary quaternions. For this, identify $SU(2)$ with all unit quaternions by taking the matrix with top row $(P,Q)$ to the quaternion $P+Qj$. Now $SU(2)$ acts on $V$ by conjugation. Whether you prefer $V$ or $su(2)$ is up to you; all you need is some 3-d vector space for $SU(2)$ to act on. — WillO, Aug 15 '16 at 15:24
In response to your edit: 1) Neither. 2) What final matrix? 3) What diagonalization process? 4) The mapping from $su(2)$ to ${\mathbb R}^3$ that takes $X$ to its representation in the basis $\langle S_x,S_y,S_z\rangle$. 5) A mapping is 2-to-1 if the inverse image of each point has cardinality 2. 6) $X$ is not a Lie algebra. — WillO, Aug 15 '16 at 18:15
This might not be a good idea, as the Matrix has been proven to be dangerous. — nelomad, Aug 16 '16 at 00:23

score 11 · Accepted Answer · answered Aug 15 '16 at 15:23

There is nothing being diagonalized here. The passage you quote does use the notation $$ AXA^\dagger = AXA^{-1} $$ (since $A$ is unitary), which does appear often in diagonalization problems, but the transformation is much broader than that. This type of transformation, $$ A\mapsto BAB^{-1}, $$ is known as a matrix similarity transformation, and two matrices $A$ and $C$ are said to be similar if and only if there exists an invertible matrix $B$ such that $C=BAB^{-1}$.

Diagonalization is the process of taking a matrix $A$ and finding a matrix $C$ which is similar to $A$ and also diagonal.

However, the uses of similarity as a relationship and a transformation go way beyond just diagonalization - in essence, two matrices are similar if and only if they represent the same linear map on two different bases. As such, a wide array of matrix properties are preserved by similarity (well summarized in the Wikipedia link above), which is part of what makes the relationship so useful.

score 5 · Answer 2 · answered Aug 15 '16 at 15:39

The Lie algebra structure of $su(2)$ is entirely irrelevant here; all that matters is that $su(2)$ is a three-dimensional real vector space, so start by forgetting about Lie algebras.

An element $A\in SU(2)$ acts on that 3-d vector space by mapping $X$ to $AXA^{-1}$.

Therefore an element $A\in\pmatrix{P&Q\cr -\overline{Q}&\overline{P}\cr}$ in $SL(2)$ can be represented as a 3 by 3 real matrix $\rho(A)$. It would be a very very good exercise for you to write down an explicit formula for $\rho(A)$ in terms of $P$ and $Q$ --- not that the final result is important, but this will fix in your head exactly what's going on here. Start, of course, by computing how $A$ acts on each of the three known basis vectors for $su(2)$; those are the columns of $\rho(A)$.

Finally, check that $\rho(A)$ is in $SO(3)$, so you've mapped $SU(2)$ to $SO(3)$. It will be obvious that $A$ and $-A$ go to the same place, so the mapping is (at least) two-to-one. You can check further that it's exactly two-to-one.

Optional next step: To fix in your head the idea that nothing about $su(2)$ matters except for its 3-dimensionality, identify $A$ with the quaternion $P+Qj$ and let it act on the 3-d real vector space of pure-imaginary quaternions via conjugation. This is a different route to the same outcome, and clearly has nothing to do with Lie algebras.

Finally, I don't understand any of your questions about diagonalization or why you are so eager to diagonalize something. Since you're the one bringing diagonalization to the table, only you can know what you want to diagonalize or why.

I don't really see how this addresses the question as clarified in the comments - this seems more likely to confuse the OP than to help, to be honest. — Emilio Pisanty, Aug 15 '16 at 15:48
@EmilioPisanty: The OP is confused about so many things that it's probably hopeless to address them all at one go. But I do think the fundamental problem is that he has no idea what the author is trying to accomplish here --- i.e. to show that $SU(2)$ acts orthogonally on a 3-d real vector space and so maps to $SO(3)$. I think that the exercise of writing down the action explicitly would go a long way toward clarifying (for the OP) what the point of all this — WillO, Aug 15 '16 at 15:51
The quaternion analogue that WillO mentions is sketched in the last paragraph of my Phys.SE answer here. — Qmechanic, Aug 15 '16 at 18:28

Qmechanic · Answer 3 · 2016-08-16T09:26:01.200

In a nutshell, the first main point is that (up to some conventional constants) there is an isometric Lie algebra isomorphism $X\mapsto [X]$ between
- the 3-dimensional Lie algebra $(su(2),[\cdot,\cdot],\det(\cdot))$ of traceless anti-Hermitian $2\times 2$ matrices equipped with the determinant as a norm square, and
- the 3D space $(\mathbb{R}^3, \times, |\cdot|^2)$ equipped with the standard vector cross product and the standard norm square.
(The Lie algebra structure plays no role in what follows, so it is enough to think of the map $X\mapsto [X]$ as an isometric vector space isomorphism.)
The second crucial point is now that for each group element $A\in SU(2)$, the map $\rho(A):\mathbb{R}^3\to \mathbb{R}^3$ (which Nadri Jeevanjee defines above) is a linear isometry. Therefore it is an orthogonal transformation in 3D space $\mathbb{R}^3$, which can be represented by a $3\times3$ orthogonal matrix (where we use the standard orthonormal basis in $\mathbb{R}^3$). In other words, the map $\rho$ is a map from $SU(2)$ to $O(3)$.
The above can now be tighten up to show that $SU(2)$ is a double cover of $SO(3)$. No diagonalization is used anywhere, cf. OP's questions.

score 4 · Answer 4 · answered Aug 15 '16 at 21:08

Given a wave function $\psi = \psi(\vec{r})$ as a function of a position vector $\vec{r} = (x,y,z)$ in space, rotations of the position vector ultimately depend on only two parameters $\phi$ and $\theta$, as can be seen by expressing $$\vec{r} = x \hat{i} + y \hat{j} + z \hat{k}$$ in spherical polar coordinates $$\vec{r} = r \sin(\theta) \cos(\phi) \hat{i} + r \sin(\theta) \sin(\phi)\hat{j} + r \cos(\theta)\hat{k}.$$ How to we represent rotations of a 3-dimensional vector in a two-dimensional space? Well, using $$\mathbb{R}^3 = \mathbb{R} \times \mathbb{C}$$ we could consider vectors of the form $$(x,y,z) \mapsto (z,x+iy).$$ In this new space, can we build rotation and reflection matrices? Using the form of these matrices expressed here, and using orthonormal vectors to make up the columns of a rotation matrix, we take the unit vector specifying $\vec{r}$: $$\vec{n}_{\vec{r}} = (n_x,n_y,n_z) \mapsto (n_z,n_x + i n_y)$$ and use vectors like this to build our rotation $$\begin{bmatrix}n_z & - (n_x + i n_y)^* \\ n_x + i n_y & n_z \end{bmatrix}$$ and reflection $$\begin{bmatrix}n_z & n_x - i n_y \\ n_x + i n_y & - n_z \end{bmatrix}$$ matrices in this new space. As is common when we go from a Lie group element $e^{T}$ to the Lie algebra $e^{T} = I + T$ we want to get it into the form $e^{iT'} = I + iT'$ so that $e^{T} = e^{-i(iT)} = e^{-iT'} = I - iT'$, thus given the above matrices we add $1 = - i i$ to this, so that our reflection matrix becomes $$i \begin{bmatrix}- i n_z & - n_y - in_x \\ n_y - in_x & i n_z \end{bmatrix}$$ giving the $$\begin{bmatrix}- i n_z & - n_y - in_x \\ n_y - in_x & i n_z \end{bmatrix}$$ in your post. Now, $(n_x,n_y,n_z)$ is a unit vector, if we choose $$(n_x,n_y,n_z) = \hat{i} = (1,0,0)$$ we get, before adding $1 = -ii$: $$ (n_x,n_y,n_z) = \hat{i} = (1,0,0) \mapsto \sigma_x = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}$$ similarly $$ (0,1,0) = \hat{j} \mapsto \sigma_y = \begin{bmatrix} 0 & - i \\ i & 0 \end{bmatrix}$$ and similarly for $\sigma_z$, from this you easily get $- i \sigma_x, - i \sigma_y, - i \sigma_z$. The $\frac{1}{2}$ normalization comes from working out the commutation relations (I think), check that yourself. Notice I used orthonormal column vectors, but I didn't need to.

Thus, if $\vec{r} = (x,y,z)$ is represented by $X = \begin{bmatrix}- i z & y - ix \\ y - ix & i z \end{bmatrix}$, we can represent a rotation on $\vec{r}$ by another matrix $A$ in this space acting on $X$ by $$X' = AXA^+$$ and we know this represents a rotation because $$\det(X') = \det(AXA^+) = \det(X)$$ preserves the length of the vector $\vec{r}$, which is the determinant of this matrix.

So right now we answered 4. in your list, and illustrated the idea of 6. which you can read about fully here, lets go for 5.:

The double cover 2-1 mapping comes from the following: Notice the spherical coordinate representation $$\vec{r} = r \sin(\theta) \cos(\phi) \hat{i} + r \sin(\theta) \sin(\phi)\hat{j} + r \cos(\theta)\hat{k}$$ assumes an orientation when we rotate. In other words, a rotation matrix is specified by two orthonormal basis vectors, and these are oriented in a certain way (think right-hand rule), but there is no reason why we could not have started from the opposite orientation. This vector is the effect of an element of $SO(3)$ on a position vector, but my construction of a 2-D space to represent 3-D rotations allows for both orientations to live in this space, so if $$X' = AXA^+$$ is the effect of a rotation with the above orientation, we can say $$X' = (-A)X(-A)^+$$ is the effect of a rotation with the opposite orientation, but $$X' = (-A)X(-A)^+ = AXA^+$$ so you have two rotations mapping to the same element, double cover!

Much of this is summarized here, and in this book.

Matrix diagonalization in SU(2) and SO(3)

4 Answers4