How do tensor products and direct sums fit into quantum mechanics?

Question

I understand that at times tensor products or direct sums are taken between Hilbert spaces in quantum mechanics. I don't, however, know when this can be done or when it should be done. I would like this meticulously explained. In particular, where does this fit with respect to the axioms of quantum mechanics? I understand that there may be more than one set of axioms for quantum mechanics, so let's adhere to these (that may be faulty):

I. The state of a system is represented by a ray of a Hilbert space $H$, the vectors of which are called kets.

II. Corresponding to each observable $A$ is a Hermitian operator $\hat{A}$ on $H$.

III. When measuring an observable $A$ of a system in a state represented by a ray with normalized representative $\Psi$, the result of the measurement will be one of the eigenvalues $a$ of $A$ with probability $\left<\mathbf{a}, \Psi\right>$, where $\mathbf{a}$ is the normalized eigenvector corresponding to the eigenvalue $a$. Furthermore, the state of the system will be that represented by $\mathbf{a}$.

IV. A ket $\Psi$ representing a system will evolve according to the Schrodinger equation.

Related/possible duplicate: https://physics.stackexchange.com/q/54896/50583 — ACuriousMind, Feb 01 '20 at 13:54
@knzhou I actually don't know what I mean, but I have the notion that to be very general one needs to speak of degrees of freedom. — , Feb 04 '20 at 01:30
To be blunt, both of the bounties you've put up are extremely hard to answer, because you're insisting on an explanation of more fundamental things in terms of less fundamental ones. — knzhou, Feb 04 '20 at 01:54
For example, in your other bounty, you ask for a general description of the Lagrangian in terms of kinetic and potential energy. But the Lagrangian is the more general idea. Many systems described by Lagrangian mechanics have no meaningful notions of kinetic and potential energy at all -- instead these quantities are derived from the Lagrangian in simple cases. — knzhou, Feb 04 '20 at 01:54
And here, you insist on an explanation of quantum axioms in terms of the intuitive classical idea of degrees of freedom. But in reality, the notion of classical degrees of freedom emerges from quantum mechanics. Your notion of being more general is actually a request to be less general. — knzhou, Feb 04 '20 at 01:55
Imagine that in elementary school you learn about geometry using rulers and crayons, then in high school you learn the axioms of Euclidean geometry. The things you learned in elementary school follow from those axioms, not vice versa. What you are asking for is like demanding that the axioms be rephrased in terms of crayons, to be "very general". — knzhou, Feb 04 '20 at 01:58
@knzhou Well I know very little and my questions reflect my naive point of view. — , Feb 04 '20 at 03:20

Chiral Anomaly · Answer 1 · 2020-02-04T00:12:01.937

Tensor product

Writing the Hilbert space as a tensor product $$ \newcommand{\cH}{\mathcal{H}} \cH=\cH_A\otimes \cH_B $$ can be useful when we want to think of $\cH_A$ and $\cH_B$ as two complementary subsystems of the full system. Observables associated with subsystem $A$ act like the identity on the other factor $\cH_B$, and conversely. For example, an observable associated with subsystem $A$ has the form $O_A\otimes 1$. A general observable affects both $\cH_A$ and $\cH_B$. That is, a general observable is a sum of terms of the form $O_A\otimes O_B$ where the operators $O_{A/B}$ act only on $A/B$ respectively.

In particular, if $\cH=\cH_A\otimes \cH_B$, the Hamiltonian in the Schrödinger equation is a sum of terms of the form $O_A\otimes O_B$. In particular, terms of the form $O_A\otimes 1$ and $1\otimes O_B$ describe the dynamics of the $A$ and $B$ subsystems by themselves, and all other terms describe interactions between these subsystems.

Another example is a non-relativistic particle with spin: we can express the Hilbert space as a tensor product $\cH_X\otimes \cH_S$, where observables associated with the particle's location have the form $O_X\otimes 1$ and observables associated with its spin have the form $1\otimes O_S$. In this case, the different parts are usually called different "degrees of freedom" instead of different "subsystems." In the case of a non-relativistic particle, we can further factorize $\cH_X$ into three factors associated with the three dimensions of space. Again, we would normally call these "degrees of freedom" instead of "subsystems."

Very generally, we can define a "subsystem" or "degree of freedom" as a special collection of observables. The tensor product construction isn't needed for this, but it's often useful. If observables associated with different subsystems (or degrees of freedom) commute with each other, then the tensor product is often useful as a systematic way of mathematically delineating the different sets of observables: each set acts nontrivially on only one of the tensor factors.

The "subsystems" and "degrees of freedom" concepts are just vaguely-delineated special cases of a much more general idea: mutually commuting subsets of the set of observables. The same Hilbert space admits many different tensor-product factorizations. Which one is most useful (if any) depends on which operators we want to represent which physical observables -- the decisions we make when we're defining a model. A similar comment applies to the most common definitions/measures of "entanglement," because they refer to a given tensor product factorization.

Learning about the "split property" in quantum field theory reveals some limitations of the tensor product formulation. The split property is mentiond in this related post:

Should it be obvious that independent quantum states are composed by taking the tensor product?

The tensor product also has other uses. For example, for a single particle at rest in 3-d space, we can systematically express the spin-$j$ representation for any $j$ by taking the tensor product of $2j$ copies of the spin-1/2 representation and symmetrizing. We can think of this as a special application of the subsystem idea, because a symmetrized collection of $2j$ spin-1/2 particles has total spin $j$.

Axioms I-IV listed in the OP are the same whether or not $\cH$ is written as a tensor product, because those axioms are independent of what representation we use for the Hilbert space.

Direct sum

Writing the Hilbert space as a direct sum $$ \cH=\cH_1\oplus\cH_2 $$ is useful when we want to focus on a particular subspace of states. Using a non-relativistic single-particle model as an example, $\cH_1$ could consist of states in which a particle's wavefunction has support only within a given region $R$, and $\cH_2$ could consist of states with support in the complement of $R$.

More generally, given any discrete observable (such as the observable that asks "is the particle located in the region $R$ or not?"), we can write $\cH$ as the direct sum of that observable's eigenspaces. A direct-sum decomposition of the Hilbert space corresponds to a block-matrix representation of operators on the Hilbert space.

More esoterically, the direct sum is also useful for representing mixed states as vector states: every state, whether pure or mixed, can be expressed as a vector state in a sufficiently large Hilbert space, with the understanding the all observables have a block-diagonal form that doesn't mix the different direct-summands with each other. This fact is sometimes useful for proving theorems, and this fact can in turn be proven using the GNS Construction.

Again, axioms I-IV listed in the OP are the same whether or not $\cH$ is written as a direct sum, because those axioms are independent of what representation we use for the Hilbert space.

I think by "subsystem" you're alluding to a subset of degrees of freedom. Would you mind clarifying what you mean by "subsystem" — , Feb 03 '20 at 18:25
@PiKindOfGuy I added a few more paragraphs in the section about tensor products, including another example and a general perspective. Short answer: the conceptual separation of a system into different "subsystems" or "degrees of freedom" corresponds to a mathematical choice of mutually-commuting subsets of the set of observables. Observables can be organized into such subsets in many different ways, which is why it's so hard to pin down a single specific definition of "subsystem" or "degree of freedom." — Chiral Anomaly, Feb 04 '20 at 00:17

score 3 · Answer 2 · edited Jun 29 '23 at 11:38

A very short and very helpful answer, only regarding the physical aspects of the tensor product and direct sum:

The direct sum adds Hilbert spaces in such a way that they are separable from another. In your Hamiltonian you will notice this as block matrices added on the diagonals, which all act on separate parts of the vector of wave-functions that describes your system.

The tensor product mixes things up. If you tensor product two matrices, you write the second matrix in its entirety into every matrix entry of the former one, muliplied by the number that was previously in that entry. Each matrix has a set of eigenstates. You can now build a basis for this new matrix by "combination" of the former two bases of eigenstates, into a new one. With combination I mean that the vectors are built similarly to the matrix, in each entry of the eigenvector of the first matrix, the eigenvectors of the second matrix, multiplied by the number that was previously in that entry. This new basis is not a basis of eigenvectors anymore.

The latter is now reducable into the former, means, you can bring the latter into block-diagonal form to separate your system in several subsystems.

Application: You want to describe two spin-$1/2$ particles in one system. Each particle has the states $|\uparrow\rangle, |\downarrow\rangle$.

But as we know, we can separate the problem into a system of spin 1 and spin 0, where the spin $1$ system is $3$ dimensional (Sz-eigenvalues: $1,0,-1$) and the spin $0$ system is $1$ dimensional (Sz-eigenvalues: $0$). Why do we differentiate $\textit{Spin 1}$ and $\textit{Spin 0}$ systems? Because these are the systems (or collection of vectors) that will yield either the eigenvalue $S=1$ (because $S^2 \times$ spin-$1$-eigenvector = $2$ = $S(S+1) \implies S=1$), and $S=0$ (because $S^2 \times$ spin-$0$-eigenvector = $1$ = $S(S+1) \implies S=0$).

If anyone wants to format this answer better, go ahead.

score 1 · Answer 3 · answered Feb 06 '20 at 02:18

You can't really derive when to use the direct sum and when to use the tensor product from the four postulates that you listed, because those postulates describe a single system and presuppose the existence of a Hilbert space $\mathcal{H}$ that describes that system. The nature of the Hilbert space is generally just postulated and can rarely be derived without additional assumptions. For example, there's no way to derive that a given many-particle system is bosonic or fermionic (or neither), which is fundamentally a property of the Hilbert space, without making further assumptions like Lorentz invariance.

However, as discussed in the answers to this duplicate question, the Born rule does play more nicely with tensor products than with direct sums. If you can express a state of a system as a tensor product of two other states, then each of those factor states also counts as a "system" according to the postulates, which fits our intuitive notion of how "systems" should work. I guess that depending on your philosophical perspective on what counts as a "system", this is either a proof or a strong motivation for the tensor product as the proper means of combining "subsystems".

But, to riff off knzhou's comments, it's worth pointing out that these "why" questions are always very difficult to answer in quantum mechanics. That's because many results in quantum mechanics all fit together in a very satisfying way, but it's not necessarily obvious which ones are the most fundamental:

The Born rule
Subsystems of systems themselves obey the same rules
The fact that the tensor product is the appropriate way to combine together subsystems
The fact that physical states naturally correspond to rays rather than vectors in Hilbert space
[Significantly deeper into the weeds,] the fact that the field of scalars is the complex rather than real numbers

Modifying any of these facts tends to immediately break all of the other ones, while still resulting in a logically consistent theory. (The last one is a special case; it can be modified while leaving the others intact, although the resulting theory is arguably less natural.) So it's very hard to answer "why" questions in general.

Steven Sagona · Answer 4 · 2020-02-06T23:10:23.340

It took a lot of struggle to get an idea for what "hilbert space" and tensor products really mean in quantum mechanics.

Personally, I think explanations that rely on "tensor product formalism" is the absolute worst for understanding. It gives zero intuition and turns the learner into a plug-and-chug monkey.

The concept that made things click to me is a careful writing of what the wavefuction repesents:

The wavefunction represents a (complex) value that (when squared) describes the probability of a POSSIBLE OUTPUT STATE.

It sounds obvious but this is the key idea: potential output states are assigned probabilities.

Now it's very natural to see what happens in many particle states:

If I have a "quantum coin" described by the state $\Psi = (\sqrt{P_H}|H\rangle+\sqrt{P_T}|T\rangle)$.

And I flip two of these coins, what's the output state?

Now it's clear in normal probability that the output possiblities for flipping two coins are:

HH, HT, TH, and TT (4 output states)

And this doubles for every coin we add! Adding one more:

HHH, HHT, HTH, HTT, TTT, THT, THH, TTH (8 output states)

Now in the quantum case each of these possiblities needs to be assigned its own probability amplitude (and has the potential to cause interference!)

Now if we flip two quantum coins completely independently, we identify that there shouldn't be any correlation between the coins, and it should look exactly the same as the classical case.

$$P(HH) = P_H P_H\\ P(HT) = P_H P_T\\ P(TH) = P_T P_H\\ P(TT) = P_T P_T$$

Now is there a linear operator that will take two states $\Psi_1 = (\sqrt{P_H}|H\rangle_1+\sqrt{P_T}|T\rangle_1)$ and $\Psi_2 = (\sqrt{P_H}|H\rangle_2+\sqrt{P_T}|T\rangle_2)$ and turn them into the correct combined possibility space that gives independent probablities? That's the tensor product!!

A tensor product is used to describe states that are independent. And this is exactly why entanglement is if and only if a given state CANNOT be described by such a tensor product.

If you can't simplify it so that it can be of the form $(a|H\rangle_1 + b|T\rangle_2) \otimes (c|H\rangle_3 + d|T\rangle_4)$

Then you know your state isn't "independent" (and is by definition entangled).

I think a lot of times this entanglement example is thought of as separate from what these tensor products represent, and I think that this is a mistake - I wasn't able to make any sense of this stuff until eventually finding this line of thinking.

One final note: often people say something like:

For a state 1 (existing in $\mathcal{H}_1$) and a state 2 (existing in $\mathcal{H}_2$) entanglement exists in the space $\mathcal{H}_1 \otimes \mathcal{H}_2$. This language is very confusing, but is unfortunately common and is rarely explained. The "hilbert space" $\mathcal{H}_1 \otimes \mathcal{H}_2$ simply represents the set of probability amplitudes that could be assigned to the combination output state. In our example, with 2 quantum coins, we would have a space of $\mathcal{H}_1\otimes \mathcal{H}_2 \rightarrow (|H\rangle_1 + |T\rangle_1) \otimes (|H\rangle_2+ |T\rangle_2)\rightarrow (|H\rangle_1|H\rangle_2 +|H\rangle_1|T\rangle_2+|T\rangle_1|H\rangle_2+|T\rangle_1|T\rangle_2 )$ In this case, we are using this notation as more of a "trick" to mix our kets together so that we get the space that describes the larger possibility space!

score 0 · Answer 5 · answered Jun 29 '23 at 11:51

It is perhaps worth adding an example of a single particle with spin. The operators and the eigenfunctions are then studied in the product space of orbital degrees of freedom and spin degrees of freedom. E.g., if we start with Hamiltonian $$ H=\frac{p_x^2}{2m}+V(x) -\frac{eh}{2m}\sigma_z B,$$ then we have eigenfunctions of the spatial and spin parts of the Hamiltonian as $$ \left[\frac{p_x^2}{2m}+V(x)\right]\phi_n(x)=E_n\phi_n(x), \sigma_z\chi_\pm=\pm\chi_\pm$$ whereas the eigenfunctions of the full Hamiltonian are defined in the direct product space: $$\psi_{n,\pm}(x)=\phi_n(x)\chi_\pm.$$ Operators $\frac{p_x^2}{2m}+V(x)$ and $\sigma_z$ are diagonal in spin and position spaces respectively, and this is usually not spelled explicitly. However, one may have spin-orbit coupling, which is diagonal in neither.

How do tensor products and direct sums fit into quantum mechanics?

5 Answers5

Tensor product

Direct sum

Linked