Quantum Probability, what makes quantum characteristic functions quantum?

Question

I'm trying to understand how $[Q,P] \neq 0$ leads to the conclusion that no probability distribution can be established for $A$ and $B$.

Classically if we had two random variables $Q$ and $P$ we could write

\begin{align} \phi_{Q,P}(t_q,t_p) =& E\left[e^{i(t_Q Q + t_P P)}\right]\\ =& \int e^{i(t_Q q + t_P p)} f_{Q,P}(q,p)dq dp\\ =& \mathcal{FT}\left[f_{Q,P}(q,p)\right](t_q, t_p) \end{align}

Here $\phi_{Q,P}$ is the characteristic function and $f_{Q, P}$ is the probability density function for $Q$ and $P$. In particular we have that

\begin{align} f_{Q,P}(q,p) = \mathcal{FT}^{-1}\left[\phi_{Q,P}(t_Q,t_P)\right](q, p) \end{align}

If expectation values like $E\left[ Q^n P^m \right]$ are known for all non-negative integers $n, m$ then $\phi_{Q,P}(t_Q, t_P)$ can in principle be calculated and then Fourier transformed to find $f_{Q,P}(q, p)$. That is, knowledge of expectation values is enough to determine a probability function.

Quantum mechanically it is known that this procedure breaks down. There is no way to come up with a probability distribution function for non-commuting observables. My question is where my argument above breaks down in the non-commuting case. Quantum mechanically (at least theoretically) we have access to expectation values of the form $E\left[Q^n P^m\right]$ (and versions of the same with different operator orderings). This means that we can calculated some sort of quantum characteristic function $\phi_{Q, P}(t_Q, t_P)$ for $Q$ and $P$. In principle we should then be able to Fourier transform this characteristic function to get something like a probability distribution for $Q$ and $P$. For some reason, we only get a quasiprobability distribution and not a normal probability distribution. Why not?

I don't know the full answer to this but I have a couple of leads that I will mention.

First, as I mentioned above, $E[QP] \neq E[PQ]$ and $E[Q^nP^m] \neq E[Q^{n-1}P^{m}Q]$ and the like. This means that there is not a unique definition for the characteristic function. Given that there is not a unique characteristic function it makes sense there is not a unique probability distrubition. The different choices of characteristic function can be related to different quasiprobaiblity distribution such as the Wigner, P, or Q distributions. My question is why is it the case that NONE of these characteristic functions could ever lead to a valid probability distribution function.
Not just any function $\phi$ can be transformed to give a probability distribution function. Probability distribution functions are normalized and always positive. It is possible to take a Fourier transform and get something which is not normalized and which is not always positive. I believe this is related to Bochner's theorem but I'm having trouble parsing the theorem because of all of the measure theory stuff. I would really appreciate an answer that explains how we can look at a classical characteristic function and see certain properties that allow us to know it will Fourier transform to a nice probability distribution function and then how we can clearly see that non-commuting operator characteristic function do not satisfy these properties so we know they won't give us nice probability distribution functions.

lcv · Accepted Answer · 2020-06-05T00:42:27.050

I think the simplest way to understand why we cannot have a joint probability distribution for two incompatible observables $A,B$ (meaning non-commuting) is the following. With a slight abuse of notation, the joint probability is defined as:

$$ P_{A,B}(a,b) := \mathrm{Prob}(A=a,B=b) $$

which means, it's the probability of $A$ having value $a$ and $B$ having value $b$. In quantum mechanics this means that there is an eigenstate of $A$ with eigenvalue $a$ which is at the same time an eigenstate of $B$ with eigenvalue $b$. But if $[A,B]\neq 0$ this is notoriously not possible.

Note that, conversely, if $A$ and $B$ commute it is always possible to find a common eigenbasis and so the above prescription works fine.

This argument, however does not answer the other part of the question. Which is:

Why can I not define a bona fide characteristic function (i.e. which is the Fourier transform of a probability density) in case of non-compatible observables? And perhaps, is there any way to amend this?

As pointed correctly by the OP this has to do with Bochner's theorem which tells precisely what requirements a characteristic function has to satisfy. I will state Bochner's theorem in the form needed for our purposes.

Theorem (Bochner) (Univariate case) $\chi (t)$ is the Fourier transform of a probability density $P(\omega)$ ($t,\omega \in\mathbb{R}$) if and only if for any $n$-tuple $t_1,t_2,\ldots t_n$ ($t_k \in \mathbb{R}, \ k=1,2,\ldots,n$) the $n\times n$ matrix with entries $\chi_{i,j}:= \chi(t_i-t_j)$ is non-negative definite (and hermitian).

Note: for the Multivariate $d$-dimensional generalization simply consider the obvious rephrasing with $t,\omega, t_k \in \mathbb{R}^d$.

A simple way to understand Bochner's theorem is the following. A matrix $\chi$ is non-negative definite if and only if it can be written as $\chi = A A^\dagger$.

Let $P(\omega)$ be the Fourier transform of $\chi$. Then

$$ \chi_{i,j} = \int d\omega e^{i(t_i-t_j) \omega} P(\omega) \ \ \ \ \ (0) $$

which we write as

\begin{align} \chi_{i,j} &= \int d\omega e^{it_i \omega} P(\omega) e^{-it_j \omega} \\ & = (A A^\dagger)_{i,j} \end{align}

with

$$ A_{i,\omega} := e^{it_i \omega} \sqrt{P(\omega)} $$

which we can do since $P(\omega)$ is non-negative. So $P(\omega)$ non-negative means that $\chi_{i,j}$ is a non-negative definite matrix. This characterizes characteristic functions in the classical case.

Let us now turn to the quantum mechanics and consider the multivariate case, i.e., we have several observables which I call $X_1, X_2, \ldots X_n$ with spectra in $\omega_1, \ldots, \omega_n$. The conjugate variables being $t_1,\ldots,t_n$, and the notation

$$tX:=\sum_{k=1}^n t_k X_k$$

In this case the wannabe characteristic function is

$$ \chi(t):=\mathsf{E}[ e^{itX} ]= \operatorname{Tr} ( e^{itX} \rho ) $$

for some quantum state $\rho$ (a normalized non-negative matrix). We want to check under which conditions $\chi_{i,j}=\chi(t_i-t_j)$ is non-negative definite as a matrix.

If the $X_k$ were mutually commuting operators we would have

$$ e^{i(t_i-t_j) X} = e^{it_i X} e^{-it_j X} \ \ \ \ \ \ (1) $$

and then we could write

\begin{align} \chi_{i,j} &=\operatorname{Tr} \left ( e^{i(t_i-t_j) X} \rho \right) \\ &=\operatorname{Tr} \left ( e^{it_i X} e^{-it_j X} \rho \right ) \\ &=\operatorname{Tr} \left (e^{-it_j X} \sqrt{\rho} \sqrt{\rho} e^{it_i X} \right ) \end{align}

Now define the matrix $A_{j,lq} := \left ( e^{-it_j X} \sqrt{\rho} \right )_{l,q} =: A_{i,\xi}$ where $\xi=(l,q)$. We have

$$ \chi_{i,j} = \sum_{lq} A_{j,lq} \overline{A_{j,lq}} $$

which is of the form $BB^\dagger$ and proves that $\chi_{i,j}$ is non-negative definite. Obviously this whole construction breaks down if $X_k$ are not mutually commuting.

Strictly speaking (as pointed out correctly by @AcuriousMind) this does not prove that the matrix $\chi_{i,j}$ is not non-negative definite for non-mutually commuting observables. For that one should find a counterexample, i.e. show that $\chi$ has a negative eigenvalue. However it does show where the argument breaks down.

Added edit

Here I present a counterexample for a single qubit. It can be shown that for $n=2$ the $\chi_{i,j}$ matrix is always non-negative definite. So to look for the first counterexample we must take $n=3$.

Consider the following problem with incompatible (non-commuting) observables given by $\sigma^x$ and $\sigma^z$. As for the state we pick $\rho = | 0\rangle\langle 0|$. Hence the putative characteristic function is

\begin{align} \chi(t_x,t_z) &:= \langle 0| e^{i (t_x \sigma^x +t_z \sigma^z)}| 0 \rangle \\ & = \cos \left (\sqrt{t_x^2 +t_z^2}\right ) + i\frac{t_z}{\sqrt{t_x^2 +t_z^2}} \sin \left ( \sqrt{t_x^2 +t_z^2}\right ) . \end{align}

Now form the matrix $\chi_{i,j}$ for $n=3$. It can be shown that $\chi$ has the form

$$ \chi = 1\!\mathrm{l} + \Gamma $$

where the matrix $\Gamma$ is hermitian and has zero on the diagonal. Since $\Gamma$ is traceless $\chi$ fails to be non-negative definite if $\Gamma$ has an eigenvalue smaller than $-1$.

For simplicity let's call $a_{ij} = t_x^i-t_x^j$ and $b_{ij} = t_z^i-t_z^j$. Now simply pick random $a_{ij}, b_{ij}$:

\begin{align} a_{12} & = 1 \ \ b_{12} = 0.5 \\ a_{13} & = 2 \ \ b_{13} = 1.3 \\ a_{23} & = 0.4 \ \ b_{23} = 0.9 \\ \end{align}

The eigenvalues of $\Gamma$ turn out to be $\{1.499, \ -1.221, \ -0.278\}$, which implies $\chi$ is not non-negative definite. This shows that the Fourier transform of $\chi(t_x,t_z)$ is not a (joint) probability distribution. $\square$

@lcv you're statement of Bochner's theorem is very helpful but you lose me a little bit after it. My main question is notational. You state Bochner's theorem in terms of $\chi(t)$ and $P(\omega)$. Are $\omega$ and $t$ thought to be vectors here? I think this is needed to address the multivariate quantum case. If so then are $t_i$ vectors throughout the post? I guess they are, if you can confirm this I think I can work through other details on my own. — Jagerber48, Feb 02 '20 at 16:17
I suppose for $X = (Q, P)$ (as a vector) the vectors $t_1 = (1,0)$ and $t_2 = (0, 1)$ should be able to give a counter-example. I'll try to work this out. — Jagerber48, Feb 02 '20 at 16:18
@jgerber In Eq. (0) $t_i$ were supposed to be scalar for simplicity. But everything carries through if you interpret them as vector. $t\omega$ is then the standard scalar product. Yes, if you pick particular operators one should be able to work out a counter-example. I don't have a simple argument for the general case now. — lcv, Feb 02 '20 at 17:20
I added a few details. Note that the role of $n$ is different in different sections according to context. Hopefully it does not causes confusion. — lcv, Feb 02 '20 at 17:29
Isn’t the Husimi $Q$-function a direct counterexample to the main point of this answer? — knzhou, Feb 02 '20 at 17:42
You may define the anti-normal ordered characteristic function $\chi(t) = \text{tr}(\rho e^{-t a^\dagger} e^{t^* a})$. This is the characteristic function of the $Q$-function, which is nonnegative and normalizable. — knzhou, Feb 02 '20 at 21:10
@knzhou the issue here is to be able to define a multivariate probability distribution that can be used to reconstruct the quantum state. In your line thought there is even a simpler example $\chi(t) =\operatorname{tr} ( e^{i t Q} \rho )$. In your example the operators are not even observables. — lcv, Feb 02 '20 at 23:50
@knzhou the point is not that characteristic functions cannot be defined. But they cannot be defined for non commuting observables. This is a well known fact. — lcv, Feb 03 '20 at 02:52

score 2 · Answer 2 · answered Feb 02 '20 at 02:35

2

This is an issue of physical interpretation, not mathematics. For example, the Husimi $Q$-function is a probability distribution in the formal sense that it is a non-singular, normalizable, positive definite function.

We don't call it a probability distribution, however, because quantum measurement doesn't work like it does classically. The quantity $Q(q, p)$ does not physically represent "the probability of the particle having position $q$ and momentum $p$", fundamentally because $[q, p] = i$, so you can't measure both simultaneously.

answered Feb 02 '20 at 02:35

knzhou

101,976

Husimi function $Q(\alpha)$ is the probability of finding the system in the coherent state $| \alpha\rangle$. It should also be said that knowledge of $Q$ is not enough to know the sate. – lcv Feb 02 '20 at 15:51
What is $Q(q,p)$? The Husimi Q-function is a function on "coherent" or "optical" phase space, not on ordinary phase space. I know how to express the Wigner function as a function on coherent phase space, but I don't know how to express $Q$ and $P$ as functions on ordinary phase space. – ACuriousMind Feb 02 '20 at 20:01
2

@ACuriousMind You just substitute $\alpha = (q + i p)/\sqrt{2}$. I don't know if this is "rigorous" to a purist, but it's what people who use these functions (e.g. for visualizing quantum states) actually do. – knzhou Feb 02 '20 at 21:11
You are right that the positive semidefiniteness does not make the Husimi a bonafide PD. The marginal "probabilities" found this way are simply wrong, and for a reason: in this correspondence, there is a Husimi star implied in symbol products which fails to integrate out, unlike in Weyl ordering (Wigner function). But you did hit the nail on the head: in all such quasi-p-distributions, two different points in phase space fail to represent mutually exclusive alternatives, violating Kolmogorov's 3rd axiom... – Cosmas Zachos Feb 08 '20 at 15:28

score 0 · Answer 3 · answered Feb 02 '20 at 02:16

0

The expectation value $E[\mathrm{e}^{\mathrm{i}(t_QQ+ t_PP)}]$ simply does not exist in quantum mechanics - the operator inside the brackets is not a self-adjoint operator, hence not an observable, hence it does not possess an "expectation value".

answered Feb 02 '20 at 02:16

ACuriousMind

124,833

Ok, but I can still calculate $\langle e^{i(t_Q Q + t_P P)} \rangle = \langle \psi| e^{i(t_Q Q + t_P P)}|\psi\rangle$. If the operator in the brackets were self-adjoint, $\langle H \rangle$ you would then say it is an observable so that we can let $E[H] = \langle H \rangle$. Let's just say this is my definition for $E\left[e^{i(t_Q Q + t_P P)}\right]$. Why can I not interpret this as an expectation value? Interpreted this way it leads to something we could call a characteristic function or a quasi-characteristic function. Why is it not a normal characteristic function? – Jagerber48 Feb 02 '20 at 02:21
And so what if the expectation value of a random variable gives a complex number? Let's expand our scope in the classical case to allow complex valued random variables. – Jagerber48 Feb 02 '20 at 02:21
1

@AcuriousMind you could say exactly the same thing in the classical case as that expectation value is complex in that case (and no measurable quantity is). – lcv Feb 02 '20 at 05:34
1

@jgerber When you say "Why can I not interpret this as an expectation value?", then what do you mean by that? What does it mean for you to say that something is an expectation value? – ACuriousMind Feb 02 '20 at 09:52

Quantum Probability, what makes quantum characteristic functions quantum?

3 Answers3

Linked