10

In Quantum Mechanics, as I know, if a system is described by a Hilbert space $\mathcal{H}$, each physical quantity is associated with some hermitian operator $A : U\subset \mathcal{H}\to \mathcal{H}$ such that the set of possible eigenvalues of this operator is the set of possible values the quantity may assume.

If $\left|\psi\right\rangle\in \mathcal{H}$ is an eigenvector of $A$ with eigenvalue $\lambda$ and if a particle is on the state $\left|\psi\right\rangle$ then when we measure the physical quantity corresponding to $A$ we are certain to get $\lambda$ back.

In all of the above I'm talking about the general setting of a general state space. When dealing with the space of wavefunctions for a particle in one dimension usually books simply say the following: any physical quantity of interest can be written in terms of momentum and position. In that case, to find the operator associated to some physical quantity, we write it in terms of position and momentum and substitute the opeators that correspond to both position and momentum on the formula.

For that, one defines the position operator $\hat{x} : U\subset \mathcal{H}\to \mathcal{H}$ by $\hat{x}\psi(x)=x\psi(x)$ and the momentum operator $\hat{p}: U\subset \mathcal{H}\to \mathcal{H}$ by $\hat{p}\psi(x)=-i\hbar \partial_x \psi(x)$.

All of that leaves me with the following doubts:

  1. How to make rigorous this idea that "we simply pick the quantity of interest write in terms of position and momentum and susbtitute the operators"? I know that this works, but how can one justify that?

  2. More than that, why the position operator should be multiplication by the position coordinate and momentum operator should be that derivative multiplied by $-i\hbar$?

  3. Finally, on the more general setting of an abstract state space of kets, where they are not necessarily functions, how does one define the operators of interest? How does one finds out what should be the operator corresponding to some physical quantity?

I know that one doesn't really need to answer this. When defining an operator all that matters is that things work as expected. But I want to fully understand how to arrive at the operator for a certain physical quantity.

Gold
  • 35,872
  • Question 2 really belongs as a separate question, and one that I'm sure has been answered before, e.g http://physics.stackexchange.com/q/73483/ and http://physics.stackexchange.com/q/77457/ – By Symmetry Sep 04 '15 at 00:55
  • 2
    "I know that this works, but how can one justify that?" Not to poo-poo the possibility that there might be deeper mathematical structure at work here, but ... this is science "it works" is the ultimate justification. If someone shows you a deeper mathematical structure, the reason you should use it is because "it works", not because it is math or even because it is rigorous. – dmckee --- ex-moderator kitten Sep 04 '15 at 01:00

4 Answers4

14

The search for a quantum mechanical theory could be done in a mathematical systematic fashion, and it starts from observables. So the answer to the OP last question is that the process is inverted: the relevant observables are given first (and as we will discuss below, they are justified by observations); then you find the space of states where these observables are measured as a consequence. This leads to the investigation of very general mathematical structures, i.e. noncommutative probability and especially the theory of operator algebras. These structures have a huge importance in pure mathematics, and their study led to powerful results that, when applied to quantum physics (i.e. when using quantum physical systems as a model for these structures) led to a much deeper understanding of how quantum mechanical systems behave, and to the justification/prediction of many important experimental facts. It is, in my opinion, quite unfair to diminish such contributions to knowledge (the ultimate goal of physics, rather than just the mere observation) as something that in the end just works and provide numbers that agree with experiments.$^{\dagger}$

In the following I will describe how such a search for a quantum theory works (without too much details).

It starts with observation. You have a physical quantum system, on which you are able to perform measurements that could describe with sufficient accuracy some physical quantities. Since the system is quantum mechanical, you will experience indeterminacy problems with you measurements, and in general you would be able only to determine the average value that an observable takes in a given state of the system.

Then you identify a set of observables for the system. This set $A$ should be as complete as possible, in order to cover all the effective or theoretically allowed measurements. For example, if the system consists of a particle, you may think of measuring its (average) position, and in many cases experimentally perform such a measure. Operationally, you may think of a device that alerts you if the particle is in a given region $\Omega\subseteq \mathbb{R}^3$ of space. We may call this observable $\mathbb{1}_{\Omega}(x)$, and therefore $A\supset \{\mathbb{1}_{\Omega}(x),\Omega\subseteq \mathbb{R}^3\}$. The same thing may be done with a device that alerts you if the particle has a given momentum, again contained in $\Omega\subseteq \mathbb{R}^3$; and therefore $A\supset \{\mathbb{1}_{\Omega}(x), \mathbb{1}_{\Omega}(p),\Omega\subseteq \mathbb{R}^3\}$. So it is plausible that your set of observables has an infinite number of objects (e.g. to take into account that, in principle, you could measure if the particle is in any region of space); it is also plausible that your observables may be rescaled, summed, and multiplied (the last operation may also be non-commutative, as observations suggest). Finally, it is useful to extend the set of observables to "complex" ones (that is a mathematical convenience), and therefore allow multiplication by complex numbers and to have a "complex conjugation" of observables. Everyone of these operations has to satisfy suitable straightforward properties.

This set of observables with the properties above form a mathematical structure. This structure is called a $*$-algebra. The last notion that is useful to introduce, is that of the "magnitude" of an observable. This would be the supremum value that an observable can reach when measured on states. Mathematically, the magnitude has the form of a norm $\lVert\cdot\rVert: A\to \mathbb{R}_+$, with given properties. Now completing the $*$-algebra $A$ with respect to the norm $\lVert\cdot\rVert$ we obtain a Banach $*$-algebra $\mathfrak{A}$. This is the object we choose to be the set of observables of the system, provided it satisfies some additional assumptions on the norm (I do not want to give too much details); and it is called a $C^*$-algebra. At this stage, we have given mostly natural assumptions, that agree with experimental evidence (e.g. manipulation of observables, the concept of magnitude of an observable, the properties of magnitude itself...).

Given the set of observables of the system, i.e. a $C^*$-algebra $\mathfrak{A}$, we identify the space of states. The states should be described mathematically as objects that characterize the evaluation of observables, and therefore the behavior of the system. In other words, given any observable $a\in\mathfrak{A}$, we need a state $\omega$ that maps $a$ to a (eventually complex, but for "real" observables it would be real) number, to be interpreted as the average value of the observable in the state. This means that $\omega$ is a functional of $\mathfrak{A}$; and since it is suitable that it behaves in a continuous manner, it would be an element of the topological dual $\mathfrak{A}'$ of $\mathfrak{A}$. In addition, the state has to characterize evaluations that are correct in a probabilistic sense, therefore its norm in the dual space has to be one (i.e., absolute certainty has probability one). Finally, it has to preserve positivity: if we have an observable that has only positive admissible values, the evaluation must always be positive. Therefore we denote the space of states as $$S_{\mathfrak{A}}=\{\omega\in\mathfrak{A}', \lVert\omega\rVert_{\mathfrak{A}'}=1, \forall (\mathfrak{A}\ni a\geq 0),\omega(a)\geq 0\}\; .$$

First mathematical result: to the algebra of observables is always associated an algebra of bounded operators in a Hilbert space. This result, called GNS construction, says that any given $C^*$-algebra $\mathfrak{A}$ is isomorphic to an algebra of bounded operators in a Hilbert space $\mathscr{H}$. In mathematical terms, we always have an isomorphic isometric map $\pi:\mathfrak{A}\to \mathscr{L}(\mathscr{H})$, such that (suitable) states are associated to "kets", i.e. to vectors $\Omega$ of the Hilbert space; and the corresponding evaluation takes the form $\omega(a)=\langle\Omega, \pi(a)\Omega\rangle$.

A second mathematical result: any irreducible representation of the algebra of the canonical commutation relations of QM$^\ddagger$ (in their exponentiated form) is unitarily equivalent to the one where the space is $L^2(\mathbb{R}^3)$, $x$ is the position operator, and $-i\hslash\nabla$ is the momentum operator. This is the so-called Stone-von Neumann theorem, and provides an answer to the OP point 2.


$\dagger$: This long preamble is partly a response to anna v and dmckee.

$\ddagger$: The algebra of the CCR of QM is based on a finite dimensional symplectic space (the simplest example being the classical phase space of a free particle, roughly speaking $\mathbb{R}^3\times\mathbb{R}^3$); if the symplectic space is infinite dimensional, the result does not hold anymore.

yuggib
  • 11,987
  • +1 This answer must be taken as an example. This is quantum mechanics. – user91126 Sep 04 '15 at 09:46
  • Your last statement is not very clear to me. If the rep. is irreducible then it is unitarily equivalent to the standard rep in $L^2$. If it is not, it is however a direct sum (not a direct integral) of such representations as proved by Mackey. – Valter Moretti Sep 04 '15 at 14:10
  • @ValterMoretti Yes, probably I did not stated it in the clearest of forms: I did not want to be too precise mathematically (to avoid cumbersome definitions and notations); but maybe that did not help. If it is the "finite dimensional" bit that for you is not clear, I meant that the CCR has to be based on a finite dimensional symplectic space (quantum mechanics), if else the Stone-von Neumann theorem does not hold. – yuggib Sep 04 '15 at 14:35
  • Yes, I agree on the finite dimensionality of the symplectic space. My point regards the hypothesis of reducibility. It is part of the hypotheses of SvN theorem, and not of the thesis. It seemed to me not so clear in your answer. – Valter Moretti Sep 04 '15 at 16:22
  • @ValterMoretti Ok, I see your point thanks. I edited in the hope that the formulation is now clearer. – yuggib Sep 04 '15 at 16:29
  • How do weak measurements fit in? We can do them in the lab after all. And why tell people they can only observe average values when they know they can see individual values with various frequencies? You can argue that averages suffice, but telling someone that they can't do something in the lab that they very well can do isn't a way to engender trust or understanding. – Timaeus Sep 04 '15 at 20:05
  • @Timaeus "Weak measurements" are a quite disputed subject (surely not commonly accepted quantum mechanics), and as I see it, they are not different from other measurements. I did not describe above the measurement process at all (and I did not want to, for that is another problem not so related with the problem at hand). – yuggib Sep 04 '15 at 20:24
  • @Timaeus Concerning the probabilistic interpretation, this is usual in both quantum as well as classical (statistical) mechanics. Evaluating an observable in a state is a probabilistic notion, and gives in general the average of each possible different outcome allowed by the state, weighted by the respective probability; obviously you may have states where only a single value is allowed for a certain observable, and therefore the averaging process is superfluous. But in quantum mechanics, you have observables, such as position or momentum, for which no state allows a single possible outcome – yuggib Sep 04 '15 at 20:29
  • @Timaeus And I never said you can't do measurements with "individual" values in the lab; simply there are indeterminacy problems in quantum mechanics, concerning some relevant observables, that are clearly not present in classical mechanics. I don't see that as something worrying or not worth of trust. – yuggib Sep 04 '15 at 20:35
  • My comment above was too short to contain much nuance. I didn't mean to suggest that seeking deeper structure was either a waste of time or second class work. Indeed, it is very often the door to new understanding. But the applicability of any particular piece of beautiful and deep mathematics to physics stems from continuing to capture the behavior of real systems rather than from it's value as mathematics. All of which you are clearly deeply conversant with, but which is sometime lost in the day-to-day buzz of instruction in school. – dmckee --- ex-moderator kitten Sep 04 '15 at 23:33
  • @dmckee I just wanted to give another point of view; from someone in that community that works at the crossing between mathematics and physics. No offence was taken from your comments ;-) – yuggib Sep 05 '15 at 07:29
1

Question 1

I'm not sure one can make this idea "rigorous". This is essentially the process of "quantization of a classical theory". It's actually an experimental result: if we take classical expressions defining relationships between classically measurable quantities and replace them by observables and, if further, we replace Poisson brackets in the classical Hamilton equations for the evolution of these quanties by Lie brackets, we arrive at a theory that foretells experimental results well: the procedure is essentially experimentally observed to work well. How do we motivate this arbitrary recipe? This procedure is essentially due to Dirac, who noticed the stiking likeness between the general form of Hamilton's equation and the equation for the evolution of an observable in the Heisenberg picture. I actually find one of the most helpful things to do getting one's head around these at first arcane procedures is to study the history of QM, and I would commend to you:

Anthony Duncan, Michel Janssen, "From canonical transformations totransformation theory, 1926–1927: The road to Jordan’s Neue Begründung", Studies in History and Philosophy of Science Part B - Studies in History and Philosophy of Modern Physics, 40, #4 pp352-362 (2009)

It should also be said that researchers such as Weyl, Wigner Groenewold and Moyal found a way to reformulate quantum mechanics so that both the classical Hamilton equations and the quantum observable evolution equation in the Heisenberg picture belong to a unified whole where the Poisson bracket of the classical theory becomes the limiting form a more general Moyal bracket, which is otherwise proportional to a Lie bracket. See my answer here for more details.

Question 2

One can motivate this from essentially de Broglie's hypothesis and then see it as the unique possibility in a certain sense through the Stone von Neumann Theorem. Assume at the outset we can transform our separable Hilbert state space $\mathbf{L}^2(\mathbb{R})$ to co-ordinates wherein the position observable is simply the multiplication operator $(f:\mathbb{R}\to\mathbb{C}) \mapsto (x\,f:\mathbb{R}\to\mathbb{C})$. Assume then de Broglie's hypothesis that the momentum of a delocalized plane wave $e^{i\,k\,x}$ has to be $\hbar\,k$. So, for a general state, we Fourier transform our position co-ordinates to see what delocalized, pure sinusoidal momentum states make up our state, assign $\hbar\,k$ to each of the constituents and sum them up and then Fourier transform the superposition back to find that our momentum observable in position co-ordinates is $\hat{p}=-i\,\hbar\,\mathrm{d}_x$, or $\hat{p}=-i\,\hbar\,\nabla$ in 3D. Witness immediately we have the canonical commutation relationship: $[\hat{x},\,\hat{p}] = i\,\hbar\,\mathrm{id}$ which then implies the Heisenberg uncertainty relationship between measurements made by these two observables.

The Stone-von Neumann theorem then asserts that any "reasonable"[1] unitary representation of two operators that fulfil the canonical commutation relationship is unique up to a unitary transformation.

What this means is that if we have $[\hat{x},\,\hat{p}] = i\,\hbar\,\mathrm{id}$ in a theory involving two arbitrary observables we can always find co-ordinates wherein $\hat{x} \,f = x\,f$ and $\hat{p}\,f = -i\,\hbar\,\mathrm{d}_x\,f$.

Question 3

This one seems to be over my head, so I'll defer to an expert. In the meanwhile, apply the Meatloaf principle: 2 out of 3 ain't bad.


[1]: The exponentiated versions of the pair must also fulfil the Weyl relations. See the Wiki article I cited for details.

  • Just a small technical remark: there are (explicitly defined) operators $X$ and $P$, essentially self-adjoint on a common domain, such that $[X,P]=i\hbar$ and that are not unitarily equivalent to the canonical observables. This is because their exponentiated version do not satisfy the Weyl relations. Every couple of self-adjoint operators whose exponentiated version satisfies the Weyl relations are indeed unitarily equivalent to canonical observables (Stone-von Neumann). – yuggib Sep 04 '15 at 11:50
  • @yuggib Thanks heaps. Noted and changed. Have never quite gotten through a proof of the SvN with full understanding, probably the reason why that bit (Weyl requirement) didn't stick. – Selene Routley Sep 04 '15 at 12:39
  • You're welcome ;-) One is tempted to think that the requirement on generators would suffice, but there are explicit counterexamples so one has to be careful. The counterexample that involves Riemann surfaces given by Reed and Simon is nice! – yuggib Sep 04 '15 at 13:21
  • It is sufficient that $X^2+P^2$ is essentially selfadjoint. Under this further hypothesis, the CCR can be lifted to a strongly continuous unitary rep. of Weyl-Heinsenberg group and Stone-von Neumann holds. Conversely if you have a strongly continuous rep. of Weyl-Heinsenberg group, its generators satisfy CCR on the Garding domain, where $X^2+P^2$, $X$ and $P$ are essentially selfadjoint. Another dense invariant domain where these facts are true is the set of analytic vectors of the representation. – Valter Moretti Sep 04 '15 at 14:05
0

Re question 2):

Let $\lbrace,\rbrace$ be the classical Poisson bracket. Then for (ordinary classical) observables $u,v,w,x$ we have $$\lbrace uv,wx\rbrace = \lbrace u,wx\rbrace v +u \lbrace v,wx\rbrace =\lbrace uv,w\rbrace x + w \lbrace uv,x\rbrace$$

If we drop the assumption of commutativity but still expect the above to hold, it follows from some algebraic manipulation that $$\lbrace u,w\rbrace [v,x]=[u,w]\lbrace v,x\rbrace$$ where $[u,v]$ is the Lie bracket $uv-vu$.

Therefore in the quantum case, we expect the Lie bracket to be proportional to the classical Poisson bracket. The Poisson bracket of (classical) position and momentum (in he same direction) is $\lbrace q,p\rbrace=1$. Therefore we want the Lie bracket of (quantum) position and momentum to be proportional to $1$, i.e. a constant. In order for position, momentum and their Lie bracket to all be hermitian, that constant has to be purely imaginary, i.e. of the form $i\hbar$ for some real number $\hbar$.

Thus we want to define $q$ and $p$ in such a way that $[q,p]=i\hbar$.

A simple way to accomplish this is to let $q$ be multiplication by $x$ and let $p$ be $-i\hbar d/dx$. This is not the only possible choice, but it works.

WillO
  • 15,072
-1

Finally, on the more general setting of an abstract state space of kets, where they are not necessarily functions, how does one define the operators of interest? How does one finds out what should be the operator corresponding to some physical quantity?

Physicists have found the theoretical format of quantum mechanics by trial and error, and mainly, as described by WillO'answer by analogies from the mathematics of classical mechanics. Dmcckee in the comments stresses that in physics it is what "works" that leads the way to mathematics and that is true.

Once one enters the realm of mathematics it is well known that many different formalisms can describe the same thing. (Think of all those expansions in series from the infinity of complete sets that could be devised). As far as physics is concerned the usefulness of the bra and ket formalism and the creation and annihilation operators existing as an operator field in space, rests on its correspondence with the Feynman diagrams which have simplified the calculations which are necessary to get numbers to compare with experiment. Physics is about observables/measurements.

So the operators of interest are the good old ones used to construct by analogies with classical physics the quantum mechanical equations ( Shrodinger, Dirac , Klein Gordon,) which will give the basic wavefunction represented in the bras and kets. Then the Feynman rules for devising the integral for the process under consideration completes the necessary mathematical format for getting numbers to compare with observables.

There are people who believe that mathematics forms nature, and not that mathematics is a tool to describe nature. This mode belongs to philosophy and not to physics, physics is not about beliefs. It started with the "ideas" of Plato and the Pythagorian "music of the spheres" but it is not physics.

anna v
  • 233,453