What's wrong with the square root version of the Klein-Gordon equation?

Question

The Wikipedia article has a derivation of the Klein-Gordon equation. It gets to this step:

$$\sqrt{\textbf{p}^2 c^2 + m^2 c^4} = E$$

and inserts the QM operators to get

$$\left( \sqrt{ (-i \hbar \nabla)^2 c^2 + m^2 c^4 } \right) \psi = i\hbar\frac{\partial}{\partial t} \psi$$

The article then says

This, however, is a cumbersome expression to work with because the differential operator cannot be evaluated while under the square root sign. In addition, this equation, as it stands, is nonlocal.

To fix this, the first equation is squared instead to get

$$\textbf{p}^2 c^2 + m^2 c^4 = E^2$$

after which the QM operators are inserted and the expression is simplified to get

$$-\hbar^2 c^2 \nabla^2 \psi + m^2 c^4 \psi = -\hbar^2 \frac{\partial^2}{\partial t^2} \psi$$

A couple things I don't understand. First, are the solutions to this differential equation not exactly the same as the solutions to the first differential equation? Both sides of the starting equation were squared, so it seems to me that regardless of the particular form of the resulting differential equation, both of them should have the exact same set of solutions.

Secondly, why is the first differential equation cumbersome to work with? It seems like it would in fact be easier to work with, since the operator under the square root could be expanded in terms of a Taylor series and then you have an equation that is first order in time.

And finally, can someone elaborate on what nonlocal means? The linked article on the Wikipedia page didn't entirely help me understand it.

Squaring the two sides of an equation is not an equivalence operation, not even for simple algebraic equations (that's high school math) because $x=1$ has one solution, but $x^2=1^2$ has two. That an algebraic function of a differential operator can not be evaluated in some sense is nonsense and Wikipedia shouldn't say that. The problem is that the evaluation of that root would lead to an infinite series of differential operators which, under certain circumstances, can be expressed as a more general operator of the integral type... and that is where the non-locality will probably appear. — CuriousOne, Jan 02 '15 at 04:42
As you pointed out, since squaring two sides of an equation is not an equivalence operation, in what sense then is it valid to do that with the above? — Nick, Jan 02 '15 at 04:55
In the sense that physics is not math. Nature doesn't care what kind of mental process you use to guess the correct equation, as long as the equation has solutions that are somewhat correct. The derivation of the Klein-Gordon equation is simply guessing by taking something that makes no sense as an equation and kneading it into the shape of something that makes a tad more sense. That's all there really is behind this "mystery". — CuriousOne, Jan 02 '15 at 04:58
So essentially, this whole derivation involving squaring is just a heuristic to come up with an equation that is then checked with experiment to see how well it matches up? Is that why the Dirac equation is an "alternative" relativistic form of the Schrodinger equation? — Nick, Jan 02 '15 at 05:01
You got it. All equations of motion are the result of heavy guesswork. They are not arbitrary, of course, after all, we want nice properties like energy conservation, causality, invariance under symmetry groups, existence of a ground state, stability... the list of necessary technical criteria is probably quite extensive. I don't think that it's correct to say that Dirac is a relativistic version of Schroedinger. It's one relativistic equation among many others and it fits historically into the time frame, but we still keep inventing new relativistic equations... and none of them are perfect. — CuriousOne, Jan 02 '15 at 05:07
And the most "perfect" version is just QFT which doesn't even really have a Schrodinger equation, right? — Nick, Jan 02 '15 at 05:10
Depends what you mean by "Schroedinger equation". Theoretically one can package QFT into something that formally looks like the Schroedinger equation (but it's not THE Schroedinger equation for single particles) and to the best of my knowledge that "form" is not that useful for actual calculations. I think the more problematic cases of quantum field theories are the ones that don't even have a simple equation (at least, yet). One may, for instance, have a pretty good handle on perturbative properties of a system without even knowing the explicit Lagrangian. — CuriousOne, Jan 02 '15 at 05:24
@CuriousOne, "they say" the functional Schrodinger equation is useful for certain perturbative calculations, but -- as you hint -- its usefulness may be dubious (see, e.g., Brian Hatfield's Quantum Field Theory of Point Particles and Strings chapter on the functional Schrodinger picture for when it's useful in perturbative calculations). — Alex Nelson, Jan 02 '15 at 05:44
@AlexNelson: I am not a theorist and I really can't say anything useful about how to do calculations in detail. I think there is some agreement that all of the current formulations of QFT are still suffering from severe mathematical problems, but given how successful the theory is, even at this state, it can't be completely ill-defined in my opinion. There has to be some version of it that cleans up all the problems and is essentially identical in its solution space to what we are doing now. — CuriousOne, Jan 02 '15 at 05:49

Alex Nelson · Answer 1 · 2015-01-02T18:06:16.790

Secondly, why is the first differential equation cumbersome to work with? It seems like it would in fact be easier to work with, since the operator under the square root could be expanded in terms of a Taylor series and then you have an equation that is first order in time.

Well, the Taylor series for operator expressions only really make sense if they converge everywhere (e.g., $\exp(\partial_{x})$ makes sense as a series expression)...modulo technical details.

The squareroot doesn't have a nice series expression for any operator...it works for normal operators.

So what happens with this squareroot version of the KG equation, we just take the Fourier transform of the expression

$$ \int (k^{2}-m^{2})\,\hat{f}(k)e^{ikx}\,\mathrm{d}^{4}k=0. $$

Then observe we have the operator be $(k^{2}-m^{2})$. So, hey, presto, take its squareroot! We get

$$ \int \sqrt{k^{2}-m^{2}}\;\hat{f}(k)e^{ikx}\,\mathrm{d}^{4}k=0. $$

Then...well, then it's a pain to work with. Why? Because all our lovely tools from linear algebra don't really work too well. My next tool, profanity, doesn't produce much results either :\

Addendum: I thought I ought to add some links on this, because there are people researching it. (This method I sketched describes treating the squareroot of the Klein-Gordon Equation using pseudodifferential operators)

Claus Lämmerzahl, "The pseudodifferential operator square root of the Klein–Gordon equation". J. Math. Phys. 34 9 (1993), 3918-3932, doi:10.1063/1.530015
J. Sucher, "Relativistic Invariance and the Square‐Root Klein‐Gordon Equation". J. Math. Phys. 4 17 (1963); doi:10.1063/1.1703882

I'm sure from there, you can follow the references to where-ever you want.

And finally, can someone elaborate on what nonlocal means? The linked article on the Wikipedia page didn't entirely help me understand it.

As I understand it (and someone will probably correct me if I am wrong), generically, it means the field at one point depends on its value at other spatially separated points. It borks up our intuitive understanding of cause and effect.

If we have infinitely many derivatives, we get this problem. Why?

Well, consider a special case: the Taylor expansion. We have

$$ f(x+h) = f(x) + hf'(x) +\cdots = \exp(h\partial_{x})f(x) $$

where $\exp(h\partial_{x})=1 + h\partial_{x} + \cdots$ is an expression involving infinitely many derivatives. We then get a relation between values at two distinct points ($x$ and $x+h$).

More generally, we could consider any operator involving infinitely many derivatives, not just $\exp(h\partial/\partial x)$.

Very helpful answer. Your last example -- would that imply that the series expansion (in matrix representation) of the time evolution operator U = exp(-iHt/ћ) is nonlocal as well, if H involves the gradient? — Nick, Jan 02 '15 at 05:00
@Nick, in some sense, I suppose you could think of the unitary time evolution as "nonlocal in time" relating the state at time $t_{0}$ with the state at time $t+t_{0}$. This is not terrible, it's allowed by physics. The problem is when you have nonlocality violate the condition $[\varphi(\mathbf{x}), \varphi(\mathbf{y})]=0$ for spacelike $\mathbf{x}$, $\mathbf{y}$. — Alex Nelson, Jan 02 '15 at 05:10

score 1 · Answer 2 · answered Sep 10 '18 at 15:55

On the right-hand side we have a square root of an operator. It is possible to take the square root of an operator (ie. the square root of a matrix) and there is a body of theory in linear algebra and spectral theory related to this possibility but the question is how to interpret this physically, since a matrix has multiple square roots which are themselves matrices.

One possible interpretation is to make a Taylor series expansion as you rightly say, but we then get a Hamiltonian with derivatives of arbitrarily high order. The two standard approaches are obviously to square both sides and obtain the Klein-Gordon equation or just to propose a Hamiltonian which is linear in the momentum and equal to the square of the relativistic energy-momentum relation: this leads to the Dirac equation. If you take the latter approach, a solution of the equation is not just a function only and has to have four components.

What's wrong with the square root version of the Klein-Gordon equation?

2 Answers2

Linked

Related