Why there is a minus in the definition of the Minkowski Spacetime Interval?

Question

The spacetime interval is defined as follows:

$$\Delta s^2 = -(c\Delta t)^2 + \Delta x^2 + \Delta y^2 + \Delta z^2$$

or in tensor notation:

$$\Delta s^2 = \eta_{\mu\nu} \Delta x^\mu \Delta x^\nu$$

When I first studied introductory special relativity, I didn't even pay much attention to this quantity -- it was mostly time dilation, length contraction, and fancy paradoxes.

However, it has caught my attention now. The book I'm reading simply defines the quantity, and claims that it's invariant.

Now, just from tensor analysis and ignoring special relativity, $\eta_{\mu\nu} \Delta x^\mu \Delta x^\nu$ looks like a contracted product of a doubly covariant tensor with two contravariant tensors, mathematically proving it's an invariant. Great!

But, what I do not understand is why is the spacetime invariant defined the way it is? Why is it $-(c\Delta t)^2$, and not $(c\Delta t)^2 + \Delta x^2 + \Delta y^2 + \Delta z^2$ ?

I want the physical motivation behind this formula.

See my answer in https://physics.stackexchange.com/questions/779000/where-does-the-negative-signature-case-come-from-in-the-pythagorean-derivation-o which uses the Bondi k-calculus and radar measurements to motivate the appearance of the minus-sign. — robphy, Sep 05 '23 at 22:25

Turbotanten · Answer 1 · 2017-01-13T12:34:15.643

Leonard Susskind professor at Stanford University has en excellent explanation of why the space-time invariant is defined the way it is. I've included a video link from where he begins to talk about the subject, if you'd like to watch it.

He compares space-time to euclidean geometry where the normal Pythagorean theorem says that the square distance between two points is the sum of the square of the distance in your coordinate system. i.e $c^2= a^2+b^2$. This is a quantity that is invariant i.e. we could rotate our coordinate system and describe our new set of points in the new coordinate system with primed coordinates, then we would have the following invariant quantity between our new and old coordinates, $a^{\prime 2}+b^{\prime 2} = a^2 + b^2$. Similar in space-time we also look for an invariant quantity that all observers in different reference frames will agree upon. If we begin with the Lorentz transformation ($c=1$), we have

$$ x^\prime = \frac{(x-vt)}{\sqrt{1-v^2}} \quad t^\prime = \frac{(t-vx)}{\sqrt{1-v^2}} $$ Let us look for an invariant quantity. We could begin and try with $t^{\prime 2}+x^{\prime 2} = t^2 + x^2$, substituting $x^\prime$ and $t^\prime$ into the equation we will notice that it does not read $t^{2}+x^{2} = t^2 + x^2$, so this is not invariant property in space-time. But if we try $t^{\prime 2}-x^{\prime 2} = t^2 - x^2$ and do the same procedure we will find that this is an invariant property!

Great! My book just defined the space-time interval and then stated that Lorentz transformations are those which keep this interval invariant. Doing it the other way feels more intuitive to me. — Silver, Jan 13 '17 at 11:26
Do you think you could include Susskind's arguments in the post here? Otherwise, this would be considered a "link-only" answer and would likely be deleted. — Kyle Kanos, Jan 13 '17 at 12:16
You can do it either way round. Most students prefer Lorentz transformation first; then when you get deeper in the subject, especially if you go to general relativity, then you begin to fall in love with metrics and geometry, and then you put the interval first. — Andrew Steane, May 05 '19 at 14:12

score 8 · Answer 2 · answered May 05 '19 at 13:00

Here are two different ways to introduce the subject of Special Relativity. Both are good ways, and each can be used to derive the other.

Approach 1: symmetry principles. We assert Relativity Postulate (physical behaviour the same relative to an inertial frame, no matter the state of relative motion of that inertial frame with others) and Speed of Light Postulate (there is a finite maximum speed for signals). From these we can derive the Lorentz transformation and hence what quantities are invariant. The spacetime interval is one such quantity.

Approach 2: geometric assertions about spacetime. We assert that spacetime is a smooth differentiable manifold with a Minkowskian metric $\eta$. The metric is itself a statement of that which is invariant; the Lorentz transformation $\Lambda$ is then defined as that class of transformations which satisfies $$ \Lambda^T \eta \Lambda = \eta $$ (Here I have used matrix notation in which $T$ is a transpose and $\eta$ has components $\eta_{ab}$.)

Your question is closest in spirit to approach 2. The question then becomes, "why the Minkowski metric? Why not some other metric?" The answer goes to the heart of what kind of universe we have. One can argue that if the metric were that of a 4-dimensional Euclidean space, for example, then there wouldn't be any sense in which time is different from space, and this would amount to such a different way of things that it is hard to even describe it as a physical universe where there can be conservation laws of the type that allow one to single out and label things by their worldlines. There would be no sense of limits to causality, of past and future. Other metrics you can consider, such as diag(-1,-1,1,1) also give such a different type of spacetime that it would be totally unlike the ways things actually are in the universe as we find it.

So in so far as one can talk of a "physical motivation behind this formula" as you ask, it would be "well this is deeply and directly connected to the notion of causality and the causal structure of spacetime. It also expresses the notion that one spatial direction is as good as another in the basic structure of spacetime."

score 5 · Answer 3 · edited Jan 11 '19 at 13:35

Let's define an event $A$, a source of light is emanated from the origin at $t=0$. Let at time $t$ the light reaches a point $B$. Let the coordinate of space be $(x, y, z)$. If the distance travelled by the light in time $t$ is given by $$c^2t^2=x^2+y^2+z^2$$ $$c^2t^2-x^2-y^2-z^2=0.$$

Consider another frame X’ which is moving at a velocity $v$. Observing the same event in Minkowski space time gives $$c^2t'^2=x'^2+y'^2+z'^2$$ $$c^2t'^2-x'^2-y'^2-z'^2=0.$$
We called the term interval. This event shows that if the interval has zero value in one frame of reference, it should be zero in all inertial frame of reference, since light has a constant velocity in all frame of reference(Postulate of special relativity). Thus the interval between any two events of $X$ and $X'$ coordinates has a linear dependence. Suppose for an observer in $X$ frame the $X’$ frame is moving at some velocity $v$ then linear dependence of interval is given by $$c^2t^2-x^2-y^2-z^2=\alpha (c^2t'^2-x'^2-y'^2-z'^2).$$ Where $\alpha$ depends only on the magnitude of velocity. If not it will contradict the isotropic nature of space. If the observer is at $X'$ frame, then the $X$ frame will be moving at at a velocity $v$. Thus the linear dependence of interval is given by $$c^2t'^2-x'^2-y'^2-z'^2=\alpha(c^2t^2-x^2-y^2-z^2).$$ Substituting this in the above equation gives, $$c^2t^2-x^2-y^2-z^2=\alpha^2(c^2t^2-x^2-y^2-z^2).$$ Which gives $\alpha=1$.Thus the interval $$c^2t'^2-x'^2-y'^2-z'^2=c^2t^2-x^2-y^2-z^2$$ is an invariant under Lorentz transformation.

The constancy of light and isotropic nature of space defines the invariant interval.

JMLCarter · Answer 4 · 2017-01-13T11:21:07.420

Because it let's you quickly recognise the potential for a causal relation between the two events. So, (noting the extra $\Delta$ you added before the s is removed)

$s>0$ (space-like) more space in between than light can cross in the time => no causal relation
$s=0$ (light-like) exactly on the "light cone"
$s<0$ (time-like) less space in between than light can cross in the time => A causal relation is possible

It's the way light works "against space" or rather travelling through space that is being modelled. It's not the same as the magnitude of a vector measurement of distance.
Obviously you could give $(c\Delta t)^2$ the same sign as the distances. The resultant quantity, whilst having some uses, would just be a vector in space-time, it would not have the same (or as much, in my view) physical significance... and crucially definitely does not get to be called "space-time interval".

Further note there is a choice about whether to use -+++ or +--- signs for the terms in the equation, this choice of -+++ is just a matter of convention.

(What's really cool is that it can be shown that s is preserved under the Lorentz transform; proving that causality cannot be affected simply by changing your frame of reference. Neat.)

I don’t see how it could have any uses if you use the physically incorrect sign in Minkowski metric. What use could we get in Euclidean geometry from Pythagoras with the wrong sign? — blanci, Nov 08 '20 at 14:22

score 1 · Answer 5 · answered Nov 08 '20 at 09:44

Answering the original question as to why the space time interval is “defined” with an opposite sign on the time component as compared to the space part. As Feynman said you must be careful with definitions in physics. (Feynman was talking about defining mass as force over acceleration which means Newton can never be proved wrong as it would always hold by definition! Of course f= ma needs to be a result or an observation about physical universe and it turns out it is only an approximation). The space time interval is a physical law and an observation about the universe and subject to direct and indirect measurements and exceedingly well verified by lots of experiments. Better not to think of it as a mysterious definition, it is an incredible discovery about the physical universe. We cannot expect too much in the way of “reasons” nor definitions to support fundamental physical law. Though some hints are mentioned in other answers posted. Similarly Pythagoras theorem in physics is not a definition but an observation about space and turns out it applies only approximately to the physical world as can be checked by measurements though fails near black holes and very slightly everywhere though of course infinitesimally unnoticeable except using special techniques (though some indirect consequences can be argued to be observed in everyday phenomena). Personally I think spacetime interval can be presented first simply because even a high schooler familiar with Pythagoras can glean a little from it whereas the Lorentz transformations can be grasped only after having studied the rotation matrix.

"The space time interval is a physical law and an observation about the universe and subject to direct and indirect measurements and exceedingly well verified by lots of experiments." Hi, off the top of your head can you name a couple of experiments that verified the spacetime interval and how that experiment itself made that verification? — Tivity, Jun 20 '22 at 22:22

Dr Maurya Dinesh · Answer 6 · 2021-04-28T17:43:46.420

Perhaps no one has replied the correct answer, here is the required answer

Speed of light or any thing is need not to be constant & and no need of 4D space (keep these in mind), Einstein never revealed his derivation of interval, else everyone would have known... how simple the relativity is, instead made everything mysterious, in one ref frame: Interval is (now deducible, earlier it was 3rd postulate and people are still living in this third postulate world)

$- (\Delta S)^2 = c^2 (\Delta t)^2 - (\Delta x)^2 - (\Delta y)^2 - (\Delta z)^2$

Where $\Delta S = \dfrac{v\Delta \ell}{c}, \quad $ $\Delta \ell = \hat{i}\Delta x+\hat{j}\Delta y+\hat{k}\Delta z$, $ v$ is the relative speed of object, rest is clear. The above strange def of interval is derivable through the total relativistic momentum. Now for first time you will be able to understand it.

Thanks \
Dr. Maurya dinesh

Why there is a minus in the definition of the Minkowski Spacetime Interval?

6 Answers6

Linked