11

The spacetime interval is a rather important thing in Special Relativity. It allows us to define the separation between any two events as spacelike, timelike or lightlike and more importantly, the Lorentz transformations can be defined as the transformations which keep the spacetime interval fixed.

In that sense, as we know: lenghts and time intervals themselves are observer dependent. They are not absolute notions. On the other hand, the spacetime interval is one absolute notion.

Now, given events $(t_1,x_1,y_1,z_1)$ and $(t_2,x_2,y_2,z_2)$, its definitios is:

$$I = -c^2 \Delta t^2+\Delta x^2+\Delta y^2+\Delta z^2,$$

so that it is the difference between the distance that light traveled between the two events and the spatial separation of the events.

My problem here is that if we try to construct special relativity following the historical procedure there is a certain gap when introducing the spacetime interval.

We can usually go on like this: we start reviewing the problems in electrodynamics which motivated the theory as Einstein himself stated in his paper. After that, we can follow Einstein's procedure and starting with the postulates derive the relativity of simultaneity, the lengths contraction and the time dilation. From that we are able to get the Lorentz transformations.

The next natural step is to give more mathematical substance to this construction, and start labeling events with elements of $\mathbb{R}^4$ so that we finally get to the idea of spacetime. The problem is that in this point one usually just defines this formula for $I$, shows that it is preserved by the Lorentz transformations and shows that it allows us to classify the separation between events.

What I want is to be able to motivate why do we introduce the spacetime interval. It is just a certain object with certain properties, but how do we motivate its importance in the context of relativity and how do we motivate its definition?

After all, as far as I know, originally relativity is trying to solve the inconsistency between Newtonian Mechanics and Maxwell's Electrodynamics. The Lorentz transformations would seem to already to the job. How could one in this context motivate the definition of $I$?

Gold
  • 35,872
  • Landau discusses this (with spherical electromagnetic waves) in his book. IMHO the best way to motivate $I$ is with the fact that the combination $a_0b_0-\boldsymbol a\cdot\boldsymbol b$ appears a lot in mechanics, electrodynamics, field theory, etc. (nice question by the way, I really want to read the answers) – AccidentalFourierTransform Apr 28 '16 at 15:53
  • 2
  • Because we're teaching the relativity unit of modern physics out of Tacheuchi, I've been using a geometric motivations, but I think it still needs refining. Moreover it really requires the decision to go with a geometric description of the physics from the beginning. – dmckee --- ex-moderator kitten Apr 28 '16 at 16:32
  • As far as I know there are two main approaches to special relativity. One which is the original one would be to start from the motivation coming from Electrodynamics and using Einstein's postulates do arrive at the Lorentz transformations. The other is the more geometrical one, based proposing a spacetime $\mathbb{R}^4$ endowed with one inner product $\eta$ of signature $1,3$. In the second one, the Lorentz transformations are defined as the ones which preserve inner product. For me there seems to be a considerable gap between the approaches, what I'm trying to do is to fill that gap. – Gold Apr 28 '16 at 17:15
  • How has this question survived so long, when there is no clear answer to "importance" which is completely subjective. "How to motivate" invites an opinion based discussion against the rules of this forum. "Which is the best approach" is another subjective discussion. Why is this forum so random in the application and enforcement of its own rules? – KDP Mar 19 '24 at 15:43

2 Answers2

4

$I$ has a clear physical meaning if $I\lt 0$ – which is a significant percentage of the spacetime, so to say: $$ I = -c^2\Delta t_{\rm proper}^2 $$ where $\Delta t_{\rm proper}$ is the time measured by clock that moves by a constant velocity (without acceleration); and that visits the point $(x_1,y_1,z_1)$ at time $t_1$ and $(x_2,y_2,z_2)$ at time $t_2$.

Even though all inertial observers may observe and describe the clock above, the description is particularly simple in the reference frame of the clock itself, the rest frame. In that frame, $(x'_1,y'_1,z'_1)=(x'_2,y'_2,z'_2)$ and $I=-c^2 (t'_1-t'_2)^2$ where the primes indicate that I had to use different coordinates in that frame.

So any inertial observer must be able to calculate the total duration shown by the clock which connects the two events in the spacetime. The next question is: Why the formula in the general frame is given by the Pythagorean formula with the diverse signs?

Well, it's simple: the proper time measured on light-like trajectories must be zero, $I=0$. Why? Because under the transformation to another inertial system, the light only changes its frequency (by the Doppler shift) but it doesn't change the trajectory (world line) of the light given by $\vec r = \vec v\cdot t$ where $|\vec v|=c$.

So Einstein's postulate about the constancy of the speed of light says that if $I=0$ in one inertial system, it is $I=0$ in other inertial systems, too. But it happens exactly for the light-like trajectories that have $$ -c^2 \Delta t^2 + \Delta x^2+ \Delta y^2+\Delta z^2 = 0$$ which is equivalent to $|\Delta r|/\Delta t = c$, the right speed, as we mentioned. But that implies that $I$ has to be a function of $-c^2 \Delta t^2 + \Delta x^2+ \Delta y^2+\Delta z^2$ (a function that is zero if the argument is zero) and in the rest frame, one may figure out the power and the coefficient $c^2$.

So the invariance of $I$ is basically equivalent to the principle of the constancy of the speed of light because the latter is a special case of the constancy saying that if $I=0$, then $I'=0$, too. And for nonzero $I$, the invariance of $I$ may be seen by a symmetry – e.g. by looking at both observers from the viewpoint of a reference frame "in the middle" where both move by the same speed in opposite directions.

For $I\gt 0$, the interval is spacelike and spacelike world lines aren't really allowed. $I$ is related to the proper spatial distance between two events in the spacetime although it's hard to measure it directly (clocks aren't possible because they're not allowed to move superluminally). But because $I$ is an analytic function of $\Delta x^\mu$, it must be invariant when it's negative, too.

The comments about the "proper time" measured by the clocks is just a special example optimized for $\Delta x^\mu \Delta x_\mu$. Similar 4-products of vectors are extremely important and natural for other choices of two 4-vectors, e.g. $p^\mu \cdot \Delta x_\mu$ for the phase of a de Broglie-like wave, $p^\mu p_\mu$ for the squared mass of a particle with some momentum, and many many others. Because all the vectors such as $p^\mu$ transform like $\Delta x^\mu$, the invariance of the inner 4-products under the Lorentz transformations follows from the same algebra. But the detailed physical interpretation of all these 4-vectors and their inner products depends on what 4-vectors we consider.

Luboš Motl
  • 179,018
0

The reason we need the interval, is because of the metaphysics involved in the relativity of time. Once time is relative, then “what is”, is relative also, and “being” which had always been thought of as being inherently “absolute” independent of frame of reference, is no longer so.

We use being to indicate its distinction from mere appearance. But observation is “here”, Einstein realized, and “what is” is “over there”. And you can’t see what is directly “over there” without a signal. And a signal that is possibly not infinitely fast. In fact, there might be no infinitely fast signals that allow us to see what is “over there”.

That “what is” could be “relative” and at the same time “exist independent of frame of reference” appears to be a problem. It is this contradiction between “relative” and “absolute” that Einstein showed was only apparently a contradiction. If there are no infinitely fast signals there is no contradiction.

Einstein got around the problem by allowing “local” realization of absolute being. Someone can look at a clock locally, and from it, determine the time of a measurement.

This “local” is a problem. Our heads have a certain size, we are not like point particles or the thickness of a plane. We exist not in the limit. We are finite creatures. But Einstein noted we can neglect that when looking at short distances, like when looking at a clock in the room.

He realized that all local observers could see the same thing, whether they are moving or stationary, or better no matter which frame we think of them as being attached to.

All observers in this “local reprieve from relativity” can read instruments attached to any frame, meaning moving or not, and all colocated observers will see the same thing. So absoluteness is preserved.

Not “absolute” in the sense that the same values must appear on two colocated clocks moving relative to each other, but “absolute” in the sense that all will agree on what we see on any particular clock that is attached to any particular frame.

There really are no inconsistent observations locally. This gives an event structure, or point structure to observed physical events that is absolute and independent of frame of reference. He recovered the absoluteness of being that way.

To be very clear imagine you and a clock are attached to a frame of reference that is moving relative to me and my clock. Imagine we are “collocated” meaning at this instant you and I are passing each other. I can see your clock, and I can see mine. You also can see your clock and you also can see mine. We can neglect light because we are colocated. Both of us agree on all of what we see. Our observations are not relative to our frames of reference. They are absolute and form the empirical basis of the theory.

But when we assemble these events together with other events not colocated with us, when I assemble reports of what all of the observers attached to my frame of reference see at some time, when I assemble and look at that picture, it shows your meter sticks are shorter than mine, and when you do the same you see that my meter sticks are shorter than yours.

How can my meter stick be longer than yours and yours be longer than mine? Because the pictures we create are assembled from events that are not taken at the same time. You select the front of the stick and then wait and you then select the back of it. No wonder it is shorter. The back just moved toward the front. You say the same about me.

Which events are selected to determine what is, or was, at any time relative to my frame, are the ones that are simultaneous relative to a frame, meaning simultaneous as measured by clocks synchronized by assuming light travels at c relative to that frame. Same with you and me. Either of us can assume that light travels at c relative to either frame. Either of us, in any frame may calculate the correct picture of what things are relative to any other frame. We can first assume light travels at c relative to my frame, then sync clocks and select events from some time as read on the clocks. We then can assume that light travels at c relative to your frame and synchronize clocks and form a different picture of what is now relative to your frame. So what is, is relative to the frame of reference. But what we observe locally is not relative. It is the absolute set of events that really occurred.

This is no problem because all local measurements all frames share. The events are absolute. It is only their arrangement subsequently into pictures of what “is” at some “time” globally that is relative.

This relationship between “being” in the sense of “what is” not in the sense of “what is no longer”, nor in the sense of “what is not yet” but only in the sense of “what is right now” becomes relative.

Being becomes relative, but only in that sense. What remains absolute is the empirical observations made.

The use of separate notions of “what is” relative to each frame, makes the laws of electromagnetism very simple as they predict that light travels at the same speed relative to any frame.

Minkowski had deep insight into this and realized that if we use a four dimensional spacetime collection of events, that these relative definitions of what is could be resolved as projections of an absolute four space onto the different axes of the coordinate systems of the frames rendering time measurements relative but retaining the absolute reality of spacetime.

He was therefore able to see and retain what was absolute. What “is” in the sense of “what is absolutely” is the space time interval. “What is” in the sense of “what is occurring at the same time relative to some frame of reference” is relative. Time and space intervals are conceived as a kind of mere projection of the real four vector interval onto the axes of frame of reference. Time is relative, space is relative but the interval is absolute. There is no contradiction. As Einstein said in his original paper, the situation is only apparently contradictory.

The axes are created utilizing the fact that light moves at c relative to any frame as an assumption.

If any observer were allowed to see globally without collecting information from distant points via sinals, or, equivalently, if they used signals faster than light, or even infinitely fast, relativity would be impossible.

So “spacetime interval” is so important because it, all 4-intervals I mean, establish a real four dimensional spacetime consisting of fixed absolute points that are the events seen locally, empirically, still absolutely, by all observers in each frame that are colocated.

The arrangement of events into simultaneous slices of that spacetime relative to any frame that defines “simultaneous” using the fact that light moves at c relative to it, are what come out different or “relative to the frame”.

Every colocated observer in every frame sees the same thing as any other observer in any other frame that is colocated sees. They even see the same content in signals transmitted from other places. And all observers are then free to calculate what “was” or “is” or “will be” at some moment relative to any frame of reference not just their own.

So everyone agrees that relative to some frame, this meter stick is longer than that one moving relative to it, but at the same time relative some other frame the opposite is true. It’s just how you project the real, absolute four space interval, onto the axes of the coordinate systems.

Everyone sees the same thing and can calculate lengths relative to any frame they want to.

That spacetime is the “new reality” and no longer can we just say that “what is”, as in “what is now”, is absolute, that is what Minkowski realized. The old reality was absolute time intervals. The new reality is spacetime intervals that can be projected onto time and space axes.

Latter Einstein realized how fundamental and useful the notion of considering reality to be independent of the coordinates was. It helped him greatly in General relativity where the shape of the underlying spacetime could be used to interpret gravity.

Spacetime is like a crystalline unmoving structure. Proper time only occurs along world-lines.

As Doris Day once sung “Que sera, sera! What will be will be”. Spacetime is what is and itself has no temporal structure. It’s like a fixed crystal.

Minkowski saw spacetime and got very excited declaring an end to ages of thinking in terms of absolute space and separate absolute time.

He was right!