5

There are many approaches to deriving the Lorentz transformation. The two main ones are, I think:

Method 1. Assume the Minkowski metric $\eta = {\rm diag}(-1,1,1,1)$ and then define Lorentz transformations $\Lambda$ as that set of transformations which satisfy $\Lambda ^T \eta \Lambda = \eta$ (where I am using matrix notation).

Method 2. Quote postulates and reason from them.

My question concerns the latter method. There is a long history of arguments over exactly what has to be assumed. For the sake of my question I shall solve the question of linearity by asserting that I am interested in finding a linear transformation if there is one which satisfies whatever postulates I introduce. (Thus I am not interested in the question whether there may also be non-linear transformations which also satisfy the postulates).

I would like to suggest that in order to obtain the standard Lorentz transformation, it is sufficient to assume linearity and just two further things:

  1. Postulate 1 (The Principle of Relativity). The motions of bodies included in a given space are the same among themselves, whether that space is at rest or moves uniformly forward in a straight line. (This implies that the mathematical form of laws of motion is unchanged from one inertial frame to another.)

  2. Postulate 2. There is a finite maximum speed for signals (where a signal is an influence which can transmit a cause to an effect).

But I am aware that this is disputed because, it is asserted, one must also add a further assumption about the isotropy of space.

For clarity, I will first briefly present the undisputed part of the argument. One considers two frames in relative motion, with aligned axes, such that frame $S'$ proceeds at speed $v$ in the $x$ direction as observed in frame $S$. One first argues that coordinates $t'$ and $x'$ of any given event are functions of $t$ and $x$ alone, and since we have decided to seek a linear transformation, we may write $$ \newcommand\mycolv[1]{\begin{bmatrix}#1\end{bmatrix}} \mycolv{ct'\\x'} = \mycolv{a & b\\d & e} \mycolv{ct\\x} $$ where $a,b,d,e$ are functions of $v$, to be determined. First one reasons that the very meaning of speed $v$ is that the events $(t',0)$ in S' correspond to $(t,vt)$ in S, from which one finds $d = - (v/c) e$. One also asserts, from Postulate 1, that speed just means relative speed so one must equally find that the events $(t,0)$ in S correspond to the events $(t',-vt')$ in S', from which it follows that $d = - (v/c) a$. Then one can invoke Postulate 2 to assert that the events $(ct,ct)$ in S must correspond to $(ct',ct')$ in S', which gives $a + b = d + e$.

Putting all the above together, one finds the transformation has to be of the form $$ \Lambda(v) = a(v) \mycolv{1 & \!-v/c\\-v/c & 1}, $$ where the function $a(v)$ is still to be discovered.

It is at this point that the question of isotropy comes in. It is entirely reasonable to propose that $\Lambda(v)^{-1} = \Lambda(-v)$. The question is whether this statement is forced upon us by the Principle of Relativity and the assumption of linearity. In a quite thorough treatment by Selene Routley here: [ https://physics.stackexchange.com/questions/253356/homogeneity-and-isotropy-and-derivation-of-the-lorentz-transformations ] it is asserted that $\Lambda(v)^{-1} = \Lambda(-v)$ is not forced upon us but amounts to a further assumption. My question is to ask: is that correct?

Another way to ask the same question is: is it possible to furnish a linear transformation with $\Lambda(v)^{-1} \ne \Lambda(-v)$ which nevertheless satisfies the Principle of Relativity?

To make the question more pointed still, here is an argument to derive $a(v) = (1-v^2/c^2)^{-1/2}$ (and hence $\Lambda(v)^{-1} = \Lambda(-v)$) from the two postulates above, together with linearity, and no further assumption. The argument is essentially the one proposed by udrv here [ https://physics.stackexchange.com/questions/230320/how-do-the-postulates-of-relativity-relate-lorentz-transforms-to-their-inverses ].

We consider frames S' and S as before, and a further frame S'' which moves at speed $u$ in the $x'$ direction relative to frame S'. We argue from relativity postulate that S'' is just another inertial frame so we must find there is a transformation directly from S to S'' which must match the composition of the two transformations from S to S' to S'': $$ a(w) \mycolv{1 & -w/c \\ -w/c & 1} = a(u) \mycolv{1 & -u/c \\ -u/c & 1} a(v) \mycolv{1 & -v/c \\ -v/c & 1}. \tag{1} $$ By multiplying out the matrix product one finds $$ a(w) = a(u) a(v) (1 + u v/c^2) \;\;\; {\rm and} \;\;\; w = \frac{u+v}{1+uv/c^2} $$ hence the function $a(v)$ must satisfy $$ a\left( \frac{u+v}{1+uv/c^2} \right) \equiv a(u) a(v) \left( 1 + \frac{uv}{c^2} \right). \tag{2} $$ This is not just an equation but an identity: that is, it is valid at all $u,v$ in the range over which those quantities are defined, i.e. here $0 \le |u| < c$ and $0 \le |v| < c$. Such an identity is sufficient to fix the function uniquely. (To prove this, one way is to write $a(x) = \sum_{i=0}^\infty b_i x^i$ with constant coefficients $b_i$ to be discovered, and then make the claim that equation (2) is sufficient to fix the $b_i$). One obtains $$ a(v) = \frac{1}{\sqrt{1 - v^2/c^2}} $$ and hence we get the Lorentz transformation, without requiring any assumption about isotropy of space.

It would be helpful if any answer which asserts that isotropy is a further assumption could also explain why equation (1) does not itself follow from the two postulates and linearity.

Andrew Steane
  • 58,183
  • Somewhat related : https://physics.stackexchange.com/questions/564238/can-the-lorentz-transformations-be-derived-this-way . Check out the second Update section of the post. I had derived the gamma factor but by considering faster than light frames, so it was pretty controversial – Ryder Rude Oct 24 '22 at 17:31
  • Now that I think about it, I did not have to assume faster than light frames. In the post, I've used two assumptions only : 1. linearity, and 2. "if $B's$ space or time axis makes a slope $m$ in $A's$ spacetime diagram, then $A's$ space or time axis makes the same slope $m$ in $B's$ spacetime diagram, upto a sign change." The second assumption is justified by the principle of relativity, that both frames' descriptions of each other should be symmetric. These two assumptions are enough to derive Lorentz transformations – Ryder Rude Oct 24 '22 at 18:20
  • @RyderRude your statement about slopes is equivalent to $\Lambda(v)^{-1} = \Lambda(-v)$ I think. So I agree your reasoning but, as I understand it, some people say you sneaked in an assumption about isotropy which, they assert, does not have to be made. – Andrew Steane Oct 24 '22 at 18:56
  • I think Selene is not too careful in her wording. She called isotropy an assumption and linked the wikipedia article. But the wikipedia article did not call it an assumption : It just says that it must be true (probably because of the Principle of Relativity). The article does not list the principle of relativity as an assumption. But it uses two assumptions which are justified by that principle : Closure (which you also needed to assume), and Isotropy. The principle of relativity subsumes these. – Ryder Rude Oct 25 '22 at 04:22
  • My derivation did not need the closure assumption, because it extended the assumption about slopes to the space axes as well. Maybe closure is more natural than that. In the very least, the principle of relativity guarantees that the absolute value of $v$ remains unchanged as we switch the frame, because change in absolute value would make the frames asymmetric. The sign change of $v$ is perhaps unaccounted for. Maybe we could derive separately for both cases of the sign. I think the +v case may simply be inconsistent with the other axioms. Even rotations have $-v$. – Ryder Rude Oct 25 '22 at 04:40
  • @AndrewSteane It seems that one could determine $a(v)$ by invoking the condition that $\operatorname{det}\Lambda =1$, which could be seen as a more natural condition for a vector space. Could this be seen as part of the Relativity Principle? (Are Euclidean rotations and Galilean boosts derived analogously?) – robphy Oct 27 '22 at 00:52

3 Answers3

1

I think the crucial issue here is the hidden assumptions behind "the speed of light" and "linear." When we say "the" speed of light, we're assuming that the leftward and rightward speeds of light are the same, which means we're assuming they're comparable. If they're comparable, it's only by using the metric or, equivalently, by knowing how the coordinates behave under reflection. And it matters whether "linear" applies just to the Lorentz boosts or also to the behavior under reflection.

So consider the following two possibilities:

(A) The Lorentz transformations have their usual form, but under reflection the negative and positive x axes get rescaled by $\alpha$ and $1/\alpha$.

(B) Under reflection, $x\rightarrow -x$, but the Lorentz transformation is nonlinear. For $x<0$, the Lorentz transformation uses $c\alpha$, while for the positive $x$ axis it uses $c/\alpha$.

A and B give the same physics, because they're equivalent up to a change of coordinates. A has linear Lorentz transformations but B doesn't, but in both cases the full Poincare group is nonlinear.

  • Thanks. I am ruling out your B by assuming linearity. Regarding your A, I have in mind a 3-dimensional space. Can this rescaling on reflection still make sense? (It's not easy to see how it can be consistent with the results of rotation through $180^\circ$). – Andrew Steane Oct 24 '22 at 16:06
0

I submit that isotropy of spacetime is a necessary prerequisite for relativity of inertial motion.

In other words: I submit that granting relativity of inertial motion means that automatically isotropy of spacetime is granted too.


It seems to me there is no way to break these concept down in any form of sub-units. The very uniformity is the very strength of those concepts.

It seems to me: one either decides that granting relativity of inertial motion implies isotropy of spacetime, or one allows doubt. The concepts are so simple that there is no opportunity for building a reasoning. In order to build a reasoning one would have break the concepts down into sub-units, but there are no sub-units.



Moving to discussion of axiomatization of physics generally:

In my opinion: when it comes to formulating axioms for a branch of physics pushing for rigor only has value up to a point.

I subscribe to the view that postulates in physics are very different from postulates in mathematics, so much so that arguably it is misleading to use the same word.

This type of view is articulated well by stackexchange contributor Kevin Zhou, so I give an extensive quote from an answer to a question about derivability in physics:

[...] in physics, you can often run derivations in both directions: you can use X to derive Y, and also Y to derive X. That isn't circular reasoning, because the real support for X (or Y) isn't that it can be derived from Y (or X), but that it is supported by some experimental data D. This two-way derivation then tells you that if you have data D supporting X (or Y), then it also supports Y (or X).

Once you finish putting high school math on a rigorous foundation, undergraduate math generally builds upward. [...]

This isn't the case in physics: undergraduate physics generally builds downward. Every year, you learn a new theory that subsumes everything you previously learned as a special case, which is completely logically independent of those earlier theories.


In mathematics the purpose of a set of axioms is to establish a basis, a bedrock, that you will never have to go back to.

In physics: there may not actually be a bedrock.


Another view that I subscribe to: in any logical system there is great freedom to interchange axiom and theorem without changing the contents of the system.

So: formulating a set of axioms for a particular branch of physics will not necessarily inform you what the most fundamental concepts are. What to designate as axiom or as theorem is a judgement call, it appears.


My view is that in physics the value of axiomatic approach is that is offers an exploratory tool.

Interchanging theorems and axioms, it seems, has the potential to offer a window to understanding the relations between the various concepts.


About axioms:
I remember reading in the Wikipedia article about Euclid's five postulates that David Hilbert had at one time set out to formulate an exhaustive set of axioms. This led to a set of 20 axioms.

I'm guessing that in physics an exhaustive set of axioms would be so large that you wouldn't be able to see the forest for the trees.

So I would advocate for a sparse set of principles, for the purpose of focus. With an intentionally sparse set of principles implicit assumptions are a feature, not a bug.


My preference is to proceed from relativity of inertial motion, with acknowledgement that granting relativity of inertial motion implies granting isotropy of spacetime.

That is sufficient to narrow down the possibilities to just two: Galilean transformation and Lorentz transformation.

No doubt many demonstrations of that exist, one of them is the one by Palash B. Pal Nothing but relativity


With the possibilities narrowed down to two (Galilean transformation and Lorentz transformation), the final element needed (postulate of intrinsic limit to velocity) feels hardly like a postulate.

The fact that the final element needed is used only to narrow down from two possibilities to one underlines how far reaching the ramifications of the relativity postulate are.

Cleonis
  • 20,795
  • The point about interchanging axioms and theorems is quite correct and can be applied in multiple places in physics, making the use of terms such as "fundamental" quite subjective. – Andrew Steane Oct 27 '22 at 09:24
  • @AndrewSteane One example: for SR my preference is to take the Minkowski metric as starting point. I will have the Lorentz transformations and intrinsic maximum velocity follow logically from the Minkowski metric. When newtonian physics was succeeded by SR several existing concepts carried over, such as recognizing an equivalence class of inertial coordinate systems. I prefer to emphasize the spacetime-geometry property that sets them apart: Newton: euclidean metric, euclidean vector addition; SR: Minkowski metric, Minkowski vector addition. – Cleonis Oct 27 '22 at 20:52
0

OP: There is a long history of arguments over exactly what has to be assumed. For the sake of my question I shall solve the question of linearity by asserting that I am interested in finding a linear transformation if there is one which satisfies whatever postulates I introduce. (Thus I am not interested in the question whether there may also be non-linear transformations which also satisfy the postulates).

Me: It is not necessary to assume linearity. Consider two inertial systems S and S' in Rindler's standard configuration (all exes are parallel and of equal scaling, the origins coincide at t=t'=0). A particle be at rest in S at some time t. By the priciple of least reason, the particle remains at rest, cf. Buridan's Ass. Hence, its world line in S is straight. By the principle of relativity, its world line in S' is straight, too. Thus, the transformation sought for maps straight lines onto straight lines. For transforming velocities and accelerations, the transformation is differentiable. The only transformation for that are the fractional-linear ones. They have got poles which reverse the direction of t' while that of t is unchanged. Removing the pole yields the affine transformations. For the standard configuration above, they become linear. (They are even 'special linear' because the determinant of the transformation matrix equals 1.)

OP: I would like to suggest that in order to obtain the standard Lorentz transformation, it is sufficient to assume linearity and just two further things:

Postulate 1 (The Principle of Relativity). The motions of bodies included in a given space are the same among themselves, whether that space is at rest or moves uniformly forward in a straight line. (This implies that the mathematical form of laws of motion is unchanged from one inertial frame to another.)

Me: I agree.

OP: Postulate 2. There is a finite maximum speed for signals (where a signal is an influence which can transmit a cause to an effect).

Me: That is not necessary. It is sufficient to assume (before the relativity postulate) the existence of continuous space and time which, (i), can be equipped with arbitrary many (continuous) Cartesian (3d) and 1d coordinate systems and, (ii), are not influenced by the presence of particles. -- You have implicitly assumed that.

OP: But I am aware that this is disputed because, it is asserted, one must also add a further assumption about the isotropy of space.

Me: When space is empty, homogeneity and isotropy of space and time follow from the principle of least reason: Where inhomogeneity and anisotropy should they stem from and what should determines its magnitude?

Best wishes,

physics@peter-enders.science

  • "Thus, the transformation sought for maps straight lines onto straight lines." Strictly speaking, your argument only shows that a certain subclass of straight lines are mapped to straight lines, namely those which correspond to a particles traveling at 'allowed' speeds. I would agree that any bijection that maps lines to lines and 0 to 0 must be linear, but I'm not sure this theorem can be extended to the case under consideration here. – Inzinity Sep 17 '23 at 07:24