In the derivation of Lorentz transformations, the Wikipedia article mentions a couple of times that the linearity comes from the homogeneity of space. I am looking for a thorough explanation on this.
-
1"some reference"... do you have little more background on what your thoughts on this are? Have you le asked this before somewhere? – Nikolaj-K Mar 27 '14 at 20:32
-
Possible duplicate: http://physics.stackexchange.com/q/12664/2451 and links therein. – Qmechanic Mar 27 '14 at 20:51
-
I don't think the length preserving property is a result of homogeneity. In fact, the Minkowski metric is a result of Lorentz transformation, which is observed to preserve the metric. There has to be an explanation of first principle of the linearity. – shrinklemma Mar 27 '14 at 21:10
-
@Walt: Isn't that like saying the Euclidean metric follows from ordinary rotations preserving it? – Muphrid Mar 28 '14 at 01:47
-
@Muphrid You cannot exactly say that since the rotations also preserve the Minkowski metric. You may ask the question that given any group of transformations, does there exist a metric that is preserved by these transformations? Furthermore, when does the group of symmetries that preserves the metric coincide with the group of transformations to begin with? – shrinklemma Mar 28 '14 at 03:58
4 Answers
I claim that if the transformation between frames is homogeneous and differentiable, then it is affine (homogeneity is not strictly speaking sufficient for linearity since the full transformation between frames is actually a Poincare transformation which is affine, not linear)
For a mathematically precise proof, we need a mathematical definition of homogeneity. To arrive at such a definition, we note that the basic idea is that we can pick our origin wherever we choose, and it won't "affect the measurement results of different observers." In particular, this applies to measurements of the differences between the coordinates of two events. Let's put this in mathematical terms.
Let $L:\mathbb R^4\to\mathbb R^4$ be a transformation. We say that $L$ is homogeneous provided \begin{align} L(x+\epsilon) - L(y+\epsilon) = L(x) - L(y) \end{align} for all $\epsilon\in\mathbb R^4$ and for all $x,y\in\mathbb R^4$.
We can now precisely state and prove the desired result. Note that I also assume that the transformation is differentiable. I haven't thought very hard about if or how one can weaken and/or motivate this assumption.
Proposition. If $L$ is homogeneous and differentiable, then $L$ is affine.
Proof. The definition of homogeneity implies that , \begin{align} L(x+\epsilon)-L(x) = L(y+\epsilon) - L(y) \tag{1} \end{align} for all $\epsilon, x, y$. Now we note that the derivative $L'(x)$ of $L$ at a point $x$ is a linear operator on $\mathbb R^4$ that satisfies \begin{align} L(x+\epsilon) - L(x) = L'(x)\cdot \epsilon +o(|\epsilon|) \end{align} and plugging this into $(1)$ gives \begin{align} (L'(x)-L'(y))\cdot\epsilon = o(|\epsilon|) \end{align} for all $\epsilon,x,y$, where $|\cdot|$ is the euclidean norm. Now simply choose $\epsilon = |\epsilon|e_j$ with $|\epsilon|\neq 0$ where $e_0, \dots e_3$ are the standard, ordered basis elements on $\mathbb R^4$, multiply both sides on the left by $(e_i)^t$ where $^t$ means transpose, divide both sides by $|\epsilon|$, and take the limit $|\epsilon|\to 0$ to show that all matrix elements of $L'(x)-L'(y)$ are zero. If follows immediately that \begin{align} L'(x) = L'(y) \end{align} In other words, the derivative of $L$ is constant. It follows pretty much immediately that $L$ is affine, namely that there exists a linear operator $\Lambda$ on $\mathbb R^4$, and a vector $a\in\mathbb R^4$ such that \begin{align} L(x) = \Lambda x + a \end{align} for all $x\in\mathbb R^4$. $\blacksquare$

- 57,120
-
@mhodel Glad you liked it! I'd be curious to know if you or perhaps someone else can think of a good way of motivating the differentiability hypothesis. – joshphysics Mar 28 '14 at 02:50
-
@joshphysics Assuming your definition of homogeneity, you don't need differentiability to deduce linearity. Continuity suffices and is not unnatural to assume. But could you explain to me how the homogeneity of space is related to your definition of homogeneity of transformation? I do not exact understand the homogeneity of space. – shrinklemma Mar 28 '14 at 03:47
-
@joshphysics I agree with Walt's comments, though I don't think I understand the significance of the distinction between differentiability/continuity as well as he does. – mhodel Mar 28 '14 at 04:22
-
@Walt I think the basic idea is that, if space is "homogenous", it shouldn't matter what coordinate system you use. Specifically of interest to us, it shouldn't matter where you choose you origin to be, so that you have some sort of translational invariance. Therefore the Lagrangian, which should obey the same symmetries as the space it lives in, displays the same translational invariance. As joshphysics put it, L(x+e) - L(y+e) = L(x) - L(y) – mhodel Mar 28 '14 at 04:25
-
@Walt The idea is simply that the coordinate differences between events measured by any observer don't depend on the choice of origin of a given observer. I'm not sure how else to mathematically formulate homogeneity given this intuition. – joshphysics Mar 28 '14 at 05:34
-
-
@Walt Haha ok well I almost was hoping you were going to continue to press me. I've actually been thinking about it on an off for the past day, and I think with a bit more thought I can come up with a more convincing, detailed motivation for the definition which will probably involve some diagrams. I'll write an addendum when I think I have something. – joshphysics Mar 28 '14 at 16:34
-
@joshphysics: Thanks for the awesome proof! Could you clarify a couple of things? One, with respect to what are we taking the derivatives of $L(x)$ and $L(y)$? Secondly, you mentioned that $|.|$ is the Euclidean norm - is that assumption strictly necessary? If instead of Eucliean norm, we had Minkowski or any other arbitrary norm, would we run into trouble with the proof? – Shirish Kulhari Jun 19 '20 at 19:55
-
@ShirishKulhari The argument of $L$ is just a label -- when we say we are taking the derivative of a function, the meaning of that statement doesn't depend on the label of its argument. One could e.g. write $L'$ without evaluating the derivative at a particular point. The equation involving the Euclidean norm is not an assumption per se, it's just a fact that's true according to the standard definition of the derivative of a function defined in many dimensions. In any case, as another of the answers notes, you don't need differentiability in the proof anyway. – joshphysics Jun 19 '20 at 21:59
-
@joshphysics: Thanks so much for getting back on this after so many years! Sorry if I sound dense, but by $L'(x)$ (where $x\in\mathbb{R}^4$) do you mean the Jacobian of $L$ evaluated at $x$? As for the other doubt, the reason why I was confused about the norm is because we're taking $|\epsilon|$, which resides in the same vector space as the spacetime vectors, and that vector space is equipped with the Minkowski metric. That's why I was confused - should $|\epsilon|$ be the Minkowski norm instead of the Euclidean? Again, my bad if I'm being stupid here – Shirish Kulhari Jun 19 '20 at 22:53
-
1@ShirishKulhari Not dense at all. Yes sometimes it's called the Jacobian (https://math.stackexchange.com/a/621995/58845). The expression with the Euclidean norm follows from the definition of the derivative as a linear transformation. Note that since elements of Minkowski space are points in $\mathbb R^4$, there's no reason why one can't still use the Euclidean norm to write down mathematical identities. One can simultaneously consider multiple additional mathematical structures (e.g. norms) if one finds them useful. – joshphysics Jun 20 '20 at 04:28
-
@joshphysics: Awesome, thank you! Upvoted the answer since now I understand it better. The proof without differentiability assumption is a bit too advanced for me at this point, so I guess I'll get back to it later. – Shirish Kulhari Jun 20 '20 at 05:57
-
1Your definition of homogeniety simply states that L maps parallelograms to parallelograms, which surely is a simple reformulation of the affinity condition. I do not see at all how it follows from the physical "homogeniety of space"; to complete the argument one would need to describe a physical (thought) experiment leading to the preservation of parallelograms. – Kostya_I Jul 25 '22 at 09:02
This answer is essentially the same as JoshPhysics's Answer but with the following points:
- We use a more "off-the-shelf" mathematical result to get rid of the differentiability assumption JoshPhysics used in his answer and instead we simply need to assume that the Lorentz transformation is only continuous;
- We show that the continuity assumption is a necessary and the minimum necessary assumption further to the OP's homogeneity assumption i.e. the OP's assertion that linearity comes from the homogeneity of spacetime alone is wrong.
JoshPhysics's equation (1) implies:
$$L(X+Y)-L(Y) = L(X)-L(0);\;\forall X,\,Y\in \mathbb{R}^{1+3}\tag{1}$$
Now we define $h:\mathbb{R}^{1+3}\to\mathbb{R}^{1+3}$ by $h(Z)=L(Z)-L(0)$; then it follows from (1) alone that:
$$h(X+Y)=h(X)+h(Y);\quad\forall\,X,\,Y\in\mathbb{R}^{1+3}\tag{2}$$
But this is the famous Cauchy functional equation generalized to $3+1$ dimensions. For one, real dimension, the only continuous solution is $h(X)\propto X$; there are other solutions, but they are everywhere discontinuous, as shown in:
E. Hewitt & K. R. Stromberg, "Real and Abstract Analysis" (Graduate Texts in Mathematics), Springer-Verlag, Berlin, 1965. Chapter 1, section 5
It is easy to broaden the Hewitt-Stromberg argument to any number of dimensions, so that, given an assumption of continuity of $L:\mathbb{R}^{1+3}\to\mathbb{R}^{1+3}$, we must have:
$$L(X) = \Lambda\,X + \Delta\tag{3}$$
where $\Lambda$ is a linear operator - a $4\times4$ matrix and $\Delta$ a spacetime offset.
Note that we must invoke the continuity assumption; otherwise, following the reasoning in Hewitt and Stromberg, our $h$ function could be one of the everywhere discontinuous Cauchy equation solutions, and we could then, by reversing the step from my equation (1) to (2), construct everywhere discontinuous, nonlinear functions that fulfill JoshPhysics's homogeneity postulate. So unless we require exactly continuity, we shan't "choose" the right solution of the Cauchy equation. Thus continuity of transformation as well as homogeneity are the minimum assumptions needed to imply linearity.

- 88,112
Intuitively, this is fairly easy to understand. This isn't a proof, but suppose Bob is traveling at a constant velocity relative to you in such a way that:
3 minutes on Bob's clock equal 15 minutes on your clock (time dilation)
15 meters of Bob's distance is equal to 3 meters of your distance (Lorentz contraction)
Note that I'm making no assumption of linearity. I don't know how long 4 minutes on Bob's clock will be on my clock. I'm only going to use the two specific observations above to show linearity (intuitively).
Suppose Bob starts a 3 minute egg timer (hourglass), and, the moment 3 minutes elapse, he turns it around to measure another 3 minutes.
Since Bob is in an inertial (constant velocity) reference frame, his 3 minutes plus 3 minutes add to 6 minutes.
In your reference frame, the first 3 minutes took 15 minutes (by our observation above) and the second 3 minutes also took 15 minutes, since Bob's velocity relative to us remains constant. Thus, Bob's 6 minutes took 15 + 15 minutes, or 30 minutes.
Of course, you can apply this observation to any amount of time, thus showing linearity.
The argument for distance is similar. If Bob walks 15 meters, pauses (for a length of time that will differ for the two of you), and then walks another 15 meters, he has walked a total of 30 meters, since distances add.
You don't know how long 30 of Bob's meters are for you, but you do know the first 15 meters translates to 3 meters, as does the second 15 meters. Since distance adds for you as well, you now know that 30 meters of Bob's distance equals 6 meters of your distance.
In other words, time and distance add in all intertial reference frames.
Why is this not a proof?
I assume that 3 minutes on Bob's clock is always equal to 15 minutes on your clock, since Bob is traveling at a constant velocity relative to you.
However, it's at least theoretically possible that the speed of Bob's clock depends on his distance from you. Perhaps 3 minutes on Bob's clock equals 15 minutes on your clock the instant he passes you, but, when he's half a light year away, 3 minutes on his clock is now an hour on your clock.
So, this isn't a proof, but if you intuitively accept that the difference in time and distance between two observers depends solely on their relative velocity, this may be helpful.
The following proof only requires the continuity of the Lorentz transformations $L$ but it also requires that the two observers start measuring time at the same instant and in the the same point of space, thus $L(0)=0$.
As already mentioned by joshphysics homogeneity of the space translates into the following property:
\begin{equation} L(y+\varepsilon)-L(x+\varepsilon) = L(y)-L(x)\quad \forall\, x,\,y,\,\varepsilon\,. \end{equation} Let now $\varepsilon = -x $ so that $$ L(y-x) = L(y) - L(x) + L(0)\,, $$ then we have the following $$ L(y+x) = L(y-(-x)) = L(y) - L(-x) + L(0) $$ and $$ L(-x) = L(0-x) = L(0)-L(x)+L(0) = -L(x) + 2\,L(0)\quad .$$ Combining the last two equations we get $$L(y+x) = L(y)+L(x)-L(0)\quad.$$ If we assume that $L(0) = 0$ then $$ (1)\quad\begin{cases}L(y+x) = L(y)+L(x) \\ \\ L(-x) = - L(x)\end{cases}$$ It is easy to check from $(1)$ that $L(z\,y) = z\,L(y)\,$ $\forall z\in\mathbb{Z}\,.$ Consider now $q\in \mathbb{Q}$ and let $a\in\mathbb{Z}\,,$ $b\in\mathbb{N}$ such that $q=\dfrac{a}{b}\,.$ Then $$ L(y) = L\left(\dfrac{b}{b}\,y\right) = b\,L\left(\dfrac{1}{b}y\right) \qquad \Rightarrow \qquad L\left(\dfrac{1}{b}y\right) = \dfrac{1}{b}\,L(y)$$ so that $$L(q\,y) = L\left(\dfrac{a}{b}\,y\right) = a L\left(\dfrac{1}{b}\,y\right) = \dfrac{a}{b}\,L(y) = q\,L(y)\quad.$$ Consider now $\alpha\in\mathbb{R}$, since $\mathbb{Q}$ is dense in $\mathbb{R}$ there exists a sequence $\{q_n\}_{n=0}^\infty$ of rational numbers such that $q_n \rightarrow \alpha $ as $n\rightarrow \infty$. From the continuity of $L$ we have that $$L(\alpha\,y ) = \lim_{n\rightarrow \infty} L(q_n\,y) = \lim_{n\rightarrow \infty}q_n\, L(y) = \alpha\,L(y)\qquad .$$ Finally given any two real numbers $\alpha$ and $\beta$, and given any two events $x$ and $y$, we have linearity of Lorentz transformations: $$ L(\alpha\,y+\beta\,x) = L(\alpha\,y)+L(\beta\,x) = \alpha\,L(y) + \beta\,L(x)\quad. $$ Note that if $L(0)\neq 0$ instead of $(1)$ we have $$ (2)\quad\begin{cases}L(y+x) = L(y)+L(x)-L(0) \\ \\ L(-x) = - L(x)+2\,L(0)\end{cases}$$ Letting $\Lambda(x) = L(x)-L(0)$ we can rewrite $(2)$ as follows $$ (1')\quad\begin{cases}\Lambda(y+x) = \Lambda(y)+\Lambda(x) \\ \\ \Lambda(-x) = - \Lambda(x)\end{cases}$$
Since $\Lambda$ is also continuous we can repeat the previous steps showing its linearity. In the end if the Lorentz transformation is just continuous the homogeneity of the space implies that it's affine.