4-Vector Definition

Question

In most places I've looked, I see that 4-vectors are defined as 4-element vectors that transform like the 4-position under lorentz transformation. This is typically accompanied by generally, $$\widetilde{A} \ ^\mu = \Lambda^\mu\ _\nu A^\nu$$

This is strange to me and seems circular. How do you "transform like the 4-position?" I can transform $x^\mu$, but how would that compare to other 4-element vectors? Couldn't I just slap $\Lambda$ on anything and say "oh look it transformed?" Clearly not, but I still don't understand.

Where am I going with this? I'd like to show that if $V^\mu U_\mu$ is lorentz invariant and $V$ is a 4-vector, then so must $U$. At first glance it seems like almost like I'm proving a definition. If the scalar is lorentz invariant, it's unchanged if transform both vectors. So am I done just by showing that $V^\mu U_\mu = \widetilde{V} \ ^\mu \widetilde{U} \ _\mu$? This seems too trivial...

Some elaboration and more: From what I know 4-vectors are norm invariant in all inertial frames. That is, for the 4-vector $V$ as an example, $V^\mu V_\mu = \widetilde{V} \ ^\mu \widetilde{V} \ _\mu$. Consider $\sigma \equiv V^\mu U_\mu$, where U isn't necessarily a 4-vector. If I state $\sigma$ is a lorentz scalar, I should find $U$ must be a 4-vector as well.

I'm not sure I can give as clear an answer as I'd like to this, but I want to say that I completely feel your pain about the mind-boggling way that physicists talk about vectors and tensors. I didn't understand what was really going on until I picked up just the right math book (Analysis on Manifolds by Munkres) and got a proper description of what a tensor is.
...and FWIW I'm an experimental physicists, not a mathematician. — DanielSank, Sep 11 '18 at 07:53
Especially with proofs. I have a hard time dealing with what I'm not supposed to know considering how vaguely defined most things are. I'm checking out Munkres atm — Captain Morgan, Sep 11 '18 at 08:00
Replace "transform like the 4-position" with "transform with the $\Lambda$ matrices" under a change of coordinates. — gented, Sep 11 '18 at 08:00
This seems more strange to me. Again, can I not just transform anything then? What does it mean to "transform with?" — Captain Morgan, Sep 11 '18 at 08:11
Beware that Munkres only really talks about antisymmetric tensors. That book is not a complete discussion of tensors. I just like it because it emphasizes the algebraic structure of tensors which is much better (IMHO) than the "transforms like" definition used in physics. — DanielSank, Sep 11 '18 at 08:42
Related (but unfortunately with an incorrect accepted answer): Proof that four-potential is a four-vector. — Emilio Pisanty, Sep 11 '18 at 12:26

score 3 · Answer 1 · answered Sep 11 '18 at 12:25

3

Couldn't I just slap $Λ$ on anything and say "oh look it transformed"?

You could, but that's not what that statement is doing. Using the example you chose, $A^\mu$ isn't just some tuplet of real numbers - it is a very specific combination $(\phi,\vec A)$ of pre-existing concepts, namely the electrostatic scalar and vector potentials.

When you say that $A^\mu$ transforms like a vector, you are making a nontrivial statement about the scalar and vector potentials that a moving observer will require to explain a given field configuration, and you're making the affirmative claim that the scalar potential will acquire terms of the form $\gamma\vec v \cdot \vec A$ and vice versa, the same way that a four-position does.

Of course, that needs to be proved separately, with the details of the proof depending on what definition you've chosen for $A^\mu$, but if the object itself carries nontrivial meaning then its transformation will do so too.

answered Sep 11 '18 at 12:25

Emilio Pisanty

132,859
33
351
666

I'm not talking about any specific quantities. $A^\mu$ is commonly used (and probably so because of vector potential,) however I mean the question from a purely mathematical and arbitrary sense aside from the relativistic framework. – Captain Morgan Sep 11 '18 at 21:28
Then you haven't provided enough of a definition for $A$ to say anything useful about it. – Emilio Pisanty Sep 11 '18 at 21:29
I believe this is the point I'm coming from. The equation I used (with $A$) has been the definition I've seen in most places for the definition of a 4-vector, without clarifying what it is. In terms of vectors spaces, 4-vectors are a subspace of the general 4-dimensional space because of their properties and operations. What is the defining criteria though? What defines this subspace? – Captain Morgan Sep 11 '18 at 21:34
That varies across the literature. There are several different, mutually-incompatible, valid approaches. Without a specific claim to respond to, this is extremely hard to answer. – Emilio Pisanty Sep 11 '18 at 21:46

score 3 · Accepted Answer · answered Sep 11 '18 at 23:38

Suppose you have $4$ physical quantities $U_0$, $U_1$, $U_2$, and $U_3$. Given a well defined physical system $S$, these quantities are well defined. If you then perform a Lorentz transform, the system is considered from the point of view of another observer. Interpreted this way, we call this a passive transform. But you can just as well attribute the change due to the Lorentz transform to a change in the system. We call this an active transform. The two interpretations are equivalent, because what the other observer in the passive case sees is going to be assessed by that observer in the same way as how the original observer would have assessed it, had he/she seen the same thing.

Then, since the $U_j$ are well defined functions of the system, any change in the system induced by the Lorentz transform interpreted in the active way, defines how the functions $U_j$ will change. The way the $U_j$ will change is thus well defined, there is no a priori assumption that these quantities will transform like a 4-vector. We could e.g. have chosen $4$ quantities that each transform like scalar.

The proof of the quotient theorem that states that if $V^{\mu}U_{\mu}$ transforms like a scalar for any arbitrary four vector $V^{\mu}$ involves writing down just that fact:

$$V'^{\mu}U'_{\mu} = V^{\mu}U_{\mu}$$

And then you insert the Lorentz transform rule for the transform of $V^{\mu}$:

$$\Lambda^{\mu}_{\hphantom{\nu}\nu}V^{\nu}U'_{\mu} = V^{\mu}U_{\mu}$$

Then since this must hold for any arbitrary $4$-vector $V^{\mu}$, we can consider this for the particular case where $V^{\mu}$ is the unit vector pointing in some arbitrary $\rho$-direction, i.e. $V^{\mu} = \delta^{\mu}_{\hphantom{\rho}\rho}$:

$$\Lambda^{\mu}_{\hphantom{\nu}\nu}\delta^{\nu}_{\hphantom{\rho}\rho}U'_{\mu} = \delta^{\mu}_{\hphantom{\rho}\rho}U_{\mu}$$

Simplifying both sides, yields:

$$\Lambda^{\mu}_{\hphantom{\nu}\rho}U'_{\mu} = U_{\rho}$$

This is then the inverse transform, the transform from $U_{\mu}$ to $U'_{\mu}$ is given by:

$$U'_{\mu} = \Lambda_{\mu}^{\hphantom{\nu}\rho}U_{\rho}$$

what is the validation for placing $\delta^\mu_{\ \rho}$ a rank 2 tensor, in place of the arbitrary rank 1 tensor $V^\mu$? — Captain Morgan, Sep 13 '18 at 16:59
We keep $\rho$ fixed in the definition of $V^{\mu}$, so it is a rank 1 tensor, as you can easily see by checking the transformation rule. Note that the components of the second rank Kronecker delta stay invariant under Lorentz transforms (also easy to verify), and this is clearly not the way a unit vector will transform. — Count Iblis, Sep 13 '18 at 18:17

OkThen · Answer 3 · 2018-09-11T12:19:24.277

1

Vectors are elements of linear spaces. And every linear space has a basis. A 4-vector simply means a vector in a 4-dimensional space. The expression

$$ A^{\mu\,'} = \Lambda^{\mu}_{\nu} A^{\nu} $$

is a change of basis.

For completeness, let me elaborate on this last point. Fix two basis in this 4-dimensional vector space, $e_{\mu}$ and $\tilde{e}_{\alpha}$. This means that every vector $A$ can be written as

$$ A = A^{\mu} e_{\mu} \quad \text{or} \quad A = \tilde{A}^{\alpha}\tilde{e}_{\alpha} $$

The real (or complex) numbers $A^{\mu}$ and $\tilde{A}^{\alpha}$ are called coordinate representations of $A$. They represent the same object $A$, each just written in a different basis.

You can now consider a change of basis, that is a linear transformation on this vector space that takes the basis $\tilde{e}$ to the basis $e$:

$$ \tilde{e}_{\alpha} = \Lambda^{\phantom{\alpha}\mu}_{\alpha} e_{\mu} $$

so that the following equality holds

$$ A^{\mu} e_{\mu} = \tilde{A}^{\alpha} \tilde{e}_{\alpha} = \tilde{A}^{\alpha}\Lambda^{\phantom{\alpha}\mu}_{\alpha} e_{\mu} $$

implying

$$ A^{\mu} = \Lambda^{\phantom{\alpha}\mu}_{\alpha} \tilde{A}^{\alpha} $$

Comment. I think a good reference for this might be Schutz's book on general relativity, chapter 2.

edited Sep 11 '18 at 12:19

answered Sep 11 '18 at 12:10

OkThen

824

I'm not sure if I agree with all 4-vectors being 4-dimensional vectors. From what I understand, and correct me if I'm wrong, 4-vectors are a specific category of 4-dimensional vectors. – Captain Morgan Sep 11 '18 at 21:31
@CaptainMorgan It's not that 4-vectors are a specific category of 4-dimensional vectors; you can choose the coordinates of any point in $\mathbb{R}^4$ as the components of a 4-vector in that basis, so every 4-dimensional vector can be a 4-vector. What's restricted, in relativity, is the set of changes of basis you're allowed to apply to the vector space; namely, instead of being allowed to use any linear transformations (aka "elements of $GL(4)$"), we are restricted to a specific subgroup of linear transformations (aka "elements of $O(1,3)$," aka "elements of the Lorentz group"). – probably_someone Sep 11 '18 at 23:54
@probably-someone, so with hardware and tools as an analogy to vectors and transformations, it's not a question of having the correct type of hardware, but tool instead? Effectively, is what you're saying that I can define any 4-dimensional vector to be a 4-vector, as long as I only perform lorentz transformations on it? – Captain Morgan Sep 12 '18 at 00:02
No, 4-vectors are not a specific category of 4-dimensional vectors. You are playing with names and probably overthinking this. @ probably_someone elaborated very precisely on the role of Lorentz transformations in all of this. – OkThen Sep 12 '18 at 04:07

score 1 · Answer 4 · answered Sep 12 '18 at 00:55

So I know you already accepted an answer but in my opinion this is very important and is not discussed with our undergraduates enough:

The notion of "is a tensor" as we use it in physics is generally syntactic, not semantic.

That means that it is not a physical object which is a four-vector or not a four-vector, rather it is a vector equation which is either covariant or non-covariant, and the easiest way to write it in a covariant way is if all of the constituent entities "are tensors."

Here's what I mean in more detail: technically you have a geometrical space, and the inhabitants of that space are the true, semantic, $[m, n]$-tensors. There are a set of "scalars"¹ and atop that are defined your "vectors"² and atop that you can define coordinate systems³ and covectors and $[m,n]$-tensors in general⁴. That's where the "real" tensors live.

But when I say "this is a tensor" in physics what I mean is that this expression singles out one and exactly one tensor in the geometrical space. If it does, then that physical quantity "is a tensor," and if it does not, then it is not.

This is why we can say "a vector is anything that transforms like a vector." We mean "if you shift from coordinates $C$ to coordinates $C'$ in the geometrical space, we know how its vectors' components mix together. If an assortment of measurable numbers happens to mix together in the same way, then it can be associated with exactly one of these tensors and in that sense the assortment "is a tensor."

So the easiest example, though it may reach into a course you have not yet had, is a Christoffel symbol. A Christoffel symbol is a part of differential geometry which helps us take derivatives in curved spaces. A symbol like $\Gamma^a_{bc}$ certainly looks like a [1, 2]-tensor. It has numeric components just like one! Why is it famously "not a tensor"?

It's because: there does exist a tensor which has those components in the present coordinate system, and you can calculate what those components of that tensor must be in a transformed coordinate system, but if you derive the Christoffel symbol of that other coordinate system, it will not have those transformed components. So yes, the Christoffel symbol in some coordinate system $A$ happens to coincide with a tensor, but if you shift to a different coordinate system $B$ then you will discover that it was indeed just a coincidence that that particular geometrical entity was your $\Gamma$. The abstract notion of "Christoffel symbol" is defined in a way such that it might be embodied by a number of different tensors depending on the coordinate system, and that is why it is "not a tensor".

Do you see what I mean when I say that it is a syntactic concept? The equation does single out a set of numbers and that set of numbers is some tensor, the problem is that in different coordinate systems the same equation singles out a different entity and hence that expression is not a tensor expression.

So special relativity says when you accelerate towards a clock it appears to tick faster, proportional to both its distance to you and your acceleration. This is the only fundamental fact which special relativity adds to our physics; everything else can be derived from it. We happen to have a 4D Minkowski space where the abstract geometrical entities obey Lorentz transforms preserving a metric $\operatorname{diag}(1, -1, -1, -1)$. And if we work it out, the assembly of components $(ct, x, y, z)$ will, if you control for the fact that the geometric space doesn't know what "units" are, correspond to a single geometric entity in that space: if you transform those coordinates with this rule from special relativity, you will find that the new position and time components match the relativistic components. And thus we say that these components "are a four-vector."

You can get special relativity from general relativity in a very boring limit. In general relativity you have some abstract space of "points" $\mathcal M$ and you must define a set of real-valued scalar fields $\mathcal S \subset \mathcal M \to \mathbb R$, which must be "smooth" in the sense that the set must be closed under what I call "$k$-functors", these are functions from $C^\infty(\mathbb R^k, \mathbb R)$ interpreted as acting "pointwise" on the output, e.g. for $k=2$ we'd have $f[s_1, s_2](p) = f\big(s_1(p),~s_2(p)\big)$. This set also defines your topology, hence how the space is connected. Note that this gives closure under pointwise addition and multiplication.
In general relativity the vector fields are the Leibniz-linear maps $\mathcal V\subset \mathcal S\to\mathcal S$. So if $V$ is a vector field, this "Leibniz-linear" term means that for any $k$-functor, using $\bullet^{(i)}$ to mean "partial derivative of $\bullet$ with respect to its $i^\text{th}$ argument", we would say $$V\Big(f[s_1, s_2, \dots s_k]\Big) = \sum_{i=1}^n f^{(i)}[s_1\dots s_k]\cdot V(s_i).$$
You technically need to assume a coordinate system in GR. Formally the axiom says that around any point there exists a neighborhood and $n$ scalar fields $c_{1,2,\dots n}$ such that any vector field can, within that neighborhood, be written as an $n$-functor $f[c_1,\dots c_n].$ Then one can uniquely identify a vector as a directional derivative with components $v_i = V(c_i).$ Those components are always scalar fields, mind.
A covector is a linear map from vectors to scalars, $\operatorname{Hom}(\mathcal V \to \mathcal S)$ or however you want to notate it. An $[m, n]$-tensor is a multilinear map from $m$ covectors and $n$ vectors to a scalar. There is an axiom stating that there exists a metric $[0,2]$-tensor and a $[2, 0]$-tensor inverse to it, providing a bijection between the vector space and the covector space and more generally between all $[m,n]$-tensors with the same $m+n$. In addition to this, one needs an axiom that any $[n, 0]$-tensor can be written as some big sum of products of vectors, so that the space of tensors is not substantially more interesting than the products of the spaces of vectors and covectors.

This is a lot to soak in. Thank you for the answer. I need to think about this some more. — Captain Morgan, Sep 12 '18 at 03:15

4-Vector Definition

4 Answers4