2

I am looking back over some old notes and see that I have written

$\bar{p}=\left(\overset{E}{\vec p}\right)$ and $\bar{x}=\left(\overset{t}{\vec x}\right)$

(using Planck units)

And then $\bar{p} \cdot \bar{x}=-Et+ \vec{p} \cdot\vec{x}$

But I can't find an explanation anywhere as to why the first term has a minus sign in front. It looks a lot to me like $(\Delta s)^2 = -(c\Delta t)^2 + (\Delta x)^2$ for Minkowski spacetime...

Would appreciate a (relatively) simple explanation of this- I'm still in high school, so my math knowledge is quite limited!

EDIT: Also, I have seen this post where the dot product is $Et-\vec{p} \cdot \vec{x}$ rather than $-Et+\vec{p} \cdot \vec{x}$ and in fact have actually come across both being used. Even when it comes to distance in spacetime, I have seen both $(\Delta s)^2 = -(c\Delta t)^2 + (\Delta x)^2$ and $(\Delta s)^2 = (c\Delta t)^2 - (\Delta x)^2$. I would also be grateful if someone had an explanation for this.

Qmechanic
  • 201,751
Meep
  • 3,959

4 Answers4

3

That's the definition of the dot product in Minkowski space-time.

To be clear, any space-time is endowed with a metric. Standard ${\mathbb R}^3$ that you may be familiar with has a metric $\delta_{ij} = \text{diag}(1,1,1)$, $i=1,2,3$. Given two vector $\vec{v} = (v^1,v^2,v^3)$ or $v^i$ for short and similarly $w^i$, the dot product is defined as $$ \vec{v} \cdot \vec{w} : = \sum_{i,j=1}^3 \delta_{ij} v^i w^j = v^1 w^1 + v^2 w^2 + v^3 w^3 $$ Note that the metric $\delta_{ij}$ is also the same quantity that defines distance between points in ${\mathbb R}^3$, i.e. given two points $(x_1,x_2,x_3)$ and $(y_1,y_2,y_3)$, the Euclidean distance between them is $$ (\Delta s)^2 = (x_1 - y_1)^2 + (x_2-y_2)^2 + (x_3-y_3)^2 = \sum_{i,j=1}^3\delta_{ij} (\Delta x)^i (\Delta x)^j $$

In Minkowski spacetime, the metric is $\eta_{\mu\nu}= \text{diag}(-1,1,1,1)$ where $\mu = 0,1,2,3$. The dot product of two vectors $v^\mu$ and $w^\mu$ is now similarly defined $$ v \cdot w = \sum_{\mu,\nu=0}^3 \eta_{\mu\nu}v^\mu w^\nu = - v^0 w^0 + v^1 w^1 + v^2 w^2 + v^3 w^3 $$ As before, the metric is also same quantity that defines distances in Minkowski space-time $$ (\Delta s)^2 = - (\Delta t)^2 + (\Delta x)^2 = \sum_{\mu,\nu=0}^3 \eta_{\mu\nu} (\Delta x)^\mu (\Delta x)^\nu $$

Aside:

  1. In some texts, the metric is defined as $\eta_{\mu\nu}= \text{diag}(1,-1,-1,-1)$

  2. A summation convention is often employed wherein, if there is an index that is repeated - one up, one down - then by convention this index is summed over (Einstein's notation). For instance $$ \eta_{\mu\nu} v^\mu w^\nu := \sum_{\mu,\nu=0}^3 \eta_{\mu\nu}v^\mu w^\nu $$ Indices are never supposed to be repeated and be both down or both up, example $\eta_{\mu\mu}$ doesn't make sense. However, some authors use this sloppy notation as well.

Prahar
  • 25,924
  • Thank you for your reply! So it seems to me that there is no intuitive 'explanation' as such- it is just happens to work out? Also, in your first 'aside' point, you mentioned a point that I added to my original question as you were writing your answer! But I'm still not too sure about this, because then does this not mean that $(\Delta s)^2=-(\Delta s)^2$ ? Thank you :) – Meep Nov 15 '14 at 15:39
  • 1
    @21joanna12 - The explanation is that the same quantity (the metric) governs dot products of vectors and the line element and therefore the two quantities must have the same form. Secondly, your last statement is wrong. It is simply the case that various authors use different conventions for their definition of quantities. The line element in one convention is negative of the line element in another convention. You can choose whichever convention you want and stick to it, but you shouldn't try to compare equations written in different conventions. – Prahar Nov 15 '14 at 15:47
2

It is Minkowski spacetime. Both $\bar{p}=\left(\begin{array}{c}E\\{\vec p}\end{array}\right)$ and $\bar{x}=\left(\begin{array}{c}t\\ \vec x\end{array}\right)$ are invariant 4-vectors in spacetime. A 4-vector is composed of a time component and a spacial component (with 3 sub-components related to ordinary 3-space). By invariant we mean that their magnitudes ($\bar{p}\cdot\bar{p}$ or$\bar{x}\cdot\bar{x}$) will be the independent of the choice of any inertial reference frame.

$\bar{p}\cdot\bar{p} = \bar{p}'\cdot\bar{p}' = -E^2 + p_x^2+p_y^2+p_z^2$ OR $E^2 - p_x^2+p_y^2+p_z^2$, depending on the sign convention you want to use.

The basic idea behind the minus sign in the dot product comes from analyzing how light signals would travel between two events. One principle of relativity is that light (electromagnetic radiation) has an invariant speed; all observers measure c.

Imagine two events, 1 and 2. One observer sees them happen at $\vec{x}_1$ and $\vec{x}_2$ at times $t_1$ and $t_2$. If a light signal starts at event 1 at $t_1$ and travels toward event 2, it will be at spatial location $c(t_2-t_1)+\vec{x}_1=c(\Delta t)+\vec{x}_1$ when event 2 happens. at time $t_2$. In Minkowski spacetime, the time coordinate is independent of the spacial coordinate, so the to find the spacetime spacing between the two events in this reference frame we use the squared Pythagorean length

$(t_2-t_1)^2+(\vec{x_2}-\vec{x_1})^2=(\Delta t)^2+(\Delta\vec{x})^2.$

The spacetime length (squared) of the light signal motion (called a worldline) is

$(t_2 - t_1)^2 + (c \Delta t +\vec{x}_1 - \vec{x}_1)^2 = (\Delta t)^2 + (c\Delta t)^2.$

The square distance between the light-signal and the event spacing is therefore

$(\Delta\vec{x})^2-(c\Delta t)^2$ or $(c\Delta t)^2-(\Delta\vec{x})^2$, depending on which difference you care to take; it doesn't matter. What does matter is that this Minkowski spacing between the light signal travel and the event separation must be the same in all inertial reference frames. And that's where the minus sign originates.

It eventually leads to the definition of the 4-vector ``dot product'' operation which says that the time components are multiplied together, the spacial components are dotted together like 3-space vectors, and the difference of these two products is taken.

What's convenient about this definition is that the 4-vector dot product of ANY 2 properly constructed 4-vectors is an invariant quantity.

ASIDE: Some writers write the 4-vector with the spatial components as the top 3 members and the time component as the bottom member.

A nice feature of the 4-vector notation is that the Lorentz transform can be expressed as a matrix multiplication. The Lorentz matrix, $\mathcal{L}$ is $4\times 4$. The Lorentz transform of a 4-vector would be $\bar{p}'=\mathcal{L}\bar{p}$.

Bill N
  • 15,410
  • 3
  • 38
  • 60
1

In Minkowski spacetime you assign four coordinates to your events: $x=(x^0,x^1,x^2,x^3)$ - in this notation $x^0 =ct$ and the other three coordinates are the spatial ones. Suppose these are the coordinates of a point in spacetime reached by a light ray which started at the origin, then you have that:

$(x^0)^2=c^2 t^2 = \sum_{i=1}^3 (x^i)^2 \Rightarrow s^2 = -(x^0)^2 + (x^1)^2 + (x^2)^2 + (x^3)^2 =0$

Because $c$ is exactly the speed of light and therefore $ct$ is the distance travelled by the light ray. Now, if you change your coordinate system to a new one (inertial, of course) in such a way that the origin of this new system is also the origin of the old one, you know that:

$(s')^2 = -(x'^{0})^2 + (x^{1})^2 + (x^{2})^2 + (x^{3})^2 = 0$

where $x'^{\mu}$ are the coordinates in the new system. This is because the light ray also starts at the origin of the new system and its speed is also $c$ (remember, the laws of Physics are the same in the two systems, that's relativity!). The thing now becomes a little bit harder, but the main idea is that this relation also holds for general events, not only for those whith $s^2=0$. What I'm saying is that if you have the coordinates of an event $x^{\mu}$ and you define $\eta_{\mu \nu} = \text{diag} \{-1,1,1,1\}$, the quantity:

$$s^2 = \sum_{\mu, \nu = 0}^{4} \eta_{\mu \nu} x^{\mu} x^{\nu} $$

is the same in every inertial system (with the same origin, but this is a minor point). I won't prove this since the details are quite technical and they add nothing but complicate calculations, but it can be done using the fact that every event with $s^2=0$ also has $(s')^2=0$ in a new coordinate system related to the old one by a linear transformation (you want your transformation between inertial systems to be linear to preserve rectilinear motions of free particles). So if you transform your coordinates with a Lorentz transformation so that your event is now at the point $x'^{\mu}$, then:

$$(s')^2 = \sum_{\mu, \nu = 0}^{4} \eta_{\mu \nu} x'^{\mu} x'^{\nu} = s^2 $$

Now you can imagine the reason for that minus sign: it is the only way we have in Minkowski spacetime to create an invariant, a quantity which does not depend in the particular choice of the (inertial) coordinate system. In addition, every four-vector transforms like the coordinates, so in your example with $p=(\frac{E}{c}, \vec{p})$, the quantity $p^2 = -\frac{E^2}{c^2} + \vec{p}^2$ is also independent of the inertial system. This idea generalizes to products of different vectors, in the sense that the dot product of four vectors like the one you wrote:

$$ x \cdot p = -\frac{E}{c}t + \vec{x} \cdot \vec{p} = \sum_{\mu, \nu = 0}^{4} \eta_{\mu \nu} x^{\mu} p^{\nu} $$

is also the same in every inertial system.

In addition, it should be clear now why we can define $\eta_{\mu \nu}$ with two conventions: $(-,+,+,+)$ or $(+,-,-,-)$. In both cases we can define an inner product of four-vectors which gives us an invariant (the only difference is the sign of this invariant).

Alex V.
  • 712
  • 5
  • 16
1

There are already a few answers that explain the mathematics behind it, but since you've said that your "math knowledge is quite limited" I'll try and break it down into simpler terms.

You're already familiar with a 3D vector dot product, and it seems your confusion arises from dot products of a four-vector. Now what they didn't teach you when they taught you about dot products was that the 2D/3D case of a vector dot product is just a specific example of a process called an inner product. (The Wikipedia page is quite densely mathematical, but possibly worth a read).

These inner products are products of two vectors (say four-vectors) multiplied by what is called the "metric" of the spacetime. The metric tells the inner product how to behave.

So what that means is this - If you have two four vectors $x$ and $y$, then using the metric (traditionally $\eta$ in special relativity), the dot product will be defined as follows:

$$\bar x.\bar y = \sum_{n=1}^4 \sum_{m=1}^4 \eta_{nm}x_n y_m$$

where $n$ and $m$ run over the components of the four-vectors.

$\eta$ here is defined as (where $c = 1$)

$$ \eta = \begin{pmatrix}-1&0&0&0\\0&1&0&0\\0&0&1&0\\0&0&0&1 \end{pmatrix}$$

if $c \neq 1$

$$ \eta = \begin{pmatrix}-c^2&0&0&0\\0&1&0&0\\0&0&1&0\\0&0&0&1 \end{pmatrix}$$

It is because of this form of the metric that you're getting that funny minus sign. The sign of the metric is called it's signature, and you'll see this signature in any two four vectors that you have to take the dot product of.

Kitchi
  • 3,709