12

I'm having trouble understanding the metric tensor in general relativity. What I've understood so far has come from my course lecture notes used in conjunction with "The Road to Reality" by Roger Penrose.

Problem 1

I know that in special relativity, the matrix

$\eta_{ab}= \left( \begin{array}{cccc} 1 & 0 & 0 & 0\\ 0 & -1 & 0 & 0\\ 0 & 0 & -1 & 0\\ 0 & 0 & 0 & -1\\ \end{array} \right) $

is the metric tensor, but as far as I know "the metric tensor" is just a name for that matrix. I've now learned that in general the metric tensor is the matrix $g_{ab}=\vec{e}_a \cdot \vec{e}_b$ where $\vec{e}_a=\dfrac{d\vec{\sigma}}{dx_a}$, and $\vec{\sigma}(x_1, x_2, \dots )$ is a surface paratremized by the $x_\alpha$. So this would imply that the "surface" in special relativity (I'm guessing that this is what's meant by "spacetime") is 4 dimensional and its vectors $\vec{e}_1, \vec{e}_2, \vec{e}_3, \vec{e}_4$ are orthogonal. But also for $\vec{e}_\alpha\neq \vec{e}_0$, we have

$\vec{e}_\alpha \cdot \vec{e}_\alpha = |\vec{e}_\alpha|^2\text{cos}(0)=-1$, as a vector is parallel to itself. This is my first problem, as the modulus of a vector shouldn't be negative. I'm assuming these vectors $\vec{e}_\alpha$ are in Cartesian coordinates.


Problem 2

Then if $\vec{v}$ is a vector on this surface, written in the surface coordinates so that $\vec{v}=v_{\alpha}\vec{e}^\alpha$, then $\vec{v}\cdot \vec{v}= (v_{\alpha}\vec{e}^\alpha ) \cdot (v_{\beta}\vec{e}^\beta )=(\vec{e}^\alpha\cdot \vec{e}^\beta)v_\alpha v_\beta=\eta_{\alpha \beta}v_\alpha v_\beta$. This makes sense to me if the identity matrix $I_3$ is the metric tensor for 3 dimensional Cartesian coordinates (which I'm assuming it is), so that for $\vec{v}=(a,b,c)$ the dot product becomes $\vec{v}\cdot \vec{v}=a^2 + b^2 + c^2.$ I'm a bit confused about how the symbol $\cdot$ is used here - in the case of $\vec{e}_\alpha \cdot \vec{e}_\alpha$ it seems to be the standard Cartesian dot product, but in the case of $\vec{v}\cdot \vec{v}$ it's not; merely multiplying corresponding components here would be incorrect.


Problem 3

My third problem is that I'm not sure where the equation $\begin{equation} ds^2=g_{ab}dx^\alpha dx^\beta \end{equation}$ comes from. Is this the definition of $ds^2$, and if so, is $ds^2$ conserved in all coordinate systems, as is the case in special relativity? If $\eta_{ab}$ as defined above is called the metric tensor in special relativity, then is this conserved in all coordinates too? By the definition of the metric tensor I can't see why it should be.

Apologies for the lack of clarity, and thanks for any help!

Edit: An example of an exam question I'd like to be able to understand: question (image hosted on dropbox)

user12262
  • 4,258
  • 17
  • 40
Lammey
  • 633

2 Answers2

20

Let's start at the beginning:

The setting for relativity - be it special or general - is that spacetime is a manifold $\mathcal{M}$, i.e. something that is locally homeomorphic to Cartesian space $\mathbb{R}^n$ ($n = 4$ in the case of relativity), but not globally.

Such manifolds possess a tangent space $T_p\mathcal{M}$ at every point, which is where the vectors one usually talks about live. If you choose coordinates $x^i$ on the manifold, then the space of tangent vectors is

$$T_p\mathcal{M} := \{\sum_{i=0}^3 c^i \frac{\partial}{\partial x^i} \lvert c^i \in \mathbb{R} \}$$

When we say that a tupel $(c^0,c^1,c^2,c^3)$ is a vector, we mean that is corresponds to the object $c^i\partial_i \in T_p\mathcal{M}$ at some point $p \in \mathcal{M}$.

A metric on $\mathcal{M}$ can be given by specifying a non-degenerate, bilinear form at each point

$$g_p : T_p\mathcal{M} \times T_p\mathcal{M} \rightarrow \mathbb{R}$$

What you learned "in general" is that the components of the metric are, for chosen basis vectors $\partial_i$ of $T_p\mathcal{M}$, defined by $g_{ij} = g(\partial_i,\partial_j)$. You can now indeed see the metric as a kind of scalar product, setting $X \cdot Y := g(X,Y)$ for two vectors $X,Y$. (This contains the answer to your second problem) But for non-Riemannian manifolds, i.e. manifolds where not all entries in the metric are positive, this is not a scalar product in the sense you may be used to. In particular, it can be zero. Vectors for which it is zero are usually called lightlike or null.

The important thing to take away is that manifolds do not always behave like cartesian space.

Now, for your third problem, we need the concept of the cotangent space $T_p^*\mathcal{M}$. It is the dual vector space to the tangent space, spanned by the differentials $\mathrm{d}x^i : T_p\mathcal{M} \rightarrow \mathbb{R}$ for a chosen coordinate system, and defined by

$$\mathrm{d}x^i(\partial_j) = \delta^i_j$$

Now, recall that the metric was a map from twice the tangent space to $\mathbb{R}$. As such, we can see it as an element of the tensor product $T_p^*\mathcal{M} \otimes T_p^*\mathcal{M}$, which is the space spanned by element of the form $\mathcal{d}x^i \otimes \mathcal{d}x^j$. As the metric is an element of this space, it is expandable in its basis:

$$ g = g_{ij}\mathrm{d}x^i\mathrm{d}x^j$$

where the physicist just drops the bothersome $\otimes$ sign. Now, what has this to do with infinitesimal distance? We simply define the length of a path $\gamma : [a,b] \rightarrow \mathcal{M}$ to be (with $\gamma'(t)$ denoting the tangent vector to the path)$[1]$

$$ L[\gamma] := \int_a^b \sqrt{\lvert g(\gamma'(t),\gamma'(t))\rvert}\mathrm{d}t$$

And, by using physicists' sloppy notation, $g(\gamma'(t),\gamma'(t)) = g_{ij} \frac{\mathrm{d}x^i}{\mathrm{d}t}\frac{\mathrm{d}x^j}{\mathrm{d}t}$, if we understand $x^i(t)$ as the $i$-th coordinate of the point $\gamma(t)$, and so:

$$ L[\gamma] = \int_a^b \sqrt{g_{ij} \frac{\mathrm{d}x^i}{\mathrm{d}t}\frac{\mathrm{d}x^j}{\mathrm{d}t}}\mathrm{d}t = \int_a^b \sqrt{g_{ij}\mathrm{d}x^i\mathrm{d}x^j}\frac{\mathrm{d}t}{\mathrm{d}t} = \int_a^b \sqrt{g_{ij}\mathrm{d}x^i\mathrm{d}x^j}$$

Since we call $\mathrm{d}s$ the infinitesimal line element that fulfills $L = \int \mathrm{d}s$, this is suggestive of the notation

$$ \mathrm{d}s^2 = g_{ij}\mathrm{d}x^i\mathrm{d}x^j$$

If we notice that, by the definition of tangent and cotangent vectors by differentials and deriviatives as above, things with upper indices transform exactly in the opposite way from the things with lower indices (see also my answer here), it is seen that this is indeed invariant under arbitrary coordinate transformations.

$[1]$ $\gamma'(t)$ is really a tangent vector in the following sense:

Let $x : \mathcal{M} \rightarrow \mathbb{R}^n$ be a coordinate chart. Consider then: $ x \circ \gamma : [a,b] \rightarrow \mathbb{R}^n$. Since it is an ordinary function between (subsets of) cartesian spaces, it has a derivative

$$(x \circ \gamma)' : [a,b] \rightarrow \mathbb{R}^n$$

Now, $(x \circ \gamma)'^i(t)$ be be thought of as the components of the tangent vector $\gamma'(t) := (x \circ \gamma)'^i(t)\partial_i \in T_{\gamma(t)}\mathcal{M}$. It is a somewhat tedious, but worthwhile excercise to show that this definition of $\gamma'(t)$ is independent of the choice of coordinates $x$.


You exam question with the surfaces is asking about something different. You are given an embedding of a lower-dimensional submanifold $\mathcal{N}$ into Cartesian space

$$ \sigma: \mathcal{N} \hookrightarrow \mathbb{R}^n $$

and asked to calculate the induced metric on the submanifold from the Cartesian metric

$$\mathrm{d}s^2 = \sum_{i = 1}^n \mathrm{d}(x^i)^2$$

(which is just the identity matrix in component form w.r.t. any orthonormal basis of coordinates in $\mathbb{R}^n$, i.e. the dot product)

Now, how is a metric induced? Let $y : \mathbb{R}^m \rightarrow \mathcal{N}$ be coordinates for the submanifold (you are actually given $\sigma \circ y$ in the question), and $x$ be the coordinates of the Cartesian space. Observe that any morphism of manifolds $\sigma$ induces a morphism of tangent spaces

$$ \mathrm{d}\sigma_p : T_p\mathcal{N} \rightarrow T_{\sigma(p)}\mathbb{R}^n, \frac{\partial}{\partial y^i} \mapsto \sum_j \frac{\partial(\sigma \circ y)^j}{\partial y^i}\frac{\partial}{\partial x^j} $$

called the differential of $\sigma$. As a morphism of vector spaces, it is a linear map given, as a matrix, by the Jacobian $\mathrm{d}\sigma^{ij} := \frac{\partial(\sigma \circ y)^j}{\partial y^i}$ of the morphism of manifolds. Now, inducing a metric means setting

$$ g_\mathcal{N}(\frac{\partial}{\partial y^i},\frac{\partial}{\partial y^j}) := g_\mathrm{Euclidean}(\mathrm{d}\sigma(\frac{\partial}{\partial y^i}),\mathrm{d}\sigma(\frac{\partial}{\partial y^j}))$$

On the right hand side is now the dot product of two ordinary vectors in $\mathbb{R}^n$, and what your exams call $\vec e_{y^i}$ is my $\mathrm{d}\sigma(\frac{\partial}{\partial y^i})$. If you note that you are given $\sigma \circ y$, then all you need to do is to calculate the metric components by calculating $g_\mathcal{N}$ as above for every possible combination of $y^i,y^j$ (in 2D, fortunately, there's only four).

ACuriousMind
  • 124,833
  • Thanks for the detailed reply! It's been really helpful so far, but I'm a bit confused by the definition of the tangent space. You call its elements vectors, but the elements are linear combinations of partial derivatives, and a vector like $\gamma '(t)$ isn't just a linear combination of partial derivatives, yet it belongs to the tangent space? – Lammey Jul 20 '14 at 22:43
  • @ACuriousMind: "Let's start at the beginning: The setting for relativity [...] is that spacetime is a manifold $\mathcal M$ [...] that the components of the metric [tensor] are, for chosen basis vectors $\delta_i$ of $T_p \mathcal M$, defined by $g{ij}=g(\delta_i,\delta_j)$. [...] it can be zero. Vectors for which it is zero are usually called lightlike." -- Is there some reason (having to do with participants, observations, physics) to associate the case of "_zero" component values with "light" ? If so, are there corresponding reasons concerning cases of other component values? – user12262 Jul 20 '14 at 22:49
  • @James Machin: I've added responses to your questions to my answer. One could, in fact, also define the tangent space via the equivalence classes of the $(x \circ \gamma)'$ of all curves $\gamma$, as you can see on the Wikipedia page of the tangent space. – ACuriousMind Jul 20 '14 at 23:05
  • @user12262: Such vectors are called lightlike because if $\gamma$ is the worldline of something travelling at the speed of light, then $g(\gamma'(t),\gamma'(t)) = 0$ at every point along the path. Similarily, the nomenclature space-like and time-like arises from the corresponding vectors being tangent to worldlines of FTL travellers and sub-light travellers, respectively. – ACuriousMind Jul 20 '14 at 23:09
  • @ACuriousMind: "[1] [...] an ordinary function between (subsets of) cartesian spaces, [...] has a derivative" -- Surely not every function $$\mathbb R \rightarrow \mathbb R^n$$ has a derivative. However, perhaps functions such as $$(x \circ \gamma) : [a, b] \rightarrow \mathbb R^n,$$ i.e. as considered in the answer, may have additional (stronger) properties which may (or may not) imply the existance of a derivative. – user12262 Jul 20 '14 at 23:21
  • @ACuriousMind: "if $\gamma$ is the worldline of something travelling at the speed of light, then $g(\gamma'(t),\gamma'(t))=0$ at every point along the path." -- How so? What do you mean by "speed" (i.e. a notion which apparently has not been mentioned in your answer, as it presently stands)? ... – user12262 Jul 20 '14 at 23:26
  • @user12262: True enough, but I did not want to overload an answer to (someone who seems to me like) a beginner with such technicalities. The manifold is assumed to be a smooth manifold, so the charts and the paths are also smooth. As for the speed, I've not mentioned it in my answer as it is not necessary for this question to discuss the reasons for the space-/light-/time-like nomenclature, and it would need rehashing the idea of inertial frames and whatnot. – ACuriousMind Jul 20 '14 at 23:35
  • @ACuriousMind: -- Well, I've certainly expressed the reasons for my dissatisfaction; and a fitting quote about how to do better. Also, I'll wait another day for the OP (James Machin) to edit the question title (at least) before suggesting the corresponding edit(s) myself. – user12262 Jul 20 '14 at 23:41
  • I've never seen that definition of vector before. Actually , I can't get my head around it at all! – Lammey Jul 20 '14 at 23:46
  • @James Machin: Are you familiar with the abstract axiomatic definition of a vector space? One gets used to these things by working with them so long until one can recite them in one's sleep. Also, be aware that my explanation takes a pretty abstract view - there are ways to introduce these things more "intuitively", but I'm not good at them and they sweep the mathematical structure under the rug, which you're gonna need sonner or later. – ACuriousMind Jul 20 '14 at 23:53
  • Also, for visualising things about manifolds, I often find it helpful to think of the good old sphere, which is as good a manifold as it gets. Tangent vector are really what the name says - they are tangent to the sphere, which you can "see" when you think about how you would interpret an infinitesimal change in $x^i$ - to which the $\partial_i$ correspond - geometrically. It's difficult to convey such intuition over words, and I am not presuming I'm doing a good job at that, either. Perhaps someone else will provide an answer that suits you better :) – ACuriousMind Jul 20 '14 at 23:56
  • I think I understand the idea of a vector space, where you abstract the definition of a vector to an element of group which obeys the properties in the wikipedia link, and so I can see that the tangent vector space which you defined above satisfies these properties (well I haven't checked but I assume they do!). But a tangent vector to a curve $\gamma (t)$ is also a vector in the standard sense i.e. if $\gamma (t)=(t,2t,3t^2)$ then $\gamma '(t)=(1,2,6t)$, I guess my problem is that I can't see how $\gamma' (t)$ corresponds to a partial derivative. I should add thanks for all the help so far! – Lammey Jul 21 '14 at 00:10
  • @James Machin: Ohhh, that! Well, I didn't mean to imply that there is anything intrinsic about $\gamma'$ that says it's a partial derivative. It's just that $\mathcal{M}$ is not a vector space, so $\gamma'$ needs a space to live, and so you send it to the tangent space. If you call the space $\gamma'$ natively lives in $V$, then it's naturally isomorphic to $\mathbb{R}^n$, which is naturally isomorphic to $T_p\mathcal{M}$. The map I give in the footnote is the explicit description of that isomorphism. – ACuriousMind Jul 21 '14 at 00:16
  • @ACuriousMind excellent answer; very clear and pretty much self-contained! +10 if I could – Danu Jul 21 '14 at 00:34
  • I guess what I'm finding tricky is connecting this back to my course; after all I have to be able to answer the exam questions! From looking at the questions, we are given a surface like $\vec{\sigma} (\theta, \phi )$ and then calculate the metric matrix by the definition $g_{ab}=\vec{e}\theta \cdot \vec{e}\phi$, where $\cdot$ is the dot product as in the sum of the product of the components, and $\vec{e}_{\theta}:=\frac{d\vec{\sigma}}{d\theta}$ I'm trying to understand how your definition of the metric turns into that. Sorry I'm not being very clear but most of these concepts are beyond me! – Lammey Jul 21 '14 at 01:38
  • @James Machin: From only the information you have given, I'd say that's a pretty stupid exam question (at least as a preparation for SR/GR). How is $\vec \sigma$ given? Is it a function $\mathbb{R}^2 \rightarrow \mathbb{R}^n, (\theta,\phi) \mapsto \vec\sigma(\theta,\phi)$? If yes, then what you are seeking is the "induced metric on a submanifold" and I can add that to my answer. – ACuriousMind Jul 21 '14 at 02:31
  • @ACuriousMind I'll add a link to my question to give an example question – Lammey Jul 21 '14 at 02:43
  • @James Machin: "I'm trying to understand how your [ACuriousMind's] definition of the metric [tensor]" -- i.e.$$L[\gamma]:=\int_{\gamma}\mathrm d s$$together with$$\gamma : [a,b]\rightarrow \mathcal M$$and$$L[\gamma]:=\int_a^b \sqrt{g_{jk}~\mathrm d x^j~\mathrm d x^k}$$ "turns into that" -- i.e.$$L[\gamma]=\int_a^b~\mathrm d~t~\gamma'~\lim_{\Gamma \rightarrow \mathcal X} {\frac{\int_a^{\gamma^{-1}[\mathcal \Gamma]} \mathrm d s-\int_a^{\gamma^{-1}[\mathcal X]} \mathrm d s }{\gamma^{-1}[ \Gamma]-\gamma^{-1}[ \mathcal X ]} }.$$ -- Me, too. (But that's a bit beyond "exam questions" ...) – user12262 Jul 21 '14 at 06:20
  • @James Machin: I've added a part discussing inducing a metric from an embedding as you are given. – ACuriousMind Jul 21 '14 at 13:12
  • @user12262: I have not actually defined the path length metric here, as that requires the introduction of geodesics and the exponential map. It seems to me that you are dissatisfied that I have not given a complete course in (pseudo-)Riemannian differential geometry, and that's true. I've only introduced the bits and ends that are needed to understand the question at hand, and which are most likely to show up in typical GR scenarios. – ACuriousMind Jul 21 '14 at 13:17
  • @ACuriousMind: "It seems to me that you're dissatisfied that I have not given a complete course in (pseudo-)Riemannian [DG ...]" -- Not at all; your intro ("In the beginning ... we have a manifold $\mathcal M$") is as complete as can be expected. I'm dissatisfied because that's not the beginning of physics, nor of (G)TR in particular. (But James Machin first has a course to pass, to earn the leisure of such considerations ...) "path length metric [...] requires [...] geodesics and the exponential map." -- Such overhead seems superfluous for certain curves $\lambda$ with $L[\lambda]=0$. – user12262 Jul 21 '14 at 15:58
2

This is my first problem, as the modulus of a vector shouldn't be negative.

First, while there are many useful properties of introductory linear algebra you should keep in mind with GR, thinking in Cartesian terms with positive definite matrices simply has to go. Vectors in relativity can very much have negative norm.

Even though it's not often done in the literature, it might be pedagogically helpful to write the magnitude of $\vec{x}$ as $\lVert \vec{x} \rVert$ rather than $\lvert \vec{x} \rvert$, the latter being too reminiscent of the absolute value function.


I'm a bit confused about how the symbol $\cdot$ is used here...

This is another issue with notation. Once upon a time, in a non-physics setting, I was taught that two vectors $\vec{x}$ and $\vec{y}$ living in an inner product space could have their inner product computed, $(\vec{x}, \vec{y})$. In the case of a very special inner product, the dot product, we could calculate the value by adding the pairwise products of the vectors' components, and we call this result $\vec{x} \cdot \vec{y}$.

In relativity, though, we never use this Cartesian dot product.1 Thus, we choose to make this symbol mean "apply the metric to the vectors": $\vec{x} \cdot \vec{y} = g(\vec{x}, \vec{y})$.

If you look at the components of $\vec{x}$ and $\vec{y}$, the linearity of $g$ means it can be expressed as a matrix with components $g_{\mu\nu}$, where $g(\vec{x}, \vec{y})$ is understood to be computed as "matrix-multiply the row vector of components $x^\mu$ with the matrix with components $g_{\mu\nu}$ with the column vector of components $y^\nu$." Using implicit Einstein summation notation, we can write this as $x^\mu g_{\mu\nu} y^\nu$, or better yet $g_{\mu\nu} x^\mu y^\nu$.

Actually, because we have a metric, we have a natural dual space to our vector space. For any $\vec{x}$, there exists a unique linear map $\tilde{x}$ on the vector space such that $\tilde{x}(\vec{y}) = g(\vec{x}, \vec{y})$ for all vectors $\vec{y}$. Thus "$\vec{x} \cdot \vec{y}$" can be interpreted as the usual "sum the results of componentwise multiplication" as long as it is understood that we are taking the components of the dual vector $\tilde{x}$ together with those of the normal vector $\vec{y}$ (and that that the basis for the dual space is the dual to the one use for the original vector space).

If the metric is Cartesian, then the matrix representation is the identity matrix, and so the first interpretation of the notation reduces to the standard don't-think-too-hard dot product. In the language of dual spaces, a Cartesian metric induces a trivial map from vectors to their duals: the components remain the same. Thus one doesn't even keep track of whether we were taking components from $\vec{x}$ or $\tilde{x}$.


My third problem is that I'm not sure where the equation $ds^2 = g_{\alpha\beta} dx^\alpha dx^\beta$ comes from.

There is a lot of deep differential geometry behind this statement, and I'll only give the briefest glimpse at it. For each index $\mu$, $x^\mu$ is a scalar field on your spacetime manifold. The exterior derivative operator $\mathrm{d}$ turns scalars into dual vectors (among other things) and is in this special case really just the familiar gradient operator.

Consider a point in spacetime. At this point, your coordinates induce directional derivatives $\partial/\partial x^\mu = \partial_\mu$, and these can be taken as a basis for vectors at that point. The dual space in fact has as its corresponding basis the gradients $\mathrm{d}x^\mu$.2

By the definition of the dual basis, we know that $\mathrm{d}x^\mu(\partial_\nu) = \delta^\mu_\nu$. Consider a vector $\vec{V} = V^\alpha \partial_\alpha$. We know $\mathrm{d}x^\mu(\vec{V}) = V^\alpha \mathrm{d}x^\mu(\partial_\alpha) = V^\alpha \delta^\mu_\alpha = V^\mu$. (The first equality comes from the linearity of $\mathrm{d}x^\mu$; see also footnote 2.) Thus in any fixed basis, $$ g(\vec{V}, \vec{W}) = g_{\alpha\beta} V^\alpha W^\beta = g_{\alpha\beta} \mathrm{d}x^\alpha(\vec{V}) \mathrm{d}x^\beta(\vec{W}) = g_{\alpha\beta} \mathrm{d}x^\alpha \otimes \mathrm{d}x^\beta (\vec{V}, \vec{W}). $$

The single, indivisible symbol "$ds^2$" is just shorthand for the tensor $g_{\alpha\beta} \mathrm{d}x^\alpha \otimes \mathrm{d}x^\beta$ (often written without the explicit product symbol). So, in a very roundabout way, that is its definition. And it is as coordinate invariant as the inner product $g$.


1Note some texts will try to use it, by throwing in factors of $i$ in various places to get the desired negative signs when two $i$'s multiply. This is bad practice, and fails miserably when going from SR to GR.

2Possible point of confusion: $\mu$ in this paragraph is just indexing different coordinates, not components of vectors or dual vectors. Moreover, arrows and tildes have been suppressed. Thus $\partial_\mu$ is full vector for any $\mu$, with components $\partial_\mu^\alpha$ indexed by $\alpha$. Similarly, $\mathrm{d}x^\mu$ is a dual vector, and its components in some understood basis would be $\mathrm{d}x^\mu_\alpha$.

  • Thanks a lot for the detailed answer! Your response to parts 1 and 2 really helped. As for the third part, I have the same problem there as I did with the answer below - I can't understand how partial derivatives can be a basis for vectors! Tangent vectors aren't linear combinations of partial derivatives as far as I am aware! Doesn't a basis for tangent vectors need to consist of vectors? I don't understand how finite combinations of infinitesimal entities can create a vector. – Lammey Jul 21 '14 at 00:59
  • It's common in differential geometry to identify vectors with directional derivatives. It's kind of a definition of last resort: if your manifold were a vector space, you could take $\partial \vec x/\partial x^\alpha$ and get a vector because $\vec x$ is an element of a vector space. When positions are no longer elements of vector spaces, that notion breaks down. The directional derivatives themselves still obey the vector space structure, though. – Muphrid Jul 21 '14 at 03:40