2

In the linearization of GR, when $g_{\mu\nu} = \eta_{\mu\nu} + h_{\mu\nu}$, and $|h| \ll 1$, it is said (for example here) that 'to linear order the “$\Gamma\Gamma$” terms go away' in the formula for the Ricci tensor.

It is like to say that, if a function is small, the square of its first derivative (in the case, the $\Gamma$'s are derivatives of $h$) can be neglected compared to its second derivative (the remaining terms of the Ricci tensor).

I can imagine examples where this is reasonable (see below), but not a general proof. The references that I read don't even bother to discuss the question, so maybe it is obvious and I am missing something.

Let's take an example for a function of one variable: $h(x) = A \sin(kx)$ If $A<<1$ then $h<<1$. But suppose that $k = \frac{1}{A}$. Then the derivative: $\dot h(x) = \cos(kx)$ is not so small.

However it is true that $\ddot h(x) = -k \sin(kx)$ is much bigger than $\dot h$. And according to that example, the square of the first derivative can be neglected compared to the second derivative.

But this is not a proof, only an example.

DanielC
  • 4,333
  • 7
    Would it help to write $g_{\mu\nu}=\eta_{\mu\nu}+\epsilon h_{\mu\nu}$ and consider the expansion to be in powers of $\epsilon$? – Ghoster Mar 06 '24 at 00:12
  • This is basically the definition of linearization via e.g., Gateaux or Frechet derivatives. – whpowell96 Mar 06 '24 at 19:14

4 Answers4

5

In linear GR, $\Gamma$ (I'm omitting the indices) is always proportional to the perturbation $h$. Therefore, $\Gamma^2$ is always proportional to $h^2$, and thus it is a quadratic term. Since we are keeping only the linear terms, we drop $\Gamma^2$.

  • If $\Gamma$ was proportional to $h$. But it is proportional to the derivatives of $h$. – Claudio Saspinski Mar 06 '24 at 09:58
  • @ClaudioSaspinski Which are roughly of the size of $h$. Otherwise, it would be impossible to have $\abs{h} \ll 1$ everywhere. – Níckolas Alves Mar 06 '24 at 11:25
  • I included an example. The derivatives of $h$ are not necessarily of the size of $h$. – Claudio Saspinski Mar 06 '24 at 12:56
  • 1
    @NíckolasAlves As far as mathematics is concerned, that is not true at all. The derivative of a function can be arbitrarily larger than the function itself. See https://math.stackexchange.com/q/4756869. So this claim needs its own justification. – Vincent Thacker Mar 06 '24 at 13:46
  • Those are good points. I have to think about it more. But I believe one can argue that the derivatives must be small as well (at least for the partial derivatives of tensors), perhaps by following Ghoster's argument in the comments – Níckolas Alves Mar 06 '24 at 14:10
  • The key is that no matter how much larger $h'$ is compared to $h$ (as long as it is finite), $\epsilon^2 h'^2$ goes to zero faster than $\epsilon h$ or $\epsilon h'$ as $\epsilon\to 0$. We don't care about how $h$ and $h'$ compare pointwise. All that matters for linearization is how various terms scale when $h$ is sent to zero uniformly, i.e., by multiplying by $\epsilon$ and sending $\epsilon\to 0$. – whpowell96 Mar 06 '24 at 19:10
  • @whpowell96 I don't understand how your first sentence works. The functions $\sin 1/x^2$ and $\sin e^x$ are counterexamples. – Vincent Thacker Mar 06 '24 at 20:32
  • That comes down to what sort of functions are allowed to be chosen for perturbations, which is a choice of function space and topology. Obviously if you choose perturbations with singularities then bad things can happen, but that is the purpose of clearly defining the domain of the function you are linearizing. The point I am making is that if $h$, $h'$, and required nonlinearities have finite norm in some space, then $|\epsilon^2 h'^2|$ goes to zero faster than $|\epsilon h'|$ as $\epsilon\to0$. Usually finite norm translates to something physically reasonable, like finite energy. – whpowell96 Mar 06 '24 at 23:50
  • 1
    The point I'm making is that "$h\to0$" when doing linearization actually means "$h\to 0$ in some reasonable function space topology on which your problem is well-defined," which rules out a lot of pathological examples if your function space topology corresponds to anything physically reasonable in most cases. – whpowell96 Mar 06 '24 at 23:55
  • @whpowell96 So it's basically what I said in my answer (the last sentence)? – Vincent Thacker Mar 07 '24 at 01:24
  • Yes, but the reason why is that the space of perturbations is restricted for the thing you are linearizing to even be well-defined – whpowell96 Mar 07 '24 at 02:09
2

You are correct that it does not follow mathematically. The derivative of a function can be arbitrarily larger than the function itself even in the infinitesimal limit.

But the relative sizes of $h$ and $h'$ are not what's being compared here because as $h \to 0$ in the infinitesimal limit, terms with two factors of $h$ (or its derivatives) will be negligible compared to the linear terms.

In addition, in many cases, physical considerations impose upper bounds on their magnitudes.

  • A function being small does not imply the derivative is small (i.e $\lim\limits_{x\to 0}g(x)=0$ does not imply $\lim\limits_{x\to x_0}g’(x)=0$, and in fact the limit could be $\infty$). Sure, this is correct. But, this is not relevant to the question asked. The answer by Nickolas Alves, and Ghoster’s comments are correct. Here we are expanding in powers of $\epsilon$, not $x=(x^1,\dots, x^1)$. More precisely, one can consider a smooth family of semi-Riemannian metrics $\epsilon\mapsto g(\epsilon)$. – peek-a-boo Mar 08 '24 at 02:06
  • Then, one can consider all the associated tensors: the full Riemann curvature $\text{Riem}(\epsilon):=\text{Riem}[g(\epsilon)]$, and the Ricci, $\text{Ric}(\epsilon):=\text{Ric}[g(\epsilon)]$, the Christoffel symbols relative to a fixed coordinate chart $\Gamma(\epsilon):= \Gamma[g(\epsilon)]$ etc, and for each of these quantities, one can consider the derivative $\frac{d}{d\epsilon}\bigg|_{\epsilon=0}$. These gives tensor fields of the corresponding rank. And indeed, for a ‘product’, the quadratic terms drop out because for first derivatives in $\epsilon$, the $\epsilon^2$ term is irrelevant. – peek-a-boo Mar 08 '24 at 02:09
  • So, your third paragraph is wrong. It’s not an additional postulate, it’s a simple consequence of “expanding relative to $\epsilon$”, which is our parameter, as opposed to the coordinates $x$. Specifically for oscillators, I’ve written this up much more carefully and explicitly with the language of (Frechet and Gateaux/directional) derivatives here. Edit: sorry to keep bombarding your comments, Nickolas’ answer as written is a little off, but can be ‘easily’ reworded to be correct. – peek-a-boo Mar 08 '24 at 02:13
  • @peek-a-boo I still don't see how it works on all continuous functions. See my comments on the above answer. Sure it works on most, but the functions $\sin 1/x^2$ and $\sin e^x$ are counterexamples. Their derivatives span all real numbers in magnitude. For every $\epsilon$ there are points where the derivative exceeds $1/\epsilon^2$. – Vincent Thacker Mar 08 '24 at 08:08
  • like I said in my first comment, what you’re writing is 100% true, but is not the game we’re playing. We are not comparing pointwise values of functions at different ranges of $x$-values (there of course smallness of functions doesn’t apply smallness of derivatives). What we are doing is considering two-parameter families of functions $F(x,\epsilon)$, with $x\in M$ (any manifold) and say $\epsilon\in I$ (an interval around the origin, or more generally a Lie group) we’re and expanding about $\epsilon=0$. – peek-a-boo Mar 08 '24 at 12:51
  • So, $F(x,\epsilon)=F(x,0)+\epsilon\cdot\frac{\partial F}{\partial \epsilon}\bigg|{(x,0)} + o(|F(x,\epsilon)-F(x,0)|)$. As you see here, the error depends on the point $x$ (which is what allows your statement to be true as well). But, we’re fixing $x$ and one by one we’re doing this expansion. This gives us the ‘variation function’ $x\mapsto \frac{\partial F}{\partial \epsilon}\bigg|{(x,0)}$. Notice I’m not a-priori comparing the size of this function to $F(\cdot, 0)$. So perhaps you might wish to think in terms of ‘calculate $\frac{\partial}{\partial\epsilon}|_{0}$’ – peek-a-boo Mar 08 '24 at 12:59
  • For example, if $F(x,\epsilon)=\sin(\epsilon e^x)$, then $F$ is a nice bounded function, but the variation in the sense of $\epsilon$-derivative above is $\delta F (x)= e^x$ (since $\cos(0)=1$). So clearly this is an unbounded function. But once again I’ve never made the claim (and physicists don’t mean this either even if they give a contrary impression) that this derivative is smaller than the original function. Instead what is meant is that for fixed $x$ (or varying in a fixed compact set) one expands in powers of $\epsilon$ and notices that a prefactor of $\epsilon$ is the important thing. – peek-a-boo Mar 08 '24 at 13:07
  • 1
    @peek-a-boo I understand what you're saying. But what I don't get is how this solves the original issue, which is that the ${h'}^2$ term is not necessarily much smaller than the $h$ term in the full expanded expression (before dropping terms). Carroll's book makes no mention of this $\epsilon$ nor does the Wikipedia article. Instead they simply let $g = \eta + h$. From what I see, by introducing the parameter $\epsilon$ you are basically redefining your criteria to be "drop any terms that contain more than one $h$ or its derivatives", instead of the magnitude. – Vincent Thacker Mar 08 '24 at 17:07
  • @peek-a-boo Since your first derivative with respect to $\epsilon$, by construction, contains precisely the terms with just one $h$ (because every $h$ is guaranteed to come with exactly one factor of $\epsilon$), it seems to me that the whole message "it is linear because it contains only terms with a single $h$ and not because other terms are smaller" i.e. it is a redundant statement. – Vincent Thacker Mar 08 '24 at 17:28
1

In Chapter 7 of Wald's General Relativity, he formalizes the notion of perturbations roughly as follows:

  1. We assume that there exists a one-parameter family of metrics $g_{ab}(\lambda)$, where $\lambda$ is a real parameter. We also assume that this one-parameter family is smooth in $\lambda$ and in the coordinates, i.e., we can take all of the derivatives we want and they're well-behaved.
  2. We assume that $g_{ab}(\lambda)$ satisfies the Einstein field equations for all values of $\lambda$.
  3. We assume that $g_{ab}(0) = \eta_{ab}$ (or, more generally, we assume that $g_{ab}(0)$ is some fixed background metric that we want to look at the perturbations of.)

Under these assumptions, the (first-order) metric perturbation is then $$ h_{ab} = \left. \frac{d g_{ab}(\lambda)}{d\lambda} \right|_{\lambda = 0} $$ or, if you prefer, $$ g_{ab}(\lambda) = \eta_{ab} + \lambda h_{ab} + \mathcal{O}(\lambda^2). $$ In other words, $\lambda$ effectively acts as a "small parameter" controlling the size of the perturbation.

You can then calculate things like the linearized Ricci tensor by taking similar derivatives: $$ R^{\text{(lin)}}_{ab} = \left. \frac{d R_{ab}}{d\lambda} \right|_{\lambda = 0}, $$ or, if you prefer, $$ R_{ab} = \lambda R^{\text{(lin)}}_{ab} + \mathcal{O}(\lambda^2). $$ Because of the construction of the Ricci tensor, $R^{\text{(lin)}}_{ab}$ will only ever contain terms that are linear in $h_{ab}$ by definition.

So to finally answer your questions:

  1. Why are we sure that the quadratic terms don't get large enough that we have to include them? By definition the quadratic terms are lumped into the $\mathcal{O}(\lambda^2)$ terms that we've swept under the rug. If we find ourselves in a situation where the $\mathcal{O}(\lambda^2)$ terms are comparable to the $\mathcal{O}(\lambda)$ terms, that just means that we're outside the range of $\lambda$ where the linearized approximation is a good approximation.

  2. But what if there's something weird going on, where our family of solutions has derivatives that get bigger as the perturbation itself gets smaller? Such families of solutions might exist, but they don't meet the conditions (1) and (3) above, and so this technique wouldn't find them.

0

I think that I have found a good argument for a function of one variable. Below are the definition of the derivatives, only with the limits omitted.

$$\dot h(x) = \frac{h(x+\Delta x) - h(x)}{\Delta x}\implies (\dot h(x))^2 = \frac{(h(x+\Delta x) - h(x))^2}{(\Delta x)^2}$$

$$\ddot h(x) = \frac{\frac{h(x+\Delta x) - h(x)}{\Delta x} - \frac{h(x) - h(x-\Delta x)}{\Delta x}}{\Delta x} = \frac{h(x+\Delta x)+h(x-\Delta x) - 2h(x)}{(\Delta x)^2}$$

As the denominators are equal, and the numerator of $(\dot h(x))^2$ have only quadratic terms of $h(x)$, then we can say that $(\dot h(x))^2 << \ddot h(x)$ when $h(x) << 1$.