Derivative of the Lagrangian with respect to the metric tensor

Question

I'm trying to calculate the derivative of the Lagrangian $$\mathcal{L}=\frac{1}{2}\partial_\mu\phi\,\partial^\mu\phi-\frac{1}{2}m^2\phi^2$$ with respect to the metric tensor $g_{\mu\nu}$, with the convention $(+, -, -, -)$, in order to obtain $$T^{\mu\nu}=-g^{\mu\nu}\mathcal{L}-2\frac{\delta\mathcal{L}}{\delta g_{\mu\nu}}.$$ First I tried to do it as follows:

$$\frac{\delta\mathcal{L}}{\delta g_{\mu\nu}}=\frac{1}{2}\frac{\delta\left(g_{\alpha\beta}\partial^\alpha\phi\partial^\beta\phi\right)}{\delta g_{\mu\nu}}=\frac{1}{2}\frac{\delta g_{\alpha\beta}}{\delta g_{\mu\nu}}\partial^\alpha\phi\partial^\beta\phi=\frac{1}{4}\left(\delta_\alpha^\mu\delta_\beta^\nu+\delta_\alpha^\nu\delta_\beta^\mu\right)\partial^\alpha\phi\partial^\beta\phi=\frac{1}{2}\partial^\mu\phi\partial^\nu\phi.$$

Where I have used the formula

$$\frac{\delta g_{\alpha\beta}}{\delta g_{\mu\nu}}=\frac{1}{2}\left(\delta_\alpha^\mu\delta_\beta^\nu+\delta_\alpha^\nu\delta_\beta^\mu\right)$$

that my professor derived in class.

But, if I write $\partial_\mu\phi\partial^\mu\phi=g^{\alpha\beta}\partial_\alpha\phi\partial_\beta\phi$, I would use the expression for the derivative of the inverse metric:

$$\frac{\delta g^{\alpha\beta}}{\delta g_{\mu\nu}}=-\frac{1}{2}\left(g^{\alpha\mu}g^{\beta\nu}+g^{\alpha\nu}g^{\beta\mu}\right)$$

This formula comes from the fact that $g^{\alpha\beta}g_{\beta\rho}=\delta^{\alpha}_{\rho}$, so

$$\frac{\delta g^{\alpha\beta}}{\delta g_{\mu\nu}}g_{\beta\rho}+g^{\alpha\beta}\frac{\delta g_{\beta\rho}}{\delta g_{\mu\nu}}=0$$

which leads to

$$\frac{\delta g^{\alpha\beta}}{\delta g_{\mu\nu}}g_{\beta\rho}=-\frac{1}{2}g^{\alpha\beta}\left(\delta^\mu_\beta\delta^\nu_\rho+\delta^\mu_\rho\delta^\nu_\beta\right)=-\frac{1}{2}\left(g^{\alpha\mu}\delta^{\nu}_{\rho}+g^{\alpha\nu}\delta^{\mu}_\rho\right)$$

multiplying by $g^{\rho\varphi}$, we get

$$\frac{\delta g^{\alpha\varphi}}{\delta g_{\mu\nu}}=-\frac{1}{2}\left(g^{\alpha\mu}g^{\varphi\nu}+g^{\alpha\nu}g^{\varphi\mu}\right)$$

and then, aplying that we would get

$$\frac{\delta\mathcal{L}}{\delta g_{\mu\nu}}=-\frac{1}{2}\partial^\mu\phi\partial^\nu\phi$$

I would like to understand what's causing this problem. I suppose this question may be duplicated, but it's still not clear to me how to fix that sign error from the answers I've read. This is my first time studying GR and I would be grateful for any responses.

Your formula for the inverse metric doesn't make sense to me. Can you explain it? — Valac, Jun 19 '22 at 18:35
Of course. We write the product of the metric by its inverse as the indentity matrix, derive both sides by the metric with lowered indices and use the formula I used in the post. Then it follows by some manipulation. — Pedro Huot, Jun 19 '22 at 18:44
Where does the minus sign come from? If you lower $\alpha\beta$ from the formula your professor derived, do you get a minus sign? — Valac, Jun 19 '22 at 18:55
As the derivation for the formula of the inverse metric take some lines, I will edit the question to make it clear where the minus sign come from. In addition, did you mean to raise the indices $\alpha\beta$? — Pedro Huot, Jun 19 '22 at 19:03
In your first calculation, $\partial^{\alpha}\phi=g^{\alpha\mu}\partial_{\mu}\phi$ by definition, so you have to take this into account as well when doing the variation with respect to the metric. If you do this, you'll get the correct answer $\frac{\delta \mathcal{L}}{\delta g_{\mu\nu}}=-\frac{1}{2}\partial^{\mu}\phi\partial^{\nu}\phi$. — peek-a-boo, Jun 19 '22 at 19:38
Related: https://physics.stackexchange.com/q/228185/2451 , https://physics.stackexchange.com/q/149066/2451 — Qmechanic, Jun 19 '22 at 19:40
So, when calculating variations with respect to the metric, we have to lower the indices of $\partial^\mu$ because we define $\partial^\mu$ in terms of $\partial_\mu$? — Pedro Huot, Jun 19 '22 at 19:46
Yes, $\partial^{\mu}$ is defined in terms of $\partial_{\mu}$, so we need to make the presence of the metric explicit here if you want to vary; it's really a chain rule issue. Maybe I'll elaborate in an answer. — peek-a-boo, Jun 19 '22 at 19:48

score 6 · Accepted Answer · answered Jun 19 '22 at 20:44

This is just elaborating a little more on the 'behind the scenes', since OP's confusions seem to be resolved in the comments already.

Consider the following simplified situation. Let $f_1,f_2:\Bbb{R}^2\to\Bbb{R}$ be two functions defined as $f_1(x,y)=x^2y^3$ and $f_2(x,y)=xy^2$. These are clearly two different functions. Consider now two curves, $\gamma_1,\gamma_2:\Bbb{R}\to\Bbb{R}^2$ defined as $\gamma_1(t)=(t,t)$ and $\gamma_2(t)=(t,t^2)$. Then, you can easily verify that the composed maps are equal: for all $t\in\Bbb{R}$, we have $(f_1\circ\gamma_1)(t)=(f_2\circ\gamma_2)(t)=t^5$. On the other hand, let us calculate their partial derivative: \begin{align} \frac{\partial f_1}{\partial x}\bigg|_{\gamma_1(t)}=2t^4,\quad\text{but}\quad\frac{\partial f_2}{\partial x}\bigg|_{\gamma_2(t)}=t^4. \end{align} This shouldn't be surprising: we started off with two different functions $f_1,f_2$, and we just happened to find two curves $\gamma_1,\gamma_2$ such that $f_1\circ\gamma_1=f_2\circ\gamma_2$. There's no reason to expect that this implies $\frac{\partial f_1}{\partial x}\circ \gamma_1= \frac{\partial f_2}{\partial x}\circ \gamma_2$, and in fact as shown above, this equality is false.

How does this relate to the Lagrangian? Fix any smooth manifold $M$, and consider the mappings

$\mathscr{L}_1: \Gamma(T^0_2(M))\times \Gamma(T^1_0(M))\to C^{\infty}(M)$ defined as $\mathscr{L}_1(H,\xi)=H(\xi,\xi)=H_{ab}\xi^a\xi^b.$
$\mathscr{L}_2: \Gamma(T^2_0(M))\times \Gamma(T^0_1(M))\to C^{\infty}(M)$ defined as $\mathscr{L}_2(K,\omega)=K(\omega,\omega)=K^{ab}\omega_a\omega_b.$

In words, $\mathscr{L}_1$ eats a $(0,2)$-tensor field in its first slot, and a vector field (a $(1,0)$ tensor field) in its second slot, and it outputs a smooth function by contracting the tensor field and vector field completely; $\mathscr{L}_2$ does a similar thing (contraction) except it has a different domain. Now, without any doubt, $\mathscr{L}_1$ and $\mathscr{L}_2$ are completely different maps.

Now, let us fix a scalar field $\phi$ on $M$. We now get two induced mappings via composition, denoted $\mathcal{L}_1$ and $\mathcal{L}_2$, defined on the space of metric tensors and taking values in $C^{\infty}(M)$, such that

$\mathcal{L}_1[g]:= \mathscr{L}_1(g,\text{grad}_g(\phi)):=\mathscr{L}_1(g,g^{\sharp}(d\phi))=g_{ab}\partial^a\phi\partial^b\phi$.
$\mathcal{L}_2[g]:= \mathscr{L}_2(g^{``-1"}, d\phi)= g^{ab}\partial_a\phi\partial_b\phi$.

Here, $g^{\sharp}$ denotes the musical isomorphism which converts covector fields into vector fields (the index-raising operation), and $g^{``-1"}$ denotes the 'inverse' metric tensor (I put 'inverse' in quotation marks since a $(0,2)$ tensor field strictly speaking doesn't have an inverse; rather we refer to a corresponding $(2,0)$ tensor).

So you see, the composed functions $\mathcal{L}_1$ and $\mathcal{L}_2$ are equal. However, the variations $\frac{\delta \mathscr{L}_1}{\delta H}\bigg|_{(g,\text{grad}_g\phi)}$ and $\frac{\delta\mathscr{L}_2 }{\delta K}\bigg|_{(g^{``-1"}, d\phi)}$ are not equal.

Hopefully the analogy with the above simple case is clear: $\mathscr{L}_i$ is like $f_i$, the map $g\mapsto (g,\text{grad}_g(\phi))$ is like the curve $\gamma_1$, and the map $g\mapsto (g^{``-1"},d\phi)$ is like the curve $\gamma_2$, and it turns out their compositions are equal: $\mathcal{L}_1=\mathcal{L}_2$. But that doesn't mean the original maps are equal, nor does it imply the composition of the derivatives along these 'curves'are equal (remember in variational calculus, we always perform the variation first, and only afterwards evaluate).

So roughly speaking, your first calculation corresponds to $\mathscr{L}_1$, where we view $\partial^a\phi$ as indpendent variables, whereas in the latter case we view $\partial_a\phi$ as the independent variables. In going from one to the other, there are factors of the metric which appear. Lastly, in physics, we view the second situation as more 'fundamental', i.e $\partial_a\phi$ is the basic quantity (afterall the exterior derivative $d\phi$ can be defined without any metric).

score 0 · Answer 2 · answered Jun 20 '22 at 09:41

I want to summarize the answer given by @peek-a-boo in a physicists friendly way:

The identity $g_{\alpha\beta}g^{\beta\delta}=\delta_{\alpha}^{\delta}$ is actually a relation between the metric tensor $\mathbf{g}$ and its inverse $\mathbf{g}^{-1}$, i.e. $\mathbf{g}\cdot\mathbf{g}^{-1}=$.

The stress-energy-momentum tensor is usually defined by the functional derivative with respect to the metric tensor, i.e $$T^{\mu\nu}[\mathbf{g}](x)=-\frac{2}{\sqrt{|g|}}\frac{\delta S[\mathbf{g}]}{\delta g_{\mu\nu}(x)}.$$

If you want to perform the functional derivative with respect to the inverse metric, then you must follow the chain rule. i.e. \begin{align} T_{\mu\nu}[\mathbf{g}^{-1}](x)&=-\frac{2}{\sqrt{|g|}}\frac{\delta S[\mathbf{g}]}{\delta g^{\mu\nu}(x)} \\ &=-\frac{2}{\sqrt{|g|}}\int d^{4}y\frac{\delta g_{\alpha\beta}(y)}{\delta g^{\mu\nu}(x)}\frac{\delta S[\mathbf{g}]}{\delta g_{\alpha\beta}(y)} \\ &=-\frac{2}{\sqrt{|g|}}\int d^{4}y\left[-\frac{1}{2}(g_{\alpha\mu}(x)g_{\beta\nu}(x)+g_{\alpha\nu}(x)g_{\beta\mu}(x))\delta(x-y)\right]\frac{\delta S[\mathbf{g}]}{\delta g_{\alpha\beta}(y)} \\ &=-\frac{1}{2}\left[g_{\alpha\mu}(x)g_{\beta\nu}(x)+g_{\alpha\nu}(x)g_{\beta\mu}(x)\right]T^{\alpha\beta}[\mathbf{g}](x) \\ &=-T_{\mu\nu}[\mathbf{g}](x). \end{align}

So depending on your conventions, there is a minus sign.

Derivative of the Lagrangian with respect to the metric tensor

2 Answers2

Linked