12

For a variation of the metric $g^{\mu\nu}$ with respect to $g^{\alpha\beta}$ you might expect the result (at least I did):

\begin{equation} \frac{\delta g_{\mu\nu}}{\delta g_{\alpha\beta}}= \delta_\mu^\alpha\delta_\nu^\beta. \end{equation}

but then to preserve the fact that $g^{\mu\nu}$ is symmetric under interchange of $\mu$ and $\nu$ we should probably symmetrise the right hand side like this:

\begin{equation}\frac{\delta g_{\mu\nu}}{\delta g_{\alpha\beta}}= \delta_\mu^\alpha\delta_\nu^\beta + \delta_\mu^\beta\delta_\nu^\alpha.\end{equation}

Is this reasonable/correct? If not, why not?

It seems that I can derive some weird results if this is right (or maybe I'm just making other mistakes).

Qmechanic
  • 201,751
Sam
  • 308

2 Answers2

16

Since the metric $g_{\mu\nu}=g_{\nu\mu}$ is symmetric, we must demand that

$$\begin{align} \delta g_{\mu\nu}~=~&\delta g_{\nu\mu}\cr~=~&\frac{1}{2}\left(\delta g_{\mu\nu}+\delta g_{\nu\mu}\right)\cr~=~&\frac{1}{2}\left( \delta_{\mu}^{\alpha}\delta_{\nu}^{\beta} + \delta_{\nu}^{\alpha}\delta_{\mu}^{\beta}\right)\delta g_{\alpha\beta},\end{align}\tag{1}$$

and therefore

$$ \frac{\delta g_{\mu\nu}}{\delta g_{\alpha\beta}} ~=~\frac{1}{2}\left( \delta_{\mu}^{\alpha}\delta_{\nu}^{\beta} + \delta_{\nu}^{\alpha}\delta_{\mu}^{\beta}\right).\tag{2}$$

The price we pay to treat the matrix entries $g_{\alpha\beta}$ as $n^2$ independent variables (as opposed to $\frac{n(n+1)}{2}$ symmetric elements) is that there appears a half in the off-diagonal variations.

Another check of the formalism is that the RHS and LHS of eq. (2) should be idempotents because of the chain rule. For further motivation, see e.g. this Phys.SE post.

Qmechanic
  • 201,751
  • I also considered this, but consider

    $\frac{\delta g^{12}}{\delta g^{12}} = \frac{1}{2}(\delta^1_1\delta^2_2 + \delta^1_2\delta^2_1) = \frac{1}{2}$

    Would I not expect to get 1?

    – Sam Nov 26 '14 at 23:57
  • Thank you, I had to think about it for a while but it makes sense now :) – Sam Nov 27 '14 at 21:48
  • Hey, Sam. I got linked to this question from a similar one I had: https://physics.stackexchange.com/q/335173/.

    I still don't quite get the issue with the factors of $\frac{1}{2}$. Why did it eventually make sense to you that $\frac{\partial g^{12}}{\partial g^{12}} = \frac{1}{2}$? That just looks so wrong to me.

    – Klein Four May 26 '17 at 20:34
  • 4
    The point is that the two elements $g_{12}=g_{21}$ are equal, so they cannot be varied independently. If you (wrongly) think that $\frac{\partial g_{12}}{\partial g_{12}}$ should be $1$, while $\frac{\partial g_{21}}{\partial g_{12}}$ should be $0$, then recall that they are actually equal $\frac{\partial g_{12}}{\partial g_{12}}=\frac{\partial g_{21}}{\partial g_{12}}$! – Qmechanic May 26 '17 at 20:50
1

Another approach would be - adapted to the case of symmetric matrices in the OP but also valid for general symmetric arrays as well as in many other contexts - to consider the space $S^2_n$ of $n\times n$ symmetric matrices, viewed as the coordinate space $S^2_n=\mathbb R^{n(n+1)/2}$ with standard coordinates $$(s^{ij})_{i\le j}=(s^{11},s^{12},...,s^{1n},s^{22},s^{23},...,s^{33},s^{34},...),$$ and thus a function $f(s)=f(s^{11},s^{12},...,s^{1n},s^{22},s^{23},...)$ does not depend on any of the $s^{ji}$ ($i<j$) because those are not actually coordinates on $S^2_n$.

Also define $T^2_n$ to be the space of $n\times n$ matrices, viewed as $T^2_n=\mathbb R^{n^2}$, with (unrestricted) coordinates $$(x^{ij})_{i,j=1,...,n}=(x^{11},...,x^{1n},x^{21},...).$$

In this approach, $S^2_n$ is not a subspace of $T^2_n$, but we can define two maps, an inclusion $\imath:S^2_n\rightarrow T^2_n$ and a projection/symmetrization $\sigma:T^2_n\rightarrow S^2_n$, defined as follows.

If $s=(s^{ij})_{i\le j}$, then $x^{ij}=\imath(s)^{ij}$ is defined as $$x^{ij}=\left\{\begin{matrix} s^{ij} & i\le j \\ s^{ji} & i>j\end{matrix}\right. ,$$ i.e. it is the natural extension of $(s^{ij})_{i\le j}$ into a symmetric array, while the projection/symmetrization is defined as - if $s=\sigma(x)$ - so that $$ s^{ij}=x^{(ij)}=\frac{1}{2}(x^{ij}+x^{ji}),\ i\le j, $$ i.e. we first symmetrize $x^{ij}$, and the restrict the value of the indices so that $i\le j$.

Clearly $\sigma$ is a left inverse of $\imath$ in the sense that $\sigma\circ\imath=\mathrm{Id}_{S^2_n}$.


We can now consider tangent vectors on $S^2_n$ and $T^2_n$, on the former a generic tangent vector has the form $$ v=\sum_{i\le j}v^{ij}\frac{\partial}{\partial s^{ij}}|_s \in T_sS^2_n,$$ whereas on the latter we have $$ w=\sum_{i,j}w^{ij}\frac{\partial}{\partial x^{ij}}|_x\in T_xT^2_n. $$

We attempt to relate derivatives with respect to the restricted variables $s^{ij}$ to derivatives with respect to the unrestricted variables $x^{ij}$ by noting that for any vector $v\in T_sS^2_n$ and function $f\in C^\infty(S^2_n)$ we have $$ v(f)=v(f\circ\sigma\circ\imath)=\imath_\ast(v)(\sigma^\ast f), $$ where $\imath_\ast$ is pushforward along the inclusion and $\sigma^\ast$ is pullback along the symmetrization.

We have $$ (\sigma^\ast f)(x)=(f\circ\sigma)(x)=f(\sigma(x)). $$ Calcuating the derivative gives $$ \frac{\partial(f\circ\sigma)}{\partial x^{ij}}(x)=\sum_{k\le l}\frac{\partial f}{\partial s^{kl}}(\sigma(x))\frac{\partial\sigma^{kl}}{\partial x^{ij}}(x) = \sum_{k\le l}\frac{\partial f}{\partial s^{kl}}(\sigma(x))\frac{\partial\frac{1}{2}(x^{kl}+x^{lk})}{\partial x^{ij}}(x) \\= \sum_{k\le l}\frac{\partial f}{\partial s^{kl}}(\sigma(x))\frac{1}{2}(\delta^k_i\delta^l_j+\delta^l_i\delta^k_j) , $$ and as such for any tangent vector $w=\sum_{ij}w^{ij}\frac{\partial}{\partial x^{ij}}$ we get $$ w(f\circ \sigma)=\sum_{i,j}w^{ij}\frac{\partial (f\circ\sigma)}{\partial x^{ij}}(x)=\sum_{i,j}\sum_{k\le l} w^{ij}\frac{\partial f}{\partial s^{kl}}(\sigma(x))\frac{1}{2}(\delta^k_i\delta^l_j+\delta^l_i\delta^k_j) \\ =\sum_{k\le l}\frac{1}{2}\left( w^{kl}+w^{lk}\right)\frac{\partial f}{\partial s^{kl}}(\sigma(s)). $$

Let us now assume that there is a tangent vector $v=\sum_{i\le j}v^{ij}\frac{\partial}{\partial s^{ij}}|_s\in T_sS^2_n$ such that $w=\imath_\ast(v)$. It is very easy to check (using for example the definition of the tangent map in terms of curves) that $\imath_\ast(v)$ is just $\sum_{ij}w^{ij}\frac{\partial}{\partial x^{ij}}|_{x=\imath(s)}$, where the $w^{ij}$ are just the symmetric extensions of the $v^{ij}$.

We thus obtain rigorously that $$ v(f)=\sum_{i\le j}v^{ij}\frac{\partial f}{\partial s^{ij}}|_s=\sum_{i,j}v^{ij}\frac{\partial (f\circ\sigma)}{\partial x^{ij}}|_{x=\imath(s)}, $$ where at the last equality, the components $v^{ij}$ have been automatically symmetrically extended.


Now let us apply this to the case when $f=s^{kl}$ is a coordinate function, and let us extend the coordinate functions symmetrically so that $s^{kl}=s^{lk}$ for unrestricted values of the indices. Then $$v(s^{kl})=\sum_{i\le j}v^{ij}\frac{\partial s^{kl}}{\partial s^{ij}}=\sum_{i,j}v^{ij}\frac{\partial(s^{kl}\circ\sigma)}{\partial x^{ij}}=\sum_{i,j}v^{ij}\frac{\partial\sigma^{kl}}{\partial x^{ij}}\\=\sum_{ij}v^{ij}\frac{1}{2}(\delta^k_i\delta^l_j+\delta^l_i\delta^k_j).$$ We should look at this as the equality $$ \sum_{ij}v^{ij}\frac{\partial s^{kl}\circ\sigma}{\partial x^{ij}}=\sum_{ij}v^{ij}\frac{1}{2}(\delta^k_i\delta^l_j+\delta^l_i\delta^k_j), $$ which is true for all (even unsymmetric) $v^{ij}$ and we get $$ \frac{\partial s^{kl}\circ\sigma}{\partial x^{ij}}=\frac{1}{2}(\delta^k_i\delta^l_j+\delta^l_i\delta^k_j). $$ This is a result we could have obtained much earlier with much less fluff, buuut....


The point of the fluff is that it does not make sense to calculate $\partial s^{kl}/\partial s^{ij}$ for $i>j$, so we define $$ \frac{\partial s^{kl}}{\partial s^{ij}}\equiv\frac{\partial s^{kl}\circ\sigma}{\partial x^{ij}}=\frac{1}{2}(\delta^k_i\delta^l_j+\delta^l_i\delta^k_j) $$ as a kind of abuse of notation. But this abuse is not dangerous, because as we have seen, for any array $v^{ij}_{i\le j}$ which we automatically symmetrically extend, we have $$\sum_{i\le j}v^{ij}\frac{\partial f}{\partial s^{ij}}\equiv\sum_{i,j}v^{ij}\frac{\partial (f\circ\sigma)}{\partial x^{ij}}, $$ therefore, if we use $\partial f/\partial s^{ij}$ with unrestricted indices as a shorthand for the symmetric $\partial (f\circ\sigma)/\partial x^{ij}$, and replace all restricted sums with unrestricted sums, we get the same results - but the price of this is that through this identification, we get the weird result $\partial s^{12}/\partial s^{12}=1/2$.

Bence Racskó
  • 10,948