Confused with 4-vector notation and 4-derivative

Question

I have a lot of trouble finding out what the rules are for doing algebra and calculus with 4-vectors. This example shall illustrate one of my problems:

The Lagrangian for a real scalar field is

$$\mathcal{L}=\frac{1}{2}\eta^{\mu\nu}\partial_{\mu}\phi\partial_{\nu}\phi-\frac{1}{2}m^2\phi^2.$$ When trying to solve the Euler-Lagrange equation I do not know how to evaluate $(\frac{\partial\mathcal{L}}{\partial(\partial_{\mu}\phi)})$. These are the ideas I have:

1. $$(\frac{\partial\mathcal{L}}{\partial(\partial_{\mu}\phi)})=\frac{1}{2}\eta^{\mu\nu}\partial_{\nu}\phi=\frac{1}{2}\partial^{\mu}\phi.$$

2. The Lagrangian can be written as $\mathcal{L}=\frac{1}{2}\eta^{\mu\nu}\partial_{\mu}\phi\partial_{\nu}\phi-\frac{1}{2}m^2\phi^2=\frac{1}{2}\partial_{\mu}\phi\partial^{\mu}\phi-\frac{1}{2}m^2\phi^2$, so

$$(\frac{\partial\mathcal{L}}{\partial(\partial_{\mu}\phi)})=\frac{1}{2}\partial^{\mu}\phi.$$

3. But the Lagrangian can also be written as $\mathcal{L}=\frac{1}{2}\partial^{\nu}\phi\partial_{\nu}\phi-\frac{1}{2}m^2\phi^2$, so in this case how do I evaluate it? Do I change $(\frac{\partial\mathcal{L}}{\partial(\partial_{\mu}\phi)})$ to $(\frac{\partial\mathcal{L}}{\partial(\partial_{\nu}\phi)})$?

4. We can write the Lagrangian as $\mathcal{L}=\frac{1}{2}\partial_{\mu}\phi\partial^{\mu}\phi-\frac{1}{2}m^2\phi^2=\frac{1}{2}(\partial_{\mu}\phi)^2-\frac{1}{2}m^2\phi^2$, so in this case we have

$$(\frac{\partial\mathcal{L}}{\partial(\partial_{\mu}\phi)})=\frac{1}{2}2(\partial_{\mu}\phi)=\partial_{\mu}\phi.$$

None of these are correct $(\partial^{\mu}\phi)$! How do I get the correct answer? What am I doing wrong? Are there any resources where they can help me specifically with such problems?

Vincent Thacker · Accepted Answer · 2021-08-22T09:06:25.810

You seem to be having some problems with index notation and the Einstein summation convention, so I recommend brushing up on those.

Firstly, the $\mu$ in $\mathcal{L}$ is a dummy index, while the $\mu$ in $\partial/\partial\left(\partial_\mu\phi\right)$ is a live index. You cannot write both of them as $\mu$ or else you will run into problems. For example, if you are going to use $\mu$ in $\partial/\partial\left(\partial_\mu\phi\right)$, you should change the dummy indices in $\mathcal{L}$ to something like $$\mathcal{L}=\frac{1}{2}\eta^{\rho\lambda}\partial_{\rho}\phi\partial_{\lambda}\phi-\frac{1}{2}m^2\phi^2$$

Secondly, $\partial^\mu \phi$ is not independent of $\partial_\mu\phi$, so you cannot treat it as a constant in your second point. In addition, in your fourth point, you cannot possibly have $\partial_\mu\phi \partial^\mu\phi = \left(\partial_\mu\phi\right)^2$ as you have a dummy index on the left but a live index on the right.

Lastly, the derivative of one component with respect to another component of the same object is a delta function. For example, $$\frac{\partial v_1}{\partial v_\mu} = \delta^1_\mu$$ since components are linearly independent. Then you can apply this to the expanded expression $\partial_\mu\phi \partial^\mu\phi = -\partial_0\phi \partial_0\phi+\partial_1\phi \partial_1\phi+\partial_2\phi \partial_2\phi+\partial_3\phi \partial_3\phi$. Alternatively, if you are confident enough, you could also apply the product rule directly to $\eta^{\rho\lambda}\partial_{\rho}\phi\partial_{\lambda}\phi$.

Hope this clears up (at least some of) your confusion.

Explicitly, by the product rule$$\tfrac{\partial}{\partial(\partial_\mu\phi)}(\tfrac12\eta^{\rho\lambda}\partial_\rho\phi\partial_\lambda\phi)=\tfrac12\eta^{\rho\lambda}(\delta_\rho^\mu\partial_\lambda\phi+\partial_\rho\phi\delta_\lambda^\mu)=\tfrac12(\eta^{\mu\lambda}\partial_\lambda\phi+\eta^{\rho\mu}\partial_\rho\phi)=\partial^\mu\phi.$$ — J.G., Aug 22 '21 at 09:00
Your answer and the comment by J.G. helped a lot. I can now see how to get the correct answer. However I've got one more question. I know what $\partial_{\mu}x^{\mu}$ is, as you said $\mu$ here is a dummy index so I have to expand it. But what is $\partial_{\mu}\phi$? Is it the 4-vector $(\partial_{t}\phi,\partial_{x}\phi,\partial_{y}\phi,\partial_{}\phi)$ or just a representation for the 4 possible partial derivatives? My textbook defines $\partial_{\mu}$ as the operator $(\partial_{t},\partial_{x},\partial_{y},\partial_{z})$. — Shiki Ryougi, Aug 22 '21 at 09:40
@ColourfulSpacetime It is a representation for the 4 possible partial derivatives, one in each coordinate direction. The collection of partial derivatives form the components of the covector field (or one-form) $\text{d}\phi = \partial_\mu\phi \text{d}x^\mu$. Note that this only works for scalars; for tensors you'll need to use the covariant derivative. — Vincent Thacker, Aug 22 '21 at 09:47

Confused with 4-vector notation and 4-derivative

1 Answers1

Linked

Related