32

In considering the (special) relativistic EM field, I understand that assuming a Lagrangian density of the form

$$\mathcal{L} =-\frac{\epsilon_0}{4}F_{\mu\nu}F^{\mu\nu} + \frac{1}{c}j_\mu A^\mu$$

and following the Euler-Lagrange equations recovers Maxwell's equations.

Does there exist a first-principles derivation of this Lagrangian? A reference or explanation would be greatly appreciated!

my2cts
  • 24,097
mcamac
  • 423

7 Answers7

64

Abstract

In the following we'll prove that a compatible Lagrangian density for the electromagnetic field in presence of charges and currents is \begin{equation} \mathcal{L}_{em}\:=\:\epsilon_{0}\cdot\dfrac{\left|\!\left|\mathbf{E}\right|\!\right|^{2}-c^{2}\left|\!\left|\mathbf{B}\right|\!\right|^{2}}{2}-\rho \phi + \mathbf{j} \boldsymbol{\cdot} \mathbf{A} \tag{045} \end{equation} that is the Euler-Langrange equations produced from this Lagrangian are the Maxwell equations for the electromagnetic field.

This Lagrangian density is derived by a trial and error procedure, not by guessing.

1. Introduction

The Maxwell's differential equations of electromagnetic field in presence of charges and currents are \begin{align} \boldsymbol{\nabla} \boldsymbol{\times} \mathbf{E} & = -\frac{\partial \mathbf{B}}{\partial t} \tag{001a}\\ \boldsymbol{\nabla} \boldsymbol{\times} \mathbf{B} & = \mu_{0}\mathbf{j}+\frac{1}{c^{2}}\frac{\partial \mathbf{E}}{\partial t} \tag{001b}\\ \boldsymbol{\nabla\cdot} \mathbf{E} & = \frac{\rho}{\epsilon_{0}} \tag{001c}\\ \boldsymbol{\nabla\cdot}\mathbf{B}& = 0 \tag{001d} \end{align} where $\: \mathbf{E} =$ electric field intensity vector, $\:\mathbf{B}=$ magnetic-flux density vector, $\:\rho=$ electric charge density, $\:\mathbf{j} =$ electric current density vector. All quantities are functions of the three space coordinates $\:\left( x_{1},x_{2},x_{3}\right) \equiv \left( x,y,z\right)\:$ and time $\:t \equiv x_{4}\:$.

From equation (001d) the magnetic-flux vector $\:\mathbf{B}\:$ may be expressed as the curl of a vector potential $\:\mathbf{A}\:$ \begin{equation} \mathbf{B}=\boldsymbol{\nabla} \boldsymbol{\times} \mathbf{A} \tag{002} \end{equation} and from (002) equation (001a) yields \begin{equation} \boldsymbol{\nabla} \boldsymbol{\times}\left(\mathbf{E}+\frac{\partial \mathbf{A}}{\partial t}\right) =\boldsymbol{0} \tag{003} \end{equation} so the parentheses term may be expressed as the gradient of a scalar function \begin{equation*} \mathbf{E}+\frac{\partial \mathbf{A}}{\partial t} =-\boldsymbol{\nabla}\phi \end{equation*} that is \begin{equation} \mathbf{E} =-\boldsymbol{\nabla}\phi -\frac{\partial \mathbf{A}}{\partial t} \tag{004} \end{equation} So the six scalar variables, the components of vectors $\:\mathbf{E}\:$ and$\:\mathbf{B}\:$, can be expressed as functions of 4 scalar variables, the scalar potential $\:\phi\:$ and three components of vector potential $\:\mathbf{A}$.

Inserting the expressions of $\:\mathbf{E}\:$ and $\:\mathbf{B}\:$, equations (002) and (004) respectively, in equations (001b) and (001c) we have
\begin{equation} \boldsymbol{\nabla} \boldsymbol{\times} \left(\boldsymbol{\nabla} \boldsymbol{\times} \mathbf{A}\right) =\mu_{0}\mathbf{j}+\frac{1}{c^{2}}\frac{\partial }{\partial t}\left(-\boldsymbol{\nabla}\phi -\frac{\partial \mathbf{A}}{\partial t}\right) \tag{005} \end{equation}
and \begin{equation} \boxed{\: -\nabla^{2}\phi-\frac{\partial }{\partial t}\left(\boldsymbol{\nabla\cdot}\mathbf{A}\right) =\frac{\rho}{\epsilon_{0}} \:\vphantom{\dfrac{\dfrac{a}{b}}{\dfrac{a}{b}}}} \tag{006} \end{equation} Given that \begin{equation} \boldsymbol{\nabla} \boldsymbol{\times} \left( \boldsymbol{\nabla} \boldsymbol{\times} \mathbf{A}\right) =\boldsymbol{\nabla}\left(\boldsymbol{\nabla\cdot} \mathbf{A}\right)- \nabla^{2}\mathbf{A} \tag{007} \end{equation} equation (005) yields \begin{equation} \boxed{\: \frac{1}{c^{2}}\frac{\partial^{2}\mathbf{A}}{\partial t^{2}}-\nabla^{2}\mathbf{A}+ \boldsymbol{\nabla}\left(\boldsymbol{\nabla\cdot} \mathbf{A}+\frac{1}{c^{2}}\frac{\partial \phi}{\partial t}\right) =\mu_{0}\mathbf{j} \:\vphantom{\dfrac{\dfrac{a}{b}}{\dfrac{a}{b}}}} \tag{008} \end{equation}

2. The Euler-Lagrange equations of EM Field

Now, our main task is to find a Lagrangian density $\:\mathcal{L}\:$, function of the four ''field coordinates'' and their 1rst order derivatives
\begin{equation} \mathcal{L}=\mathcal{L}\left(\eta_{\jmath}, \overset{\centerdot}{\eta}_{\jmath}, \boldsymbol{\nabla}\eta_{\jmath}\right) \qquad \left(\jmath=1,2,3,4\right) \tag{009} \end{equation} such that the four scalar electromagnetic field equations (006) and (008) are derived from the Lagrange equations \begin{equation} \frac{\partial }{\partial t}\left[\frac{\partial \mathcal{L}}{\partial \left(\dfrac{\partial \eta_{\jmath}}{\partial t}\right)}\right]+\sum_{k=1}^{k=3}\frac{\partial }{\partial x_{k}}\left[\frac{\partial \mathcal{L}}{\partial \left(\dfrac{\partial \eta_{\jmath}}{\partial x_{k}}\right)}\right]- \frac{\partial \mathcal{L}}{\partial \eta_{\jmath}}=0\:, \quad \left(\jmath=1,2,3,4\right) \tag{010} \end{equation} simplified in notation to \begin{equation} \boxed{\: \dfrac{\partial }{\partial t}\left(\dfrac{\partial \mathcal{L}}{\partial \overset{\centerdot}{\eta}_{\jmath}}\right) + \boldsymbol{\nabla\cdot} \left[\dfrac{\partial \mathcal{L}}{\partial \left(\boldsymbol{\nabla}\eta_{\jmath}\right)}\right]- \frac{\partial \mathcal{L}}{\partial \eta_{\jmath}}=0, \quad \left(\jmath=1,2,3,4\right) \:\vphantom{\dfrac{\dfrac{a}{b}}{\dfrac{a}{b}}}} \tag{011} \end{equation}

Here the Lagrangian density $\:\mathcal{L}\:$ is a function of

  1. the four ''field coordinates''

\begin{align} \eta_{1}&=\mathrm{A}_1\left( x_{1},x_{2},x_{3},t\right) \tag{012.1}\\ \eta_{2}&=\mathrm{A}_2\left( x_{1},x_{2},x_{3},t\right) \tag{012.2}\\ \eta_{3}&=\mathrm{A}_3\left( x_{1},x_{2},x_{3},t\right) \tag{012.3}\\ \eta_{4}&=\:\;\phi \left( x_{1},x_{2},x_{3},t\right) \tag{012.4} \end{align}

  1. their time derivatives

\begin{align} \overset{\centerdot}{\eta}_{1} & \equiv \dfrac{\partial \eta_{1}}{\partial t} =\dfrac{\partial \mathrm{A}_{1}}{\partial t}\equiv\overset{\centerdot}{\mathrm{A}}_{1} \tag{013.1}\\ \overset{\centerdot}{\eta}_{2} & \equiv \dfrac{\partial \eta_{2}}{\partial t} =\dfrac{\partial \mathrm{A}_{2}}{\partial t}\equiv \overset{\centerdot}{\mathrm{A}}_{2} \tag{013.2}\\ \overset{\centerdot}{\eta}_{3} & \equiv \dfrac{\partial \eta_{3}}{\partial t} =\dfrac{\partial \mathrm{A}_{3}}{\partial t}\equiv\overset{\centerdot}{\mathrm{A}}_{3} \tag{013.3}\\ \overset{\centerdot}{\eta}_{4} & \equiv \dfrac{\partial \eta_{4}}{\partial t} =\dfrac{\partial \phi}{\partial t}\equiv\,\overset{\centerdot}{\!\!\phi} \tag{013.4} \end{align}

and

  1. their gradients

\begin{equation} \begin{array}{cccc} \boldsymbol{\nabla}\eta_{1}=\boldsymbol{\nabla}\mathrm{A}_1 \:,\: & \boldsymbol{\nabla}\eta_{2}=\boldsymbol{\nabla}\mathrm{A}_{2} \:,\: \boldsymbol{\nabla}\eta_{3}=\boldsymbol{\nabla}\mathrm{A}_3 \:,\: & \boldsymbol{\nabla}\eta_{4}=\boldsymbol{\nabla}\phi \end{array} \tag{014} \end{equation}

We express equations (006) and (008) in forms that are similar to the Lagrange equations (011) \begin{equation} \boxed{\: \dfrac{\partial }{\partial t}\left(\boldsymbol{\nabla\cdot}\mathbf{A}\right)+\boldsymbol{\nabla\cdot}\left(\boldsymbol{\nabla}\phi \right) -\left(-\frac{\rho}{\epsilon_{0}}\right) =0 \:\vphantom{\dfrac{\dfrac{a}{b}}{\dfrac{a}{b}}}} \tag{015} \end{equation} and \begin{equation} \boxed{\: \dfrac{\partial}{\partial t}\left(\frac{\partial \mathrm{A}_{k}}{\partial t}+\frac{\partial \phi}{\partial x_{k}}\right)+\boldsymbol{\nabla\cdot} \left[c^{2}\left(\frac{\partial \mathbf{A}}{\partial x_{k}}- \boldsymbol{\nabla}\mathrm{A}_{k}\right)\right] -\frac{\mathrm{j}_{k}}{\epsilon_{0}}=0 \:\vphantom{\dfrac{\dfrac{a}{b}}{\dfrac{a}{b}}}} \tag{016} \end{equation} The Lagrange equation (011) for $\:\jmath=4\:$, that is for $\:\eta_{4}=\phi \:$, is \begin{equation} \frac{\partial }{\partial t}\left(\frac{\partial \mathcal{L}}{\partial\:\, \overset{\centerdot}{\!\!\phi}}\right) + \boldsymbol{\nabla\cdot} \left[\frac{\partial \mathcal{L}}{\partial \left(\boldsymbol{\nabla}\phi\right)}\right]- \frac{\partial \mathcal{L}}{\partial \phi}=0 \tag{017} \end{equation}

Comparing equations (015) and (017), we note that the first could be derived from the second if \begin{equation} \dfrac{\partial \mathcal{L}}{\partial \:\,\overset{\centerdot}{\!\!\phi}}=\boldsymbol{\nabla\cdot}\mathbf{A}\:, \qquad \dfrac{\partial \mathcal{L}}{\partial \left(\boldsymbol{\nabla}\phi\right)}=\boldsymbol{\nabla}\phi \:, \qquad \frac{\partial \mathcal{L}}{\partial \phi}=-\frac{\rho}{\epsilon_{0}} \tag{018} \end{equation}
so that the Lagrangian density $\:\mathcal{L}\:$ must contain respectively the terms \begin{equation} \mathcal{L}_{\boldsymbol{\alpha_{1}}}\equiv\left(\boldsymbol{\nabla\cdot} \mathbf{A}\right)\:\,\overset{\centerdot}{\!\!\phi}\:, \qquad \mathcal{L}_{\boldsymbol{\alpha_{2}}}\equiv\tfrac{1}{2}\Vert \boldsymbol{\nabla}\phi \Vert^{2}\:, \qquad \mathcal{L}_{\boldsymbol{\alpha_{3}}}\equiv-\frac{\rho \phi}{\epsilon_{0}} \tag{019} \end{equation} and consequently their sum \begin{equation} \mathcal{L}_{\boldsymbol{\alpha}}=\mathcal{L}_{\boldsymbol{\alpha_{1}}}+\mathcal{L}_{\boldsymbol{\alpha_{2}}} +\mathcal{L}_{\boldsymbol{\alpha_{3}}}=\left(\boldsymbol{\nabla\cdot} \mathbf{A}\right)\:\,\overset{\centerdot}{\!\!\phi}+\tfrac{1}{2}\Vert \boldsymbol{\nabla}\phi \Vert^{2}-\frac{\rho \phi}{\epsilon_{0}} \tag{020} \end{equation}

We suppose that an appropriate Lagrangian density $\:\mathcal{L}\:$ would be of the form \begin{equation} \mathcal{L}=\mathcal{L}_{\boldsymbol{\alpha}}+\mathcal{L}_{\boldsymbol{\beta}} \tag{021} \end{equation} and since $\:\mathcal{L}_{\boldsymbol{\alpha}}\:$ produces equation (015), we expect that $\:\mathcal{L}_{\boldsymbol{\beta}}\:$, to be determined, will produce equations (016). This expectation would be right if equations (015) and (016) were decoupled, for example if the first contains $\:\phi $-terms only and the second $\:\mathbf{A} $-terms only. But here this is not the case : $\:\mathcal{L}_{\boldsymbol{\alpha}}\:$ as containing $\:\mathbf{A} $-terms would participate to the production of equations (016) and moreover $\:\mathcal{L}_{\boldsymbol{\beta}}\:$ would participate to the production of equation (015), possibly destroying mutually the production of the equations as we expected. But here we follow a trial and error procedure, which will direct to the right answer as we'll see in the following.

Now, the Lagrange equations (011) for $\:\jmath=k=1,2,3\:$, that is for $\:\eta_{k}=\mathrm{A}_{k} \:$, are \begin{equation} \frac{\partial }{\partial t}\left(\dfrac{\partial \mathcal{L}}{\partial \overset{\centerdot}{\mathrm{A}}_{k}}\right) +\boldsymbol{\nabla\cdot} \left[\dfrac{\partial \mathcal{L}}{\partial \left(\boldsymbol{\nabla}\mathrm{A}_{k}\right)}\right]- \frac{\partial \mathcal{L}}{\partial \mathrm{A}_{k}}=0 \tag{022} \end{equation}

Comparing equations (016) and (022), we note that the first could be derived from the second if \begin{equation} \dfrac{\partial \mathcal{L}}{\partial \overset{\centerdot}{\mathrm{A}}_{k}}= \overset{\centerdot}{\mathrm{A}}_{k}+\frac{\partial \phi}{\partial x_{k}}\:, \quad \dfrac{\partial \mathcal{L}}{\partial \left(\boldsymbol{\nabla}\mathrm{A}_{k}\right)}=c^{2}\left(\frac{\partial \mathbf{A}}{\partial x_{k}}- \boldsymbol{\nabla}\mathrm{A}_{k}\right)\:, \quad \frac{\partial \mathcal{L}}{\partial \mathrm{A}}_{k}=\frac{\mathrm{j}_{k}}{\epsilon_{0}} \tag{023} \end{equation}

From the 1rst of equations (023) the $\:\mathcal{L}_{\boldsymbol{\beta}}\:$ part of the Lagrange density $\:\mathcal{L}\:$ must contain the terms \begin{equation} \frac{1}{2}\left\Vert \overset{\centerdot}{\mathrm{A}}_{k}\right\Vert^{2}+\frac{\partial \phi}{\partial x_{k}}\overset{\centerdot}{\mathrm{A}}_{k}\:, \quad k=1,2,3 \tag{024} \end{equation} and so their sum with respect to $\:k\:$ \begin{equation} \mathcal{L}_{\boldsymbol{\beta_{1}}}\equiv \tfrac{1}{2}\left\Vert \mathbf{\dot{A}}\right\Vert^{2}+\boldsymbol{\nabla}\phi \boldsymbol{\cdot} \mathbf{\dot{A}} \tag{025} \end{equation}

From the 2nd of equations (023) the $\:\mathcal{L}_{\boldsymbol{\beta}}\:$ part of the Lagrange density $\:\mathcal{L}\:$ must contain the terms \begin{equation} \tfrac{1}{2}c^{2}\left[\frac{\partial \mathbf{A}}{\partial x_{k}} \boldsymbol{\cdot} \boldsymbol{\nabla}\mathrm{A}_{k} -\Vert \boldsymbol{\nabla}\mathrm{A}_{k}\Vert^{2}\right] \:, \quad k=1,2,3 \tag{026} \end{equation} and so their sum with respect to $\:k\:$ \begin{equation} \mathcal{L}_{\boldsymbol{\beta_{2}}}\equiv\tfrac{1}{2}c^{2}\sum^{k=3}_{k=1}\left[ \frac{\partial \mathbf{A}}{\partial x_{k}}\boldsymbol{\cdot} \boldsymbol{\nabla}\mathrm{A}_{k}-\Vert\boldsymbol{\nabla}\mathrm{A}_{k}\Vert^{2}\right] \tag{027} \end{equation} From the 3rd of equations (023) the $\:\mathcal{L}_{\boldsymbol{\beta}}\:$ part of the Lagrange density $\:\mathcal{L}\:$ must contain the terms \begin{equation} \frac{\mathrm{j}_{k}\mathrm{A}_{k}}{\epsilon_{0}} \:, \quad k=1,2,3 \tag{028} \end{equation} and so their sum with respect to $\:k\:$ \begin{equation} \mathcal{L}_{\boldsymbol{\beta_{3}}}\equiv \frac{\mathbf{j} \boldsymbol{\cdot} \mathbf{A}}{\epsilon_{0}} \tag{029} \end{equation}

From equations (025), (027) and (029) the $\:\mathcal{L}_{\boldsymbol{\beta}}\:$ part of the Lagrange density $\:\mathcal{L}\:$ is \begin{align} \mathcal{L}_{\boldsymbol{\beta}} & = \mathcal{L}_{\boldsymbol{\beta_{1}}}+\mathcal{L}_{\boldsymbol{\beta_{2}}} +\mathcal{L}_{\boldsymbol{\beta_{3}}} \tag{030}\\ & = \tfrac{1}{2}\left\Vert \mathbf{\dot{A}}\right\Vert^{2}+\boldsymbol{\nabla}\phi \boldsymbol{\cdot}\mathbf{\dot{A}}+\tfrac{1}{2}c^{2}\sum^{k=3}_{k=1}\left[ \frac{\partial \mathbf{A}}{\partial x_{k}} \boldsymbol{\cdot} \boldsymbol{\nabla}\mathrm{A}_{k}-\Vert\boldsymbol{\nabla}\mathrm{A}_{k}\Vert^{2}\right]+\frac{\mathbf{j} \boldsymbol{\cdot} \mathbf{A}}{\epsilon_{0}} \nonumber \end{align}

Finally, from the expressions (020) and (030) for the densities $\:\mathcal{L}_{\boldsymbol{\alpha}},\mathcal{L}_{\boldsymbol{\beta}}\:$ the Lagrange density $\:\mathcal{L}=\mathcal{L}_{\boldsymbol{\alpha}}+\mathcal{L}_{\boldsymbol{\beta}}\:$ is \begin{align} \mathcal{L}& = \mathcal{L}_{\boldsymbol{\alpha}} + \mathcal{L}_{\boldsymbol{\beta}} \tag{031}\\ & = \left( \boldsymbol{\nabla\cdot} \mathbf{A}\right)\:\,\overset{\centerdot}{\!\!\phi}+\tfrac{1}{2}\Vert \boldsymbol{\nabla}\phi \Vert^{2}-\frac{\rho \phi}{\epsilon_{0}} \nonumber\\ & + \tfrac{1}{2}\left\Vert \mathbf{\dot{A}}\right\Vert^{2}+\boldsymbol{\nabla}\phi \boldsymbol{\cdot} \mathbf{\dot{A}}+\tfrac{1}{2}c^{2}\sum^{k=3}_{k=1}\left[ \frac{\partial \mathbf{A}}{\partial x_{k}} \boldsymbol{\cdot} \boldsymbol{\nabla}\mathrm{A}_{k}-\Vert\boldsymbol{\nabla}\mathrm{A}_{k}\Vert^{2}\right]+\frac{\mathbf{j}\boldsymbol{\cdot}\mathbf{A}}{\epsilon_{0}} \nonumber\\ & \text{(this is a wrong Lagrange density)} \nonumber \end{align}

3. Error-Trial-Final Success

Insertion of this Lagrange density expression in the Lagrange equation with respect to $\:\phi \:$, that is equation (017), doesn't yield equation (006) but
\begin{equation} -\nabla^{2}\phi-\frac{\partial }{\partial t}\left(2 \boldsymbol{\nabla\cdot} \mathbf{A}\right) =\frac{\rho}{\epsilon_{0}}\:, \quad (\textbf{wrong}) \tag{032} \end{equation} The appearance of an extra $\:\left( \boldsymbol{\nabla\cdot} \mathbf{A}\right) \:$ is due to the term $\:\left( \boldsymbol{\nabla}\phi \boldsymbol{\cdot} \mathbf{\dot{A}}\right) \:$ of $\:\mathcal{L}_{\boldsymbol{\beta}}\:$ and that's why the Lagrange density given by equation (031) is not an appropriate one.

In order to resolve this problem we must look at (015), that is (006), from a different point of view as follows \begin{equation} \boldsymbol{\nabla\cdot}\left(\boldsymbol{\nabla}\phi + \mathbf{\dot{A}}\right) -\left(-\frac{\rho}{\epsilon_{0}}\right) =0 \tag{033} \end{equation}

Comparing equations (033) and (017), we note that the first could be derived from the second if in place of (018) we have
\begin{equation} \dfrac{\partial \mathcal{L}}{\partial \:\,\overset{\centerdot}{\!\!\phi}}=0\:, \qquad \dfrac{\partial \mathcal{L}}{\partial \left(\boldsymbol{\nabla}\phi\right)}=\boldsymbol{\nabla}\phi + \mathbf{\dot{A}} \:, \qquad \frac{\partial \mathcal{L}}{\partial \phi}=-\frac{\rho}{\epsilon_{0}} \tag{034} \end{equation}
so in place of (019) and (020) respectively the equations \begin{equation} \mathcal{L}^{\prime}_{\boldsymbol{\alpha_{1}}}\equiv 0\:, \quad \mathcal{L}^{\prime}_{\boldsymbol{\alpha_{2}}}\equiv\tfrac{1}{2}\Vert \boldsymbol{\nabla}\phi \Vert^{2} +\boldsymbol{\nabla}\phi \boldsymbol{\cdot} \mathbf{\dot{A}}\:, \quad \mathcal{L}^{\prime}_{\boldsymbol{\alpha_{3}}}=\mathcal{L}_{\boldsymbol{\alpha_{3}}}\equiv-\frac{\rho \phi}{\epsilon_{0}} \tag{035} \end{equation} \begin{equation} \mathcal{L}^{\prime}_{\boldsymbol{\alpha}}=\mathcal{L}^{\prime}_{\boldsymbol{\alpha_{1}}}+\mathcal{L}^{\prime}_{\boldsymbol{\alpha_{2}}} +\mathcal{L}^{\prime}_{\boldsymbol{\alpha_{3}}}=\tfrac{1}{2}\Vert \boldsymbol{\nabla}\phi \Vert^{2}+\boldsymbol{\nabla}\phi \boldsymbol{\cdot} \mathbf{\dot{A}}-\frac{\rho \phi}{\epsilon_{0}} \tag{036} \end{equation} Now, it's necessary to omit from $\:\mathcal{L}_{\boldsymbol{\beta_{1}}}\:$, equation (025), the second term $\:\left( \boldsymbol{\nabla}\phi \boldsymbol{\cdot} \mathbf{\dot{A}}\right) \:$ since it appears in $\:\mathcal{L}^{\prime}_{\boldsymbol{\alpha_{2}}} \:$, see the second of above equations (035).

So we have in place of (025) \begin{equation} \mathcal{L}^{\prime}_{\boldsymbol{\beta_{1}}}\equiv \tfrac{1}{2}\left\Vert \mathbf{\dot{A}}\right\Vert^{2} \tag{037} \end{equation} while $\:\mathcal{L}_{\boldsymbol{\beta_{2}}},\mathcal{L}_{\boldsymbol{\beta_{3}}}\:$ remain unchanged as in equations (027) and (029) \begin{align} \mathcal{L}^{\prime}_{\boldsymbol{\beta_{2}}} &=\mathcal{L}_{\boldsymbol{\beta_{2}}}\equiv\tfrac{1}{2}c^{2}\sum^{k=3}_{k=1}\left[ \frac{\partial \mathbf{A}}{\partial x_{k}}\boldsymbol{\cdot}\boldsymbol{\nabla}\mathrm{A}_{k}-\Vert\boldsymbol{\nabla}\mathrm{A}_{k}\Vert^{2}\right] \tag{038} \\ \mathcal{L}^{\prime}_{\boldsymbol{\beta_{3}}} &=\mathcal{L}_{\boldsymbol{\beta_{3}}}\equiv \frac{\mathbf{j} \boldsymbol{\cdot} \mathbf{A}}{\epsilon_{0}} \tag{039} \end{align}

In place of (030) \begin{align} \mathcal{L}^{\prime}_{\boldsymbol{\beta}} & = \mathcal{L}^{\prime}_{\boldsymbol{\beta_{1}}}+\mathcal{L}^{\prime}_{\boldsymbol{\beta_{2}}} +\mathcal{L}^{\prime}_{\boldsymbol{\beta_{3}}} \tag{040} \\ & = \tfrac{1}{2}\left\Vert \mathbf{\dot{A}}\right\Vert^{2}+\tfrac{1}{2}c^{2}\sum^{k=3}_{k=1}\left[ \frac{\partial \mathbf{A}}{\partial x_{k}} \boldsymbol{\cdot} \boldsymbol{\nabla}\mathrm{A}_{k}-\Vert\boldsymbol{\nabla}\mathrm{A}_{k}\Vert^{2}\right]+\frac{\mathbf{j} \boldsymbol{\cdot} \mathbf{A}}{\epsilon_{0}} \nonumber \end{align} and finally for the new Lagrangian density we have in place of (031)

\begin{align} \mathcal{L}^{\prime}& = \mathcal{L}^{\prime}_{\boldsymbol{\alpha}} + \mathcal{L}^{\prime}_{\boldsymbol{\beta}} \tag{041} \\ & = \tfrac{1}{2}\Vert \boldsymbol{\nabla}\phi \Vert^{2} +\boldsymbol{\nabla}\phi \boldsymbol{\cdot} \mathbf{\dot{A}} -\frac{\rho \phi}{\epsilon_{0}} \nonumber\\ & + \tfrac{1}{2}\left\Vert \mathbf{\dot{A}}\right\Vert^{2}+\tfrac{1}{2}c^{2}\sum^{k=3}_{k=1}\left[ \frac{\partial \mathbf{A}}{\partial x_{k}} \boldsymbol{\cdot} \boldsymbol{\nabla}\mathrm{A}_{k} -\Vert\boldsymbol{\nabla}\mathrm{A}_{k}\Vert^{2}\right]+\frac{\mathbf{j} \boldsymbol{\cdot} \mathbf{A}}{\epsilon_{0}} \nonumber \end{align}

Density $\:\mathcal{L}^{\prime}\:$ of (041) is obtained from density $\:\mathcal{L}\:$ of (031) if we omit the term $\:\left( \boldsymbol{\nabla\cdot}\mathbf{A}\right)\:\:\overset{\centerdot}{\!\!\phi}\:$. So $\:\mathcal{L}^{\prime}\:$ is independent of $\:\:\:\overset{\centerdot}{\!\!\phi}$.

In the following equations the brace over the left 3 terms groups that part of the density $\:\mathcal{L}^{\prime}\:$ that essentially participates to the production of the electromagnetic equation (006) from the Lagrange equation with respect to $\:\phi \:$, equation (017), while the brace under the right 4 terms groups that part of the density $\:\mathcal{L}^{\prime}\:$ that essentially participates to the production of the electromagnetic equations (008) from the Lagrange equations with respect to $\:\mathrm{A}_{1},\mathrm{A}_{2},\mathrm{A}_{3} \:$, equation (022).

\begin{equation*} \mathcal{L}^{\prime}=\overbrace{\tfrac{1}{2}\Vert \boldsymbol{\nabla}\phi \Vert^{2}-\frac{\rho \phi}{\epsilon_{0}}+\boldsymbol{\nabla}\phi \boldsymbol{\cdot} \mathbf{\dot{A}}}^{\text{with respect to }\phi}+\tfrac{1}{2}\left\Vert \mathbf{\dot{A}}\right\Vert^{2}+\tfrac{1}{2}c^{2}\sum^{k=3}_{k=1}\left[\frac{\partial \mathbf{A}}{\partial x_{k}} \boldsymbol{\cdot} \boldsymbol{\nabla}\mathrm{A}_{k}-\Vert \boldsymbol{\nabla}\mathrm{A}_{k}\Vert^{2}\right]+\frac{\mathbf{j} \boldsymbol{\cdot} \mathbf{A}}{\epsilon_{0}} \end{equation*}

\begin{equation*} \mathcal{L}^{\prime}=\tfrac{1}{2}\Vert \boldsymbol{\nabla}\phi \Vert^{2}-\frac{\rho \phi}{\epsilon_{0}}+\underbrace{\boldsymbol{\nabla}\phi\boldsymbol{\cdot} \mathbf{\dot{A}}+\tfrac{1}{2}\left\Vert \mathbf{\dot{A}}\right\Vert^{2}+\tfrac{1}{2}c^{2}\sum^{k=3}_{k=1}\left[\frac{\partial \mathbf{A}}{\partial x_{k}} \boldsymbol{\cdot} \boldsymbol{\nabla}\mathrm{A}_{k}-\Vert\boldsymbol{\nabla}\mathrm{A}_{k}\Vert^{2}\right]+\frac{\mathbf{j}\boldsymbol{\cdot} \mathbf{A}}{\epsilon_{0}}}_{\text{with respect to }\mathbf{A}} \end{equation*}

Note the common term $\:\left( \boldsymbol{\nabla}\phi \boldsymbol{\cdot} \mathbf{\dot{A}}\right)$.

Reordering the terms in the expression (041) of the density $\:\mathcal{L}^{\prime}\:$ we have \begin{equation} \mathcal{L}^{\prime}=\underbrace{\tfrac{1}{2}\left\Vert \mathbf{\dot{A}}\right\Vert^{2}+\tfrac{1}{2}\Vert \boldsymbol{\nabla}\phi \Vert^{2}+\boldsymbol{\nabla}\phi \boldsymbol{\cdot} \mathbf{\dot{A}}}_{\tfrac{1}{2}\left\Vert - \boldsymbol{\nabla}\phi -\frac{\partial \mathbf{A}}{\partial t}\right\Vert^{2}}-\tfrac{1}{2}c^{2}\underbrace{\sum^{k=3}_{k=1}\left[\Vert \boldsymbol{\nabla}\mathrm{A}_{k}\Vert^{2}-\frac{\partial \mathbf{A}}{\partial x_{k}}\boldsymbol{\cdot} \boldsymbol{\nabla}\mathrm{A}_{k}\right]}_{\left\Vert \boldsymbol{\nabla} \boldsymbol{\times} \mathbf{A}\right\Vert^{2}}+\frac{1}{\epsilon_{0}}\left( -\rho \phi + \mathbf{j}\boldsymbol{\cdot} \mathbf{A}\right) \end{equation} \begin{equation} ----------------- \tag{042} \end{equation}

that is \begin{equation} \mathcal{L}^{\prime}=\tfrac{1}{2}\left|\!\left|- \boldsymbol{\nabla}\phi -\frac{\partial \mathbf{A}}{\partial t}\right|\!\right|^{2}-\tfrac{1}{2}c^{2}\left|\!\left| \boldsymbol{\nabla} \boldsymbol{\times} \mathbf{A}\right|\!\right|^{2}+\frac{1}{\epsilon_{0}}\left( -\rho \phi + \mathbf{j} \boldsymbol{\cdot} \mathbf{A}\right) \tag{043} \end{equation} or \begin{equation} \mathcal{L}^{\prime}=\frac{\left|\!\left|\mathbf{E}\right|\!\right|^{2}-c^{2}\left|\!\left|\mathbf{B}\right|\!\right|^{2}}{2}+\frac{1}{\epsilon_{0}}\left( -\rho \phi + \mathbf{j}\boldsymbol{\cdot}\mathbf{A}\right) \tag{044} \end{equation}

Now, if the density $\:\mathcal{L}^{\prime}\:$ must have dimensions of energy per unit volume we define $\:\mathcal{L}_{em}=\epsilon_{0}\mathcal{L}^{\prime} \:$ so \begin{equation} \boxed{\:\:\: \mathcal{L}_{em}\:=\:\epsilon_{0}\cdot\dfrac{\left|\!\left|\mathbf{E}\right|\!\right|^{2}-c^{2}\left|\!\left|\mathbf{B}\right|\!\right|^{2}}{2}-\rho \phi + \mathbf{j} \boldsymbol{\cdot} \mathbf{A} \:\:\:\vphantom{\dfrac{\dfrac{a}{b}}{\dfrac{a}{b}}}} \tag{045} \end{equation} having in mind that \begin{align} \left\Vert\mathbf{E}\right\Vert^{2} & = \left\Vert - \boldsymbol{\nabla}\phi -\dfrac{\partial \mathbf{A}}{\partial t}\right\Vert^{2} = \left\Vert \mathbf{\dot{A}}\right\Vert^{2}+\Vert \boldsymbol{\nabla}\phi \Vert^{2}+2\left(\boldsymbol{\nabla}\phi \boldsymbol{\cdot} \mathbf{\dot{A}}\right) \tag{046a}\\ & \nonumber\\ \left\Vert\mathbf{B}\right\Vert^{2} & = \left\Vert\boldsymbol{\nabla} \boldsymbol{\times} \mathbf{A}\right\Vert^{2}=\sum^{k=3}_{k=1}\left[\Vert \boldsymbol{\nabla}\mathrm{A}_{k}\Vert^{2}-\dfrac{\partial \mathbf{A}}{\partial x_{k}}\boldsymbol{\cdot} \boldsymbol{\nabla}\mathrm{A}_{k}\right] \tag{046b} \end{align}

The scalar $\:\left(\left|\!\left|\mathbf{E}\right|\!\right|^{2}-c^{2}\left|\!\left|\mathbf{B}\right|\!\right|^{2}\right)\:$ is one of the two Lorentz invariants (1) of the field (the other is $\:\mathbf{E}\boldsymbol{\cdot}\mathbf{B}$) essentially equal to a constant times $\:\mathcal{E}_{\mu\nu}\mathcal{E}^{\mu\nu}\:$, where $\:\mathcal{E}^{\mu\nu}\:$ the antisymmetric field(1) tensor.

On the other hand, the scalar $\: \left(-\rho \phi + \mathbf{j} \boldsymbol{\cdot} \mathbf{A}\right)\:$ is essentially the inner product $\:J_{\mu}A^{\mu}\:$ in Minkowski space of two 4-vectors : the 4-current density $\:J^{\mu}=\left(c\rho,\mathbf{j}\right)\:$ and the 4-potential $\:A^{\mu}=\left(\phi/c,\mathbf{A}\right)\:$ , a Lorentz invariant scalar too.

So, the Lagrange density $\:\mathcal{L}_{em}\:$ in equation (045) is Lorentz invariant.


(1) Following W.Rindler in "Introduction to Special Relativity" Ed.1982, this tensor is derived in equation (38.15) \begin{equation} \mathcal{E}_{\mu\nu}= \begin{bmatrix} \hphantom{-} 0 & \hphantom{-}E_{1} & \hphantom{-}E_{2} & \hphantom{-}E_{3} \\ -E_{1} & \hphantom{-} 0 & -cB_{3} & \hphantom{-}cB_{2} \\ -E_{2} & \hphantom{-}cB_{3} & \hphantom{-}0 & -cB_{1} \\ -E_{3} & -cB_{2} & \hphantom{-}cB_{1} & \hphantom{-} 0 \end{bmatrix} \quad \text{so} \quad \mathcal{E}^{\mu\nu}= \begin{bmatrix} 0 & -E_{1} & -E_{2} & -E_{3} \\ E_{1} & \hphantom{-}0 & -cB_{3} & \hphantom{-}cB_{2} \\ E_{2} & \hphantom{-}cB_{3}& \hphantom{-}0 & -cB_{1} \\ E_{3} & -cB_{2} & \hphantom{-}cB_{1} & \hphantom{-}0 \end{bmatrix} \tag{38.15} \end{equation} which by making the (duality) replacements $\:\mathbf{E}\to -c\mathbf{B}\:$ and $\:c\mathbf{B}\to \mathbf{E}\:$ yields \begin{equation} \mathcal{B}_{\mu\nu}= \begin{bmatrix} 0 & -cB_{1} & -cB_{2} & - cB_{3} \\ cB_{1} & \hphantom{-}0 & -E_{3} &\hphantom{-} E_{2}\\ cB_{2} & \hphantom{-}cE_{3} & \hphantom{-}0 & -E_{1} \\ cB_{3} & -E_{2} & \hphantom{-}E_{1}& \hphantom{-} 0 \end{bmatrix} \quad \text{so} \quad \mathcal{B}^{\mu\nu}= \begin{bmatrix} \hphantom{-}0 & \hphantom{-}cB_{1} & \hphantom{-}cB_{2} & \hphantom{-}cB_{3} \\ -cB_{1} & \hphantom{-} 0 & -E_{3} & \hphantom{-}E_{2} \\ -cB_{2} & \hphantom{-}cE_{3} & \hphantom{-}0 & -E_{1} \\ -cB_{3} & -E_{2} & \hphantom{-}E_{1} & \hphantom{-} 0 \end{bmatrix} \tag{39.05} \end{equation} The two invariants of $\:\mathcal{E}^{\mu\nu}\:$-immediately recognizable as such from their mode of formation - can be expressed as follows: \begin{align} X & =\dfrac{1}{2}\mathcal{E}_{\mu\nu}\mathcal{E}^{\mu\nu}=-\dfrac{1}{2}\mathcal{B}_{\mu\nu}\mathcal{B}^{\mu\nu}=c^{2}\left|\!\left|\mathbf{B}\right|\!\right|^{2}-\left|\!\left|\mathbf{E}\right|\!\right|^{2} \tag{39.06}\\ Y & =\dfrac{1}{4}\mathcal{B}_{\mu\nu}\mathcal{E}^{\mu\nu}=c\mathbf{B}\boldsymbol{\cdot}\mathbf{E} \tag{39.07} \end{align}

Frobenius
  • 15,613
  • What's the difference between "trial and error" and "guessing"? – tparker Jul 09 '17 at 07:30
  • 13
    Beautiful latex – Kenshin Jul 11 '17 at 12:03
  • 16
    A professor, named Bahman Zohuri, of the Department of Electrical and Computer Engineering, University of New Mexico published at 30 January 2019 a paper in pdf format titled "Deriving the Lagrangian Density of an Electromagnetic Field" with an exact copy-paste of this answer herein. It's a pity that in his references not even a word about PhysicsStackExchange and my answer. Link : Deriving the Lagrangian Density of an Electromagnetic Field – Frobenius Jun 04 '19 at 16:03
  • 4
  • 3
    It is hard to believe that a professor from a "reputed" university plagiarised your world without even mentioning it. What is more astonishing is that a "reputed" publisher published it, disgusting to say the least. – renormalizedQuanta Jul 30 '20 at 05:12
  • 3
    @renormalizedQuanta : ...indeed, it's hard to believe... – Frobenius Jul 30 '20 at 12:25
  • 3
    This answer is great! (+++1) – SG8 Apr 29 '21 at 12:01
  • 3
    @Frobenius, About that copy-paste, I think it would be better someone contact Springer and report that plagiarism. The prestigious journals often accuse the professors of plagiarism. I'm pretty sure that this is not the first time he has done this and it will not be the last time. – SG8 Apr 29 '21 at 12:13
  • At the moment I thought that you should publish this I saw that a professor had (ab)used this! – Deschele Schilder May 15 '21 at 10:44
  • @Deschele Schilder : ...what can I say about this. As you see I post this comment at Jun 4'19 and no reply. I'am sure that the professor from a "reputed" university and the "reputed" publisher know about this. By the way it's always a pleasure to see comments like this here A mathematically illogical argument in the derivation of Hamilton's equation in Goldstein. – Frobenius May 15 '21 at 10:53
  • In equation (016), how does the $\mu_{0}\textbf{j}$ go to $\frac{j_k}{\epsilon_0}$? – moboDawn_φ Jan 13 '22 at 15:46
  • 1
    @moboDawn_φ : Welcome to PSE. Because $$c^2 \cdot (008)=(016)\implies c^2\mu_0\mathbf{j}=\dfrac{\mathbf{j}}{\epsilon_0}$$ – Frobenius Jan 13 '22 at 23:22
  • 1
    Thank you for this amazing answer and work ! – abu_bua Jan 30 '22 at 23:39
  • 1
    @abu_bua : Welcome. I suggest to take a look also here Why treat complex scalar field and its complex conjugate as two different fields? for the Lagrangian Density of the Schroedinger equation. – Frobenius Jan 31 '22 at 00:13
  • 2
    @Frobenius have you reported to the editor/publisher? – Ziyuan Feb 22 '22 at 23:11
  • 1
    @ziyuang : Thanks for your attention. No, I don't have reported to. I don't think it's worth to do it. – Frobenius Feb 22 '22 at 23:45
  • 1
    @Frobenius Good answer! One comment though! The Maxwell's equations that you wrote are not Maxwell's equations in empty space. They are in presence of charges and currents ($\rho,\vec{j} \neq 0$). – Solidification Aug 13 '22 at 11:49
  • @Solidification : Thank you. Although belatedly I edited the answer according to your comment. – Frobenius Feb 20 '23 at 04:10
  • 2
    If anyone wants to report the plagiarism of this answer by Bahman Zohuri to researchgate, the necessary information is here: https://www.researchgate.net/ip-policy – hft Apr 06 '23 at 02:39
  • 1
    And here: Copyright Agent; ResearchGate GmbH; Chausseestr. 20; Berlin, Germany 10115; Email: copyright@researchgate.net; Telephone: +49 (0) 30 200051-100 – hft Apr 06 '23 at 02:42
  • 1
    @hft : Many Thanks for the information. – Frobenius Apr 06 '23 at 02:49
  • 2
    the article/book has been "withdrawn", see https://link.springer.com/book/10.1007/978-3-319-91023-9 including the specific chapter https://link.springer.com/chapter/10.1007/978-3-319-91023-9_5 – hyportnex Apr 09 '23 at 21:43
  • @hyportnex : Many Thanks for the information. – Frobenius Apr 09 '23 at 21:49
  • 1
    The book is still available on Amazon. I have reported the book as plagiarism. If you would like to give the book a one-star review and state in your review that the book is plagiarized from Stack Exchange, go here: https://www.amazon.com/Scalar-Wave-Driven-Energy-Applications/dp/3319910221/ – hft Apr 13 '23 at 20:29
  • 1
    Bahman Zohuri does not appear to be employed by University of New Mexico ECE anymore, but he does still appear on their website: http://ece.unm.edu/news/2018/05/ece-profs-shine-at-hodgin-hall.html – hft Apr 13 '23 at 20:32
  • 1
    He is currently listed as an Adjunct Professor at Golden Gate University Edward S. Ageno School of Business. Note that while the circumstantial evidence seems very damning, we can not entirely rule out identity theft or other possibilities. I have not found a verified professional CV listing this plagiarized textbook as a publication. If you would like to contact the Dean of the GGU School of business about this to get more info, his linkedin is here: https://www.linkedin.com/in/bruce-magid-5208ba188/ – hft Apr 13 '23 at 21:01
  • 1
    @hft : Thanks again for your valuable comments. I have the mentioned book "Scalar Wave Driven Energy Applications" in my e-library since 25/12/2020. I consider it worthless to deal with this plagiarism anymore. Let's continue to give our time dealing with our favorite subjects here "αεί διδασκόμενοι". – Frobenius Apr 13 '23 at 21:17
  • 1
    Yep. I agree. It's not worth the time, but nevertheless every time I remember that this guy stole from you, it pisses me off. But yeah, it is best to just let it go. – hft Apr 13 '23 at 21:34
10

Ultimately the reasoning must be that (as you stated) it must be constructed so the Euler-Lagrange equations are Maxwell's equations. So in a sense you have to guess the Lagrangian that produces this as is done by (emerita) professor Susan Lea in these notes to a graduate Electrodynamics course for example.

However you can get some guidance from the fact that we need to construct a Lagrangian for a massless non self interacting field. So we need a gauge and lorentz invariant combination of the 4-vector potential which only has a kinetic term (quadratic in derivatives of the fields). You are then not left with many options apart from $F^{\mu\nu}F_{\mu\nu}$. The source term is then trivial to add in if needed.

Cleonis
  • 20,795
Mistake Ink
  • 1,105
  • 1
    What about $\epsilon_{\mu\nu\sigma\tau} F^{\mu\nu} F^{\sigma\tau}$? – Fabian Aug 15 '12 at 21:18
  • Well, I said "not left with many options" and indeed the combination you write down would also fit my criteria, as would $\det (F)$, but one may as well try the simplest option first and that turns out to be correct. I note that the combination you give is a pseudoscalar. Can anyone think if there is a reason why that would not be allowed? – Mistake Ink Aug 16 '12 at 13:22
  • 3
    @MistakeInk - $\epsilon_{\mu \nu \rho \sigma } F^{\mu \nu }F^{\rho \sigma } $is a fine candidate term for the Lagragian, its just that its a total derivative so it doesn't affect the classical EOM and vanishes in perturbation theory. It still does have some consequences though - see http://en.wikipedia.org/wiki/CP_violation#Strong_CP_problem. As for $Det(F)$ I don't think this term is renormalizable since its equal to $e^{tr \log F}$ which you could expand about some background field value and get arbitrarily high powers of the field strength. – DJBunk Aug 31 '12 at 21:03
  • You do however get terms of the form $ \log tr (k^2+{F^{\mu \nu}}^2)$ when calculating the effective action in the presence of a background field gauge field. See Chap 16 of Peskin. – DJBunk Aug 31 '12 at 21:05
  • 1
    @DJBunk You are correct that $\text{det}(F^\mu_\nu)$ is nonrenormalizable so it wouldn't work, but it's only quartic in $F$ and does not contain arbitrarily high powers. It's also not Lorentz-invariant. – tparker Jul 09 '17 at 07:27
  • @DJBunk A nonzero $\epsilon F F$ term has actually been realized in real experimental systems - 3D topological insulators. It doesn't effect the bulk physics, but does give rise to topologically protected gapless edge modes on the system's boundary. – tparker Jul 09 '17 at 07:46
2

You can find the answer from the book "Differential Geometry and Lie Groups for Physicists" by Marian Fecko.

In geometrical language, an action of a field $F\in\Omega^{p}(M)$ on some n-dim Riemannian manifold $(M,g)$ should be understand as an 'inner-product' $$\int_{M}F\wedge\ast_{g}F,$$ where $p<n$, and $\ast_{g}$ is the hodge star operator, i.e. $$\ast_{g}:\Omega^{p}(M)\rightarrow\Omega^{n-p}(M)$$ so that the action is diffeomorphism invariant and its density is a scalar.

For example, the action of a free scalar field takes this form: $$S[\phi]=\int_{M}d\phi\wedge\ast_{g}d\phi+\frac{m}{2}\int_{M}\phi\wedge\ast_{g}\phi=\int_{M}\sqrt{|g|}d^{n}x\left\{\partial_{\mu}\phi\partial^{\mu}\phi+\frac{m\phi^{2}}{2}\right\}.$$ When the 'worldsheet' manifold is 1-dim, only possible fields are 1-forms. One can consider an action takes the following form $$\int_{\mathbb{R}}A_{\mu}\frac{dx^{\mu}}{ds}ds$$ where $A=A_{s}ds=A_{\mu}\frac{dx^{\mu}}{ds}ds$ is a 1-form on the worldline, whose hodge star dual field is not defined.

Xiaoyi Jing
  • 1,068
2

I'm almost 100% sure the Lagrangian is an assumption of the theory. It cannot be derived. I don't have any references for this claim. I just know that from every course I've been taught and every book I've read, the Lagrangian (assuming it is being used at all) is where you start. It is the "first principle" in this case.

mcFreid
  • 2,607
  • Thanks - I guess I'd only ever seen Lagrangians from mechanics, where they are naturally of the form $L=T-V$ and thus what I was calling "derivable". – mcamac Aug 15 '12 at 18:39
  • 2
    I don't see how that Lagrangian is "derivable" either. Of course, we write it as such so that the Euler-Lagrange equations give us the classical equations of motion. But, I wouldn't consider that a derivation. Really it's just substituting one assumption for another. – mcFreid Aug 15 '12 at 19:21
1

You can use the symmetries of E&M to show that there's essentially only one reasonable candidate that needs to be checked:

The action we seek should be Lorentz invariant, gauge invariant, parity and time-reversal invariant, and no more than second order in derivatives. The only candidate is [the Maxwell action]. [Srednicki QFT pg. 334.]

tparker
  • 47,418
1

This is an old question, but I'll mention another approach I'm surprised hasn't come up. (Add factors of $c$ or $\mu_0$ or whatever to taste.)

We'll dream up the equivalent of Newton's second law for electromagnetism by replacing position with $A^\mu$, force with $j^\mu$, and time derivatives with something more Lorentz-invariant. In other words, $\ddot{A}^\mu=j^\mu$ isn't OK. Absorbing a factor into the definition of one vector or the other, the most general option is $\partial_\mu(\partial^\mu A^\nu+k\partial^\nu A^\mu)=j^\nu$, and charge conservation $\partial_\nu j^\nu=0$ obtains $k=-1$. So, in familiar notation, $\partial_\mu F^{\mu\nu}=j^\nu$.

Similarly, we want to update $\frac12m\dot{x}^2-V(x)$ to an appropriate Lagrangian density. Since our target equation of motion is invariant under $\delta A_\nu=\partial_\nu\chi$, we'll need to construct the derivative-squared kinetic term out of $F_{\mu\nu}$, so it has to be a multiple of $F_{\mu\nu}F^{\mu\nu}$. This product includes $2\partial_0A_i\partial^0A^i=-2\sum_i(\partial_0A_i)^2$, so to get $\frac12\dot{A}\cdot\dot{A}$ we need a factor of $-\frac14$. We can now verify $-A_\nu j^\nu$ has the correct overall factor in the potential term for $\partial_\mu F^{\mu\nu}=j^\nu$ to result.

J.G.
  • 24,837
0

Do there exist first principles for deriving a Lagrangian density? Yes. Our current understanding of the electromagnetic field is that its dynamics possess symmetry under "gauge transforms" and symmetry under a set of geometric transforms. The first, alone, suffices to give you the framework of "Maxwell's Equations", while the second fills in a missing piece to the puzzle: the "constitutive relations".

The field is described, fundamentally, by a gauge potential, which in component form is given by a one-form $A = A_μ dx^μ$. I'm using the Einstein summation convention here and throughout. For coordinates $\left(x^0, x^1, x^2, x^3\right) = (t, x, y, z)$ it consists of the "electric potential" $φ = -A_0$ and "magnetic potential" $ = \left(A_x, A_y, A_z\right) = \left(A_1, A_2, A_3\right)$.

The gauge transform is given, in infinitesimal form, by: $$Δφ = \frac{∂χ}{∂t}, \hspace 1em Δ = -∇χ,$$ where $$∇ = \left(\frac{∂}{∂x}, \frac{∂}{∂y}, \frac{∂}{∂z}\right),$$ or in component form by $ΔA_μ = -∂_μχ$, where $$∂_0 = \frac{∂}{∂t}, \hspace 1em \left(∂_1, ∂_2, ∂_3\right) = ∇.$$ Below, we'll also use $ = (x, y, z)$.

That is also the finite form of the gauge transform, since we could just as well write $$φ → φ + \frac{∂χ}{∂t}, \hspace 1em → - ∇χ,$$

Many different gauges can be set with this. For instance, even $φ = 0$ is possible, just by setting $χ(,t) = -\int^t_0 φ(,T) dT$. Then $$φ(,t) → φ(,t) - φ(,t) = 0, \hspace 1em (,T) → (,T) + \int^t_0 ∇φ(,T) dT.$$

I'm going to do more than answer your question, and raise the stakes. Suppose we also have another field $q(,t)$ that participates in this gauge transform with the infinitesimal form of it given by $Δq = χq$. What does the Lagrangian density look like, with it included?

Gauge Invariance And Utiyama's Theorem
Our first question is: what is the most general Lagrangian density $\left(A_μ, v_{μν}, q, v_μ\right)$ that is a function of the fields $A_μ$, $q$ and their gradients - which I denote here as $v_{μν} = ∂_μA_ν$ and $v_μ = ∂_μq$? Define the following derivatives: $$^μ = \frac{∂}{∂A_μ}, \hspace 1em = \frac{∂}{∂q}, \hspace 1em ^{μν} = \frac{∂}{∂v_{μν}}, \hspace 1em ^μ = \frac{∂}{∂v_μ},$$ and assume that $$ has no dependencies on any other field nor any explicit dependency on the coordinates.

We assume that the gauge transforms are transparent with respect to the gradient, $$Δv_μ = Δ\left(∂_μq\right) = ∂_μ\left(Δq\right), \hspace 1em Δv_{μν} = Δ\left(∂_μA_ν\right) = ∂_μ\left(ΔA_ν\right),$$ so that $$Δv_μ = ∂_μχ q + χ ∂_μq = ∂_μχ q + χ v_μ, \hspace 1em Δv_{μν} = -∂_μ∂_νχ.$$ Then, in infinitesimal form, we have the following action under the gauge transform: $$\begin{align} Δ &= ^μ ΔA_μ + ^{μν} Δv_{μν} + Δq + ^μ Δv_μ \\ &= -^μ ∂_μχ - ^{μν} ∂_μ∂_νχ + χ q + ^μ \left(∂_μχ q + χ v_μ\right) \\ &= \left( q + ^μ v_μ\right) χ + \left(^μ q - ^μ\right) ∂_μχ - ^{μν} ∂_μ∂_νχ. \end{align}$$

Now, the first (and most important) key point is that $∂_μ∂_νχ = ∂_ν∂_μχ$. We're assuming the gauge transform functions $χ$ are well-behaved (continuous at least to their second order differentials or $C^2$). Therefore, we can rewrite $$^{μν} ∂_μ∂_νχ = \frac{1}{2}\left(^{μν} + ^{νμ}\right) ∂_μ∂_νχ.$$ Other than the $C^2$ restriction on $χ$, the function is assumed to be arbitrary and that the gauge transform for them all should be $Δ = 0$. Therefore, we may equate the coefficients for each order of differential separately to zero, resulting in: $$ q + ^μ v_μ = 0, \hspace 1em ^μ q - ^μ = 0, \hspace 1em ^{μν} + ^{νμ} = 0.$$ We may then write: $$ = -\frac{^μ v_μ}{q^2}, \hspace 1em ^μ = \frac{^μ}{q}, \hspace 1em ^{μν} = -^{νμ}.$$ Thus, the total differential for $$ (which we'll write as a variational $Δ$, this time understanding that $Δ$ now denotes a generic variation, not just a gauge transform) reduces to: $$\begin{align} Δ &= ^μ ΔA_μ + ^{μν} Δv_{μν} - ^μ \frac{v_μ Δq}{q^2} + ^μ \frac{Δv_μ}{q} \\ &= ^μ \left(ΔA_μ - v_μ \frac{Δq}{q^2} + \frac{Δv_μ}{q}\right) + \frac{1}{2} ^{μν} \left(Δv_{μν} - Δv_{νμ}\right) \\ &= ^μ Δ\left(\frac{v_μ}{q} + A_μ\right) + \frac{1}{2} ^{μν} Δ\left(v_{μν} - v_{νμ}\right), \end{align}$$ where we used the newly-established anti-symmetry of $^{μν}$ to write $$^{μν} Δv_{μν} = \frac{1}{2} ^{μν} \left(Δv_{μν} - Δv_{νμ}\right).$$

In this way, we show that the functional dependence of $\left(A_μ, v_{νμ}, q, v_μ\right)$ on the fields and their gradients reduces to a dependence $\left(F_{μν}, a_μ\right)$ only on their combinations: $$F_{μν} ≡ v_{μν} - v_{νμ}, \hspace 1em a_μ ≡ \frac{v_μ}{q} + A_μ,$$ which, expressed in terms of the field and their gradients are: $$F_{μν} = ∂_μA_ν - ∂_νA_μ, \hspace 1em a_μ = \frac{∂_μq}{q} + A_μ = ∂_μ\ln q + A_μ.$$ This also carries with it the strong suggestion to rewrite $q$, instead, as an exponential.

All of this is an instance of Utiyama's Theorem. First: the gauge fields $A_μ$ may enter the dynamics only in the form of the gauge field strength $F_{μν}$, with no dependence on $A_μ$, other than through their involvement in the gradients of other fields. Second: the gradients for the other fields that are involved in the gauge transform may only occur in certain combinations that involve the gauge field, itself - the "gauge covariant" derivatives. The example chosen would have been more illustrative if using $\ln q$, instead, since its gauge transform is: $Δ\ln q = χ$. Then, it would follow that $Δ\left(∂_μ\ln q\right) = ∂_μ\left(Δ\ln q\right) = ∂_μχ$, and $Δa_μ = 0$. Similarly, we confirm that under gauge transform $ΔF_{μν} = ∂_μ∂_νχ - ∂_ν∂_μχ = 0$. So, this is a gauge covariant derivative applied to the gauge field, itself.

The Utiyama Theorem, in its various incarnations, is the primary structure building tool for Lagrangian densities. That's the most direct answer I can give to your question.

The Emergence Of Maxwell's Equations And The Dynamics
Now that we've established that the only combinations of the gradients of the gauge field that may enter the dynamics, through the Lagrangian, are the anti-symmetric ones, we may then proceed to define the Maxwell field strengths: $$ ≡ \left(B^x, B^y, B^z\right) = \left(F_{23}, F_{31}, F_{12}\right), \hspace 1em ≡ \left(E_x, E_y, E_z\right) = \left(F_{10}, F_{20}, F_{30}\right),$$ respectively for the magnetic induction and electric field.

Then, from the field-potential relations $F_{μν} = ∂_μA_ν - ∂_νA_μ$, we obtain the following: $$ = ∇×, \hspace 1em = -∇φ - \frac{∂}{∂t}.$$ From this, in turn are derived the identities: $$∂_μF_{νρ} + ∂_νF_{ρμ} + ∂_ρF_{μν} = 0,$$ which, expressed in terms of the Maxwell vectors, become: $$∇· = 0, \hspace 1em ∇× + \frac{∂}{∂t} = .$$

The derivatives with respect to the gauge potentials and field strengths we will now rewrite as: $$^{μν} = -^{μν}, \hspace 1em ^μ = ^μ.$$ These are the tensor densities, respectively, for the response fields and sources. Through them, the other Maxwell vectors and scalars arise: $$\begin{align} &= \left(D^x, D^y, D^z\right) ≡ \left(^{01}, ^{02}, ^{03}\right), & ρ &≡ ^0,\\ &= \left(H_x, H_y, H_z\right) ≡ \left(^{23}, ^{31}, ^{12}\right), & &= \left(J^x, J^y, J^z\right) ≡ \left(^1, ^2, ^3\right), \end{align}$$ for the displacement field $$, magnetic force $$, charge density $ρ$ and current density $$.

Ignoring the dependency on other fields, integration by parts yields the following as the total variation for the Lagrangian density: $$\begin{align} Δ &= ^μ ΔA_μ - \frac{1}{2} ^{μν} ΔF_{μν} \\ &= ^μ ΔA_μ + ^{μν} ∂_νΔA_μ \\ &= \left(^μ - ∂_ν^{μν}\right)ΔA_μ + ∂_ν\left(^{μν} ΔA_μ\right). \end{align}$$ When applying the action principle, the divergence of the boundary term $^{μν} ΔA_μ$ drops out (it plays a role in determining the symplectic structure of the field's dynamics), and the remaining term gives rise to the Euler-Lagrange equations: $$∂_ν^{μν} = ^μ.$$ In terms of the Maxwell vectors and scalars, this reduces to the other set of Maxwell equations: $$∇· = ρ, \hspace 1em ∇× - \frac{∂}{∂t} = ,$$ and derived from this, in turn, is the continuity equation $∂_μ^μ = 0$, or: $$∇· + \frac{∂ρ}{∂t} = 0.$$

A similar analysis on the other field yields the Euler-Lagrange equation $∂_μ^μ = $ and an additional divergence $∂_μ\left(^μΔq\right)$, out of which comes the symplectic structure for the additional field $q$.

The significance of their having a non-trivial gauge transform is that this yields a set of constutive laws for the respective derivatives, which includes a contribution to the electromagnetic current: $$^μ = ^μ q, \hspace 1em = -^μ ∂_μ\ln q.$$ I didn't actually choose a very good example for the extra field, since its Euler-Lagrange equation is essentially just the continuity equation for the electromagnetic source fields, in disguise.

The Lagrangian density doesn't play any direct role in any of what's just been laid out. This is Lagrangian-independent and provides an enveloping framework for all Lagrangian densities that are gauge invariant functions of the fields and their first derivatives.

Instead, the role that the Lagrangian density does play is to provide a set of constitutive relations that connect the fields that arise in dynamics to the fields that the Lagrangian density is a function of. For the Maxwell field, they may be expressed directly as: $$ = \frac{∂}{∂}, \hspace 1em = -\frac{∂}{∂}, \hspace 1em ρ = -\frac{∂}{∂φ}, \hspace 1em = \frac{∂}{∂}.$$

So, $$ is a generating function for constitutive relations.

There's nothing up to this point that said anything about whether the underlying geometry was relativistic, non-relativistic, or whether it was a geometry where the absolute speed is 0 (the Carrollian universe) or in which all speeds are absolute (the universe of the "static group") or whether there's even any time dimensions at all, as opposed to a universe where all four dimensions are spacelike (i.e. "Euclideanization"). Everything laid out up to this point applies across the board to them all - unchanged.

In particular, there are no $c$'s anywhere, at this point, either express or implied; yet. Instead, that is a matter of geometry.

Invariance Under Geometric Transforms: Constitutive Relations
In fact, there is nothing that said what the coordinates $(t,x,y,z)$ actually had to be. They could be strange mixtures of space and time coordinates, or they could include angular coordinates. All the expressions above - including the component forms - apply as is. The equations have the full set of coordinate transforms as their symmetries. They are covariant with respect to all coordinate transforms.

Since only anti-symmetric combinations of the second derivatives enter into play, then all the fields can be combined into differential forms. This is close to how Maxwell originally represented them in his pre-1870 writings, and in his treatise - except he didn't make (full) use of the anti-commuting Grassmann algebra of differential forms.

We will write wedge products as ordinary products, e.g. $dxdy = dx∧dy = -dy∧dx = -dydx$. Then, define the following: $$d = (dx, dy, dz), \hspace 1em d = (dydz, dzdx, dxdy), \hspace 1em dV = dxdydz.$$ Then for the Maxwell fields, we have: $$A = A_μ dx^μ = ·d - φ dt, \hspace 1em F = \frac{1}{2} F_{μν} dx^μ dx^ν = ·d + ·ddt.$$ For the extra field $q$, we may also write $v = v_μ dx^μ$ and $a = a_μ dx^μ$. Then we have $$v = \frac{dq}{q} = d\ln q, \hspace 1em a = v + A.$$

The fields used in the dynamics are densities, so they contract with the coordinate 4-form $$d^4x ≡ dx^0 dx^1 dx^2 dx^3 = dt dV$$ as $$G ≡ \frac{1}{2} ^{μν} ∂_ν ˩ ∂_μ ˩ d^4x, \hspace 1em J ≡ ^μ ∂_μ ˩ d^4x.$$ The contraction operator is defined recursively by $$∂_μ ˩ \left(dx^ν (⋯)\right) = δ^ν_μ (⋯) - dx^ν \left(∂_μ ˩ (⋯)\right), \hspace 1em ∂_μ ˩ g = 0,$$ where $g$ denotes a scalar function. In particular, $$∂_0 ˩ d^4x = dV, \hspace 1em ∇ ˩ d^4x = -dtd = -ddt.$$ In terms of the Maxwell fields, $G$ and $J$ may be written as: $$G = ·d - ·ddt, \hspace 1em J = ρ dV - ·ddt.$$ For the extra field, we can write $p = ^μ ∂_μ ˩ d^4x$ and $f = d^4 x$.

Then, for the fields and their derivatives, we have: $$dA = F, \hspace 1em d\ln q + A = a \hspace 1em ⇒ \hspace 1em dF = 0, \hspace 1em da = F.$$ On the dynamics side, we have: $$dG = J, \hspace 1em dp = f, \hspace 1em ⇒ \hspace 1em dJ = 0, \hspace 1em df ≡ 0.$$ The last of these equations is trivial, since $f$ is a 4-form and all 4-forms in 4D have zero exterior derivative. The constitutive laws obtained up to this point, from gauge invariance are: $$J = qp, \hspace 1em f = -\frac{dq}{q} p = -(d\ln q) p.$$

All of these objects are geometric invariants. The transform for their components is inherited from the transforms on the coordinates and their differentials under that requirement. So, the second requirement we impose on the Lagrangian density is that it be invariant under a distinguished set of coordinate transforms.

The transforms, specified here in infinitesimal form, should include those for spatial translations $$, time translations $τ$, rotations $$ and boosts $$, with their actions on the coordinates given by: $$Δ = × - βt + , \hspace 1em Δt = -α· + τ.$$ The boosts are specified in general form with coefficients $(α,β) ≠ (0,0)$ (we're excluding the above-mentioned case of the "static group"). If $αβ ≥ 0$, then the finite forms of a boost in the $x$ direction with a finite velocity $v$ can be specified as: $$(x, t) → (γ(x - βvt), γ(t - αvx)), \hspace 1em γ = \frac{1}{\sqrt{1 - αβv^2}}.$$ This includes the Galilean transforms if $α = 0$, $β ≠ 0$, where the absolute speed is infinity; the Carrollian transforms if $α ≠ 0$, $β = 0$, where the absolute speed is zero; and the Lorentz transforms $αβ > 0$, with a finite, non-zero absolute speed $c = \sqrt{β/α}$. It also includes 4D Euclidean transforms if $αβ < 0$, except that the finite forms listed above are no longer the most general, since $(x, t) → (-x, -t)$ is also included. For $αβ > 0$, boosts are constrained to $|v| < c$; and in general they are constrained to $αβv^2 < 1$, which (however) is vacuous, except for the Lorentzian case.

Using $Δd = dΔ$, for the coordinate differentials, we may write down the transforms: $$Δ(d) = ×d - βdt, \hspace 1em Δ(dt) = -α·d.$$ Imposing the following invariance requirement $$Δ\left(d·∇ + dt \frac{∂}{∂t}\right) = 0,$$ we then also have: $$Δ\left(∇\right) = α\frac{∂}{∂t}, \hspace 1em Δ\left(\frac{∂}{∂t}\right) = β·∇.$$ Under these transforms, including the one already specified, the following are the three geometric invariants: $$d·∇ + dt \frac{∂}{∂t}, \hspace 1em βdt^2 - α|d|^2, \hspace 1em β|∇|^2 - α{\left(\frac{∂}{∂t}\right)}^2,$$ and the transforms are exactly those which have these as their invariants.

Finally, for the higher order coordinate differentials, the following may be derived: $$\begin{align} Δ(d) &= ×d + β×ddt, & Δ(ddt) &= ×ddt - α×d, \\ Δ(dV) &= -βddt, & Δ(ddt) &= ×ddt - αdV. \end{align}$$ So, imposing the requirements $ΔA = 0$, $ΔF = 0$, $ΔG = 0$ and $ΔJ = 0$ leads to the following transforms: $$ Δφ = -β·, \hspace 1em Δ = × - αφ, \hspace 1em Δ = × - α×, \hspace 1em Δ = × + β×, \\ Δρ = -α·, \hspace 1em Δ = × - βρ, \hspace 1em Δ = × + α×, \hspace 1em Δ = × - β×. $$

For the other field, a set of transforms similar to those for $A$ and $J$ can be derived, respectively for $a$ and $p$, while the scalar $q$ and 4-form $f$ are both invariant.

Translation invariance is trivial, since everything is expressed in terms of differential forms. There are no occurrences of $$ or $τ$ anywhere above. Invariance under rotations is ensured by restricting to only the scalars and scalar combinations of the vectors. For the electromagnetic fields, gauge invariance ensures that only $$ and $$ may enter directly into the Lagrangian density. Their only independent scalar combinations are: $$½||^2, \hspace 1em ·, \hspace 1em ½||^2.$$ Therefore, the Lagrangian density reduces to a function of the form $$\left(½||^2, ·, ½||^2, ⋯\right).$$ Other scalar invariants in $(⋯)$ exist for the components of the field $a$, similar to those for $$ and $φ$ (which are $½||^2$ and $φ$ itself), as well as those involving mixtures of the extra field with the electromagnetic field. Here, we will just focus on the first three.

Write their differential coefficients as: $$Δ = ε Δ\left(½||^2\right) + θ Δ\left(·\right) - \frac{1}{μ} Δ\left(½||^2\right) + ⋯$$ Then, this leads immediately to the following constitutive laws: $$ = ε + θ + ⋯, \hspace 1em = \frac{}{μ} - θ + ⋯, \hspace 1em ⋯,$$ with this for adopted for backward-compatibility (which, however, tacitly assumes that $$ has a non-zero derivative with respect to $½||^2$; i.e. that $μ ≠ 0$).

This is, thus, the general form for Lagrangian densities that are translation-invariant and isotropic.

A Lagrangian density reduces to a function of a set of invariant combinations. The differential coefficients with respect to those invariants together comprise the corresponding constitutive coefficients. Each of them is, in general, a function of all the invariants, satisfying relations such as $$\frac{∂ε}{∂\left(·\right)} = \frac{∂θ}{∂\left(½||^2\right)}.$$

The Lagrangian density, therefore, generates a set of constitutive relations that are all cut from the same cloth. The only Lagrangian-dependent differences are those encapsulated within the constitutive coefficients. All of the particulars of the dynamics are contained in them.

There is mixing between the first and third of the scalar invariants, under boosts, and the only invariant combination obtained from the two is: $$½\left(β||^2 - α||^2\right).$$ So, when we conclude boost invariant, the Lagrangian density reduces further to a function of the form $$\left(½\left(β||^2 - α||^2\right), ·, ⋯\right).$$ This leads to relations between the constitutive coefficients; with the one of importance here being: $$βεμ = α.$$

Thus, for boost-invariant Lagrangian densities, we can say the following: (1) in the Galilean case, $εμ = 0$ (so at least one of $$ or $$ must be $$), in the Carrollian case, we would require $εμ = ∞$ (thus forcing at least one of $$ or $$ to be $$). In the 4D Euclidean case, we have $εμ < 0$, while in the Lorentzian case, we have $εμ = α/β = 1/c^2$.

For null fields $$β||^2 = α||^2, \hspace 1em · = 0,$$ the constitutive coefficients reduce to the "null field form": $ε = ε_0$, $μ = μ_0$, $θ = θ_0$, which may each be functions of the other invariants not listed. If $θ_0$ is independent of the other invariants, and constant, then without loss of generality we can set $θ_0 = 0$ by just redefining $$ and $$ respectively as $ - θ_0$ and $ + θ_0 $. This will not affect the Maxwell equations. For the Lorentzian case, we can also assume that $ε_0 > 0$ and $μ_0 > 0$ by flipping signs, changing $$, $$, $$ and $ρ$ respectively to $-$, $-$, $-$ and $-ρ$.

In that case, the constitutive relations reduce to the familiar form $$ = ε_0 + ⋯, \hspace 1em = \frac{}{μ_0} + ⋯,$$ where focus is taken off the possible additions from other fields, with $ε_0 > 0$, $μ_0 > 0$ and $ε_0 μ_0 = (1/c)^2$.

By comparison, the Maxwell-Lorentz density has the form $$ = ½ \left(ε_0 ||^2 - \frac{||^2}{μ_0}\right).$$ In large measure, the constitutive law obtained from it is independent of the Lagrangian density, since it is already present in the other Lagrangian densities - at least for null fields or fields closely approximating them. Only the cross-terms arising from invariants that mixed the electromagnetic fields with other fields will have an impact on that conclusion.

This undercuts much of the justification for the Maxwell-Lorentz Lagrangian density, as opposed to other alternatives.

NinjaDarth
  • 1,944