Is the equation $[\nabla_{\mu},\nabla_{\nu}]=F_{\mu\nu}$ correct? If yes, how does it have to be interpreted?

Question

It seems like simply using the equation \begin{equation} \nabla_{\mu}=\partial_{\mu}+A_{\mu} \end{equation} isn't enough: One obtains \begin{equation} [\nabla_{\mu},\nabla_{\nu}]=\underbrace{[\partial_{\mu},\partial_{\nu}]}_{=0}+\underbrace{\partial_{\mu}A_{\nu}-\partial_{\nu}A_{\mu}+[A_{\mu},A_{\nu}]}_{=F_{\mu\nu}}+A_{\mu}\partial_{\nu}-A_{\nu}\partial_{\mu} \end{equation} and I don't see why $A_{\mu}\partial_{\nu}-A_{\nu}\partial_{\mu}=0$.

Thus, it seems like this naive approach doesn't work and we need to be more rigorous:

Consider the following setting (please see the section "Notation" for more details): A principal $G$-bundle $P\to M$ and a representation $\rho\colon G\to\mathrm{GL}(V)$, $\rho_{*}\colon g\to\mathrm{End}(V)$. Let $E\to M$ be the associated vector bundle, $A\in\Omega^1(P,g)$ a connection $1$-form and $\nabla\colon \Gamma(M,E)\to\Omega^1(M,E)$ the induced covariant derivative.

Obviously, the equation \begin{equation} \mathrm{End}(C^{\infty}(U,V))\ni[\nabla_{\mu},\nabla_{\nu}]=F_{\mu\nu}:=(s^*F)(\partial_{\mu},\partial_{\nu})\in C^{\infty}(U,g) \end{equation} doesn't make sense, as two totally different objects are equated. My guess would be that in this equation, $F_{\mu\nu}$ has to be interpreted as the linear operator $\widetilde{F_{\mu\nu}}\colon C^{\infty}(U,V)\to C^{\infty}(U,V)$ defined by \begin{equation} (\widetilde{F_{\mu\nu}}(\phi))(m)=(\rho_{*}\circ F_{\mu\nu})(m)\phi(m)\in V. \end{equation} What do you think?

If you like this question you may also enjoy reading this related Phys.SE post. — Qmechanic, Mar 26 '21 at 10:39

oliver · Answer 1 · 2021-03-19T17:16:08.980

7

Not that much notation, but probably not less true either: $$[\partial_\mu+A_\mu,\partial_\nu+A_\nu]\psi=(\partial_\mu+A_\mu)(\partial_\nu+A_\nu)\psi-(\partial_\nu+A_\nu)(\partial_\mu+A_\mu)\psi=$$ $$=\partial_\mu\partial_\nu\psi+A_\mu\partial_\nu\psi+\partial_\mu (A_\nu\psi)+A_\mu A_\nu\psi-\partial_\nu\partial_\mu\psi-A_\nu\partial_\mu\psi-\partial_\nu (A_\mu\psi)-A_\nu A_\mu\psi=$$ $$=A_\mu\partial_\nu\psi+(\partial_\mu A_\nu)\psi+ A_\nu\partial_\mu\psi-A_\nu\partial_\mu\psi-(\partial_\nu A_\mu)\psi- A_\mu\partial_\nu\psi=$$ $$=(\partial_\mu A_\nu-\partial_\nu A_\mu)\psi=F_{\mu\nu}\psi$$ and therefore $$[\partial_\mu+A_\mu,\partial_\nu+A_\nu]=F_{\mu\nu}$$ The interpretation is that if you parallel-transport the field $\psi$ along a closed loop in the electromagnetic 4-potential/connection $A_\mu$, then $\psi$ is not the same, but differs by a phase factor (more generally, a gauge group element, for SU(2) or SU(3) gauge theories) that depends on the electromagnetic field tensor, which can be interpreted as a kind of curvature (just like the Einstein-tensor is a measure of curvature due to gravitation).

In lattice gauge theory, these are called Wilson loops.

I guess, someone else will find a more bundle-ish version of this explanation.

PS: I have silently assumed you are talking about electromagnetism. For other gauge theories the derivation is almost the same, only that the commutator of $A_\mu$ with $A_\nu$ does not vanish (and consequently enters into the field tensor) like it does for EM.

edited Mar 19 '21 at 17:16

answered Mar 19 '21 at 16:55

oliver

7,462

1

It seems like you are assuming $A_{\mu}\partial_{\nu}=A_{\nu}\partial_{\mu}$: After the fourth equal sign you add $A_{\nu}\partial_{\mu}$ and subtract $A_{\mu}\partial_{\nu}$. Otherwise, you would have ended up with the equation $[\nabla_{\mu},\nabla_{\nu}]=F_{\mu\nu}+A_{\mu}\partial_{\nu}-A_{\nu}\partial_{\mu}$ like I did (see my question). – Filippo Mar 19 '21 at 17:37
2

@Filippo: not at all, after the fourth equal sign, both $A_\mu\partial_\nu$ as well as $A_\nu\partial_\mu$ appear each once with a "+" and once with a "-" sign. Hence they both cancel. – oliver Mar 19 '21 at 17:52
Here's how I came to my conclusion. The pink terms are the ones I was talking about. – Filippo Mar 19 '21 at 18:03
@Filippo: what about cancelling the red term with the second pink term, and the first pink term with the green term? – oliver Mar 19 '21 at 18:11
Well, that's what you do to obtain the desired result, don't you? $\textbf{After}$ adding the rosa terms. – Filippo Mar 19 '21 at 18:14
1

What do you mean by "After adding the rosa terms"? I'm just using algebra and the rules of differentiation. No assumptions. Nothing to add. – oliver Mar 19 '21 at 18:19
The picture I linked to shows that the rosa terms don't appear before the fourth equal sign. – Filippo Mar 19 '21 at 18:21
1

@Filippo: you are using highly complicated notation from differential geometry and don't know how to apply the product rule of differentiation...? – oliver Mar 19 '21 at 18:23
Can you explain, please? – Filippo Mar 19 '21 at 18:26
@Filippo You haven't properly applied the product rule to the orange term or the blue term $\partial_\nu (A_\mu \psi) \neq (\partial_\nu A_\mu) \psi $ – Jbag1212 Mar 19 '21 at 19:32
@Jbag1212 I have the feeling that there has been a misconception: The picture doesn't show my calculations, it's a screenshot of oliver's answer. – Filippo Mar 19 '21 at 19:36
2

@Filippo What is the issue with the linked picture? The pink terms come from applying the product rule to the corresponding blue and orange terms. They then cancel with green and red term. In other words, they were not "added" they came about from properly applying the product rule – Jbag1212 Mar 19 '21 at 19:46
@Jbag1212 Now I understand, thank you! – Filippo Mar 19 '21 at 19:49
1

@oliver I thought that the $\psi$ was just a decoration and that the calculation could be performed without it. Thus, I didn't get that you were using the product rule. This is one of the situations were naively doing the calculation without the $\psi$ leads to a wrong result. Thank you for not only doing the calculation, but also talking about the physical interpretation! – Filippo Mar 19 '21 at 20:21
1

@Filippo: I see that you are a little tending to the mathematical side of things (do you happen to be a mathematician?), and from this point of view it is understandable that you probably expected to get a declaration of $\psi$ as a function. For me as a physicist it is just natural to assume that $\psi$ in the context of a field theory is a function. After all, physicists are often a little lazy with notation, but it helps getting quicker to the point, at least if those with whom one communicates speak the same kind of "street language". – oliver Mar 19 '21 at 20:28
1

@oliver "Street language" :D I am currently a physics student (maybe I'll change to mathematics after my bachelor degree), but indeed, I have a tendency towards rigorous mathematics and I'm not very good at understanding the street language of physicists ^^ – Filippo Mar 20 '21 at 15:44
1

@Filippo: I was once like you :-) But I noticed that I had to decide for one side because it is exhausting to express everything in two slightly different languages, and most of all, making the tools and using them at the same time. And so I decided for the evil side ;-) But there are many physicists more capable than me, who see nothing special in mastering physics and mathematics at the same time. – oliver Mar 21 '21 at 06:41

score 3 · Answer 2 · answered Mar 19 '21 at 17:25

The field strength $F$ is $\mathfrak{g}$-valued. In that case if $X^a$ is a basis of the Lie algebra we can write $$F=\dfrac{1}{2}F_{\mu\nu}^a dx^\mu\wedge dx^\nu \otimes X^a.$$

In particular, given any representation $R:G\to {\rm GL}(V)$ of $G$ on the vector space $V$, we have the derived representation $dR:\mathfrak{g}\to {\rm End}(V)$ of the Lie algebra, and therefore we have the representative of $F$ in this representation.

Denoting $dR(X^a)=T^a$ the generators of the representation $R$ we have that $F$ is represented by $$F_R=\dfrac{1}{2}F_{\mu\nu}^a dx^\mu\wedge dx^\nu \otimes T^a.$$

In particular $(F_R)_{\mu\nu}:= (F_R)_{\mu\nu}^a T^a$ is a linear operator on $V$ and can act on any $V$-valued object.

When we write $[\nabla_\mu,\nabla_\nu]=F_{\mu\nu}$ what we mean by $F_{\mu\nu}$ is really $(F_R)_{\mu\nu}$ because there one representation being understood there. After all the covariant derivative is induced from the principal connection on each associated vector bundle, which are constructed from representations of $G$! The thing is that people abuse notation and left the representation be understood implicitly.

In that case $[\nabla_\mu,\nabla_\nu]$ is a map that can act on sections of the associated bundle and so is $F_{\mu\nu}$ by the reasons I have outlined above. You are not equating objects of different nature here.

Filippo · Answer 3 · 2021-03-26T10:31:11.073

Here's what should be the mathematically rigorous statement and proof:

Let $\rho_*\colon g\to\mathrm{End}(V)$ be the Lie algebra homomorphism induced by the Lie group homomorphism $\rho\colon G\to\mathrm{GL}(V)$. Consider the right action \begin{equation*} \mathrm{End}(V)\times V\ni(A,v)\mapsto A\cdot v:=A(v)\in V. \end{equation*} If $A\in C^{\infty}(U,g)$, \begin{align} \rho_*A\colon C^{\infty}(U,V)&\to C^{\infty}(U,V)\\ \phi&\mapsto(\rho_*\circ A)\cdot\phi \end{align} is $C^{\infty}(U)$-linear. $\rho_*(A+B)=\rho_*A+\rho_*B$ and $[\rho_*A,\rho_*B]=\rho_*[A,B]$.

If $x\colon U\to\mathbf{R}^n$ is a chart, each vector field \begin{equation*} \partial_\mu=\frac{\partial}{\partial x^\mu}\in\Gamma(U,TM) \end{equation*} induces an endomorphism \begin{align*} \partial_{\mu}\colon C^{\infty}(U,V)&\to C^{\infty}(U,V)\\ \phi&\mapsto\mathrm{d}\phi(\partial_{\mu})=\partial_{\mu}(\phi\circ x^{-1})\circ x \end{align*} and \begin{equation} \nabla_{\mu}=\partial_{\mu}+\rho_*A_{\mu}. \end{equation} Corollary: \begin{equation} [\nabla_\mu,\nabla_\nu]=\rho_*F_{\mu\nu} \end{equation}

Proof:

The equation \begin{equation*}\tag{1} \partial_{\mu}\circ(\rho_*A_{\nu})=\rho_*(\partial_{\mu}A_{\nu})+\rho_*A_{\nu}\circ\partial_{\mu} \end{equation*} implies \begin{equation*} [\partial_{\mu},\rho_*A_{\nu}]+[\partial_{\nu},\rho_*A_{\mu}]=\rho_*(\partial_{\mu}A_{\nu})-\rho_*(\partial_{\nu}A_{\mu}). \end{equation*} Thus, using the structure equation, we obtain \begin{align*} [\nabla_\mu,\nabla_\nu]=[\partial_{\mu}+\rho_*A_{\mu},\partial_{\nu}+\rho_*A_{\nu}]=[\partial_\mu,\partial_\nu]+[\partial_{\mu},\rho_*A_{\nu}]+[\partial_{\nu},\rho_*A_{\mu}]+[\rho_*A_\mu,\rho_*A_\nu]\\=\rho_*\partial_{\mu}A_{\nu}-\rho_*\partial_{\nu}A_{\mu}+\rho_*[A_{\mu},A_{\nu}]=\rho_*(\partial_{\mu}A_{\nu}-\partial_{\nu}A_{\mu}+[A_{\mu},A_{\nu}])=\rho_*F_{\mu\nu}, \end{align*} (I used the structure equation in the last step.)

Addendum - proof of equation $(1)$

More explicitely, equation $(1)$ means \begin{equation*} \partial_{\mu}(\rho_*(A_{\nu})\phi)=\rho_*(\partial_{\mu}A_{\nu})\phi+\rho_*(A_{\nu})\partial_{\mu}\phi \end{equation*} for all $\phi\in C^{\infty}(U,V)$ ("product rule"). Even more explicit: \begin{equation*} \partial_{\mu}((\rho_*A_{\nu}\cdot\phi)\circ x^{-1})\circ x=\rho_*(\partial_{\mu}(A_{\nu}\circ x^{-1})\circ x)\cdot\phi+\rho_*A_{\nu}\cdot(\partial_{\mu}(\phi\circ x^{-1})\circ x)\in C^{\infty}(U,V) \end{equation*} If we define $A_\nu:=A_\nu\circ x^{-1}\in C^{\infty}(x(U),g)$ (notice the abuse of notation), this is equivalent to \begin{equation*} \partial_{\mu}(\rho_*A_{\nu}\cdot\phi)=\rho_*\partial_{\mu}A_{\nu}\cdot\phi+\rho_*A_{\nu}\cdot\partial_{\mu}\phi\in C^{\infty}(x(U),V) \end{equation*} for all $\phi\in C^{\infty}(x(U),V)$.

Since the partial derivative is the total derivative of a function defined on an open interval, it suffices to prove the equation for $n=1$:

Let $I\subset\mathbf{R}$ is an open interval, $V$ and $W$ normed vector spaces, $O\colon W\to\mathrm{End}(V)$ a linear and continuous function and $f\colon I\to W$, $g\colon I\to V$ differentiable functions. If $Ow\in\mathrm{End}(V)$ is continuous for all $w\in W$, $Of\cdot g\colon I\to V$ is differentiable and \begin{equation*} (Of\cdot g)'=Of'\cdot g+Of\cdot g'. \end{equation*} Proof: Let $x\in I$. To simplify the notation, we define \begin{equation*} \delta F:=F(x+\delta)-F(x),F(x)=:F \end{equation*} for all functions $F\colon I\to X$. We want to prove that for every $\epsilon>0$ there exists an $r>0$ s.t. \begin{equation*} |\delta(Of\cdot g)-\delta\cdot(Of'\cdot g+Of\cdot g')|<|\delta|\epsilon \end{equation*} for all $\delta\in(-r,r)$. This follows from the following facts:

For every $\epsilon>0$ there exists an $r>0$ s.t. $|\delta f-\delta\cdot f'|<|\delta|\epsilon$ and $|\delta g-\delta\cdot g'|<|\delta|\epsilon$ for all $\delta\in(-r,r)$.
$\delta(Of\cdot g)=O\delta f\cdot g+Of\cdot\delta g+O\delta f\cdot\delta g$
$O\delta f\cdot\delta g=O(\delta f-\delta\cdot f'+\delta\cdot f')\cdot(\delta g-\delta\cdot g'+\delta\cdot g') =O(\delta f-\delta\cdot f')\cdot(\delta g-\delta\cdot g')\\+O(\delta f-\delta\cdot f')\cdot(\delta\cdot g')+O(\delta\cdot f')\cdot(\delta g-\delta\cdot g')+O(\delta\cdot f')\cdot(\delta\cdot g')$
Triangle inequality
$|(O(w))(v)|\leq|O||w||v|$ for all $(v,w)\in V\times W$.

Is the equation $[\nabla_{\mu},\nabla_{\nu}]=F_{\mu\nu}$ correct? If yes, how does it have to be interpreted?

3 Answers3

Addendum - proof of equation $(1)$