20

The following functional derivative holds: \begin{align} \frac{\delta q(t)}{\delta q(t')} ~=~ \delta(t-t') \end{align} and \begin{align} \frac{\delta \dot{q}(t)}{\delta q(t')} ~=~ \delta'(t-t') \end{align} where $'$ is $d/dt$.

Question: What is \begin{align} \frac{\delta q(t)}{\delta \dot{q}(t')}? \end{align} I'm asking this because in QFT, the lecturer defined the canonical momentum field to $\phi$ by \begin{align} \pi(x,t) ~:=~ \frac{\delta L(t)}{\delta \dot{\phi}(x,t)}, \end{align} where $L$ is the Lagrangian, a functional of the field: $L[\phi,\partial_\mu \phi] = \int d^d x \mathcal{L}(\phi,\partial_\mu \phi)$.

I know I should get \begin{align} \pi(x,t) ~=~ \frac{\partial \mathcal{L(x,t)}}{\partial \dot{\phi}(x,t)}. \end{align} (Note it's now a partial derivative with respect to the Lagrangian density.) But doing it I get: \begin{align} \delta L = \int d^dx \frac{\partial \mathcal{L}}{\partial \phi} \delta \phi + \frac{\partial \mathcal{L}}{\partial \partial_\mu \phi} \delta\partial_\mu \phi. \end{align} So somehow we ignore the first term $\int d^dx \frac{\partial \mathcal{L}}{\partial \phi} \delta \phi$! Why is that?

It can't be that we are treating $\delta \phi$ and $\delta \dot{\phi}$ as independent because if I were to take the functional derivative w.r.t. $\phi(x')$, I would have to move the dot from $\delta \dot{\phi}$ over to $\frac{\partial \mathcal{L}}{\partial \dot{\phi}}$ which will give me

$$\int d^dx (\frac{\partial \mathcal{L}}{\partial \phi} - \partial_\mu\frac{\partial \mathcal{L}}{\partial \partial_\mu \phi}) \delta \phi$$

i.e. the functional derivative gives the Euler-Lagrange equations.

So how to I take the functional derivative of a functional w.r.t to the derivative of a function?

Qmechanic
  • 201,751
nervxxx
  • 4,380

2 Answers2

18

Contrary to your claim near the end of your question, I claim that the time-derivative of the field is being treated as an "independent" argument of the Lagrangian. I'll try to convince you of this by showing you how this independence leads to everything working out the way you think it should. Some of the key points are at the end, so please read all the way through before you succumb to skepticism.

For the sake of simplicity, let's assume from the start that we are considering a classical theory of fields $\phi:\mathbb R^2\to\mathbb R$. Let $\mathcal F$ denote the set of admissible fields in this theory. We denote the first field argument with $t$ and the second argument with $x$, so we write $\phi(t,x)$ as usual.

Ok so, now let's turn to the Lagrangian. To describe this correctly, imagine taking the $x$ argument of a field in our theory fixed, then this yields a real-valued function of a single, real variable $\phi(\cdot, x):\mathbb R\to\mathbb R$. Suppose that $\mathcal G$ denotes the set of such functions. Then the Lagrangian can be defined as a functional $L:\mathcal F\times\mathcal F\to\mathcal G$. In other words, it takes in two functions that map $\mathbb R^2\to\mathbb R$ and outputs a function that maps $\mathbb R\to\mathbb R$. We label the first argument suggestively by $\phi$ and the second argument suggestively by $\dot \phi$, but in principle, one can evaluate $L$ on whatever fields $\phi$ and $\psi$ that one chooses and write, for example $L[\phi, \psi]$. I claim that the definitions of the relevant functional derivatives are as follows: \begin{align} \frac{\delta L}{\delta \phi(t,x)}[\phi,\dot\phi](t) &= \lim_{\epsilon\to 0}\frac{L[\phi+\epsilon\Delta_x,\dot\phi](t)-L[\phi,\dot\phi](t)}{\epsilon} \\ \frac{\delta L}{\delta \dot\phi(t,x)}[\phi,\dot\phi](t) &= \lim_{\epsilon\to 0}\frac{L[\phi,\dot\phi+\epsilon\Delta_x](t)-L[\phi,\dot\phi](t)}{\epsilon} \end{align} where, I am using the notation \begin{align} \Delta_{x}(t,x') = \delta(x'-x) \end{align} Notice that this is essentially like taking partial derivatives because we vary the arguments of $L$ independently.

Now, suppose that we have a theory described by a Lagrangian density that is a local function of the field and its first derivatives. Then the Lagrangian density is defined as a function $\mathscr L:\mathbb R^3\to\mathbb R$, and, because we anticipate that we will be putting the values of the field and its derivatives into the arguments of the Lagrangian density, we label its three arguments with the symbols $\phi, \dot \phi, \phi'$. The symbols $\dot\phi$ and $\phi'$ are supposed to suggestively indicate that the arguments of the Lagrangian density are meant to be evaluated on the values of a field and its time and space derivative. This is, of course, a bit of an abuse of notation since $\phi$ is usually reserved as a symbol for the field, a function $\mathbb R^2\to\mathbb R$, not for the values of the field. But as long as we keep this abuse of notation in mind, we shouldn't get confused. Then we have \begin{align} L[\phi, \dot\phi](t) = \int dx' \,\mathscr L(\phi(t,x'), \dot\phi(t,x'), \phi'(t,x')) \end{align} Now let's apply the definitions of the functional derivatives above and see what we get. For one thing, we have \begin{align} \frac{\delta L}{\delta \dot\phi(t,x)}[\phi,\dot\phi](t) &= \lim_{\epsilon\to0}\frac{\int dx'\,\mathscr{L}(\phi(t,x'),\dot{\phi}(t,x')+\epsilon\delta(x'-x),\partial_{x'}\phi(t,x'))-\int dx'\,\mathscr{L}}{\epsilon} \\ &= \int dx'\,\frac{\partial\mathscr L}{\partial \dot\phi}(\phi(t,x'), \dot\phi(t,x'), \phi'(t,x'))\,\delta(x'-x) \\ &= \frac{\partial\mathscr L}{\partial \dot\phi}(\phi(t,x),\dot\phi(t,x),\phi'(t,x)) \end{align} which is exactly what you said you should get in your question. Similarly, I'll leave it to you to show that the definition above yields \begin{align} \frac{\delta L}{\delta \phi(t,x)}[\phi,\dot\phi](t) &= \frac{\partial\mathscr L}{\partial \phi}(\phi(t,x),\dot\phi(t,x),\phi'(t,x)) \\ &\hspace{2cm}-\frac{\partial}{\partial x}\left[\frac{\partial\mathscr L}{\partial \phi'}(\phi(t,x),\dot\phi(t,x),\phi'(t,x))\right] \end{align} or, if we relax the notation a bit since we know what we're doing now, we can summarize this as \begin{align} \frac{\delta L}{\delta \dot\phi} = \frac{\partial\mathscr L}{\partial\dot\phi}, \qquad \frac{\delta L}{\delta \phi} = \frac{\partial \mathscr L}{\partial \phi} - \frac{\partial}{\partial x}\frac{\partial \mathscr L}{\partial \phi'} \end{align} Now, suppose that we want to obtain the Euler-Lagrange equations. For this, we define the action for our theory as a function $S:\mathcal F\to\mathbb R$ as follows: \begin{align} S[\phi]=\int dt \,L[\phi, \dot\phi](t) \end{align} Notice that here, the symbol $\dot\phi$ does denote the partial time-derivative of the field $\phi$, namely $\dot\phi = \partial_t\phi$. The key point here is that even though the arguments of the Lagrangian are independent, we always have the freedom to evaluate the arguments on a field and its derivative which are certainly not independent. In particular, this means that if we vary the action, then in the integral on the right hand side, we can perform the sort of integration by parts that you were worried we wouldn't be able to do. In fact, if you vary the action, then you'll find that \begin{align} \delta S[\phi] &= \int dt\,dx\,\left[\frac{\delta L}{\delta\phi} - \frac{\partial}{\partial t}\frac{\delta L}{\delta\dot\phi}\right]\delta\phi \end{align} so setting the variation to zero, and using the results I derived above using the claimed definitions of the partial variational derivatives, we obtain the standard Euler-Lagrange equations \begin{align} \frac{\partial \mathscr L}{\partial \phi} -\frac{\partial}{\partial t}\frac{\partial \mathscr L}{\partial \dot\phi} - \frac{\partial}{\partial x}\frac{\partial \mathscr L}{\partial \phi'}=0. \end{align}

joshphysics
  • 57,120
  • @dj_mummy Well what you have written in terms of $\epsilon$ derivatives is the standard way of defining variations, but once the definitions are made, it is often a lot more convenient to write shorthand symbols. I would certainly agree with complaints that the $\delta$ notation often leads to confusion and mathematical nonsense, but for those who know the precise definitions, it's useful notation in my opinion. – joshphysics Oct 12 '13 at 09:42
  • 1
    @joshphysics Hi Josh, thanks for your answer. I eventually figured it out on my own (after I posted, which is typical -.-), but anyway what you have done is right, but I don't think you explained well why the variation in $\phi$ and $\dot{\phi}$ can be thought of as independent, when in QM (not QFT) the variation in $q$ and $\dot{q}$ are clearly not independent. The resolution is that we are defining the canonical conjugate momentum to the field at one fixed point in time only: $\pi(x,t_0)$. Now $L = L[\phi(x,t_0), \dot{\phi}(x,t_0), \partial_i \phi(x,t_0)]$ so in this snapshot of time, – nervxxx Oct 12 '13 at 14:18
  • we can vary the fields $\delta \phi$ at some other fixed point in time, but that also gives me the freedom to vary how fast the field moves at that other fixed point in time, so $\delta \phi(\vec{x}',t')$ and $\delta \dot{\phi}(\vec{x}',t')$ are independent variations. Note that $\delta \partial_i \phi$ is not independent of $\delta \phi$ - because given a new $\phi(\vec{x},t_0)$ at $t_0$, the change in the spatial gradients are known, which leads to the integration by parts and the Euler-Lagrange eqns in space coordinates only. Eventually they lead to the same variations as you wrote down – nervxxx Oct 12 '13 at 14:23
  • and to make $\pi$ evolve in time, in QFT we evolve operators the usual way by acting on it with the unitary time evolution operator: $\pi(x,t) = U(t)^\dagger \pi(x,0) U(t)$. – nervxxx Oct 12 '13 at 14:24
  • @nervxxx Sorry my answer wasn't more insightful. A couple of things: it's not clear how quantum is relevant here. These issues are present in classical mechanics and classical field theory. In the case of classical mechanics, $q$ and $\dot q$ are treated as being "independent" in the Lagrangian (think phase space), but the notion of functional derivatives of $L$ is no longer particularly relevant since its not the integral of a density. We can, however, take functional derivatives of the action which is commonly viewed as a functional only of paths, and not their derivatives – joshphysics Oct 12 '13 at 22:36
  • ,so in that case, I agree that it would be rather artificial to augment the action by a $\dot q$ argument and take independent functional derivatives with respect to $q$ and $\dot q$. – joshphysics Oct 12 '13 at 22:37
  • Can you give some reference –  Jul 18 '16 at 12:00
  • @bgr95 I'm sorry the say that I don't know of a good reference for this. If you find one, please let me know. – joshphysics Jul 18 '16 at 13:57
8

This answer can be view as a supplement to joshphysics' correct answer, possibly stressing slightly different things and using slightly different words.

Before defining functional/variational derivatives in Lagrangian formalism, it is crucial to understand exactly which variables are independent of each other and which are not? In other words, which variables can we freely vary and which can we not?

This is simplest to understand in point mechanics (PM), see e.g. this Phys.SE post. Here we shall focus on $n+1$ dimensional field theory (FT) with $n$ spatial dimensions and one temporal dimension.

Let us for simplicity assume that there is only one field $q$ (which we for semantical reasons will call a position field). The field $q$ is then a function $q:\mathbb{R}^{n}\times[t_i,t_f]\to \mathbb{R}$. There is also a velocity field $v:\mathbb{R}^{n}\times[t_i,t_f]\to \mathbb{R}$.

I) Let there be given an arbitrary but fixed instant of time $t_0\in [t_i,t_f]$. The (instantaneous) Lagrangian is a local functional

$$\begin{align}L&[q(\cdot,t_0),v(\cdot,t_0);t_0]\cr ~=~&\int \!d^nx~{\cal L}\left(q(x,t_0),\partial q(x,t_0),\partial^2q(x,t_0), \ldots,\partial^Nq(x,t_0);\right. \cr &\left. v(x,t_0),\partial v(x,t_0),\partial^2 v(x,t_0), \ldots,\partial^{N-1} v(x,t); x,t_0\right),\end{align}\tag{1} $$

where $\partial$ denotes spatial (as opposed to temporal) derivative. Here $N$ is finite for a local FT, and $N\leq 1$ for a relativistic FT. The Lagrangian density ${\cal L}$ is a function of the variables listed in eq (1).

The (instantaneous) Lagrangian (1) is a functional of both the instantaneous position $q(\cdot,t_0)$ and the instantaneous velocity $v(\cdot,t_0)$ at the instant $t_0$. Here $q(\cdot,t_0)$ and $v(\cdot,t_0)$ are independent variables. More precisely, they are independent (spatially distributed) profiles, or in other words, independent functions $\mathbb{R}^n\to \mathbb{R}$ over the $x$-space. The (instantaneous) Lagrangian (1) can in principle also depend explicitly on $t_0$. Note that the (instantaneous) Lagrangian (1) does not depend on the past $t<t_0$ nor the future $t>t_0$.

Thus it makes sense to define equal-time functional differentiations as

$$\frac{\delta q(x,t_0)}{\delta q(x^{\prime},t_0)} ~=~\delta^n(x-x^{\prime}), \qquad \frac{\delta v(x,t_0)}{\delta q(x^{\prime},t_0)}~=~0,$$ $$ \frac{\delta v(x,t_0)}{\delta v(x^{\prime},t_0)} ~=~\delta^n(x-x^{\prime}),\qquad \frac{\delta q(x,t_0)}{\delta v(x^{\prime},t_0)}~=~0. \tag{2} $$

And it makes sense to define canonical momentum as

$$p(x,t_0) ~:=~\frac{\delta L[q(\cdot,t_0),v(\cdot,t_0);t_0]}{\delta v(x,t_0)}, \tag{3} $$

where it is implicitly understood that the position $q$ is kept fixed in the velocity differentiation (3). In the $N\leq 2$ case, the field-theoretic momentum definition (3) becomes

$$\begin{align}&p(x,t_0)~=~\cr &\left(\frac{\partial }{\partial v(x,t_0)}- \sum_{i=1}^n\frac{d}{dx^i}\frac{\partial }{\partial (\partial_iv(x,t_0))}\right)\cr &{\cal L}\left(q(x,t_0),\partial q(x,t_0),\partial^2q(x,t_0); v(x,t_0),\partial v(x,t_0);x,t_0\right) .\end{align} \tag{4} $$

In the $N\leq 1$ case, the field-theoretic momentum definition (3) becomes simply a partial derivative

$$p(x,t_0) ~=~\frac{\partial{\cal L}\left(q(x,t_0),\partial q(x,t_0); v(x,t_0);x,t_0\right) }{\partial v(x,t_0)} .\tag{5} $$

II) Finally let us integrate over time $t\in[t_i,t_f]$. The action functional reads:

$$ S[q]~:=~\int_{t_i}^{t_f} \! dt~ \left. L[q(\cdot,t),v(\cdot,t);t]\right|_{v=\dot{q}}.\tag{6}$$

Here the time derivative $v=\dot{q}$ does depend on the function $q:\mathbb{R}^{n}\times[t_i,t_f]\to \mathbb{R}$.

$$\frac{\delta q(x,t)}{\delta q(x^{\prime},t^{\prime})} ~=~\delta^n(x-x^{\prime})\delta(t-t^{\prime}), \tag{7} $$

$$\frac{\delta \dot{q}(x,t)}{\delta q(x^{\prime},t^{\prime})} ~=~\delta^n(x-x^{\prime})\frac{d}{dt}\delta(t-t^{\prime}) ~\equiv~\delta^n(x-x^{\prime})\delta^{\prime}(t-t^{\prime}). \tag{8} $$

In particular, it does not make sense to vary independently wrt. to the velocity in the action (6) while keeping the position fixed.

See also this related Phys.SE post.

Qmechanic
  • 201,751
  • Thanks for the edit of my post, but you deleted a minus sign in the variation of qdot against q. I have included it again. – nervxxx Oct 13 '13 at 06:38
  • @nervxxx: The minus sign in your second eq. should not be there: $\frac{\delta }{\delta q(t')}\dot{q}(t) =\frac{\delta }{\delta q(t')}\frac{d}{dt}q(t) =\frac{d}{dt}\frac{\delta }{\delta q(t')}q(t)=\frac{d}{dt}\delta(t-t') \equiv \delta'(t-t').$ This calculation can also be done more rigorously using test functions and two integrations by part. – Qmechanic Oct 13 '13 at 06:57
  • It is there. You pick up a minus sign when you do an integration by parts to transfer the derivative from the qdot to the delta function. – nervxxx Oct 13 '13 at 07:11
  • Yes, but there are two integrations by part (forth and back). – Qmechanic Oct 13 '13 at 07:39
  • What's wrong with this argument then: $\dot{q}(t) = \int \dot{q}(t') \delta(t-t') dt' = -\int q(t') \dot{\delta}(t-t') dt'$. Varying, $\delta\dot{q}(t) = - \int \delta q(t') \dot{\delta}(t-t') dt$, so the functional derivative of $\dot{q}(t)$ w.r.t. $q(t')$ is whatever's in front of $\delta q(t')$ under the integral, which is $-\dot{\delta}(t-t')$. I can see where this goes wrong, when applying it to the Lagrangian - I don't get the EL equations, but I have the result $-\dot{\delta}(t-t')$ written twice in my notes from a class with string theorist Alexander Polyakov! :O – nervxxx Oct 13 '13 at 07:54
  • That calculation goes as follows: $\dot{q}(t) = \int ! dt^{\prime} ~\delta(t-t^{\prime}) \dot{q}(t^{\prime}) = \int ! dt^{\prime} ~ \delta(t-t^{\prime}) \frac{d}{dt^{\prime}}q(t^{\prime}) = -\int ! dt^{\prime} ~ q(t^{\prime})\frac{d}{dt^{\prime}} \delta(t-t^{\prime})$$ = \int ! dt^{\prime} ~ q(t^{\prime})\frac{d}{dt} \delta(t-t^{\prime}) \equiv \int ! dt^{\prime} ~q(t')\delta^{\prime}(t-t^{\prime})$. – Qmechanic Oct 13 '13 at 08:09
  • Alright thanks. It appears we are both right. I just wasn't careful enough in thinking what my dot meant. My dot's a $d/dt'$... – nervxxx Oct 13 '13 at 08:27