Calculus of variations: meaning of infinitesimal variation $\delta$ and action minimum

Question

So I am studying classical mechanics through the MIT 8.223 notes, and encountered the derivation of the Euler Lagrange equation. There is a part I don't quite understand, which resides in the actual meaning of the $\delta$ symbol here. We define the action $S[q(t)]$ as the integral from $t_1$ to $t_2$ of $L(q,\dot q,t)$:

$$S[q(t)] = \int_{t_1}^{t_2}L(q,\dot q,t) dt.$$ We also define a new slightly perturbed function $q(t) + \delta q(t)$ and the variation of the action $\delta S$ as the difference between the action evaluated at the perturbed and initial functions, respectively (the lagrangian function is the same for both)

$$\delta S = S[q+\delta q]-S[q] = \int_{t_1}^{t_2}L(q + \delta q,\dot q + \delta \dot q,t) dt - \int_{t_1}^{t_2}L(q,\dot q,t) dt. $$ It is then said that:

$$ \delta S = \delta \int_{t_1}^{t_2}L(q,\dot q,t) dt = \int_{t_1}^{t_2} \delta L(q,\dot q,t) dt. $$

Then, by using the chain rule: $$\int_{t_1}^{t_2} \delta L(q,\dot q,t) dt = \int_{t_1}^{t_2} \frac{\partial L}{\partial q} \delta q + \frac{\partial L}{\partial \dot q} \delta \dot q dt.$$ The derivation goes on, but this is enough to answer my question. I get everything until the definition of $ \delta S$, here $\delta$ just acts on two places, to define $\delta q$ , which is a slight perturbation to the original function (but still a function of $t$, we can even take derivatives of it) and to define $\delta S$, which has a straightforward definition given above, it is just the difference of the functional at the perturbed and original functions.

The thing I don't get is the use of $\delta$ afterwards, it is brought into the integral as if it were a new kind of derivative and it even acts on $L$. However, this use of $\delta$ hasn't been defined. So what is this "operator" exactly and why can it act both to define the perturbations on the action and the generalized coordinate and to operate on functions?
Another shorter question: why is $\delta S = 0$? I know it might seem weird, but to me it seems like it should be greater than zero, if we were looking for a minimum, since we said that it is the difference between the action evaluated at the different perturbed and non perturbed functions, and the action at the original function is a minimum, thus the action at any other function is greater than that value. Shouldn't that make it greater than zero?

Just to your last point, $\delta S=0$ in the same way that $\frac{df}{dx}=0$ at a minimum of $f(x)$. At a stationary point in the action you would expect a small deviation in the function to produce zero change in the action. — Charlie, Jul 29 '20 at 17:14

Vicky · Accepted Answer · 2020-08-03T03:19:28.453

Regarding your question about $\delta$ and the $t$-dependence of $q$. First of all, $\delta$ means variation which is different from derivation. In other words,

$$ \delta L(\{x_i\}) = \sum_j \frac{\partial L}{\partial x_j}\delta x_j $$

where $\delta x_j$ is a variation of $x_j$, not in time but a change of its form. E.g., if $x_j^{(1)} = x_j(t = 0) + 5t$ and $x_j^{(0)} = x_j(0) + 5(1 - 0.00001)t$, then $\delta x_j$ could be $\delta x_j = x_j^{(1)} - x_j^{(0)} = 0.0005t$. We have not changed $t$ but the function that $x_j$ can be (its form): the thing you've been calling trajectory since high school.

Now you can understand that $\delta L \neq \frac{dL}{dx}$ or equivalent things. $\delta$ is defined as the change of $S$ or $L$ when you change the trajectory your body is following, not when you change the time.

Secondly, $\delta S = 0$ is not impossed to get a minimum but to get a singular point (i.e. a maximum, minimum or saddle point) due to all partial derivatives are zero then. You make it equal to zero because you know, since Euler and Lagrange, that the Euler-Lagrange equations give you the classical trajectory of the body under study. As far as I know (but I could be wrong), it wasn't until Feynman that we know that classically $\delta S = 0$ implies a minimum. But that comes from the path-integral formulation of quantum mechanics which is a thing for another question. Nevertheless, for completeness I'll give you a little insight. In quantum mechanics, the probability $P$ of a process comes as

$$ P \sim e^{-S/\hbar} $$

So only the smallest actions will give you relevant contributions to $P$ (yeah in QM, more than one count so your classical approximation, your classical trajectory, will be the one in the minimum: the smallest of the smallest for having the highest $P$).

$\delta S = 0$ does not imply a minimum at all, classically nor quantum-mechanically. Classical trajectories correspond to stationary configuration. There is a whole theory (Jacobi) discussing when a trajectory actually correspond to a minimum. — Smerdjakov, Jul 29 '20 at 17:45
that's what I wrote. We know it's a minimum because of the path-integral, the greater the S, the smaller the exponential. — Vicky, Jul 29 '20 at 17:46
There are many cases where a saddle configuration, for example, is used in a path-integral. I do not think we know it is a minimum at all, because it needs not to be. But I might be wrong of course. — Smerdjakov, Jul 29 '20 at 17:48
First we know that $\delta S = 0$ so min, max or saddle point. Second, we know path-integral is exponential-dependent: $P \sim exp(-S/\hbar)$. Both things together allow you to realise that the classical solution must be a min. It cannot be a max or saddle point because the trajectories with higher $S$ are exponentially suppresed and since classically you have only one solution, only one trajectory, it has to be the most probable: the one with the smallest action. That's the way I think of it, maybe I'm wrong — Vicky, Jul 30 '20 at 00:52
I see your point and I do need to study it in detail. I was pretty sure the classical action need not be a minimum though, https://aapt.scitation.org/doi/10.1119/1.2710480. But as said, thank for your hint, I will study it in detail. — Smerdjakov, Jul 30 '20 at 07:52

d_b · Answer 2 · 2020-07-29T21:15:00.310

I address question 1 only.

The standard notation is indeed unfortunate. First of all, let's dispense with the "$\delta x$" notation. The $\delta$ is $\delta S$ and in "$\delta x$" mean completely different things. As I'll explain shortly, we can think of the $\delta$ in $\delta S$ as an operation applied to the action $S$, but "$\delta x$" is one inseparable symbol meant to stand for an infinitesimal variation in the path. It is not $\delta$ applied to $x$. So let's instead write this infinitesimal variation as $\epsilon$.

Now, given an action functional $S(x)$, $\delta S$ stands for the derivative of $S$ with respect to variations in the path $x$. Specifically, \begin{align} S(x+\epsilon) - S(x) = \delta S + R, \end{align} where $\delta S$ is a linear function of $\epsilon$, and $R$ is $O(\epsilon^2)$.

Computing this following the usual steps, we find (assuming we choose $\epsilon(t_i) = \epsilon(t_f)$) \begin{equation} \delta S = \int_{t_i}^{t_f}dt \left(\frac{\partial L}{\partial x} -\frac{d}{dt}\frac{\partial L}{\partial\dot{x}}\right) \epsilon \end{equation} Then a further unfortunate choice is often made, namely to denote the integrand in this expression as "$\delta L$", so that "$\delta S = \int \delta L\, dt$". Again, this is a definition of the inseparable symbol "$\delta L$", and not an operation applied to the lagrangian.

References: Arnold, Mathematical Methods of Classical Mechanics, Section 12; José and Saletan, Classical Dynamics, Section 3.1

score 0 · Answer 3 · answered Jul 29 '20 at 20:46

To understand the derivation, you shouldn't seek a mathematically precise definition of the $\delta$ as an operator. Throughout the derivation it has different mathematical meanings, but the physical meaning is consistent: that of a small change.

We make a small change to $q(t)$ and call that $\delta q(t)$. Then we look at how everything else changes to first order, and denote that small change by a $\delta$. So we have $\delta S$, $\delta L$, $\delta \dot{q}$, etc.

The only new operator here is really the $\delta$ on the $S$, which is something like the $\nabla$ operator but applied to functionals. Everywhere else that the $\delta$ appears it is more like the typical $d$ of usual calculus.

And the fact that $\delta \leftrightarrow \nabla$ on $S$ answers your second question. To find a minimum for a function on vectors we would solve $\nabla f = 0$. On functionals we solve $\delta S = 0$. Yes, this doesn't mean that the point actually is a minimum: it could be a maximum, or saddle point. That is just an unfortunate mis-naming of the 'Principle of Least Action'; it should really be called the 'Principle of Stationary Action'.

score 0 · Answer 4 · answered Aug 02 '20 at 18:22

To discuss derivation of the Euler-Lagrange equation I must first discuss the following lemma:

(To my knowledge this lemma doesn't have a name of its own, possibly it is regarded as trivially evident. In another physics.stackexchange answer I have proposed the name Jacob's lemma, after Jacob Bernoulli.)

To present this lemma let me go back to the problem that inspired the development of calculus of variations: the brachistochrone.

The solution of the brachistochrone problem is a function that minimizes the time to travel from start to end. Take the solution of the problem, and divide it in two sections. Each subsection of the solution has the same property as the global solution: it is minimal. You can continue subdividing indefinitely, the property of being minimal carries over indefinitly, so th extends to infinitisimally short subdivisions. This connects variational and differential calculus.

The above reasoning is a proof of existence:
If you can state a problem in a variational form (fixed start and end points, varying in between), and the solution is an extremum (minimum or maximum), then the solution of that problem can also be found with a differential equation.

I have used the brachistochrone problem as an example, this reasoning generalizes to all cases; the extremum can be either a maximum or a minimum.

The Euler-Lagrance equation
With the above in place I can turn to the Euler-Lagrange equation. The Euler-Lagrange equation (a differential equation) accepts any problem stated in variational form, and transforms it to a problem stated in terms of differential calculus.

I recommend the derivation of the Euler-Lagrange equation by Preetum Nakkiran. Preetum Nakkiran points out that since the equation expresses a local condition it should be possible to derive it using local reasoning only.

This derivation with local reasoning only has the following advantage: all of the steps have an intuitive meaning.

The derivation that you encountered in your learning material, with global variation of the trial trajectory, is unneccesarily elaborate.

Classical mechanics

In terms of Lagrangian mechanics the true trajectory is the one trajectory that among the range of all trial trajectories has an extremum of the action.

The diagram below shows a sequence of 7 frames, each shown 3 seconds (animated GIF)
The sequence demonstrates the case of uniform acceleration.

Black curve: the trial trajectory
Red curve: kinetic energy
Green curve: minus potential energy

Note that in order to demonstrate the concept of Action the curve for the potential energy is upside down; it's the minus potential energy.

As the trial trajectory is varied: when the trial trajectory hits the true trajectory the red curve and the green curve are parallel everywhere. That is, this method uses the work-energy theorem to identify the true trajectory.

The lower-right quadrant shows the two integrals that together make up the action of classical Lagrangian mechanics

Calculus of variations: meaning of infinitesimal variation $\delta$ and action minimum

4 Answers4