57

The Euler-Lagrange equation gives the equations of motion of a system with Lagrangian $L$. Let $q^\alpha$ represent the generalized coordinates of a configuration manifold, $t$ represent time. The Lagrangian is a function of the state of a particle, i.e. the particle's position $q^\alpha$ and velocity $\dot q^\alpha$. The Euler-Lagrange equation is

$$ \frac{d}{dt} \frac{\partial L}{\partial \dot q^\alpha } = \frac{\partial L}{\partial q^\alpha}$$

Why is this a law of physics and not a simple triviality for any function $L$ on the variables $q^\alpha$ and $\dot q^\alpha$? The following "proof" of the Lagrange Equation uses no physics, and seems to suggest that the Lagrange Equation is simply a mathematical fact that works for every function.

$$\begin{align} \frac{d}{dt} \frac{\partial L}{\partial \dot q^\alpha} & = \frac{\partial}{\partial \dot q^\alpha} \frac{dL}{dt} &\text{commutativity of derivatives} \\ \ \\ &= \frac{\partial \dot L}{\partial \dot q^\alpha} \\ \ \\ &= \frac{\partial L}{\partial q^\alpha} & \text{cancellation of dots} \end{align}$$

This can't be right, or else nobody would give a hoot about this equation and it would be totally useless to solve any problem. What is wrong with the logical reasoning above?

Qmechanic
  • 201,751
Trevor Kafka
  • 1,816
  • You can require the EL equation for any functional. However, your thesis that it is a general identity is wrong. Where did you get the ideas for step 1 and 3 ? – my2cts Sep 16 '18 at 11:42
  • Several comments which were answering the question have been deleted. Please keep in mind that comments are meant to be used for requesting clarifications or suggesting improvements to the parent post, not for answering the question. – David Z Sep 17 '18 at 22:35
  • Actually, in oder to derive EL equations it is involved one of the most important principle of physics, known as Variational Principle or Hamilton Principle. This principle, which the EL are derived from, states that the path the object will follow, if you let it move freely, is the one that vanishes the variation of the action $S$. In formula, $\delta S = 0$ – Matteo Campagnoli Sep 19 '18 at 14:19

7 Answers7

117

Ah, what a tricky mistake you've made there. The problem is that you've simply confused some notions in multivariable calculus. Don't feel bad though-- this is generally very poorly explained. Both steps 1 and 3 above are incorrect. Rest assured, the Euler-Lagrange equation is not trivial.

Let's first take a step back. The Lagrangian for a particle moving in one dimension in an external potential energy $V(q)$ is $$ L(q, \dot q) = \frac{1}{2}m \dot q^2 - V(q). $$ This is how most people write it. However, this is very confusing, because clearly $q$ and $\dot q$ are not independent variables. Once $q$ is specified for all times, $\dot q$ is also specified for all times.

A better way to write the above Lagrangian might be $$ L(a, b) = \frac{1}{2}m b^2 - V(a). $$ Here we've exposed the Lagrangian for what it really is: a function that takes in two numbers and outputs a real number. Likewise, we can clearly see that $$ \frac{\partial L}{\partial a} = -V'(a) \hspace{1cm} \frac{\partial L}{\partial b} = m b. $$ Usually, most people write this as $$ \frac{\partial L}{\partial q} = -V'(q) \hspace{1cm} \frac{\partial L}{\partial \dot q} = m \dot q. $$ However, $q$ and $\dot q$ must be understood as independent variables in order to do this correctly. Just as $a$ and $b$ were independent variables, $q$ and $\dot q$ are too when they're being put into the Lagrangian. In other words, we could put any two numbers into $L$; we just decided to put in $q$ and $\dot q$.

Furthermore, let's look at the total time derivative $\frac{d}{dt}$. How should we understand the following expression? $$ \frac{d}{dt} L(q(t), \dot q(t)) $$ Both $q$ and $\dot q$ are functions of time. Therefore, $L(q(t), \dot q(t))$ depends on time simply because $q(t)$ and $\dot q(t)$ do. Therefore, in order to evaluate the above expression, we need to use the chain rule in multivariable calculus. $$ \frac{d}{dt} L(q(t), \dot q(t)) = \frac{dq}{dt} \frac{\partial L}{\partial a}(q(t), \dot q(t)) + \frac{d \dot q}{dt} \frac{\partial L}{\partial b}(q(t), \dot q(t)) = \dot q(t) \frac{\partial L}{\partial a}(q(t), \dot q(t)) + \ddot q(t) \frac{\partial L}{\partial b}(q(t), \dot q(t)) $$

In the above expression, I once again used $a$ and $b$ in order to make my point clearer. We need to take partial derivatives of $L$ assuming $a$ and $b$ are independent variables. AFTER differentiating, we THEN evaluate $\partial L / \partial a$ and $\partial L / \partial b$ by plugging in $(q, \dot q)$ into the $(a,b)$ slots. This is just like how in single variable calculus, if you have $$ f(x) = x^2 $$ and you want to find $f'(3)$, you first differentiate $f(x)$ while keeping $x$ an unspecified variable, and THEN plug in $x = 3$.

In your first step, the derivatives DON'T commute because $t$ and $q$ are not independent. ($q$ depends on $t$.) Yes, partial derivatives commute, but ONLY if the variables are independent. In your third step, you can't "cancel the dots" because $L$ depends on two inputs. If $L$ only depended on $q$, then yes, you could "cancel the dots" (as this is equivalent to the chain rule in single variable calculus), but it doesn't, so you can't.

EDIT: You can see for yourself that the Euler-Lagrange equation is not identically $0$. If you take the Lagrangian $L(q, \dot q)$ I've written above and plug it into the Euler Lagrange equation, you get $$ m \ddot q(t) + V'(q(t)) = 0. $$ This is not the same as $0 = 0$. It is a condition that a path $q(t)$ would have to satisfy in order to extremize the action. If it was $0 = 0$, then all paths would extremize the action.

EDIT: As Arthur points out, this is also a good time to discuss the difference between $dL / dt$ and $\partial L / \partial t$. If we have a time dependent Lagrangian, $$ L(q, \dot q, t) $$ then $L$ can depend on $t$ explicitly, as opposed to just through $q$ and $\dot q$. So, for example, where as we might have the Lagrangian for a particle in a constant gravitational field $g$ is $$ L(a,b) = \frac{1}{2} mb^2 - m g a $$ if we let allow $L$ to depend on $t$ explicitly, we could have the gravitational field get stronger as time goes on: $$ L(a,b,t) = \frac{1}{2} mb^2 - m ( C t )a. $$ ($C$ is a constant such that $Ct$ has the same units as $g$.)

The quantity $$ \frac{\partial}{\partial t} L(a, b, t) $$ should be understood as differentiating the "$t$-slot" of $L$. In the above example, we would have $$ \frac{\partial}{\partial t} L(a,b,t) = - m C a. $$ The quantity $$ \frac{d}{d t} L(q(t), \dot q(t), t) $$ should be understood as the full time derivative of $L$ due to the fact that $q$ and $\dot q$ also depend on $t$. For the above example, \begin{align*} \frac{d}{d t} L(q(t), \dot q(t), t) &= \dot q(t) \frac{\partial L}{\partial a}(q(t), \dot q(t),t) + \ddot q(t) \frac{\partial L}{\partial b}(q(t), \dot q(t),t) + \frac{\partial L}{\partial t} (q(t), \dot q(t), t) \\ &= (\dot q) (-mC t ) + \ddot q(t) (m \dot q(t)) - mC q(t) \end{align*}

user1379857
  • 11,439
  • 13
    Thanks for a comprehensive answer. We need more of those on the Stack Exchange Network. – Peter Mortensen Sep 16 '18 at 23:33
  • 14
    The Euler-Lagrange equations was the first time I properly appreciated the difference between $\partial$ and $\mathrm d$ in derivatives. For instance, given a function $L(t, q, \dot q)$, the expression $\frac{\partial L}{\partial t}$ means "Differentiate the multivariable function $L(t, q, \dot q)$ with respect to the first variable", while $\frac{\mathrm dL}{\mathrm dt}$ means "Differentiate the single variable function $L(t, q(t), \dot q(t))$ with respect to the variable $t$". – Arthur Sep 17 '18 at 07:58
  • 4
    +1 I'm a big fan of writing $L(a,b)$ to emphasize that $L$ just depends on two variables. But maybe it would help to emphasize a tiny detail: that when calculating the action (and hence finding the E-L equations), we plug in functions of time $q(t)$ and $\dot{q}(t)$ for $a$ and $b$. Otherwise it seems like the Lagrangian is a function of time and yet, at the same time, it isn't. – Javier Sep 17 '18 at 15:34
21
  1. The commutator $$\left[\frac{\partial}{\partial \dot{q}^j},\frac{\mathrm d}{\mathrm d t}\right]~\stackrel{(2)}{=}~\frac{\partial}{\partial q^j}\tag{1}$$ of a velocity derivative $\frac{\partial}{\partial \dot{q}^j}$ with the total time derivative $$\frac{\mathrm d}{\mathrm d t} ~=~\frac{\partial}{\partial t} +\dot{q}^j\frac{\partial}{\partial q^j} +\ddot{q}^j\frac{\partial}{\partial \dot{q}^j} +\dddot{q}^j\frac{\partial}{\partial \ddot{q}^j} +\ldots \tag{2}$$ is not zero. See also e.g. this related Math.SE post & this related Phys.SE post.

  2. The cancellation of dots $$\frac{\partial \dot{L}}{\partial \dot{q}^j}~=~\frac{\partial L}{\partial q^j}\tag{3}$$ works for functions $L(q,t)$ that don't depend on velocities $\dot{q}^k$. But a Lagrangian typically depends on velocities. See also this related Phys.SE post.

  3. Note the following algebraic Poincare lemma: $$L\text{ satisfies the Euler-Lagrange (EL) eqs. identically }$$ $$\quad\Updownarrow\quad\tag{4}$$ $$L\text{ is a total time derivative}$$ (modulo possible topological obstructions). For details, see e.g. this & this Phys.SE posts.

Qmechanic
  • 201,751
  • I liked your answer the best it was short and very concise and directly to the point. I wish I could have written it. – Michael Sep 16 '18 at 22:55
8

So, in principle one can choose essentially $\it{any}$ Lagrangian $\mathcal{L}$ with sufficiently chosen coordinates (and possibly constraints), and apply variational calculus to it via the Euler-Lagrange equations. The equations of motion that this produces may or may not correspond to an understandable model of reality. There are lots of Lagrangians that don't correspond to reality (seemingly). The Lagrangians that produce physical models have been found usually by guess-and-check and consultation with experiment/observation.

why is this a fundamental law of physics and not a simple triviality of ANY function L on the variables $q$ and $\dot{q}$?

The Euler-Lagrange formalism is not a "fundamental law of physics." Rather, it is a partial differential equation (or a set of them) whose solutions make a particular functional stationary, meaning the solutions obey the principle of extremized action. This mathematical concept was actually generalized in control theory by Pontryagin's maximum principle. The laws of physics are derivable through the Euler-Lagrange method, but the method is not fundamental, similar to how the particular geometry chosen is not fundamental(par. 17) for deriving physical laws. Physicists use math to model reality, so of course we're going to use the things that work! For instance, Einstein derived his field equations heuristically, but Hilbert derived them (around the same time) from the action principle by guessing the correct $\mathcal{L}$. But nowadays, almost everyone that works with general relativity or modified gravity start from $\mathcal{L}$ and use the action principle (except in cosmology they typically start from the metric itself).

It is not entirely surprising that since we are natural creatures which evolved to understand patterns of our environment, the tools we create - especially the abstract ones like math - might have some correspondence with reality. Eugene Wigner wrote a very nice essay about this topic, called "The Unreasonable Effectiveness of Mathematics in the Natural Sciences," in which he argues that it is obvious that math works so well at modeling reality, but it's not at all obvious why this works.

"Why" questions are very difficult to answer, and this one is especially difficult. Some Lagrangians work at producing physical models, and some don't, and maybe the E-L equations work as a filter for figuring that out since it can be used to make testable predictions.

@ AccidentalFourierTransform already clarified your mathematical errors, so I will not.

  • I'm not following your argument that the equations aren't a law of physics. With typical definitions, are the equations not perfectly equivalent to Newton's 2nd law, which is unambiguously a law? Not wanting to get into a debate about the definition of "physical law", but some clarification here could be useful. – aquirdturtle Sep 16 '18 at 21:43
  • Of course, as you said, the very definition of "physical law" is up for debate, but going by the wiki article I linked, it says "Physical laws are typically conclusions based on repeated scientific experiments and observations over many years and which have become accepted universally within the scientific community." The E-L equations are a mathematical formalism - specifically the PDE you solve to extremize an action. The Action Principle is not a law, but it is a theoretical principle that produces very useful models, similar to other principles, i.e. the principle of relativity. This help? – Daddy Kropotkin Sep 16 '18 at 22:32
  • So, to drive the point home, Newton's Laws of Motions are empirically verified, whereas the E-L equations is a method by which to derive those Laws of Motion. Newton's Laws can be taken as axioms, or they can be derived, but we call them "laws" because they are empirical. – Daddy Kropotkin Sep 16 '18 at 22:33
3

Your question: ''Why is the Lagrange equation not a triviality? What is wrong with my calculation?''.

First some notation. Using the unambiguous notation from SICM, the Lagrange equations are:

$$\mathrm{D}((\partial_2 L) ∘ Γ[q]) − (\partial_1 L) ∘ Γ[q] = 0$$ (where $\mathrm{D}$ is the total derivative (corresponds to the time derivative), and $Γ[q] = (q, \mathrm{D}q, ...)$ is the functional that provides the path and its derivate(s).)

(If you wonder what is wrong with traditional notation, then I recommend reading the preface of SICM which addresses this, but basically it is exactly such confusions as this question is about.)

Trying to rewrite your calculation using the unambiguous notation from SICM immediately reveals some problems:

Impossible to simply commute derivatives: Neither $$\mathrm{D}((\partial_2 L) ∘ Γ[q]) \neq \partial_2((\mathrm{D} L) ∘ Γ[q])$$ nor $$\mathrm{D}((\partial_2 L) ∘ Γ[q]) \neq \partial_2\mathrm{D} (L ∘ Γ[q])$$ make any sense.

Impossible to cancel dots: $$\partial_2\mathrm{D} (L ∘ Γ[q]) \neq \partial_1 (L ∘ Γ[q])$$ both left and right look pretty non-sensical.

Then you need to do $$\partial_1 (L ∘ Γ[q]) = (\partial_1 L) ∘ Γ[q]$$ to reconstruct a sane expression.

Thus no step in your proof is warranted.

hkBst
  • 138
  • 1
    This notation seems a bit confusing. It doesn't suggest the variables explicitly (1 and 2 could be anything). Is the ∘ functional composition? Does the $\Gamma [q]$ represent an inner product? As far as I can tell, its main advantage is ease of use with a particular programming language, but it might be a little unfamiliar. The potential for confusion with the Christoffel symbols or the gamma function exists, too. Could you explain it a little more fully in your answer? – Obie 2.0 Sep 17 '18 at 07:31
  • 1
    @Obie2.0, I recommend the preface of SICM to explain the rationale for the different notation, but I will try to explain a little better in my answer. – hkBst Sep 17 '18 at 07:49
  • +1, this notation makes it so much better! – R.. GitHub STOP HELPING ICE Sep 18 '18 at 04:19
1

That's an interesting sequence of symbolic manipulations!

It's because of the lack of rigour that it's easy to fall into these pitfalls and typically physics text don't go into where these are and why and how to avoid them. It's a skill that one picks up by doing problems, going through the theory and reading around.

Similar problems are associated with the path integral which has no rigorous definition. However, the variational calculus can be made rigourous. However, this is difficult. It's typically not touched upon in an undergraduate mathematics course where they will rigourously define calculus for one real variable, for one complex variable and many real variables - either calculus on a manifold or more typically, multi-variable calculus, which is calculus in a (finite-dimensional) vector space.

To make the mathematics of this rigorous requires apparatus of jet bundles. You can find an exposition of Saunders Jet Bundles and Michors Natural Operations. It's takes quite some development.

Mozibur Ullah
  • 12,994
1

N. Steinle already gave a great answer on the question

why is this a fundamental law of physics and not a simple triviality of ANY function L

but I would like to point out an additional tidbit regarding the part

.. seems to suggest that the Lagrange Equation is simply a mathematical fact that works for every function.

While the Lagrange equations mathematically really only describe a function/process that is an extremal value of some Lagrangian (or also some energy or action potential), the important part is that the converse is not as simple.

It seems to be a "a fundamental law of physics" that many processes, that we observe in nature even have a Lagrangian, an energy potential. This is actually not trivial, not every multidimensional function has such a potential and is a statement about the symmetry of these processes.

Aganju
  • 621
1

This is not about "what's wrong" but about how you could figure out what's wrong (or at least find something that's wrong in your attempted proof). Take a nice simple Lagrangian, like that for a free particle in one dimension: $L=\frac m2(\dot q)^2$ (where $q$ represents distance). And take some motion that is not correct in that physical situation, like uniform acceleration $q=at^2$, where $a$ is a non-zero constant. From $L$, you get the Euler-Lagrange equation $\frac d{dt}(m\dot q)=0$ (because $\partial L/\partial\dot q=m\dot q$ and $\partial L/\partial q=0$), i.e., you get conservation of momentum. On the other hand, from $q=at^2$, you get $\frac d{dt}(m\dot q)=m\ddot q=2ma$ (assuming the mass $m$ is constant). So the Euler-Lagrange equation is violated. That already shows that the Euler-Lagrange equation cannot be "simply a mathematical fact that works for every function." But you can get more information by plugging this particular $L$ and this particular $q(t)$ into your attempted proof, to see exactly which of your equations in that proof fail.

  • Hi Andreas! Wouldn't this be more suited as a comment? I am not entirely sure, just a suggestion. –  Sep 16 '18 at 20:08
  • 1
    @DvijMankad I agree with you so much that I began writing this as a comment, but I think commenting requires having earned 100 reputation on this site (not by having been active on other stackexchange sites). So if anyone with the authority to do so moves this to a comment, I won't mind at all. – Andreas Blass Sep 16 '18 at 20:14
  • Oh, I see! I don't think that could be done. I think it can work perfectly well as a ''supplement answer'' since the body of the text clearly mentions what it sets out to do. Welcome to Physics.SE! :-) –  Sep 16 '18 at 20:23