The dynamical variables in Lagrangian formalism

Question

It has been written in Goldstein's Classical Mechanics that in Lagrangian formalism, the independent dynamical variables are $q$ and $t$. That's why we represent state of a system in Lagrangian formalism using a point in configuration space. But throughout the calculations we treat $\dot{q}$ also as an independent variable, like for calculations from Euler-Lagrange equation. Also, Goldstein mentions that mathematically we treat $\dot{q}$ as independent variable but other than that it is not so. How can a mathematically independent quantity not be considered so while understanding the dynamics of the system like in stating the state of its?

This question (v2) is essentially a duplicate of Calculus of variations — how does it make sense to vary the position and the velocity independently? — Qmechanic, Oct 31 '17 at 05:39

J. Murray · Accepted Answer · 2020-05-25T12:22:56.233

We do not treat $\dot q$ as an independent variable in the derivation of the Euler-Lagrange equations. The rough answer is that $q$ and $\dot q$ are independent as inputs to the Lagrangian, but become linked once we specify a path through configuration space - I expand on this in points 5 and 6.

I'll be quite formal in what follows, but perhaps the formality will be somewhat enlightening. First, a few preliminaries:

1: The state of an $N$-dimensional system consists of a point $$q\equiv(q_1,q_2,\ldots,q_N)\in \bf{Q}$$ where $\bf{Q}$ is called the configuration space corresponding to the system, and $q_i$ is the $i^{th}$ generalized coordinate.

2: A curve $\gamma$ through the configuration space is a map $$ \gamma : \mathbb{R} \rightarrow \bf{Q}$$ $$ t \mapsto \gamma(t)=\big(q_1(t),q_2(t),\ldots,q_N(t)\big)\equiv q_\gamma(t)$$ The curve is therefore parameterized by $t$, which we call the time. This describes how the state of the system evolves. Note that we will demand that $\gamma$ be at least twice differentiable.

3: At every point along $\gamma$, there exists a unique tangent vector $V_\gamma(t)$ given as follows: $$ V_\gamma : \mathbb{R} \rightarrow \bf{T_qQ}$$ $$ t \mapsto V_\gamma(t)=\big(\dot q_1(t),\dot q_2(t),\ldots,\dot q_N(t)\big)\equiv \dot q_\gamma(t)$$ $\mathbf{T_qQ}$ is called the tangent space to $\bf{Q}$ at the point $q$. I won't bother defining this rigorously, but the intuitive notion of a tangent space should be familiar if you're picking up Goldstein.

4: The disjoint union of all of the tangent spaces of $\bf{Q}$ is called the tangent bundle to $\bf{Q}$, and is denoted $\bf{TQ}$: $$ \bf{TQ} = \underset{q\in\bf{Q}}{\sqcup}\bf{T_qQ}$$ If $(q,v)$ is an element of the tangent bundle $\bf{TQ}$, then that means that $v$ is a tangent vector to some curve passing through the point $q$.

5: The Lagrangian is a function which takes three (or two, depending on your point of view) inputs - a point $(q,v) \in\bf{TQ}$, and a real number $t\in \mathbb{R}$ - and maps them to a real number: $$ L : \bf{TQ} \times \mathbb{R} \rightarrow \mathbb{R}$$ $$ (q,v, t) \mapsto L(q,v, t)$$ A crucial point is that $q$ does not determine $v$ - as far as $L$ is concerned, $q$ is just some point in $\bf{Q}$ and $v$ is the tangent vector to one of the infinity of curves that passes through $q$.

6: The action functional $S$ maps a curve $\gamma$ to a real number in the following way: $$S[\gamma] = \int L\big(q_\gamma(t),\dot q_\gamma(t), t\big) dt $$ To reiterate the above point, the Lagrangian has three slots - one for a point in configuration space, one for a tangent vector, and one for a real number. As far as $L$ is concerned, these three slots are independent, so we can take partial derivatives at our leisure.

When we execute the action functional, we walk along the curve $\gamma$. At each $t$, we feed $\gamma(t)\equiv q_\gamma(t)$ into the first slot, $V_\gamma(t) \equiv \dot q_\gamma(t)$ into the second slot, and $t$ into the third slot. But it can't be emphasized enough that the Lagrangian itself has no idea that the three inputs have anything whatsoever to do with one another.

Now that that's out of the way, we can get down to business. We seek some $\gamma$ for which the action functional is stationary. Intuitively, we think "take the derivative and set it to zero," but at this stage it's not really clear how to take a derivative with respect to a curve.

Instead, we'll do the following. Denote the correct (but unknown) curve $\gamma_c$. Then a general curve $\gamma$ can be written as the "sum" of $\gamma_c$ and some "error" $\eta$ which vanishes at the endpoints of the integral, and where the sum is defined component-wise. In other words, at some time $t$,

$$q_\gamma(t) = q_c(t)+\epsilon\eta(t) \equiv \big(q_{c1}(t)+\epsilon\cdot\eta_1(t),q_{c2}(t)+\epsilon\cdot\eta_2(t),\ldots,q_{cN}(t)+\epsilon\cdot\eta_N(t)\big)$$

while the tangent vector (also called the generalized velocity) becomes $$\dot q_\gamma(t) = \dot q_c(t)+\epsilon\cdot \eta'(t)\equiv \big(\dot q_{c 1}(t)+\epsilon\cdot\eta_1'(t),\dot q_{c 2}(t)+\epsilon\cdot\eta_2'(t),\ldots,\dot q_{cN}(t)+\epsilon\cdot\eta_N'(t)\big)$$

where $\epsilon\in \mathbb{R}$. Rather than worry about the details of functional derivatives, we can seek a path $\gamma$ which makes the action integral stationary with respect to changes in $\epsilon$: $$ \frac{dS[\gamma]}{d\epsilon} = 0$$

The action functional becomes $$S[\gamma] = \int_A^B L\big(q_\gamma(t),\dot q_\gamma(t),t\big) dt=\int_A^B L\big(q_c(t)+\epsilon\cdot\eta(t),\dot q_c(t)+\epsilon\cdot\eta'(t),t\big) dt$$

Differentiating with respect to $\epsilon$ gives $$\frac{dS[\gamma]}{d\epsilon} = \int_A^B\sum_{i=1}^N\left[ \frac{\partial L}{\partial q_{\gamma i}}\eta_i(t) + \frac{\partial L}{\partial \dot q_{\gamma i}} \eta_i'(t) \right]dt$$

We now recognize that

$$ \frac{\partial L}{\partial \dot q_{\gamma i}} \eta_i' (t) = \left[\frac{\partial L}{\partial \dot q_{\gamma i}} \eta_i (t)\right]' - \left(\frac{d}{dt}\frac{\partial L}{\partial \dot q_{\gamma i}}\right) \eta_i (t)$$

and since the boundary term vanishes at the endpoints, we find that

$$\frac{dS[\gamma]}{d\epsilon} = \int_A^B\sum_{i=1}^N\left[\frac{\partial L}{\partial q_{\gamma i}} - \frac{d}{dt}\frac{\partial L}{\partial \dot q_{\gamma i}}\right]\eta_i(t) dt $$

Because this quantity must vanish for any independent set of choices of $\eta_i$, it follows that the integrand must vanish everywhere, and so

$$\frac{d}{dt}\frac{\partial L}{\partial \dot q_{\gamma i}} = \frac{\partial L}{\partial q_{\gamma i}}$$

This gives us the Euler-Lagrange equations which allow us to solve for the proper path in terms of the generalized coordinates $q_{\gamma i}$.

The specification of a curve, which links the generalized coordinates to the generalized velocities, happens at the level of the action, not at the level of the Lagrangian. As far as $L$ is concerned, $q(t)$ and $\dot q(t)$ have nothing to do with one another and can be chosen completely independently. That's the difference between feeding $L$ the number $q(t)$ as opposed to the function $q$.

Thanks for the amswer. Can you suggest a book to read all this material from? — quirkyquark, Oct 31 '17 at 10:51
It's a bit difficult, as I really butchered the idea of tangent spaces to keep from going too deep into differential geometry. I more or less hopped back and forth between the standard elementary treatment of Lagrangian mechanics and the manifold viewpoint, while pretty much anything you find will be either one or the other. The closest thing I can find is "Global Formulations of Lagrangian and Hamiltonian Dynamics on Manifolds" by Lee, Leok, and McClamroch. A copy may be available online if you ask the Almighty Google. — J. Murray, Oct 31 '17 at 14:08
+1 If we want to be formal, I think the "If $\eta$ is small, we can linearize" is a bit inconvenient. But it can be made more rigorous without sacrificing much readability by considering $\gamma = \gamma_c + \epsilon \eta$ where $\epsilon$ is a real number and finding $dS[\gamma]/d\epsilon=0$. — JiK, Oct 31 '17 at 19:22
@JiK You're absolutely right - I cleaned up some notation and made that edit. Thanks. — J. Murray, Nov 01 '17 at 00:03
Instead of using $q$ and $\dot q$ which leads to the supposed confusion, can't we start with something else like $ \mathcal{L} (a,b,t)$ and then arrive at some relation between $a$ and $b$, like $b=\frac{da}{dt}$. Would it make sense to do something like this? — Tachyon209, May 25 '20 at 11:52
@Tachyon209 In principle, yes we could do that. We would need to impose $b= \frac{da}{dt}$ in the derivation of the EL equations to obtain the fact that $\delta b = \frac{d}{dt} \delta a$, but that is no problem. Ultimately, the issue is that once this confusion has been eliminated, the $q,\dot q$ notation is vastly more convenient to put into operational use. It's certainly how I write down Lagrangians, at least. — J. Murray, May 25 '20 at 12:19
@J.Murray do you really mean “independent” in “ We do not treat $\dot{q}$ as an independent”? — ZeroTheHero, May 25 '20 at 12:37
@ZeroTheHero Yes, I do. Do you have a different phrasing in mind? — J. Murray, May 25 '20 at 13:14
Sorry @J.Murray but can you clarify what $\delta b$ and $\delta a$ are and how did you get the relation $\delta b =\frac{d}{dt} \delta a$ ? I mean, how did you get the infinitesimal $\delta$ here? Also, do you know of any resources which might try this method of taking arbitrary $a$ and $b$ because writing $q$ and $\dot q$ assumes beforehand that $\dot q$ is the derivative of $q$. — Tachyon209, May 25 '20 at 13:22
@Tachyon209 I am referring to the derivation of the Euler-Lagrange equations, using the symbol $a$ where I've used $q$, and the symbol $b$ where I've used $\dot q$. — J. Murray, May 25 '20 at 13:26
But I guess you didn't use $\delta$ infinitesimals in your derivation. Again, I'm really sorry if I am wrong, but aren't they different from $d$ infinitesimals? — Tachyon209, May 25 '20 at 13:28
@Tachyon209 At this point, I'd need to re-do the derivation using different notation, and I can't do that in a comment. A derivation of the EL equations using that notation can be found e.g. here — J. Murray, May 25 '20 at 13:30
Yes, Yes. Thank you. I was just confused as to whether you were referring to something else. Sorry for the misunderstanding. — Tachyon209, May 25 '20 at 13:32
@J.Murray It seems it is treated as independent and only becomes the actual velocity (as in tangent to the true path) and thus dependent once you have a solution for that path. Wether or not $\dot{q}$ is really independent of $q$ is a usual sticking point in teaching this stuff. — ZeroTheHero, May 25 '20 at 13:36
I actually just tried to rederive it with $a$ and $b$ rather than $q$ and $\dot q$ to avoid confusion before feeding the lagrangian into the Euler-Lagrange equation, but it seems like even J. Murray's answer assumes that $\dot q$ is indeed $\frac{d}{dt} q$ to derive Euler Lagrange equation. So, honestly, I still think you'll have to relate $q$ and $dotq$ beforehand. I just started my derivation with $a(t)$ and $b(t)$ hoping to get a path with some particular $a(t)$ and $b(t)$ with the condition $b=\frac{da}{dt}$ which extremizes the action, but seems like you'll have to assume that. — Tachyon209, May 25 '20 at 13:59
@Tachyon209 Yes, of course. The Lagrangian is a function of three independent inputs; when we formulate the action, however, we choose a path $q$, and then we plug $q(t)$ into the first slot, $\dot q(t)$ into the second slot, and $t$ into the third slot. We then integrate over $t$ to obtain the action. — J. Murray, May 25 '20 at 14:25
@ZeroTheHero Yes. Given some function $L$ which takes two inputs, we can linearize it to get a new function which also takes two inputs. The fact that we subsequently choose to plug non-independent quantities into that linearized function is irrelevant to the linearization procedure. — J. Murray, May 25 '20 at 14:29

Valter Moretti · Answer 2 · 2017-10-31T18:34:38.543

The general Lagrangian formalism is developed in a manifold $j^1(E)$ with the structure of a jet bundle constructed out of a fiber bundle $E \to \mathbb R$.

In other words $E$ is locally the product of $Q$ and $\mathbb R$, where $Q$ is a manifold where configurations of the system are described at every time $t \in \mathbb R$.

$E$ is covered by local coordinate patches $t, q^1,\ldots, q^n$ where $t$ is the temporal coordinate over the basis $\mathbb R$ of the fiber bundle $E \to \mathbb R$ and $q^1,\ldots, q^n$ cover the fibers $Q_t$ (diffeomorphic to $Q$).

The first jet extension $j^1(E)$ over $\mathbb R$ enlarges each fiber $Q_t$ by adding a further factor $\mathbb R^n$ covered by jet coordinates, $\dot{q}^1,\ldots, \dot{q}^n$ independent from the $q^1,\ldots, q^n$ but such that they identify to $\frac{dq^1}{dt}, \cdots, \frac{dq^n}{dt}$ as soon as a motion $t \mapsto (t, q^1(t), \ldots, q^n(t))$ is given. In other words $(t, q^1, \ldots, q^n, \dot{q}^1,\ldots, \dot{q}^n)$ fix the kinetic state of the system at time $t$. Here the configuration and the kinetic state are completely independent. The fibers of $j^1(E)$ are therefore $2n$-dimensional manifolds $A_t$, the space of kinetic states at time $t$, diffeomorphic to a canonical fiber $A$ covered by local coordinates $q^1, \ldots, q^n, \dot{q}^1,\ldots, \dot{q}^n$

In view of this structure, changing local coordinates and passing to $t', q^{'1},\ldots, q^{'n}, \dot{q}^{'1},\ldots, \dot{q}^{'n}$ the relations are $$t' = t+c\tag{1}$$ $$q'^k = q'^k(t, q^1,\ldots, q^n)\tag{2}$$ $$\dot{q}^{'k} = \frac{\partial q^{'k}}{\partial t} + \sum_{j=1}^n \frac{\partial q^{'k}}{\partial q^j} \dot{q}^j\tag{3}$$ and the inverse relations have the same structure.

You see that the third equation is compatible with the interpretation of $\dot{q}$ as time derivative of $q$. This interpretation is only formal because that derivative cannot be computed when a point $a\in A_t$ is given: to compute the said derivative we would need a curve (a section) passing through $a$, not only $a$ itself.

Euler-Lagrange equations are first-order equations induced by a scalar function ${\cal L} : j^1(E) \to \mathbb R$ that, in every local chart determines a section $t \mapsto \gamma(t) \in j^1(E)$, in coordinates $$t \mapsto (t, q(t), \dot{q}(t))\:, $$ solution of, for $k=1,\ldots, n$, $$\frac{d}{dt} \frac{\partial {\cal L}}{\partial \dot{q}^k}- \frac{\partial \cal L}{\partial q^k}=0\:.$$ $$\frac{dq^k}{dt} = \dot{q}^k(t)\:.$$ You see that $\dot{q}$ results to be the time derivative of $q$ only along the solutions of the E-L equations, otherwise $q$ and $\dot{q}$ are independent variables.

ADDED COMMENT. Why jet bundles?

The overall idea is finding a mathematical structure that encodes the idea that

$q$ and $\dot{q}$ are independent variables and they become dependent ($\dot{q}$ is the time derivative of $q$) along every solutions of equations of motion.

The first idea is modeling the space of kinetic stats on the tangent bundle of the configuration space $TQ$ where $Q$ is covered by Lagrangian coordinate patches $q^1,\ldots q^n$. Here $\dot{q}^1, \ldots, \dot{q}^n$ are the components of tangent vectors at $q^1,\ldots q^n$ (interpreted as tangent vectors to curves through that point parametrized by means of the time coordinate).

This is nice but, this way, transformations of coordinates explicitly depending on time are mathematically unnatural but physically necessary (think of Lagrangian coordinates at rest with two different reference frames one inertial and the other not inertial).

A way out is using as spacetime of kinetic states the Cartesian product $A = \mathbb R \times TQ$, where $\mathbb R$ is the temporal axis and viewing admissible coordinates on $A$ as coordinates $(t,q^1,\ldots, q^n, \dot{q}^1, \ldots, \dot{q}^n)$ where $t\in \mathbb R$ and $q^1,\ldots, q^n$ are coordinates on $Q$ and $\dot{q}^1, \ldots, \dot{q}^n$ are coordinates on each fiber of $TQ$. The coordinate $t$, in classical physics is required to coincide with the absolute time and thus it is fixed just up to an additive constant. This explains why we restricted the possible changes of temporal coordinate to the elementary (1).

This picture can be implemented already at the level of space of configurations, defining the spacetime of configurations as $E: =\mathbb R \times Q$.

In practice this construction is effective, but it suffers from the ideological drawback that every coordinate change (1)-(3) may use a different realization of $E$ (and $A$) as a Cartesian product as is evident form the transformation rules (2) (and (3)), whereas no natural choice exists in general.

So we should look for a structure that looks like a Cartesian product (at least locally) but its Cartesian decomposition is not canonical and it admits an adapted atlas of local charts whose transformation rules are stated in (1)-(3).

The first step to remove a fixed Cartesian product structure is, restricting to (1) and (2) only, assuming from scratch that the spacetime of configurations is not $\mathbb R \times Q$ but a manifold which locally looks like that product without fixing any particular choice of this decomposition.

This structure exists and is well known in mathematics: it is a fiber bundle $E \to \mathbb R$ with canonical fiber diffeomorphic to $Q$. The atlas of local coordinates adapted to the bundle structure (with preferred global coordinate defined up an additive constant on the basis $\mathbb R$) is made of local charts $t, q^1,\ldots, q^n$ transforming exactly as in (1)-(2).

It remains to further extend this structure to encompass the kinetic information. The manifold $A= j^1(E)$ is a very good candidate. It is nothing but $E$ with the addition of $n = \dim (Q)$ coordinates $\dot{q}^1, \ldots \dot{q}^n$ to each fiber for every natural coordinate patch $t, q^1,\ldots, q^n$, with the requirement that changing coordinates (3) holds true. This is because, in the definition of jet bundle, the added dot coordinates must be interpreted as the components of tangent vectors of sections in $E$, i.e., the components of all possible tangent vectors to curves $\mathbb R \ni t \mapsto (q^1(t), \ldots, q^n(t))$ passing through each point of $E$.

Can you, please, explain to me why jet bundles are unavoidable in the geometrization of classical field theory, but in principle, one can do without them in particle mechanics? At least, that is what I know. Perhaps I am wrong. — DanielC, Oct 31 '17 at 13:05
@DanielC They are not unavoidable! Just they are useful tools to model some basic physical idea. I tried to answer your question within the final note I added to my answer. — Valter Moretti, Oct 31 '17 at 14:14

Qmechanic · Answer 3 · 2018-05-13T15:01:18.570

Part of OP's question seems to be a matter of semantics: If a Lagrangian $$L(q^1,\ldots, q^n, v^1,\ldots, v^n,t)\tag{1}$$ has $n$ independent generalized position variables $q^1,\ldots, q^n$, i.e. the configuration space is $n$-dimensional, then the system is said to have $n$ degrees of freedom (DOF), cf. e.g. this Phys.SE post.

This definition of DOF is used despite the fact that the Lagrange equations are $n$ 2nd-order coupled ODEs and hence the full solution have $2n$ integration constants, i.e. the number $n$ of DOF is defined as half the number of integration constants!
Another issue is that the generalized velocities $v^1, \ldots, v^n,$ are independent variables in the Lagrangian (1), but they are dependent variables in the action $$S[q^1,\ldots, q^n; t_i,t_f]~:=~ \int_{t_i}^{t_f}\!\mathrm{d}t~ L(q^1,\ldots, q^n, \dot{q}^1,\ldots, \dot{q}^n,t),\tag{2}$$ This is e.g. explained in this Phys.SE post.

The dynamical variables in Lagrangian formalism

3 Answers3