Nested Tori: Lagrangian mechanics

Showing posts with label Lagrangian mechanics. Show all posts

Wednesday, August 26, 2015

Hamiltonian Mechanics is Awesome (Double Pendulum, Part 4)

In this post, we work toward discretizing the equations for (and simulating) the double pendulum, considered in Parts 1, 2, and 3 of this series. It turns out that one very important qualitative behavior that we want is for the total energy to be conserved (one simulation I ran of this, directly discretizing the equations for $\ddot{\boldsymbol\theta}$, eventually ran amok over time, getting faster and faster, clearly something that would not occur in a real system). Thus, one place to look is symplectic methods, which we've remarked on before. However, at least with the symplectic methods I've learned, they seem more conceptually geared toward solving differential equations involving angular position and momentum, rather than angular position and velocity. This is not strictly true, as I have seen symplectic methods at work with position and velocity. However, this might be an artifact of the relation between velocity and momentum often being very trivial (e.g., just multiplying by some mass). In any case, the position-and-momentum viewpoint is more than important enough to be worth diving into now. This is referred to as Hamiltonian mechanics. We look at $H = T+V$ instead of $L = T-V$ and consider our variables as generalized position and momentum instead of generalized position and velocity (more generally, Hamiltonian mechanics can be derived from Lagrangian mechanics via the Legendre transform).

In fact, most basic physics courses tend to focus on the sum of kinetic and potential energies, saying energy is "converted" between these two forms. They just never used the word "Hamiltonian." It's not clear, at first, what the difference between the two approaches is, especially with simple examples where it's pretty trivial to go from one to the other. In mathematical or veteran Nested Tori reader terms, the difference is that of being equations on the tangent bundle vs. the cotangent bundle (called phase space). Momentum takes a life of its own in more complicated systems (and in fact, in quantum mechanics, it's not clear what something can even be the velocity of, so something called "momentum" crops up as its own fundamental quantity without reference to the velocity of anything).

So, what is this mystical momentum for our system? We've already calculated something which we called "angular momentum" for the double pendulum: $\mathbf{p} = \partial L/\partial \dot{\boldsymbol \theta}$, the partial "gradient" vector with respect to the angular velocities. This is what it is generally. (As a check, we note for a single particle of mass $m$ and kinetic energy $\frac 1 2 m v^2$, $\partial L/\partial \dot{\mathbf{x}}=\partial L/\partial {\mathbf{v}}$ is indeed the usual $m\mathbf{v}$.) For our example, we derived
\[
\mathbf{p} = \frac{\partial L}{\partial \dot{\boldsymbol{ \theta}}} = \begin{pmatrix} (m_1 +m_2) \ell_1^2 & m_2\ell_1\ell_2\cos(\theta_2-\theta_1) \\ m_2 \ell_1 \ell_2 \cos(\theta_2 -\theta_1) & m_2 \ell_2^2\end{pmatrix} \begin{pmatrix}\dot \theta_1 \\ \dot\theta_2 \end{pmatrix} = I\dot{\boldsymbol{\theta}}
\]
That's almost something like the case $\mathbf{p} = m\mathbf{v}$ for a single particle, with the rotational analogues of mass and velocity, though the "mass" here is much more complicated (depending on the positions). In particular, as a vector, it doesn't have to point in the same direction as the velocity, unlike in the particle case (that misalignment is also the cause of washing machines thumping with an unbalanced load). With this in mind, we rewrite the kinetic energy in terms of $\mathbf{p}$ as follows: assuming the matrix $I$ is invertible, we write $\dot{\boldsymbol\theta} = I^{-1} \mathbf{p}$, substitute in $T = \frac 1 2 \dot{\boldsymbol{\theta}}^t I \dot{\boldsymbol{\theta}}$, and write $\mathbf{q} = \boldsymbol{\theta}$ (the usual notation), to get
\[
H(\mathbf{q},\mathbf{p}) = T+V= \frac{1}{2} (\mathbf{p}^t I^{-1}) I (I^{-1} \mathbf{p}) + V(\mathbf q) =\frac{1}{2} \mathbf{p}^t I^{-1} \mathbf{p} + V(\mathbf q).
\]

Hamilton's Equations

Now that we have derived $H$, the Hamiltonian, in order to relate it to our actual equations of motion, we have Hamilton's equations:
\[
\dot{\mathbf{q}} = \frac{\partial H}{\partial \mathbf{p}}
\] \[
\dot{\mathbf{p}} = -\frac{\partial H}{\partial \mathbf{q}}
\]
which are very nice: $\mathbf{p}$ and $\mathbf{q}$ are almost "dual" to each other in some sense. There's just that extra minus sign; this can be remembered by recalling that $H$ includes potential energy and thus differentiating with respect to $\mathbf{q}$ gives the gradient. But force is conventionally given as the negative of the gradient of the potential in physics. We thus find, plugging in the above:
\[
\dot{\mathbf{q}} = I^{-1} \mathbf{p}
\]
and
\[
\dot{\mathbf{p}} = -\frac 1 2 \mathbf{p}^t \frac{\partial I^{-1}}{\partial \mathbf{q}} \mathbf{p} -\frac{\partial V}{\partial \mathbf{q}}.
\]
The part that really complicates things and makes this different from the simpler examples is that the moment of inertia $I$ is dependent on $\mathbf{q}$, so we can't ignore that term when differentiating with respect to $\mathbf{q}$ (it is often just the $-\partial V/\partial\mathbf{q}$ term by itself). How do we calculate that derivative of $I^{-1}$? We step back down to components:
\[
\dot{p}_i = -\frac 1 2 \mathbf{p}^t \frac{\partial I^{-1}}{\partial \theta_i} \mathbf{p} -\frac{\partial V}{\partial \theta_i}= +\frac 1 2 \mathbf{p}^t I^{-1}\frac{\partial I}{\partial \theta_i}I^{-1} \mathbf{p} -\frac{\partial V}{\partial \theta_i}
\] \[
= \frac{1}{2} \dot{\mathbf q}^t \frac{\partial I}{\partial \theta_i}\dot{\mathbf q}-\frac{\partial V}{\partial \theta_i}
\]
where we have employed the identity that the derivative of the inverse of a matrix is always multiplication, by the inverse on both sides, of the derivative of the matrix (and a sign reversal). Let's write out what all of this is. We note that $\partial I/\partial \theta_1 = - \partial I/\partial \theta_2$, because the only way the angles in appear in $I$ is the difference $\theta_2 - \theta_1$. Thus, we have
\[
\dot {\mathbf q} = I^{-1} \mathbf{p},
\]
and using that as an intermediate variable (because to write it all out explicitly is a real mess, and we will need to use it when discretizing it, anyway):
\[
\dot{p}_1 = \frac{1}{2}\dot{\mathbf{q}}^t \begin{pmatrix} 0 & m_2\ell_1\ell_2 \sin(\theta_2-\theta_1) \\ m_2\ell_1\ell_2 \sin(\theta_2-\theta_1) &0\end{pmatrix} \dot{\mathbf{q}} + (m_1 + m_2) g\ell_1 \sin \theta_1
\] \[
\dot{p}_2 = \frac{1}{2}\dot{\mathbf{q}}^t \begin{pmatrix} 0 & -m_2\ell_1\ell_2 \sin(\theta_2-\theta_1) \\ -m_2\ell_1\ell_2 \sin(\theta_2-\theta_1) &0\end{pmatrix} \dot{\mathbf{q}} + m_2 g\ell_2 \sin \theta_2.
\]
In the final part, we discretize these equations. That'll have to wait for next time.

Monday, August 24, 2015

The Euler-Lagrange equations for the Double Pendulum (Config Spaces, Part 3)

In this post, continuing the explorations of the double pendulum (see Part 1 and Part 2) we concentrate on deriving its equation of motion (the Euler-Lagrange equation). These differential equations are the heart of Lagrangian mechanics, and indeed really what one tries to get to when applying the methods (it's essentially a way of getting Newton's 2nd Law for complicated systems). And why it is awesome. This is, like the previous part, going to be a more technical than visual, but we'll get back to that in Part 4.

Recall from Part 2 that for the double pendulum,
\[
L(\theta,\dot \theta) = T - V = \frac 1 2 (m_1+m_2) \ell_1^2 \dot{\theta}_1^2 +\frac 1 2 m_2\left( 2 \dot{\theta}_1 \dot{\theta}_2\ell_1 \ell_2 \cos(\theta_2 - \theta_1) + \ell_2^2 \dot{\theta}_2^2\right)\] \[+ (m_1+m_2) g \ell_1 \cos \theta_1 + m_2 g \ell_2 \cos \theta_2,
\]
where, $\theta_1$, $\theta_2$ are the angles from vertical of the two pendulums, $\dot \theta_1$, $\dot \theta_2$ are their angular speeds, $\ell_1$, $\ell_2$ are the length of the rods, and $m_1$, $m_2$ are their masses. This looks very imposing, so let's remark a little on the structure of that equation. In simpler situations, one often sees different squared speed terms, but here we have the interaction of the two parts of the system, in terms with products like $\dot \theta_1 \dot \theta_2$, as well as the $\cos(\theta_2 - \theta_1)$. This distinguishes the system from two completely isolated (uncoupled) pendulums; the Lagrangian already captures the aspect that energy gets transferred back and forth between the two pendulums.

The equations of motion are derived from considering an optimization problem, namely, what paths $\gamma$ through state space connecting two states optimizes the action integral
\[
\int_a^b L(\gamma(t),\dot\gamma(t)) dt.
\]
In a process quite similar to what's done in first-year calculus, we "take a derivative and set it to zero", which derives the equations
\[
\frac{\partial L}{\partial \theta_i} - \frac d {dt} \frac{\partial L}{\partial \dot \theta_i} = 0
\]
where $\partial L/\partial \dot \theta_i$ means the partial derivative with respect to $\dot \theta_i$ as if it were an independent variable. In fact, one way is to start off with a formula involving two independent variables $\theta$ and $\omega$; only after solving for the extremal path do we actually find $\omega = \dot \theta$ (this is a generalization called the Hamilton-Pontryagin principle).

$\partial L/\partial \dot {\boldsymbol \theta}$, angular momentum, and the moment of inertia

Let's now do the computation $\frac{\partial L}{\partial \dot \theta_1}$ (this quantity actually is the angular momentum of the first pendulum, $p_1$)—treating everything except $\dot \theta_1$ as a constant:
\[
p_1 = \frac{\partial L}{\partial \dot \theta_1} = (m_1 + m_2) \ell_1^2 \dot \theta_1 + m_2 \ell_1\ell_2
\cos(\theta_2-\theta_1)\dot \theta_2 .
\]
Note that unlike the more basic examples of angular momentum, this isn't simply dependent only on the (angular) velocity $\dot \theta_1$, but also on $\dot \theta_2$ and nonlinearly on $\theta_1$ and $\theta_2$. Similarly,
\[
p_2 = \frac{\partial L}{\partial \dot \theta_2} = m_2 \ell_2^2 \dot \theta_2 + m_2\ell_1\ell_2 \cos(\theta_2 - \theta_1) \dot \theta_1.
\]
This is more compactly expressed as
\[
\frac{\partial L}{\partial \dot{\boldsymbol{\theta}}}=\begin{pmatrix}p_1\\p_2\end{pmatrix} = \begin{pmatrix} (m_1 +m_2) \ell_1^2 & m_2\ell_1\ell_2\cos(\theta_2-\theta_1) \\ m_2 \ell_1 \ell_2 \cos(\theta_2 -\theta_1) & m_2 \ell_2^2\end{pmatrix} \begin{pmatrix}\dot \theta_1 \\ \dot\theta_2 \end{pmatrix}.
\]
This helps keep the unruliness under one umbrella. The square matrix in the above will be important, so let's call it $I$ (it is something like a moment of inertia for our system—but it's more complicated, principally because there is an interaction between its two component parts... It's not even a rigid body!). It should also be noted that the kinetic energy is $T = \frac 1 2 \dot{\boldsymbol{\theta}}^t I \dot{\boldsymbol \theta}$, so it still is quadratic in $\dot {\boldsymbol \theta}$, just like good ol' $\frac{1}{2} mv^2$.

The forces, $\partial L/\partial \boldsymbol{\theta}$

Due to the nonlinear dependence of the Lagrangian on the angular variables, the derivatives are trickier. But we forge on, because sticking to just the simplest examples won't help us gain as much insight into how things are used in practice. We have
\[
\frac{\partial L}{\partial \theta_1} = m_2 \ell_1 \ell_2 \dot \theta_1 \dot \theta_2 \sin(\theta_2-\theta_1) - (m_1 + m_2) g \ell_1 \sin \theta_1
\]

and
\[
\frac{\partial L}{\partial \theta_2} = -m_2 \ell_1 \ell_2 \dot \theta_1 \dot \theta_2 \sin(\theta_2-\theta_1) - m_2 g \ell_2 \sin \theta_2.
\]

Putting it together

Now we take the time derivatives of the momenta we derived above and subtract and set to zero to derive the Euler-Lagrange equations. That's straightfoward calculus; but different from simpler examples in that, of course, the chain rule must be applied to the $\cos(\theta_2 - \theta_1)$, so there will be an extra $\dot \theta_1$ or $\dot \theta_2$ term:

\[
\frac{d}{dt} \frac{\partial L}{\partial \dot{\boldsymbol{\theta}}} =  \begin{pmatrix} 0 & -m_2\ell_1\ell_2\sin(\theta_2-\theta_1)(\dot \theta_2 - \dot \theta_1) \\ -m_2 \ell_1 \ell_2 \sin(\theta_2 -\theta_1)(\dot \theta_2 - \dot \theta_1) & 0\end{pmatrix} \begin{pmatrix}\dot \theta_1 \\ \dot\theta_2 \end{pmatrix}\]
\[+\begin{pmatrix} (m_1 +m_2) \ell_1^2 & m_2\ell_1\ell_2\cos(\theta_2-\theta_1) \\ m_2 \ell_1 \ell_2 \cos(\theta_2 -\theta_1) & m_2 \ell_2^2\end{pmatrix} \begin{pmatrix}\ddot \theta_1 \\ \ddot\theta_2 \end{pmatrix}.
\]
The nice thing about it is that when it is subtracted from $\partial L/\partial \boldsymbol \theta$, the cross terms with the $\dot \theta_1 \dot\theta_2$ cancel. Also, the matrix multiplying the $\ddot{\theta}_i$'s is none other than our moment-of-inertia matrix $I$. This finally gives us

\[
0=\frac{\partial L}{\partial \boldsymbol\theta} - \frac{d}{dt} \frac{\partial L}{\partial \dot{\boldsymbol\theta}} =\begin{pmatrix}-(m_1+m_2) g \ell_1 \sin \theta_1 - m_2 \ell_1\ell_2 \sin(\theta_2-\theta_1)\dot{\theta}_2^2 \\ -m_2 g \ell_2 \sin \theta_2 +m_2 \ell_1\ell_2 \sin(\theta_2-\theta_1)\dot{\theta}_1^2  \end{pmatrix} -I \begin{pmatrix} \ddot \theta_1 \\ \ddot \theta_2 \end{pmatrix}
\]
as the Euler-Lagrange equations. As the final equation of motion, we bring the $\ddot {\boldsymbol \theta}$ term to the other side:
\[
I \begin{pmatrix} \ddot \theta_1 \\ \ddot \theta_2 \end{pmatrix} = \begin{pmatrix}-(m_1+m_2) g \ell_1 \sin \theta_1 - m_2 \ell_1\ell_2 \sin(\theta_2-\theta_1)\dot{\theta}_2^2 \\ -m_2 g \ell_2 \sin \theta_2 +m_2 \ell_1\ell_2 \sin(\theta_2-\theta_1)\dot{\theta}_1^2  \end{pmatrix}
\]
(keep in mind that $I$ varies with $\theta_1$ and $\theta_2$, however). We discretize and solve this next time. For now, simply take note of the interesting nonlinear interactions: in the positions as the sine of the difference, and quadratic nonlinearity in the derivatives (note the reversal: the $\ddot\theta_1$ gets a $\dot \theta_2^2$ and vice versa).

Friday, August 21, 2015

Lagrangian Mechanics is Awesome (Double Pendulum, Part 2)

In this post, we give some calculational details of the double pendulum (introduced in Part 1). This derivation is available in several physics books at the undergraduate (upper division) level, usually fleshed out using Lagrangian mechanics, another relevant subject that is highly useful to conceptualizing these types of problems by putting them in a setting using notions of state spaces. More cool-looking demos (and also a streamlined derivation) are available in David Eberly's excellent book, Game Physics. As a warning, this will be considerably more technical than visual; I post this because I feel that some insight into the workings of these things helps us see where geometry can be useful in contexts that are not just literally pictures. Nevertheless, here's the picture for the upcoming analysis:

$\ell_i$ are the length of the pendulum rods, $m_i$ are the masses, and $\theta_i$ are their angular displacements from vertical.

The use of Lagrangian mechanics spurs on the development of thinking in terms of state spaces by considering different kinds of coordinates that may be more convenient for the problem at hand. It is easier to think about the angular positions $\theta_1$ and $\theta_2$ of a pendulum than it is to derive it using x- and y-coordinates directly (and for our problem, physical constraints in the x- and y-coordinates, here made by assuming that the rods are rigid, make it somewhat less straightforward to deal with). Of course, we might use the x- and y-coordinates in parts of our derivation, but it's not so easy to solve for them.

Lagrangian mechanics also has us think of things in terms of energy, a quantity whose properties crop up in math a lot. Our goal in this post is to derive the Lagrangian L of the system. L is the difference between total kinetic energy T and total potential energy V. The total kinetic energy of our system should be the sum of that of the two particles,
\[
T = \frac 1 2 m_1 v_1^2 + \frac 1 2 m_2 v_2^2,
\]
and the potential energy (assuming that our pendulums are subject to downward acceleration g) depends only on the height of the two pendulum bobs:
\[
V = m_1 g y_1 + m_2 g y_2.
\]
However, if we want to express this in terms of the angular coordinates, we find, via the usual trigonometric arguments, that $y_1 = -\ell_1\cos \theta_1$ and since $y_2$ includes the height of the first pendulum, $y_2 = -\ell_1 \cos\theta_1 - \ell_2 \cos \theta_2$. In total, this means
\[
V = -(m_1+m_2) g \ell_1 \cos \theta_1 - m_2 g \ell_2 \cos \theta_2.
\]
Now the kinetic energy is a little trickier. Here, we assume that the first pendulum bob has linear velocity $\mathbf{v}_1$, and speed $|\mathbf{v}_1| = v_1$. Now, if we let the second pendulum bob have linear velocity $\mathbf{v}_{1,2}$ relative to the first, then the total velocity of the second bob is $\mathbf{v}_2 = \mathbf{v}_1 + \mathbf{v}_{1,2}$. The reason that $\mathbf{v}_{1,2}$ is useful is that it is calculated in exactly the same way as the first pendulum (because the second pendulum is, well, a pendulum).

So the total (squared) speed of the second pendulum bob is $v_2^2= v_1^2 + 2 \mathbf{v}_1 \cdot \mathbf{v}_{1,2} + v_{1,2}^2$.

Now, to figure out $\mathbf{v}_1 \cdot \mathbf{v}_{1,2}$, we use the geometric (length and angle) formulation of the dot product:
\[
\mathbf{v}_1 \cdot \mathbf{v}_{1,2} = v_1 v_{1,2} \cos \varphi
\]
where $\varphi$ is the angle between $\mathbf{v}_1$ and $\mathbf{v}_{1,2}$. But the angle between $\mathbf{v}_1$ and $\mathbf{v}_{1,2}$, since both of them are perpendicular to their respective rods, must be the same as the angle between the rods: $\varphi=\theta_2 - \theta_1$. Finally, to express the magnitudes $v_1$ and $v_{1,2}$, we use the relations $v_1 = \dot{\theta}_1 \ell_1$ and $v_{1,2} = \dot{\theta}_2 \ell_2$. This gives
\[
v_2^2 = \ell_1^2 \dot{\theta}_1^2 + 2 \dot{\theta}_1 \dot{\theta}_2\ell_1 \ell_2 \cos(\theta_2 - \theta_1) + \ell_2^2 \dot{\theta}_2^2
\]

This finally gives
\[
T =\frac 1 2 (m_1+m_2) \ell_1^2 \dot{\theta}_1^2 +\frac 1 2 m_2\left( 2 \dot{\theta}_1 \dot{\theta}_2\ell_1 \ell_2 \cos(\theta_2 - \theta_1) + \ell_2^2 \dot{\theta}_2^2\right),
\]
or,
\[
L = T-V = \frac 1 2 (m_1+m_2) \ell_1^2 \dot{\theta}_1^2 +\frac 1 2 m_2\left( 2 \dot{\theta}_1 \dot{\theta}_2\ell_1 \ell_2 \cos(\theta_2 - \theta_1) + \ell_2^2 \dot{\theta}_2^2\right)\]\[+ (m_1+m_2) g \ell_1 \cos \theta_1 + m_2 g \ell_2 \cos \theta_2.
\]
(If you're reading this on a mobile device, please turn it to Landscape mode to see the full equations.) Actually solving these equations for $\theta_1$ and $\theta_2$ will have to wait for another entry!

As a final note, kudos to MathJax which allows rendering of math formulas using $\LaTeX$!