## Wednesday, August 26, 2015

### Hamiltonian Mechanics is Awesome (Double Pendulum, Part 4)

In this post, we work toward discretizing the equations for (and simulating) the double pendulum, considered in Parts 1, 2, and 3 of this series. It turns out that one very important qualitative behavior that we want is for the total energy to be conserved (one simulation I ran of this, directly discretizing the equations for $\ddot{\boldsymbol\theta}$, eventually ran amok over time, getting faster and faster, clearly something that would not occur in a real system). Thus, one place to look is symplectic methods, which we've remarked on before. However, at least with the symplectic methods I've learned, they seem more conceptually geared toward solving differential equations involving angular position and momentum, rather than angular position and velocity. This is not strictly true, as I have seen symplectic methods at work with position and velocity. However, this might be an artifact of the relation between velocity and momentum often being very trivial (e.g., just multiplying by some mass). In any case, the position-and-momentum viewpoint is more than important enough to be worth diving into now. This is referred to as Hamiltonian mechanics. We look at $H = T+V$ instead of $L = T-V$ and consider our variables as generalized position and momentum instead of generalized position and velocity (more generally, Hamiltonian mechanics can be derived from Lagrangian mechanics via the Legendre transform).

In fact, most basic physics courses tend to focus on the sum of kinetic and potential energies, saying energy is "converted" between these two forms. They just never used the word "Hamiltonian." It's not clear, at first, what the difference between the two approaches is, especially with simple examples where it's pretty trivial to go from one to the other. In mathematical or veteran Nested Tori reader terms, the difference is that of being equations on the tangent bundle vs. the cotangent bundle (called phase space). Momentum takes a life of its own in more complicated systems (and in fact, in quantum mechanics, it's not clear what something can even be the velocity of, so something called "momentum" crops up as its own fundamental quantity without reference to the velocity of anything).

So, what is this mystical momentum for our system? We've already calculated something which we called "angular momentum" for the double pendulum: $\mathbf{p} = \partial L/\partial \dot{\boldsymbol \theta}$, the partial "gradient" vector with respect to the angular velocities. This is what it is generally. (As a check, we note for a single particle of mass $m$ and kinetic energy $\frac 1 2 m v^2$, $\partial L/\partial \dot{\mathbf{x}}=\partial L/\partial {\mathbf{v}}$ is indeed the usual $m\mathbf{v}$.) For our example, we derived
$\mathbf{p} = \frac{\partial L}{\partial \dot{\boldsymbol{ \theta}}} = \begin{pmatrix} (m_1 +m_2) \ell_1^2 & m_2\ell_1\ell_2\cos(\theta_2-\theta_1) \\ m_2 \ell_1 \ell_2 \cos(\theta_2 -\theta_1) & m_2 \ell_2^2\end{pmatrix} \begin{pmatrix}\dot \theta_1 \\ \dot\theta_2 \end{pmatrix} = I\dot{\boldsymbol{\theta}}$
That's almost something like the case $\mathbf{p} = m\mathbf{v}$ for a single particle, with the rotational analogues of mass and velocity, though the "mass" here is much more complicated (depending on the positions). In particular, as a vector, it doesn't have to point in the same direction as the velocity, unlike in the particle case (that misalignment is also the cause of washing machines thumping with an unbalanced load). With this in mind, we rewrite the kinetic energy in terms of $\mathbf{p}$ as follows: assuming the matrix $I$ is invertible, we write $\dot{\boldsymbol\theta} = I^{-1} \mathbf{p}$, substitute in $T = \frac 1 2 \dot{\boldsymbol{\theta}}^t I \dot{\boldsymbol{\theta}}$, and write $\mathbf{q} = \boldsymbol{\theta}$ (the usual notation), to get
$H(\mathbf{q},\mathbf{p}) = T+V= \frac{1}{2} (\mathbf{p}^t I^{-1}) I (I^{-1} \mathbf{p}) + V(\mathbf q) =\frac{1}{2} \mathbf{p}^t I^{-1} \mathbf{p} + V(\mathbf q).$

#### Hamilton's Equations

Now that we have derived $H$, the Hamiltonian, in order to relate it to our actual equations of motion, we have Hamilton's equations:
$\dot{\mathbf{q}} = \frac{\partial H}{\partial \mathbf{p}}$ $\dot{\mathbf{p}} = -\frac{\partial H}{\partial \mathbf{q}}$
which are very nice: $\mathbf{p}$ and $\mathbf{q}$ are almost "dual" to each other in some sense. There's just that extra minus sign; this can be remembered by recalling that $H$ includes potential energy and thus differentiating with respect to $\mathbf{q}$ gives the gradient. But force is conventionally given as the negative of the gradient of the potential in physics. We thus find, plugging in the above:
$\dot{\mathbf{q}} = I^{-1} \mathbf{p}$
and
$\dot{\mathbf{p}} = -\frac 1 2 \mathbf{p}^t \frac{\partial I^{-1}}{\partial \mathbf{q}} \mathbf{p} -\frac{\partial V}{\partial \mathbf{q}}.$
The part that really complicates things and makes this different from the simpler examples is that the moment of inertia $I$ is dependent on $\mathbf{q}$, so we can't ignore that term when differentiating with respect to $\mathbf{q}$ (it is often just the $-\partial V/\partial\mathbf{q}$ term by itself). How do we calculate that derivative of $I^{-1}$? We step back down to components:
$\dot{p}_i = -\frac 1 2 \mathbf{p}^t \frac{\partial I^{-1}}{\partial \theta_i} \mathbf{p} -\frac{\partial V}{\partial \theta_i}= +\frac 1 2 \mathbf{p}^t I^{-1}\frac{\partial I}{\partial \theta_i}I^{-1} \mathbf{p} -\frac{\partial V}{\partial \theta_i}$ $= \frac{1}{2} \dot{\mathbf q}^t \frac{\partial I}{\partial \theta_i}\dot{\mathbf q}-\frac{\partial V}{\partial \theta_i}$
where we have employed the identity that the derivative of the inverse of a matrix is always multiplication, by the inverse on both sides, of the derivative of the matrix (and a sign reversal). Let's write out what all of this is. We note that $\partial I/\partial \theta_1 = - \partial I/\partial \theta_2$, because the only way the angles in appear in $I$ is the difference $\theta_2 - \theta_1$. Thus, we have
$\dot {\mathbf q} = I^{-1} \mathbf{p},$
and using that as an intermediate variable (because to write it all out explicitly is a real mess, and we will need to use it when discretizing it, anyway):
$\dot{p}_1 = \frac{1}{2}\dot{\mathbf{q}}^t \begin{pmatrix} 0 & m_2\ell_1\ell_2 \sin(\theta_2-\theta_1) \\ m_2\ell_1\ell_2 \sin(\theta_2-\theta_1) &0\end{pmatrix} \dot{\mathbf{q}} + (m_1 + m_2) g\ell_1 \sin \theta_1$ $\dot{p}_2 = \frac{1}{2}\dot{\mathbf{q}}^t \begin{pmatrix} 0 & -m_2\ell_1\ell_2 \sin(\theta_2-\theta_1) \\ -m_2\ell_1\ell_2 \sin(\theta_2-\theta_1) &0\end{pmatrix} \dot{\mathbf{q}} + m_2 g\ell_2 \sin \theta_2.$
In the final part, we discretize these equations. That'll have to wait for next time.