Sep 21, 2022

An explanation of constraints in classical mechanics and the derivation of the Navier-Stokes equations, treating pressure as a Lagrange multiplier.

Back to homepage https://principiaphysicaegeneralis.com/

Introduction

One of the reasons for Lagrange’s initial discovery of what is now classical mechanics was the elimination of so called constrain forces (like the tension of a rope) from the description of a physical system to make the equations simpler. However there is a way (also suggested first by Lagrange) to re-introduce these constrain forces into the formalism of classical mechanics gaining more information in the process.
In this article we are going to explore the Lagrange multiplier description of constrains and use it to derive a fundamental form of Schrodinger’s equation and the Navier-Stokes equations for incompressible fluids.

Holonomic constraints

The definitions

The constraints we are going to be concerned with here are going to be Holonomic. That means that they can be expressed as the demand that a certain function $\phi(q)$ of the coordinates is zero at all times during the evolution of the system.
Geometrically, if the original system has $N$ degrees of freedom then the system evolves in a N-dimensional configuration space. Every constraint we introduce removes one degree of freedom so it limits the evolution to a (N-1)-dimensional subspace. Now the constraint is obviously being enforced by some force that by definition has no other effect on the system so it has to be orthogonal to the constrained subspace.
To see all of the above remarks in action we will consider the simple problem of a pendulum of mass m that has a constant length

The simple pendulum

Assuming that the pendulum has a constant length $L$ and the only force acting on it is gravity, then we only need one variable, the angle $\theta$, to fully describe the system. The configuration space in a circle since $\theta + 2\pi=\theta$. The Lagrangian is as usual: $$ \mathcal{L}=T-V=\frac{1}{2}mL^2\dot{\theta}^2 + mgLcos\theta =\frac{1}{2}L\dot{\theta}^2 - gcos\theta \;\;\; (2.1) $$

and from the Euler-Lagrange equations we get the usual pendulum equation $$ L\ddot{\theta}=-gsin\theta \;\;\; (2.2) $$

But what if we wanted to know the tension force that keeps the pendulum at a constant length? We begin with the unconstrained system of 2-dimensions ($r,\theta$). The configuration space is now a cylinder since again $\theta + 2\pi=\theta$ and $0<r<\infty$. As discussed above, we introduce the constraint $\phi(r,\theta)\equiv r-L = 0$ again limiting our configuration space to a circle. Instead of replacing r in $\mathcal{L}$ we introduce a new term, the constraint multiplied by an undefined function of time called a Lagrange multiplier $\lambda$ $$\mathcal{L}=\frac{1}{2}m(r^2\dot{\theta}^2 + \dot{r}^2) + mgrcos\theta + \lambda \phi(r) \;\;\; (2.3) $$

Now, not only do we keep treating $r$ as a variable, but we also treat the multiplier $\lambda$ as a variable. We have three E-L equations: $$\lambda: \;\;\;r-L=0 \;\;\; (2.3a)$$ $$\theta: \;\;\;mr^2\ddot{\theta}=mgrsin\theta \;\;\; (2.3b)$$ $$r: \;\;\; m\ddot{r}=mr\dot{\theta}+mgcos\theta+ \lambda \;\;\; (2.3c)$$

We can see that (2.3a) is the constraint, so replacing $r=L$ in (2.3b) gives us the same equation as (2.1). What we have gained is (2.3c). Since $r$ is constant $\ddot{r}=0$ so $$\lambda = -mL\dot{\theta} -mgcos\theta \;\;\; (2.4)$$

Comparing (2.3c) with Newton’s Second Law $m\ddot{r}=F_r$ we can see that $\lambda$ or in general $\lambda\cdot \nabla\phi$ is the force acting on the system to enforce the constraint $\phi$. As predicted, the force is orthogonal to the circle describing the constrained motion in the configuration space.

Schrodinger’s equation

Lagrange multipliers, and constraints in general, can also be used to derive the equations describing a system. The first example we are going to look at is a derivation of Schrodinger’s time independant equation. We define the Hamiltonian operator $\hat{\mathcal{H}}$ by the relation $$<E>=\int \psi^*\hat{\mathcal{H}}\psi\;d^3x \;\;\; (2.5)$$

Where $<E>$ is the mean value of the energy, $\psi$ the wave function of the system and $\psi^* $ its complex conjugate. Similar to the principle of least action, we demand that $<E>$ is stationary so we identify the action with $<E>$.
It is well known that the probability of a particle existing in the volume V is given by the integral of the norm of $\psi$ squared. So, if we take the integral to encompass all of space we expect the probability to be 1, since the particle must exist somewhere. Therefore our constrain is: $$ \phi \equiv \int \psi^* \psi\;d^3x - 1 = 0 \;\;\; (2.6) $$

The action can now be written as $$\mathcal{S}= \int (\psi^* \hat{\mathcal{H}}\psi + \lambda\psi^* \psi)\;d^3x -\lambda \;\;\; (2.7)$$

If we treat $\psi$ and $\psi^*$ as independant variables and apply our usual variational methods for getting the equations of motion ¹ we get the equation (and its complex conjugate) $$\hat{\mathcal{H}}\psi=\lambda\psi \;\;\; (2.8)$$

This is Schrodinger’s time independant equation and it tells us that the wave functions $\psi$ that describe any time-independant system are the eigenfunctions of the Hamiltonian operator, with energy equal to the eigenvalue $\lambda$. $\hat{\mathcal{H}}$ itself is specific to the particular problem one is investigating so there is no general formula.

Continuous Mechanics and Pressure

Continuous formalism

To describe continuous media we use fields instead of variables. That means that variables like the coordinates $q$ are replaced by quantities called field variables $\phi(q)$ of the coordinates. The Lagrangian too is replaced by a Lagrangian density $\mathcal{L}$ so the action is given by $$\mathcal{S}=\int_{t_1}^{t_2} \int_V \mathcal{L}\; d^3x\;dt\;\;\; (3.1)$$

The Euler-Lagrange equations are now given by $$ -\frac{\delta\mathcal{L}}{\delta\phi} \equiv \frac{\partial}{\partial t} \frac{\partial \mathcal{L}}{\partial(\partial\phi / \partial t)} + \frac{\partial}{\partial x_i} \frac{\partial \mathcal{L}}{\partial(\partial\phi / \partial x^i)} - \frac{\partial \mathcal{L}}{\partial \phi} = 0 \;\;\; (3.2) $$

Where $\frac{\delta\mathcal{L}}{\delta\phi}$ is called a functional derivative of $\mathcal{L}$ and $ x^i $ the three coordinates of space.

Lagrangian description of fluids

There are two ways to describe a fluid in motion. One of them, introduced by Lagrange is to give an “label” $\vec{\alpha}$ to every elementary part of the fluid and then describe its current coordinate as a function of time and the label: $\vec{q}(\vec{\alpha},t)=(q^1,q^2,q^3)$. Usually we chose the label $\vec{\alpha}$ to be the starting coordinates of the specific element.
An important matrix is the deformation matrix with elements $\frac{\partial q_i}{\partial \alpha^j}$ and determinant $J$ : $$ J= \frac{1}{6} \epsilon_{kjl}\epsilon^{imn} \frac{\partial q^k}{\partial\alpha^i}\frac{\partial q^j}{\partial\alpha^m}\frac{\partial q^l}{\partial\alpha^n} \;\;\; (3.3) $$

To understand the physical meaning of $J$ consider an elementary volume of fluid at $t=0$ with label $d^3\alpha$. At any point in time, if $d^3x$ is the volume taken up by the starting volume, by definition $$ d^3x=Jd^3\alpha \;\;\; (3.4) $$

So $J$ defines how volume changes through time (when the time evolution is described by the function $\vec{q}(\vec{\alpha},t)$). Using conservation of mass we get $$\rho (\vec{q}(\vec{\alpha},t)) d^3x=\rho_o(\vec{\alpha}) d^3\alpha $$

so using (3.4) we get $$\rho_o=\rho J \;\;\; (3.5) $$

The final definition we need is of the cofactor $A^i_k$ of $\frac{\partial q_i}{\partial \alpha^j}$ that is defined as² $$\frac{\partial q^k}{\partial \alpha^j} \frac{A^i_k}{J}=\delta^i_j \;\;\; (3.6) $$

The explicit expression for $A^i_k$ is: $$A^i_k=\frac{1}{2} \epsilon_{kjl}\epsilon^{imn} \frac{\partial q^j}{\partial \alpha^m} \frac{\partial q^l}{\partial \alpha^n} \;\;\; (3.7) $$

Eulerian description of fluids

The second way to describe a fluid was introduced by Euler and treats the velocity of the fluid $\vec{u}(\vec{x},t)$ as a field. All elements are now identical (without a label) and $\vec{u}(\vec{x},t)$ is the velocity of the element that happens to be at point $\vec{x}$ at time $t$.
To connect the two descriptions we assume that the labeling system $\vec{\alpha}$ is 1-1 and reversible and so we can “identify” the element at point $\vec{x}$ at time $t$ as $\vec{\alpha}=\vec{q}^{-1}(\vec{x},t)$. Therefore $$\vec{u}(\vec{x},t) = \dot{\vec{q}}\;(\vec{q}^{-1}(\vec{x},t) )\;\;\; (3.8) $$

Notice now that the time derivative of the velocity field, using the chain rule, is $$\frac{d}{dt}\vec{u}= \frac{\partial \vec{u}}{\partial t} + \vec{u}\cdot\vec{\nabla}\vec{u} \;\;\; (3.9) $$

Navier-Stokes equations

First we are going to construct the Lagrangian of a free fluid, using the Lagrange description. The kinetic energy is $$T[\dot{\vec{q}}]=\int \rho_o (\vec{\alpha})|\dot{\vec{q}}|^2\; d^3\alpha $$

Since we are dealing with incompressible fluids, we want the volume fluid to not change through time, so according to (3.4) we demand that $J=1$. Using our constraint language, we introduce the constraint $\phi \equiv J-1=0$. Incompressible fluids have no internal potential energy so the Lagrangian is $$\mathcal{L}=\int \rho_o (\vec{\alpha})|\dot{\vec{q}}|^2\; d^3\alpha + \lambda(J-1) \;\;\; (3.10) $$

The Euler-Lagrange equations (3.2) give $$\rho_o\ddot{q}^i = -A^i_j \frac{\partial \lambda}{\partial \alpha^j} \;\;\; (3.11) $$

using (3.5) and (3.6) we get $$\rho\ddot{q}^i = -\frac{\partial \lambda}{\partial \alpha^i} \;\;\; $$

using vector notation and going over to Euler’s description we get $$\rho(\frac{\partial \vec{u}}{\partial t} + \vec{u}\cdot\vec{\nabla}\vec{u})=-\vec{\nabla}\lambda \;\;\; (3.12) $$

In other derivations of (3.12) one sees that $-\vec{\nabla}\lambda$ is replaced by $-\vec{\nabla}P$ where $P$ is the pressure. Therefore, we can conclude that pressure is the force keeping a fluid from changing its density (i.e. making it incompressible). To get the full expression we just need to add any external potential field $\Phi$ and the inner friction coefficient $\eta$: $$\rho(\frac{\partial \vec{u}}{\partial t} + \vec{u}\cdot\vec{\nabla}\vec{u})=-\vec{\nabla}P + \vec{\nabla}\Phi + \eta\nabla^2\vec{u} \;\;\; (3.13) $$

Conclusion and (Re)sources

Conclusion

We have seen the Lagrange multiplier formalism for constrained systems and how it can, not only be applied to mechanical problems like the pendulum to calculate the forces, but also applied to fundamental matters like the theory of fluids to see physical forces in a different light.
Although we mentioned the geometric interpretation of constrains there is a whole lot more depth to it. If one is interested, one should look into D’Alambert’s principle and the principle of virtual displacements found in classical mechanics textbooks.
The second article in this series is going to go into constraints in the Hamiltonian formalism of mechanics and how it’s used to quantize fields.
Back to homepage https://principiaphysicaegeneralis.com/

Sources and Further Reading

The motivation for this two-parter came from the constrains section of Theocharis Apostolatos’s book on classical mechanics. (I don’t believe there is an English translation of the book which is a giant pity).
A fully mathematical and geometric approach to the subject of constraints and Laplace multipliers is presented in Arnold’s “Mathematical Methods of Classical Mechanics”.
A deeper dive into constrains and an exploration of the principle of virtual displacements can be found in Lanczos’s “The Variational Principles of Mechanics”.
The derivation of Schrodinger’s equation from constrains was mentioned in Landau’s “Quantum Mechanics (Non-Relativistic Theory)”.
The derivation of the pressure Lagrange multiplier and the basic theory of fluids is taken from the article “Lagrangian and Dirac constraints for the ideal incompressible fluid and magnetohydrodynamics” by P.J. Morrison. As one can tell from the title, the article also touches on the Hamiltonian/Dirac formalism and also goes into magnetohydrodynamics which do not concern us here.

See the last article on principles ↩︎
$\delta^i_j$ is the Kronecker delta and we assume the summing convention ↩︎