7

I'm studying Mechanics on Goldstein's book (Classical Mechanics) and Spivak's book (Physics for Mathematicians) and I'm in doubt about the physical intuition about the inertia tensor. On both books, the inertia tensor appears naturally when computing the angular momentum $L$ of a rigid body which, for simplicity, is only rotating.

The inertia tensor is then defined as the linear operator $I : \mathbb{R}^3 \to \mathbb{R}^3$ given by

$$I(\phi) = \sum_{i} m_i b_i \times (\phi \times b_i),$$

where $b_i\in \mathbb{R}^3$ are the initial positions of the particles of the body, and $m_i$ their masses. With this definition, it is shown that

$$L = I(\omega),$$

being $\omega$ the angular velocity of the rigid body. All of that, from the mathematical point of view, is fine.

Now, what is the physical intuition behind this? The linear operator $I$ allows one to relate, in a linear way the angular velocity and the angular momentum. This looks much like mass relates in a linear way velocity and momentum. But on the latter case, mass is a scalar while $I$ is a linear transformation.

What is, then, the best way to physically understand the inertia tensor?

Qmechanic
  • 201,751
Gold
  • 35,872

5 Answers5

2

You are right in saying that $I$ allows one to relate angular velocity and angular momentum in a linear way. It is just not as simple as the momentum and velocity case. An intuition for why things get complicated is that $L = r \times p $ involves a cross product which makes it very sensitive to the choice of a specific set of orthonormal bases(with fixed origin). While $p=mv$ involves a scalar mass that is independent of your choice of coordinates.

To explain inertia tensor, I guess we could start with simpler cases where sufficient symmetry is present (for example a sphere in 3D or a circular pancake in 2d), $L = I(\omega)$ reduces to $L = I\omega$ where $L$ and $\omega$ are vectors and $I$ is just a scalar. An intuition for this reduction is that symmetry makes $I$ resemble $m$ more. As I mentioned earlier, mass is always independent of coordinate choice. but $I$ is only independent of the coordinates that PRESERVE symmetry. Therefore, spheres and circular pancakes are pretty easy to deal with and no inertia tensor is necessary.

But for a general, extended, rigid body in 3d, the lack of symmetry breaks the simple linear relationship. Suppose you have an orthonormal basis, the origin of which is the corner of a cube and the axes line up with the edges of the cube. Basically when the cube is rotated around the z-axis, all the parts of the cube are also instantaneously rotating in the other directions (if you draw a diagram, it would be clear). Therefore $\omega_z$ affects $L_y$ and $L_x$. I don't see an intuitive explanation of the quantitative details.. But this simple cube example shows that $L_x, L_y, L_z$ must each be a linear combination of $\omega_x, \omega_y, \omega_z$. And the mathematical expression that quantifies this must be a matrix.

*A mathematical sidetrack: this matrix itself is not a tensor, but rather a REPRESENTATION of a tensor that maps angular velocity vectors to angular momentum DUAL vectors. In abstract index notation, $L_\alpha = I_{\alpha\beta}\omega^\beta $ You will see a lot of similar notations in E&M, Relativity etc.

Zhengyan Shi
  • 2,957
2

So, there are two distinct things that we're talking about here. It looks like you don't mind some advanced notation so I'll try to use that to illustrate the mathematical side of the physics I'm talking about.

Rigidity and the axis of rotation

One of the things that we're talking about is that the object is rigid, meaning that it's composed of a bunch of particles whose distances are fixed. The mathematical way to say tis is that its at any time must be represented by an isometry, and the group of isometries is $T(3) \times O(3).$ In fact, the isometry needs to be continuous with the identity so in normal infinite Euclidean space we can specify to $T(3) \times SO(3),$ translations plus rotations. Just to work out the basic idea, let Greek indices be coordinates and Latin indices be particles, so that we can use Einstein summation on Greek indices. I will try to use some explicit metric tensors $g_{\alpha\beta}$ to denote the dot product, to keep Latin and Greek indices visibly separate. The particles have position vectors $r_n^\alpha$ but the distances between those particles are constant, so $$\frac{d}{dt} \left[g_{\alpha\beta} ~(r_m^\alpha - r_n^\alpha)~(r_m^\beta - r_n^\beta)\right] = 2 g_{\alpha\beta} ~(r_m^\alpha - r_n^\alpha)~(\dot r_m^\beta - \dot r_n^\beta) = 0.$$Given an orientation (totally antisymmetric $[0,\;n]$ tensor $\epsilon$ on our $\mathbb R^n$ space) we can describe the rotation with a $[n-2,\;0]$ tensor $\Omega$ as$$\dot r^\beta_m = \chi^\beta + g^{\beta\gamma}\epsilon_{\gamma\delta\Lambda}~\Omega^\Lambda~r_m^\delta.$$This functional form causes the above term to be $\epsilon_{\alpha\delta\Lambda}~\Omega^\Lambda~R_{mn}^\alpha~R_{mn}^\delta = 0$ due to the antisymmetry of the $\epsilon$ term, where the exact form of $R^\alpha_{mn} = r^\alpha_m - r^\alpha_n$ doesn't matter, just as $\Omega^\Lambda$ doesn't matter, in deriving that $0$. It's purely from antisymmetry. In 2D, $\Omega^\Lambda$ is a scalar; in 3D it is a vector; in higher dimensions it is a tensor, but in each case it turns into this antisymmetric matrix $\Omega_{\gamma\delta} = \epsilon_{\gamma\delta\Lambda} ~\Omega^\Lambda.$ The fact that the rotations can all be represented by these antisymmetric matrices is going to be very useful in a moment. I am not sure whether in higher dimensions other terms also pop out; my thinking was just "you either need velocity to vanish directly or to be perpendicular to the position."

Angular momentum

Another is, due to the fact that the formula for kinetic and potential energies do not depend on rotation (a continuous symmetry), there is a conserved Noether current associated with that symmetry: angular momentum. In this case our Lagrangian has the form $$\frac12 \sum_n m_n g_{\alpha\beta} \dot r_n^\alpha \dot r_n^\beta - \sum_{mn} U_{mn}\left[g_{\alpha\beta} ~ (r_m^\alpha - r_n^\alpha)~(r_m^\beta - r_n^\beta)\right],$$ for some set of strong potentials which force the body to remain rigid $U_{mn}.$ Since those are rotationally symmetric, and the kinetic term is rotationally symmetric, Noether's theorem says that we therefore pick up a conserved quantity for any symmetry-obeying infinitesimal displacement $\delta r^\alpha$ of $$Q = \sum_n \frac{\partial L}{\partial \dot r_n^\alpha} ~\delta r_n^\alpha.$$Again, the rotation turns no matter what into an antisymmetric matrix, $\delta r_n^\alpha = g^{\alpha\mu}~\delta\phi_{\mu\nu}~r_n^{\nu},$ so the above becomes:$$Q = \sum_n m_n g_{\alpha\beta}~\dot r_n^\beta ~g^{\alpha\mu}~\delta\phi_{\mu\nu}~r_n^{\nu}= \delta\phi_{\mu\nu}~\sum_n m_n~\dot r_n^\mu ~r_n^{\nu} = \delta\phi_{\mu\nu}~Q^{\mu\nu}.$$ This conserved quantity therefore has a $[2,\;0]$ tensor character, as we can choose any axis for this rotation. Moreover, any symmetric part to this tensor will get nuked by the antisymmetry of the rotation, so without loss of generality we can antisymmetrize it, too. The usual notation for this is to write $Q^{[\mu\nu]}$ with square brackets, $$Q^{[\mu\nu]} = \frac12 (Q^{\mu\nu}-Q^{\nu\mu})=\sum_n m_n ~\dot r_n^{[\mu} ~r_n^{\nu]}.$$

Bringing them together

We've seen two fundamentally different expressions here: one is the angular velocity tensor $\Omega_{\alpha\beta}$, which stems from the rigidity of the system; the other is the angular momentum tensor $Q^{\mu\nu},$ which stems from the symmetry in the equations of motion. They're obviously not defined the same, but they both turn out to be antisymmetric. What is the relation between them? That's easy: substitute the rotational term from the $\Omega$ expression for $\dot r_n^\beta$ into the same $\dot r$ term in the kinetic energy to find:$$Q^{[\mu\nu]} = -\left[\sum_n m_n~r_n^{[\nu}~g^{\mu]\gamma}~r_n^{\delta} \right]~\Omega_{\gamma\delta}.$$Here we see that a $[4,\;0]$ tensor is linearly relating them and it has $\nu$-$\delta$ and $\mu$-$\gamma$ symmetry but $\nu$-$\mu$ antisymmetry. So they always stand in a direct relationship via this moment-of-inertia tensor, and that's basically because particle velocities about the center of mass both contribute directly to the angular momentum and are directly determined by a rotation.

Of course in $\mathbb R^3$ it happens to be easier to work with the angular momentum and angular velocity vectors; we write $Q^{\mu\nu}$ as $Q_\lambda~\epsilon^{\lambda\mu\nu}$ and notice that for the most common orientation (where $\epsilon_{123} = 1$) we have $\epsilon^{\alpha\beta\gamma}\epsilon_{\beta\gamma\delta} = 2\delta^\alpha_\delta,$ so that $$Q_\lambda = \frac12 \left[\sum_n m_n~\epsilon_{\lambda\mu\nu}~g^{\mu\gamma}~r_n^\nu ~r_n^\delta~\epsilon_{\gamma\delta\kappa} \right] \Omega^\kappa.$$So in $\mathbb R^3$ we find a direct $[0,\;2]$-tensorial relationship between the same quantities, because both of the angular-momentum matrices are secretly angular-momentum vectors.

I'm not entirely sure I've done all of the details properly here, but that's the general story. The two concepts are different, in part because one contains "mass" ideas that are different in different directions and the other does not; but they turn out to be linearly related through the terms $\dot r_n^\alpha$. They are secretly antisymmetric matrices with a linear relation, but they can be down-converted in $\mathbb R^2$ to scalars or in $\mathbb R^3$ to vectors or in $\mathbb R^4$ to pairs of vectors in $\mathbb R^3.$

CR Drost
  • 37,682
2

To get intuition, I recommend starting not with the definition, but with the problem you're trying to solve. Namely, given a rigid body $B$, with a certain angular velocity $\vec{\omega}$ about its center of mass, what is its angular momentum $\vec{L}$? Define the "moment of inertia" to be the property of $B$ that determines the value of $\vec{L}$ given $\vec{\omega}$.

Defined this way, the "moment of inertia" could in principle be some horribly complicated function that depends in some nonlinear way on the shape of the rigid body. After all, there are infinitely many possible input vectors $\vec{\omega}$, and a priori, they could give rise to values of $\vec{L}$ according to some crazy calculation that requires you to keep explicit track of the exact position of every particle in the rigid body.

What saves us from this nightmare is that the output vector $\vec{L}$ depends linearly on the input vector $\vec{\omega}$. How can we see that this true? Well, it's pretty intuitive that if you double $\vec{\omega}$ then you'll double $\vec{L}$, so that's a good start. What's less immediately obvious is that if $\vec{L}_1$ is the angular momentum corresponding to a certain angular velocity $\vec{\omega}_1$, and $\vec{L}_2$ is the angular momentum corresponding to a certain angular velocity $\vec{\omega}_2$, then the angular momentum corresponding to $\vec{\omega}_1 + \vec{\omega}_2$ is $\vec{L}_1 + \vec{L}_2$. But intuitively, you can convince yourself of this if you imagine the special case where $\vec{\omega}_1$ points along the $x$-axis and $\vec{\omega}_2$ points along the $y$-axis, and you calculate the angular momentum of an infinitesimal element of the rigid body at $(x,y)$ rotating first according to $\vec{\omega}_1$, then according to $\vec{\omega}_2$, and finally according to $\vec{\omega}_1 + \vec{\omega}_2$. And if linearity holds at the infinitesimal level, then it must hold of the whole rigid body, because integration is linear.

Once you've established that $\vec{L}$ depends linearly on $\vec{\omega}$, you're basically done; that's what a rank 2 tensor is—something that gives you an output vector from an input vector in a linear way. So the moment of inertia isn't an incomprehensibly complicated function of the rigid body after all; because it's a rank 2 tensor in 3 dimensions, it's determined by (at most) 9 numbers. These numbers capture the body's resistance to torque in three independent directions, and linearity lets us derive its resistance to torque in any direction.

By the way, this way of thinking also helps one see that the moment of inertia is some physical property of the object that enjoys an existence independent of our choice of coordinate system. The angular velocity and the angular momentum are physical, and the relationship between them depends just on the physical structure of the rigid body. Thinking of the moment of inertia tensor as a coordinate-independent linear relationship between coordinate-independent quantities (vectors) lets us derive the transformation law when we change coordinates (as opposed to taking the transformation law as the definition of a tensor, which I have always found to be very confusing.)

One more remark. The aforementioned exercise involving an infinitesimal element of the rigid body helps us see why the moment of inertia tensor isn't described by just a single number. That is, $\vec{L}$ isn't just a scalar multiple of $\vec{\omega}$, because the same angular speed about two different axes will give rise to different magnitudes of the angular momentum, depending on how far the mass is from the axis of rotation. So we need more than one number. Now, as you are no doubt aware, the moment of inertia tensor is actually a symmetric tensor, so you don't actually need 9 numbers to describe it; 6 numbers suffice. This symmetry doesn't follow just from the general fact that $\vec{L}$ depends on $\vec{\omega}$ in a linear way; it is a special fact about the mechanics of rigid bodies. But I think the main obstacle to intuition is grasping what a rank 2 tensor means physically, and that is what I have focused on here.

0

The inertia tensor is the object which tells us how angular velocity is converted into kinetic energy or angular momentum and therefore it plays a similar role mass plays in rectilinear motion. To physically understand why this conversion factor is just a number in one case but it is a tensor in the other we just have to note that both quantities represent the total inertia of the system.

Due to isotropy of space, the inertia of the particle in rectilinear motion is completely determined by just one parameter, mass. In rotational motion however, different rotation axes of the same body show in general different inertia and a single scalar will not be enough to describe how angular velocity is converted into kinetic energy. To completely describe the body inertia with respect to a given point we need in general six parameters, three to fix the orientation of the coordinate axes and three to quantify the inertia with respect to each of these axes.

By having six numbers to be specified, the inertia of the body requires at least a symmetric tensor of second rank to be represented by. If the body has particular symmetries the total number of different parameters is reduced. For example, consider a homogeneous sphere centered at origin which is fixed. With respect to that point, every axes orientation is equivalent so we do not need any parameter to fix the coordinate system. Moreover, rotations along each of the three axes are also equivalent, the inertia must be the same. Hence, the inertia of homogeneous sphere is described by just one single scalar and the inertia tensor is a multiple of the identity tensor.

Diracology
  • 17,700
  • 3
  • 54
  • 97
0

I have also encountered the same problem until recently I understood the meaning behind the indices.

Although there are a lot of definitions of a tensor, we are left to decide which one is palatable for the kind of context we are in.

$I_{xy}$ means how much the 3D object would be accelerated in the $y$ axis when I apply the torque in the $x$ axis.