Congruence transformations of matrices

Question

From the book Analytical Mechanics by Fowles and Cassiday I am studying classical coupled harmonic oscillators. These are systems that are governed by a system of linear second order differential equations of the form $\mathbf{M} \ddot{\mathbf{q}}+\mathbf{K}\mathbf{q} = 0$. Here you want to solve $\mathbf{q}$ as function of time $t$ and $\mathbf{M},\mathbf{K}$ are square matrices. You try to plug in $\mathbf{q} = \mathbf{a} \cos (\omega t - \delta)$ for undetermined $\mathbf{a}, \omega, \delta$ to get the system of equations $(\mathbf{K}-\omega^2\mathbf{M})\mathbf{a}\cos(\omega t-\delta) = 0$.

To find non-trivial solutions you want to find the roots $\omega^2_1, \dots, \omega^2_k$ of $\det(\mathbf{K}-\omega^2 \mathbf{M})$ as polynomial in $\omega^2$ and then calculate $\ker(\mathbf{K}-\omega^2_i \mathbf{M})$ for $i=1,\dots, k$.

Now suppose the kernels $\ker(\mathbf{K}-\omega_i^2\mathbf{M}), i=1,\dots,k$ span the whole linear space, thus you have a basis of "eigenvectors" $\mathbf{a}_1, \dots, \mathbf{a}_n$ (I use quotes because strictly speaking they are not eigenvectors). Then you can make a basis transform matrix $\mathbf{A}$ with the vectors $\mathbf{a}_i$ as columns.

1) The book then asserts that the congruence transformations $\mathbf{A}^T \mathbf{K} \mathbf{A}$ and $\mathbf{A}^T \mathbf{M} \mathbf{A}$ are diagonal matrices. Why is this the case?

Edit: a counterexample is given by taking $\mathbf{M} = \mathbf{K} = \begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix}$ such that $\omega^2 = 1$ is the only root of the determinant equation and $\mathbf{A} = \mathbf{I}_2$. Then the congruence transformations are just the matrices itself: $\mathbf{A}^T \mathbf{K}\mathbf{A} = \mathbf{K}$ and $\mathbf{A}^T\mathbf{M}\mathbf{A} = \mathbf{M}$.

So the followup question is: which assumptions on $\mathbf{M}$ and $\mathbf{K}$ must be added for this assertion to hold?

2) What is the intuition behind such a congruence transformation? For a similarity transformation from a matrix $\mathbf{B}$ to $\mathbf{D}=\mathbf{P}^{-1} \mathbf{B} \mathbf{P}$ I can interpret this intuitively as: going from the basis $\mathbf{P}\mathbf{e}_1, \dots, \mathbf{P}\mathbf{e}_n$ to the basis $\mathbf{e}_1, \dots, \mathbf{e}_n$. Is there a similar interpretation possible for congruence transformations as well?

Related : Eigenvalue equation for kinetic and potential energy. — Frobenius, Aug 31 '21 at 11:27

Cosmas Zachos · Answer 1 · 2020-01-24T16:48:46.123

I don't have your book, and I would be reluctant to shadow-box and misread it by virtually reverse-engineering it... The admittedly confusing point in the principal axis transformation you are considering is treated meticulously and nicely in Goldstein's Classical Mechanics book, Ch 10-2. You are basically right that arbitrary $\mathbf{M}$ and $\mathbf{K}$ will falsify your statement. Anticipating the stuff below, you are dealing with a sort of orthogonality in a non-Cartesian space, and the rampant generalization is hardly worth the fuss.

My counterexample would be to use hermitean Pauli matrices. So, blindly take a nasty "mass" matrix, $$\mathbf{M} = \sigma_2= \begin{bmatrix} 0 & -i \\ i & 0 \end{bmatrix}, $$ (which will lead to imaginary $\omega^2$s!) and a symmetric real potential one, $ \mathbf{K}= \sigma_1 =\begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}.$ Your equation of motion $\ddot{\mathbf{q}}= - \mathbf{M}^{-1} \mathbf{K}\mathbf{q}= i\sigma_3 \mathbf{q}$ is readily solved by $$ e^{\pm \sqrt{i} t} \begin{bmatrix} 1 \\ 0 \end{bmatrix} , ~~~\hbox {and } ~~~ e^{\pm \sqrt{-i}t} \begin{bmatrix} 0 \\ 1 \end{bmatrix}, $$ so your modal matrix $\mathbf{A}= I= \mathbf{A}^T$, pretty dismal for diagonalizing anything at all. (You would have found the same modal matrix from your determinant.)

However, there are conditions on $\mathbf{M}$, just as on $\mathbf{K}$. It is normally real, symmetric, and positive definite, and will lead to$^\dagger$ real $\omega^2$. So you may first diagonalize it by an orthogonal transformation, and then absorb the positive eigenvalues of the resulting diagonal matrix in a redefinition/rescaling of the coordinates by their square root. As a result, the new $\mathbf{M}=I $ and usual real, symmetric $\mathbf{K}$'s devolve to real, symmetric ones.

But now your eigenvalue equation has devolved to $\mathbf{K}\mathbf{q} = \omega^2 \mathbf{q}$, with real eigenvalue, whose secular equation has devolved to $\det ( \mathbf{K} -\omega^2 I )=0$, while your modal matrix $\mathbf{A}= \mathbf{R}$ is just an orthogonal rotation, $ \mathbf{R}^T= \mathbf{A}^{-1}$, and it diagonalizes $\mathbf{K}$, leaving the identity mass matrix alone.

Now, the respectable crowd think of $\mathbf{M}$ as some sort of effective metric of the space of normal modes, but, as indicated, for real symmetric $\mathbf{M}$ and $\mathbf{K}$, the former with positive non-zero eigenvalues, seat-of-the-pants types may think of the congruence as a composition of rotations and a bland rescaling of coordinates, just a wrinkle on a humdrum diagonalization problem.

Here is the simplest illustration I could think of. Take $$ \mathbf{K} =\begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}, ~~~~~\hbox{but} ~~~~ \mathbf{M} =\begin{bmatrix} 4 & 0 \\ 0 & 1 \end{bmatrix} .$$ The mass matrix is not invariant under rotations, so we could rotate both matrices by something to make it non-diagonal, but suppose you did the converse already.

Then start from the deconstruction I outlined. Rescale $\mathbf{q} \equiv \mathbf{S} \mathbf{x} $ with $\mathbf{S} = \mathbf{S}^T$=diag (1/2, 1), so that $$ \mathbf{S}\mathbf{K}\mathbf{S}\mathbf{x}= \omega^2 \mathbf{x} $$ is now a bona fide eigenvalue equation! (It so happens that the lhs matrix is $\mathbf{K} /2$ here.)

The eigenvectors for the symmetric $\mathbf{S}\mathbf{K}\mathbf{S}$ are the usual ones for $\sigma_1$, $$ \frac{1}{\sqrt{2}} \begin{bmatrix} 1 \\ \mp 1 \end{bmatrix} , $$ mutually orthogonal, so the modal matrix is now orthogonal, and diagonalizes this transformed potential matrix, while leaving the identity $\mathbf{S}\mathbf{M}\mathbf{S}=I$ alone, so also diagonal. Essentially trivial. How does this present in the congruence language of your question?

Solving the same system ab initio, but now without benefit of the above rotation and rescaling, yields null vectors $$ \mathbf{a}_{1,2}= \frac{1}{\sqrt{5}} \begin{bmatrix} 1 \\ \mp 2 \end{bmatrix} , $$ with real $\omega^2$ and an invertible modal matrix $$ \mathbf{A}=\frac{1}{\sqrt{5}} \begin{bmatrix} 1 & 1\\ -2 & 2 \end{bmatrix} , $$ which is definitely not orthogonal ($\propto \mathbf{S}\mathbf{R}$); but of course diagonalizes both $\mathbf{K}$ and $\mathbf{M}$ (rather, it leaves the latter diagonal) for evident reasons, considering the simple deconstruction above. A true equivalence relation. A change of basis to normal modes, $$\mathbf{A}\mathbf{e}_i=\mathbf{a}_i .$$

Armed with that intuition, you might proceed to pick a formally acceptable path to the book's statements, probably along the lines of the footnote.

$\dagger$ _{Consider $$ \mathbf{a}_i^* \cdot (\mathbf{K}-\omega_i^2\mathbf{M})\mathbf{a}_i = 0 \implies \omega_i^2=
\mathbf{a}_i^* \cdot \mathbf{K} \mathbf{a}_i / \mathbf{a}_i^* \cdot \mathbf{M}\mathbf{a}_i $$
without implied summation over mode indices i. So all $\omega_i^2$ are real.
You may likewise show the null vectors $\mathbf{a}_i$ are mutually orthogonal w.r.t. a metric $\mathbf{M}$, and orthonormalize them s.t. $ \mathbf{a}_i^* \cdot \mathbf{M}\mathbf{a}_j=\delta_{ij}$, as effectively done less formally above.}

Eli · Answer 2 · 2020-01-25T17:28:55.753

why $A^T\,M\,A$ and $A^T\,K\,A$ are diagonal matrices .

we want to solve this vector differential equation

$$\,M\,\vec{\ddot q}+K\,\vec{q}=0\tag 1$$ or $$\vec{\ddot q}+M^{-1}\,K\,\vec{q}=0\tag 2$$

to solve equation (2) we make this Ansatz:

$\vec{q}=\Re(\vec{a}\,e^{i\omega\,t})$

thus equation (2)

$$\underbrace{(-\omega^2\,I+M^{-1}\,K)}_{E }\,\vec{a}=0\tag 3$$

with $\det(E)=0$ you get the eigenvalues $\omega_i^2$ and for each $\omega_i^2$ the eigen-vectors $\vec{a}_i$

where $\vec{a}_i^T\,\vec{a}_j=1 \quad \text{for } i=j$ and $\vec{a}_i^T\,\vec{a}_j=0 \quad \text{for } i\ne j$

the transformation matrix $A$ is build with the eigen- vectors $\vec{a}_i$

$$A=\left[\vec{a}_1\,,\vec{a}_2\,,\ldots\,,\vec{a}_n\right]$$

thus: $$A^T\,M^{-1}\,K\,A=\Lambda$$ where $\Lambda$ is $n\times n$ diagonal matrix

$$\Lambda=\text{diagonal}\left[\omega_1^2\,,\omega_2^2\,,\ldots\,,\omega_n^2\right]$$

we can transformed $\vec{q}$ with the matrix $A$ and get: $\vec{q}=A\,\vec{q}_m$ thus equation (1)

$$A^T\,M\,A\,\vec{\ddot q}_m+A^T\,K\,A\,\vec{q}_m=0\tag 4$$

or: $$\vec{\ddot q}_m+\left(A^T\,M\,A\right)^ {-1}\,\left(A^T\,K\,A\right)\vec{q}_m=0\tag 5$$

with:

$$\underbrace{\left(A^TM\,A\right)^{-1}}_{Q_1} \underbrace{\left(A^T\,K\,A\right)}_{Q_2}= A^TM^{-1}AA^TKA=A^T\,M^{-1}KA=\Lambda$$

because $\Lambda$ is diagonal matrix thus $Q_1$ and $Q_2$ must be a diagonal matrices, thus

$A^T\,M\,A$ and $A^T\,K\,A$ are diagonal matrices. q.e.d

Example:

$$M=K= \left[ \begin {array}{cc} 1&1\\ 1&-1\end {array} \right] $$

$$M^{-1}K=\begin{bmatrix} 1 &0 \\ 0 & 1 \\ \end{bmatrix}$$

thus the eigenvalues are : $\omega_1^2=\omega_2^2=1$

because the eigenvalues are equal you must use the Jordan approach to obtain the eigen-vectors, thus the transformation matrix $A=[\vec{a}_1\,,\vec{a}_2]$

$$A=\left[ \begin {array}{cc} 1&0\\ 1&1\end {array} \right] $$

$$A^TMA=A^TKA=\begin{bmatrix} 2 &0 \\ 0 & -1 \\ \end{bmatrix}$$

and the solution is the real part of this equation:

$$\vec{q}(t)=(c_1\vec{a}_1+c_2\vec{a}_2)e^{i\,t}$$

where $c_1$ and $c_2$ are complex constant.

with $c_1=c_{1R}+i\,c_{1I}\quad,c_2=c_{2R}+i\,c_{2I}$

you get the solution

$$q_1(t)=c_{1R}\cos(t)-c_{1I}\sin(t)$$ $$q_2(t)=(c_{1R}+c_{2R})\cos(t)-(c_{1I}+c_{2I})\sin(t)$$

you have four constants for four initial conditions

Congruence transformations of matrices

2 Answers2

Linked