From the book Analytical Mechanics by Fowles and Cassiday I am studying classical coupled harmonic oscillators. These are systems that are governed by a system of linear second order differential equations of the form $\mathbf{M} \ddot{\mathbf{q}}+\mathbf{K}\mathbf{q} = 0$. Here you want to solve $\mathbf{q}$ as function of time $t$ and $\mathbf{M},\mathbf{K}$ are square matrices. You try to plug in $\mathbf{q} = \mathbf{a} \cos (\omega t - \delta)$ for undetermined $\mathbf{a}, \omega, \delta$ to get the system of equations $(\mathbf{K}-\omega^2\mathbf{M})\mathbf{a}\cos(\omega t-\delta) = 0$.
To find non-trivial solutions you want to find the roots $\omega^2_1, \dots, \omega^2_k$ of $\det(\mathbf{K}-\omega^2 \mathbf{M})$ as polynomial in $\omega^2$ and then calculate $\ker(\mathbf{K}-\omega^2_i \mathbf{M})$ for $i=1,\dots, k$.
Now suppose the kernels $\ker(\mathbf{K}-\omega_i^2\mathbf{M}), i=1,\dots,k$ span the whole linear space, thus you have a basis of "eigenvectors" $\mathbf{a}_1, \dots, \mathbf{a}_n$ (I use quotes because strictly speaking they are not eigenvectors). Then you can make a basis transform matrix $\mathbf{A}$ with the vectors $\mathbf{a}_i$ as columns.
1) The book then asserts that the congruence transformations $\mathbf{A}^T \mathbf{K} \mathbf{A}$ and $\mathbf{A}^T \mathbf{M} \mathbf{A}$ are diagonal matrices. Why is this the case?
Edit: a counterexample is given by taking $\mathbf{M} = \mathbf{K} = \begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix}$ such that $\omega^2 = 1$ is the only root of the determinant equation and $\mathbf{A} = \mathbf{I}_2$. Then the congruence transformations are just the matrices itself: $\mathbf{A}^T \mathbf{K}\mathbf{A} = \mathbf{K}$ and $\mathbf{A}^T\mathbf{M}\mathbf{A} = \mathbf{M}$.
So the followup question is: which assumptions on $\mathbf{M}$ and $\mathbf{K}$ must be added for this assertion to hold?
2) What is the intuition behind such a congruence transformation? For a similarity transformation from a matrix $\mathbf{B}$ to $\mathbf{D}=\mathbf{P}^{-1} \mathbf{B} \mathbf{P}$ I can interpret this intuitively as: going from the basis $\mathbf{P}\mathbf{e}_1, \dots, \mathbf{P}\mathbf{e}_n$ to the basis $\mathbf{e}_1, \dots, \mathbf{e}_n$. Is there a similar interpretation possible for congruence transformations as well?