A question about how the physical $W^{\pm}_{\mu}, Z_{\mu}, A_{\mu} $ masses arise:
We have the covariant derivative:
$$ D_{\mu} = \partial_{\mu} - \frac{ig}{2}\begin{pmatrix} W^3 & \sqrt{2} w^+ \\ \sqrt{2}W^- & -W^3 \end{pmatrix} - \frac{ig'}{2} \mathcal{I}B_{\mu}$$
When the symmetry is broken by the Higgs VEV \begin{pmatrix} 0 \\ u \end{pmatrix} :
$$ SU(2)_W \times U(1)_Y \rightarrow U(1) $$
And we expand around the vacuum, we obtain, after expanding the kinetic terms $(D_{\mu} \phi)^{\dagger}(D_{\mu} \phi) $
$$ \mathcal{L_{kinetic}} = \frac{g^2}{2}u^2W^+_{\mu}W^{-\mu} + (\frac{-g}{2} W^3_{\mu} + \frac{g'}{2}B_{\mu})(\frac{-g}{2}W^{3\mu} + \frac{g'}{2}B^{\mu}) + ... $$
When we identify $$W^{\pm} = \frac{1}{\sqrt{2}} (W^1_{\mu} \pm W^2_{\mu}) $$
We see that the first term gives a term $ \propto (W_1)^2 + (W_2)^2$, which has the form of a mass term.
My question is why we call the charged $W^{\pm}$ bosons the mass eigenstates, but it looks from my naive calculation that the $W_1, W_2$ are also mass eigenstates.
In addition, why does the mass term in the Lagrangian for the charged W bosons take the form:
$$ \frac{g^2u^2}{2} W^+_{\mu} W^{-\mu}$$ instead of the individual $W^+$ and $W^-$ mass terms? Some textbooks mention 'canonical normalization' but I can't find any detailed explaination.
I looked at this previous answer: Electroweak interaction: From $W^{1}_{\mu},W^{2}_{\mu},W^{3}_{\mu},B_{\mu}$ to $W^{\pm},Z_{\mu},A_{\mu}$ but it seems like that author had the opposite problem to me.