I remember overthinking equations like \begin{equation} \mathbf{1}=\int dx\ |x\rangle\langle x|\tag{1} \end{equation} and \begin{equation} X=\int dx\ |x\rangle\langle x|x\tag{2} \end{equation} when I had my introductory QM lecture ($\mathbf{1}$ is the identity on $L^2(\mathbf{R})$ and $|x\rangle$ is "defined" as the Dirac-delta function that vanishes everywhere except in $x\in\mathbb{R}$).
Surprisingly, this formalism is self-consistent if one uses the "axiom" $\langle a|x\rangle=\delta(x-a)$ and "linearity", e.g. \begin{equation} \langle a|X|\psi\rangle=\langle a|X|\int dx\ |x\rangle\langle x|\psi\rangle=\int dx\ \langle a|X|x\rangle\langle x|\psi\rangle \\ =\int dx\ x\langle a|x\rangle\langle x|\psi\rangle=\int dx\ x~\delta(x-a)\psi(x)=a\psi(a). \end{equation}
I am now wondering why the nonsense above is commonly presented in introductory QM courses. Is there more to the story? Here are some ideas:
- Equations $(1)$ and $(2)$ remind me a bit of the spectral theorem - is there a connection?
- I've had a very brief introduction to rigorous distribution theory (I've learned how to solve inhomogeneous ODEs using distributions) and I'd say $\langle x|$ can be regarded as a distribution/linear map $\mathcal{L}^2(\mathbf{R})\ni f\mapsto f(x)\in\mathbf{R}$.$^1$ But apart from that, I don't see how distribution theory helps to make $(1)$ and $(2)$ meaningful...maybe someone who knows more about distributions does. :)
$^1$ $\mathcal{L}^2$ is the set of square-integrable functions, $L^2$ is the set of equivalence classes.