As we know, translations in Minkowski space form a group, it may be represented by a unitary operator, satisfying:
$$ U(a)U^\dagger(a) = 1 \tag{1}$$ $$ U(a)U(b) = U(a+b) \tag{2}$$ $$ U(0) = 1 \tag{3}$$
The U satisfying these is $$ U(a) = e^{i a_\mu G^\mu} \tag{4}$$ where $G^\mu$'s are Hermitian operators(matrices), called the generators of the group.
In QFT, for a scalar field, we have $$ U(a)^{-1}\phi(x)U(a) = \phi(x-a) \tag{5}$$ or $$ [\phi(x), G^\mu] = i \partial^\mu \phi(x) \tag{6}$$
On the other hand, the canonical momentum operator is defined as $$ P^\mu \equiv \int d^3 x T^{0\mu} = \int d^3 x \left(\frac{\partial \mathcal{L}}{\partial(\partial_0\phi)}\frac{\partial \phi}{\partial x_\mu} - g^{0\mu} \mathcal{L} \right) \tag{7}$$
In QFT textbooks, $ P^{\mu}$'s are taken as generators of $U(a)$, $$ U(a) = e^{i a_\mu P^\mu} \tag{8}$$
but no rigorous proofs. For example, the validity of $P^\mu$ as the generators may depend on the particular Lagrangian from which $P^\mu$ is constructed.
So, under what conditions, $P^\mu$ may be taken as generators of a translation group?