In QFT we sometimes have to treat Lagrangian which contain interaction derivative terms in the form $$ \left(\partial_\mu \partial ^{\mu}\right)^a \phi^b$$ or also something like
$$ \partial_{\mu} \phi \partial ^{\mu} \phi$$ as in the mass counterterm when doing renormalization.
As I understand, when computing say scattering amplitudes we have to treat these terms as you would treat normal vertices, with the only difference that (in the momentum basis) every derivative $\partial^{\mu}$ gives you a $p^{\mu}$.
This seems fine, but I cannot reconcile this procedure with how one initially derives the rules for computing scattering amplitudes.
a)Canonical quantization
The scattering amplitudes is given by
$$\langle p_{fin} |T\exp{\left(\int\mathrm{d}^4x \mathcal H\right)} | p_{in}\rangle$$ Where $\mathcal H$ is the Hamiltonian density associated to the field Then, one decompose $$\mathcal H = \mathcal H_0 + \mathcal H_{int}$$
where $\mathcal H_{0}$ is the free hamiltonian of a free field (only the kinetic term).
It's usefule to study the problem in interaction picture, where the states (in interaction picture) evolve with $\mathcal H_0$. We have to compute then $$ \langle\tilde{p}_{fin} | T\exp{\left(-i\int \mathrm{d}^4x \mathcal {H_0}\right)}| \tilde{p}_{in}\rangle$$
But we can substitute the $\mathcal L_0$ only if $\mathcal H_{0}$ contains no derivative terms (and then $\mathcal L_0 = - \mathcal H_0$).
2) Functional quantization
Also in this approach, when computing the amplitude inserting complete basis in between final and inital state, one makes the assumption of an hamiltonian in the form $$\mathcal H_{kin}\left(\partial \phi \right) + H_{int}\left(\phi \right) $$
before substituting the Lagrangian.
So how is it possible that using the naive feynman rules for interaction terms containing derivatives gives you the correct amplitudes?