In the specific case of the Hamiltonian the non-locality arises because the time evolution depends on values of the field which are arbitrary far away. In the one dimensional case we have
\begin{equation}
i \frac{\partial}{\partial t}\, \psi ~~~=~~~ \tilde{H}\,\psi ~~=\ \sqrt{~m^2+\mathbf{\tilde{p}}^2_x~}\ \psi\ =\ \nonumber
\end{equation}
\begin{equation}
\sqrt{ ~m^2-\partial_x^2~}~~ \psi ~~=~~ \frac{m}{x} K_1\left(mx\right) ~*~ \psi ~~~~~~~
\end{equation}
( We used $\hbar=c=1$ ). In the last term $*$ denotes a convolution, in this case with a Bessel K function. It is clear that this instantaneous dependency violates the speed of light restriction.
See also my stackexchange answer here: $\nabla$ and non-locality in simple relativistic model of quantum mechanics
Now in the general case the value of $\psi(x)$ will depend on $\psi(y)$ at other locations in the past an it will depend on other fields such as $A^\mu(y)$ at other locations in the past.
Mathematically these dependencies stem from "Taylor-expanded series" of differential operators but as long as you don't violate the speed of light restriction then this is perfectly fine.
Hans