The Wikipedia article has a derivation of the Klein-Gordon equation. It gets to this step:
$$\sqrt{\textbf{p}^2 c^2 + m^2 c^4} = E$$
and inserts the QM operators to get
$$\left( \sqrt{ (-i \hbar \nabla)^2 c^2 + m^2 c^4 } \right) \psi = i\hbar\frac{\partial}{\partial t} \psi$$
The article then says
This, however, is a cumbersome expression to work with because the differential operator cannot be evaluated while under the square root sign. In addition, this equation, as it stands, is nonlocal.
To fix this, the first equation is squared instead to get
$$\textbf{p}^2 c^2 + m^2 c^4 = E^2$$
after which the QM operators are inserted and the expression is simplified to get
$$-\hbar^2 c^2 \nabla^2 \psi + m^2 c^4 \psi = -\hbar^2 \frac{\partial^2}{\partial t^2} \psi$$
A couple things I don't understand. First, are the solutions to this differential equation not exactly the same as the solutions to the first differential equation? Both sides of the starting equation were squared, so it seems to me that regardless of the particular form of the resulting differential equation, both of them should have the exact same set of solutions.
Secondly, why is the first differential equation cumbersome to work with? It seems like it would in fact be easier to work with, since the operator under the square root could be expanded in terms of a Taylor series and then you have an equation that is first order in time.
And finally, can someone elaborate on what nonlocal means? The linked article on the Wikipedia page didn't entirely help me understand it.