In Einstein Gravity in a Nutshell by Zee, in section IV.1 page 241, he tries to write down the action for electromagnetism and gravity in an intuitive and patchwork way,
Starting from the relativistic action in Minkowski spacetime without interaction,
\begin{equation} S = - m \int \sqrt{-\eta_{\mu\nu} dx^\mu dx^\nu} = - m \int \sqrt{dt^2 - d\vec{x}^2} \tag{1}\label{1} \end{equation}
We can contemplate how to add the interaction $V(x)$ by getting inspiration from the standard nonrelativistic action with interaction,
\begin{equation} S = \int dt \left( \frac{1}{2} m \left( \frac{d\vec{x}}{dt} \right)^2 - V(x) \right) \tag{2}\label{2} \end{equation}
He claimed that we realize that we could put $V(x)$ either outside or inside of the square root in $\eqref{1}$ so that,
$\underline{\rm{Electromagnetism \; Action}}$ \begin{equation} S = - \int \left( m\sqrt{-\eta_{\mu\nu} dx^\mu dx^\nu} + V(x) dt \right) \tag{3}\label{3} \end{equation}
$\underline{\rm{Gravity \; Action}}$ \begin{equation} S = -m \int \sqrt{ \left( 1 + \frac{2 V}{m} \right) dt^2 - d\vec{x}^2} \tag{4}\label{4} \end{equation}
He then says (and calculates, which I wouldn't show) that in the nonrelativistic limit, these would approach the appropriate Newtonian form and so is correct. He later states that these equations are not Lorentz invariant so need to be improved and he writes the correct form,
$\underline{\rm{Electromagnetism \; Action \; Improved}}$ \begin{equation} S = - \int \left( m\sqrt{-\eta_{\mu\nu} dx^\mu dx^\nu} - A_\mu(x) dx^\mu \right) \tag{5}\label{5} \end{equation}
$\underline{\rm{Gravity \; Action \; Improved}}$ \begin{equation} S = -m \int \sqrt{-g_{\mu\nu}(x) dx^\mu dx^\nu} \tag{6}\label{6} \end{equation}
My question is on the not improved actions, $\eqref{3}$ kind of makes sense since it is comparable to $\eqref{2}$ where $V(x)$ is outside, but I still don't understand the whole concept why it should look that way. As for $\eqref{4}$, I totally do not understand the reasoning why it should be that way. Overall, I do not see the reasoning for how $V(x)$ was added inside or outside. Besides, he did not explain the why and the how of the process.
Any help in understanding this? I know the actions can be derived in other ways, but I'm interested in the thought process of Zee and how he derives it using some patchwork intuition so please refrain from answering using some other derivations to justify the actions mentioned.