We tend to only use Lagrangians that are a function of at most the first derivative of the field $\mathcal{L} = \mathcal{L}(\phi, \partial_\mu \phi)$. For general relativity, this should not be any different, as the Einstein field equations are only of second order in the metric. However, naively one would think that the Lagrangian in the Einstein-Hilbert action contains second derivatives in the metric, because of the presence of the Ricci scalar. Why is this not an issue?
$$ \mathcal{L}_{EH}(g_{\mu\nu}, g_{\mu\nu,\alpha}, g_{\mu\nu,\alpha\beta}) = \sqrt{-g} R $$
If we perform the variation, it turns out not to matter. Writing the Lagrangian as $\sqrt{-g} g^{\mu\nu} R_{\mu\nu} $ the variation contains three terms (the first two of which do not contain any derivatives of the metric), hence
$$ \delta S = \int d^4x \left[ - \frac{1}{2} \sqrt{-g} g_{\mu\nu} R \delta g^{\mu\nu} + \sqrt{-g} R_{\mu\nu} \delta g^{\mu\nu} + \sqrt{-g} g^{\mu\nu} \delta R_{\mu\nu} \right] $$
The first two terms combine into the Einstein tensor, leaving only a possibility of a problem in the last term. However, this emerges as a total derivative which we can ignore. Interestingly, the proof of this uses the Palatini identity which does not seem to hinge on the expression of the Christoffel symbols using derivatives of the metric, but only on their general properties as a connection.
So it seems we got lucky here, but is there a deeper reason this worked out?