First off, an answer to the question "why do timelike geodesics extremize proper time".
A geodesic is by definition the path with the shortest distance between two points on a manifold i.e. you have the minimum of the proper distance:
\begin{equation}
\Delta s = \int ds
\end{equation}
where $ds^{2}$ the infinitesimal interval that will describe the metric of your manifold. We can equivalently define $d\tau ^{2} = -ds^{2}$ and thus, if we have a timelike interval ($ds^{2} < 0$) we may define proper time as:
\begin{equation}
\Delta \tau = \int d\tau
\end{equation}
Minimizing the "distance" on the manifold ends up maximizing proper time.
Now to the second question, "why do free particles move along geodesics".
The relativistic Lagrangian for a free particle of mass $m$ is going to be:
\begin{equation}
L = mc^{2} \frac{d\tau}{dt}
\end{equation}
Thus if we write down the action:
\begin{equation}
S = \int dt \, L = mc^{2} \int d\tau
\end{equation}
Therefore the need to extremize proper time and hence for free particles to move along geodesics is tantamount to the principle of stationary/least action $\delta S = 0$.
Then the question could become "why do particles (or anything for that matter) need to obey dynamics so that the principle of least action holds", to which no one really has a definitive answer. One possible idea is to consider a fundamental substrate of quantum mechanics permeating classical mechanics via the path integral formulation. Transitional probability amplitudes are given by path integrals weighted by the action of the system:
\begin{equation}
\langle \, x_{f} , t_{f} \, | \, x_{i}, t_{i} \, \rangle = \int \mathcal D x(t) \, e^{\frac{iS[x(t)]}{\hbar} }
\end{equation}
We could then treat this quantity as a partition function and, after a Wick rotation $t \rightarrow \tau = it$, they will have the form:
\begin{equation}
Z = \int \mathcal D x(\tau) \, e^{\frac{-I[x(\tau)]}{\hbar} }
\end{equation}
where $I[x(\tau)]$ the positive-definite Euclidean action following the Wick rotation. Now it has an analogous form to that of the partition function in statistical mechanics. Much like in statistical mechanics statistical quantities are weighted by the total energy of their configurations, in quantum mechanics probabilities and expected values of physical quantities are weighted by their action. Extremizing the action $S$ minimizes the Euclidean action $I$ which then provides the highest statistical weight.