Why is it so that the entire path must have an extremum, but not a minimum?
I mean, this isn't an exact proof, but: there is at least one path which satisfies the Euler-Lagrange equations since they are second-order in time and therefore admit two boundary conditions, which are generally sufficient to fix the start and end points. Exceptions would only really involve cases where your laws of motion forbid any path from going from point A to point B and you've asked for such a path anyway, like if you tried to use a space that is not connected (topologically speaking) and then put the points in disconnected parts.
Also, what do they mean by saying that only the extremum condition is used?
More precisely: let $\mathcal P$ denote the smooth functions from $\mathbb R$ ("time") to $\mathbb R^n$ ("particle coordinates"), which we'll call "paths": mathematicians would probably say $\mathcal P = C^\infty(\mathbb R, \mathbb R^n)$ or so. (We might occasionally need a weaker constraint like twice-differentiable paths or so.)
An action principle is a function $S_T :: \mathcal P \to \mathbb R$ mapping paths to real numbers, which we call the action of the path. The $T$ here is an optional time domain, useful for some action principles which can be represented with a Lagrangian $L :: (\mathbb R^n, \mathbb R^n, \mathbb R)\to\mathbb R$ as:$$S_T[p] = S_T\big[t \mapsto p(t)\big] = \int_T d\tau~L\big(p(\tau),~p'(\tau),~\tau\big).$$Notice that the Lagrangian does not know that its arguments are time-dependent or derivatives of each other or any such thing; the Lagrangian just sees some $(\vec a, \vec b, c)$ and maps it to some number. This helps you keep things straight when you start to wonder why, say, there is a total time derivative but only a partial spatial derivative: the sequence goes "do partial derivatives of $L(\vec a, \vec b, c),$ then substitute in $\vec a = p(t)$ and $\vec b = p'(t),$ then take this time derivative."
We introduce a path-perturbation $p(t) \mapsto p(t) + \epsilon~q(t)$ with $q[T] = (\vec 0, \vec 0)$ and then look for the path $p$ whose action is an extremum relative to path perturbations,$$S\big[t\mapsto p(t) + \epsilon~q(t)\big] = S[p] + O(\epsilon^2).$$We are therefore trying to study the paths for which the term linear in $\epsilon$ vanishes, and this gives the Euler-Lagrange equations for the action principle, $(\nabla_a L)_{a=p,b=p'c=t} - \frac{d}{dt}\big[(\nabla_b L)_{a=p,b=p',c=t}\big]= 0.$
We are therefore only using the extremum condition that $\delta S = 0$, not the minimum condition that as well for this path the next term is $\epsilon^2 F[q]$ for some strictly positive $F.$ The principle is often called "least action" but that is a misnomer because the action could also be a maximum or a saddle point or whatever, as long as its "derivative" (response to small path-perturbations) vanishes.
That's all they mean by saying "only the extremum condition is used."
[Of course, we then take this abstract mathematical idea (extremal inputs to an action principle) and imbue it with physical significance. The Lagrangian is kinetic minus potential energy, the extremal path is the path the system actually takes from the given start to end point, the Euler-Lagrange equations are therefore "equations of motion" for the system through its phase space. But they are making a statement at the mathematics-level, not at the physics-level.]