Consider a map $$S \ni\phi \mapsto F[\phi] \in \mathbb R$$ defined on a class $S$ of smooth functions $\phi$ defined on the compact set $\Omega \subset \mathbb R^n$ obtained by taking the closure of an open set with regular boundary $\partial \Omega$. Thus the map $F$ associates a real number $F[\phi]$ to each function $\phi\in S$.
We say that the functional derivative of the functional $F$ exists at $\phi_0$ and is the function on $\Omega$ denoted by
$$\frac{\delta F}{\delta \phi}|_{\phi_0}$$
if
$$\frac{d}{d\alpha}|_{\alpha=0} F[\phi_0 + \alpha\eta] = \int_\Omega \frac{\delta F}{\delta \phi}|_{\phi_0}(x) \eta(x) d^nx \tag{1}$$
for every smooth function $\eta$ such that $\phi_0 + \alpha \eta \in S$ for $\alpha$ in a neighborhood of $0$ (depending on $\eta$ and $\phi_0$).
This definition must be compared with the trivial analog
$$\frac{d}{d\alpha}|_{\alpha=0} f({\bf x_0 + \alpha h}) = \sum_{k=1}^n \frac{\partial f}{\partial x_k}|_{\bf x_0} h_k \tag{2}$$
valid for a differentiable function $f : \mathbb R^n \to \mathbb R$.
Here (1) can be viewed as the infinite dimensional case of (2), where now $n \to \infty$ and the sum is replaced by an integral because the discrete index $k$ becomes the continuous variable $x$.
Let us consider the particular case, with $\Omega \subset \mathbb R^n$ as said,
$$F[\phi] := \int_\Omega {\cal F}(\phi(x), \nabla \phi(x)) d^nx\:, $$
where ${\cal F}(x, y_1, \ldots, y_n)$ is a smooth function and the class $S$
is made of smooth functions $\phi$ taking a given value (a given function) on the boundary of $\Omega$.
With these hypotheses, swapping the symbol of integral with that of derivative (by Lebesgue's dominate convergence theorem), using integration by parts and observing that
$\eta(x) =0$ if $x \in \partial \Omega$ in order to have $\phi + \alpha \eta \in S$,
we eventually have that
$$\frac{d}{d\alpha}|_{\alpha=0} F[\phi_0 + \alpha\eta] =
\frac{d}{d\alpha}|_{\alpha=0} \int_\Omega {\cal F}\left(\phi(x) + \alpha \eta(x), \nabla \phi(x) + \alpha \nabla \eta(x)\right) d^nx =
\int_\Omega
\left.\left[\frac{\partial {\cal F}}{\partial \phi}- \sum_{k=1}^n \frac{\partial}{\partial x_k}\frac{\partial {\cal F}}{\partial \frac{\partial \phi}{\partial x_k}}\right]\right|_{\phi=\phi_0} \eta(x) d^nx\:.$$
In other words,
$$ \frac{\delta F}{\delta \phi}|_{\phi_0}=\left.\left[\frac{\partial {\cal F}}{\partial \phi}- \sum_{k=1}^n \frac{\partial}{\partial x_k}\frac{\partial {\cal F}}{\partial \frac{\partial \phi}{\partial x_k}}\right]\right|_{\phi=\phi_0}\:.$$
The extension to the case of $m$ components of $\phi$ (called $\phi$ and $\pi$ for instance if $m=2$) is immediate.