1) As you know,
$$
\tag 1 \theta\epsilon^{\mu \nu \alpha \beta}F_{\mu \nu}F_{\alpha \beta} = \theta\partial_{\mu}K^{\mu},
$$
where $K_{\mu}$ is the so-called Chern-Simons class.
The Feynman diagrams method tells us that the term $(1)$ defines the diagram which corresponds to the two-photon (or two-three-four non-abelian bosons) vertex $V_{A}$ (where the subscript $A$ denotes all of indices) times the delta-function with an argument of difference of ingoing and outgoing momenta times the difference of ingoing and outgoing momenta:
$$
V_{A} \sim \delta (\sum_{i}p_{i} - \sum_{f}p_{f})\times (\sum_{i}p_{i} - \sum_{f}p_{f})\times \text{constant tensor}
$$
This is none but zero, so that the term $\int F\wedge F$ doesn't affect the perturbation theory.
2) Since $\epsilon_{\mu \nu \alpha \beta}$ is the pseudotensor, while $F$ is the tensor (here I would take into an account only the gluon stength tensor $F \equiv G_{\mu \nu}^{a}$, since the effects of corresponding electroweak and electromagnetic terms can be absorbed by chiral rotations of fermion fields), we have that under discrete Lorentz transformations $T$ (with $T_{\mu \nu}x^{\mu} = (-x_{0}, \mathbf x)_{\mu}$) and $P$ (with $P_{\mu \nu}x^{\nu} = (x_{0}, -\mathbf x)_{\mu}$)
$$
\hat{T}\epsilon_{\mu \nu \alpha \beta}G^{\mu \nu}_{a}G^{\alpha \beta}_{a}\hat{T} = -\epsilon_{\mu \nu \alpha \beta}G_{a}^{\mu \nu}G_{a}^{\alpha \beta}, \quad \hat{P}\epsilon_{\mu \nu \alpha \beta}G_{a}^{\mu \nu}G_{a}^{\alpha \beta}\hat{P} = -\epsilon_{\mu \nu \alpha \beta}G_{a}^{\mu \nu}G_{a}^{\alpha \beta}
$$
I.e., $\theta \epsilon_{\mu \nu \alpha \beta}G^{\mu \nu}_{a}G^{\alpha \beta}_{a}$ term is the pseudoscalar.
So that this term, particularly, breaks $CP$ invariance of theory. What is the consequence of such breaking? If you look on the QCD lagrangian in zero temperature confined phase (which is computed nonperturbatively), for example, protons $p$, neutrons $n$ and pions $\pi$ nuclear interaction, this term acts as pseudoscalar (i.e., also $CP$-breaking) $\pi NN$ coupling:
$$
\tag 2 L_{\pi , p, n} = \bar{\Psi}\pi^{a}t_{a}(g_{\pi NN} + \bar{g}_{\pi NN}i\gamma_{5})\Psi , \quad \Psi = \begin{pmatrix} p\\ n \end{pmatrix}
$$
where $\bar{g}_{\pi NN} \sim \theta$. By introducing the nuclear interaction with EM field, you can compute the diagram which defines neutron dipole moment, and see, that it is nonzero due pseudoscalar $CP$-violating coupling (it is proportional to it). How to understand this qualitatively? It is not hard to see this.
The interaction hamiltonian of uncharged particle with spin $\mathbf S$ and EM field $\mathbf E, \mathbf B$ is
$$
\tag 3 H = -\mu \left(\mathbf B \cdot \frac{\mathbf S}{|\mathbf S|}\right) - d\left( \mathbf E \cdot \frac{\mathbf S}{S}\right)
$$
Since the magnetic field $\mathbf B$ is a pseudovector, the electric field is a vector and the spin $\mathbf S$ is a pseudovector, we have that under $P$ transformation
$$
\hat{P}\left(\mathbf B \cdot \frac{\mathbf S}{|\mathbf S|}\right)\hat{P} = \left(\mathbf B \cdot \frac{\mathbf S}{|\mathbf S|}\right),
$$
$$
\hat{P}\left(\mathbf E \cdot \frac{\mathbf S}{|\mathbf S|}\right)\hat{P} = -\left(\mathbf E \cdot \frac{\mathbf S}{|\mathbf S|}\right)
$$
So that the presence of nonzero dipole moment in $(3)$ directly violates $CP$ invariance of the theory,
$$
\hat{P}H\hat{P} \neq H
$$
Formally it is computed in the way I've described above: the gluon $\theta$ term $(1)$ generates pseudoscalar nuclear coupling $(2)$, while this coupling generates the vertex between the neutron and the EM field which defines the dipole moment.
An effect of the $\theta$ term doesn't arise in classical physics, since it affects the physics only through quantum loops.
3) When non-perturbative effects arise
Formally the information about all non-perturbative effects is contained in the path integral of the theory, which is equal to the Green's function in Heisenberg's representation. It is lost, however, in cases when we try to use completely perturbative approach at all scales of the theory.
Note that there are three important facts which are connected with "tiny" couplings expansion (QED, QCD etc.).
The first one is related to the fact that in interacting theories couplings are running ones, i.e. they are different on the different scales. For example, the QED coupling is small up to very large energies, but there are scales where it grows fast (near the so-called Landau pole), so that the perturbation theory is inapplicable. One more example: the QCD coupling quickly grows when decreasing the scale to $\sim \Lambda_{QCD}$, below which the perturbative theory becomes inapplicable.
The second one is related to the changing of the expansion coupling constants near infrared zones of momentum. Two examples:
Electron-proton interaction. You know that an electron-proton scattering matrix element in momentum representation must have the pole in the point $p_{0}^{\text{pole}} = m_{P} + m_{e} - 13.6 \text{ eV}$. But there is no perturbation theory term which generates such pole; we need to take into account the sum of all diagrams near the given pole. The reason of such pole is not hard to understand if we take into account the diagram, for which the electron and proton momentums are small in the CM frame, $|\mathbf p| << m_{e}$, while their intermediate state of scattering is characterized by the slightly different momentums. It can be shown that such diagram has effective coupling constant $\frac{e^{2}m_{e}}{q^{2}}$, which is large for $q^{2} < e^{2}m_{e}$. Such scale is the scale of the bound state. In general, the each bound state is invisible in terms of perturbation theory, since the renormalization constant $Z$ is zero for them.
EW phase transition. When we compute the effective action of the EW theory near the point of the phase transition, we can't use perturbative approach based on the naive expansion on $\sim \frac{m(\langle H \rangle)}{T}$, since the temperature partition function of bosons are large at small energies (in infrared zone). We have instead of naive thinking that the effective coupling constant is $\sim \frac{T}{m(\langle H \rangle )}$.
The third one arises when we try to summing all of the diagrams of perturbation theory. Since perturbation theory series (on coupling constant) have zero convergence radius (expansion terms grow as $n!$), we need to use some technics like Borel resummation to restore the full result. However, in such cases it is inapplicable. The first important case is the classical solution of the equations of motion of the theory called instantons. They are stationary points of the quantity $\int D\varphi \oint dg g^{-n-1}e^{-S[g, \varphi]}$, where $n$ is an integer. The corresponding Borel series usually has the negative sign pole (as in the QCD). The second case is infrared renormalon, which arises in operator expansion, for example, of all of the diagrams which contain two 4-currents in QCD. Corresponding set of diagrams generate positive pole, which makes using of perturbation theory completely inapplicable.
How to calculate non-perturbative effects
Because of the existence of many cases where perturbative theory fails, there also exist many approaches to obtain the corresponding non-perturbative results.
If you want to calculate the poles associated with the bound states, you have to solve the EOM for the one-particle state which is one of constituent particles, in which the effects of other constituents (i.e., their operators in EOM) are replaced by external potential (for example, for electron-proton task the proton effect is the Coulomb central potential). Then for such range of energies you may introduce the elementary Hydrogen atom field in the Lagrangian by a Hubbard-Stratonovich transformation of the path integral (an example given in Weinberg's QFT Vol. 1, paragraph 14).
Another famous exactly solvable case is the spontaneous symmetry breaking (in the result of which $p^{2} = 0$ poles arise), for which you have to do following:
Determine the unbroken group;
Extract the Goldstone degrees of freedom (the number of which coincide with the number of broken generators of the symmetry group) by their parametrization of coordinates of elements of broken group;
By explicit calculations, construct invariant forms from these elements, which determine the Lagrangian of effective degrees of freedom after symmetry breaking.
The obtained theory contains elementary fields which correspond to bound states of the underlined theory.
- If you want to calculate the effects of solutions of classical EOM in quantum theory (you have to take into account these solutions because the cluster decomposition principle of the S-matrix fails when you don't, which is well explained in Weinberg's QFT vol. 2) you have to find the homogopy group of the symmetry group of the given theory, then find non-tivial representations of the homotopy classes of such groups (i.e., solutions $\varphi_{\text{classical}}$ of EOM which correspond to different values of the Maurer-Cartan invariant), then to find the number of collective degrees of freedom of the solution, and, finally, by having such results, to calculate the coefficient factor near $e^{-S[\varphi_{\text{classical}}]}$ in path integral (an example given in Weinberg's QFT Vol. 2, paragraph 23.5).