As the OP has pointed out, when we compute the magnetic field generated by a current loop or other localized current configuration, the approximation in which we treat the loop as a pure dipole, and drop all higher multipole moments, will be valid only at a distance that is much greater in magnitude than the length scale of the localized current configuration.
When we are treating the current loop or other localized current configuration as the object that the magnetic field acts on, the condition for approximating it as a pure dipole is different. To discover what that condition is, it is more convenient to work with the potential energy $U$, which can be expressed in terms of the vector potential $A$:
$$ U = - \int J \cdot A \, \mathrm{d}^3 r = - \int J_i(A_i(0) + r_j \partial_j A_i(0) + \frac{1}{2} r_j r_k \partial_j \partial_k A_i(0) + \ldots ) \, \mathrm{d}^3 r
$$
where we are expanding $A$ as a Taylor series around the origin, which is assumed to lie within the localized current configuration.
It can be shown that the first term on the right-hand side, $\int J_i A_i(0)$, vanishes, and that the second term is equivalent to $-\mu \cdot B(0)$, with $\mu$ the magnetic dipole moment of the localized current configuration. This is done here.
Notice that the subsequent terms in the Taylor series involve the product of $J$ with $\ell$ factors of $r$ (i.e. something similar to a higher multipole moment, where $\ell = 2$ corresponds to the quadrupole moment and so on) and the $\ell$-fold derivatives of $A$ at the origin, i.e. something similar to the $(\ell-1)$-fold derivatives of $B$ at the origin.
This leads to the following observation: when calculating the potential energy of a localized current configuration in an external magnetic field, we can neglect moments higher than the dipole moment provided that each time we introduce another factor of $r\partial$, the resulting term is much smaller than the previous one; another way of saying this is that the magnitude of each component of $B$ doesn't vary appreciably in the vicinity of the localized current configuration.
In the special case of a uniform external magnetic field, the dipole approximation for the potential energy is exact, and likewise for the torque (which is just the derivative of the potential energy with respect to an angle, with the sign flipped).