It depends on what you mean by "EM waves are transverse". If you mean "the $\mathbf E$ and $\mathbf B$ fields are always orthogonal to the direction of propagation", where you're willing to take a case-by-case approach to the direction of propagation (or you want to use e.g. the time-averaged Poynting vector or some similar measure for the direction of propagation) then the assertion is false.
For most working physicists, however, that's a bit of a wonky understanding of what 'transverse' means, particularly for fields with a complex spatial dependence where you do not have a well-defined direction of propagation. In those cases, as AccidentalFourierTransform points out, the correct understanding of the phrase is purely in terms of the Maxwell scalar equations:
$$
\nabla\cdot\mathbf E = 0 = \nabla\cdot\mathbf B.
$$
This gives you your transversality constraint: it's phrased in a wonky differential-equation language, but at each point it essentially gives you a direction in which each field cannot point, based on its local coordinate dependence.
On the other hand, this can give some weird results if you want to stick to some more intuitive understanding of the "propagation direction" of the field. The clearest example of this is in a waveguide, which has a well-defined "direction" in the axis of the waveguide, but for which the electric or magnetic fields can have nonzero components in that direction. These components are typically small (and they vanish in the limit where the wavelength is much smaller than the waveguide width) but they are nonzero, and they can be important for applications. (For an example of these components in action (and/or shameless plug), see this paper.)
In essence, the way this works is that a confined waveguide mode, such as
$$
\mathbf E(\mathbf r,t) = E_0\hat{\mathbf e}_y \sin(x/L)\cos(kz-\omega t),
$$
is only confined because it is a standing wave in the $x$ direction, and this is best seen as the superposition of two travelling waves:
$$
\mathbf E(\mathbf r,t) = \mathrm{Re}\mathopen{}\left(
E_0\hat{\mathbf e}_y \frac{1}{2i}\left(e^{i(x/L+kz-\omega t)}-e^{i(-x/L+kz-\omega t)}\right)
\right)\mathclose{}.
$$
Here both waves are transverse, and (assuming $kL\ll 1$) they both have $\mathbf k$ vectors close to but not quite on the waveguide axis $\hat{\mathbf e}_z$. Thus, the electric and magnetic fields for each component are orthogonal to the propagation direction of each plane wave, but since there is more than one such propagation direction, the global field gets a bit confused about what that term actually means.
The specific example above is a transverse electric (TE) mode, with the electric field orthogonal to the waveguide axis, so it may sound a bit confusing, but with a bit of thought you can see that the corresponding magnetic field needs to point in the $y,z$ plane and therefore it's a bit more complicated. Finding this magnetic field is easy, since we can just plop in the corresponding field for each plane-wave component, but now you need to account for the fact that the two magnetic components are not parallel:
\begin{align}
\mathbf B(\mathbf r,t)& = \mathrm{Re}\mathopen{}\left(
B_0 \frac{1}{2i}\left(
\frac{-\hat{\mathbf e}_x-kL\hat{\mathbf e}_z}{\sqrt{1+k^2L^2}} e^{i(x/L+kz-\omega t)}
-\frac{-\hat{\mathbf e}_x+kL\hat{\mathbf e}_z}{\sqrt{1+k^2L^2}} e^{i(-x/L+kz-\omega t)}\right)
\right)\mathclose{}
\\& =
-\frac{B_0\hat{\mathbf e}_x}{\sqrt{1+k^2L^2}}\sin(x/L)\cos(kz-\omega t)
-B_0\frac{kL\hat{\mathbf e}_z}{\sqrt{1+k^2L^2}}\cos(x/L)\sin(kz-\omega t)
.
\end{align}
In particular, note that now you have a nonzero $\hat{\mathbf e}_z$ component (though again this is small when $kL\ll 1$). Does this make the magnetic field stop being "transverse"? That's up to what you want to make of the term, really.
While this is the simplest example, it should be clear that the situation is pretty generic any time that vector optics come into play. Thus, you have equivalent effects in a tight Gaussian focus (example), general waveguides (example), spherical waves (cf. Jackson 3rd ed. §9.7), and so on. These admit a similar explanation to the above in terms of a continuous superposition of plane waves (i.e. they can be decomposed as Fourier transforms of a bunch of transverse plane waves with $\mathbf k\cdot\mathbf E=0 = \mathbf k \cdot\mathbf B$), but anytime a point gets significant contributions from multiple different directions of $\mathbf k$ vectors then you will have some trouble establishing a unique 'direction of propagation'.