I find the "intuitive number" argument to be very misleading. One argument against it is... what is $N$? In fact the state of the electrostatic field around a point particle is a coherent state, which is a superposition of states with different numbers of particles. So it's not really possible to say that there are a definite number of photons "spreading out" from a central source. Additionally, as you point out, I don't see a good reason why the intuitive argument would fail for massive photons. Third, in the particle model, for finite $N$, you would expect the law to break down for large enough distance from the source -- you would expect to have "shot noise" in the force as the "rate" of photons arriving in a given spherical area decreased because the photons will have become spread very thin over a large spherical area; this has never been observed. Finally, the photons that appear in this kind of "static" force are actually virtual particles, not real particles, which is to say that they aren't really "propagating particles" but more like a colorful set of words to describe mathematical terms in a perturbative expansion, which don't need to obey things like continuity equations that hold on shell (e.g., when the classical equations of motion are satisfied).
The straightforward explanation for the $1/r^2$ vs Yukawa suppression, is simply that these force laws follow from the Green's functions (aka propagators) for the massless and massive Poisson equations, respectively.
If you want to look for a deeper explanation as to why massive and massless particles behave differently, in my view, the $1/r^2$ law is really a consequence of Gauss's Law
\begin{equation}
\nabla \cdot \vec{E} = \frac{\rho}{\epsilon_0} \implies E \sim \frac{1}{r^2}\ \ {\rm (point\ charge)}
\end{equation}
and in fact Gauss's law is deeply tied to the masslessness of the photon. In the Hamiltonian formulation of electrodynamics, Gauss's law is a first class constraint associated with the $U(1)$ gauge invariance of the photon. For the massive case, because the gauge invariance is broken, this first class constraint becomes two second class constraints, and Gauss's law no longer holds in its original form. In turn, this means that there is nothing that forces the field to fall off as $1/r^2$, and allows for the Yukawa solution to exist.
You may wonder about scalar fields, which do not have gauge invariance, but still have a difference between a $1/r^2$ behavior exhibited by massless scalar fields, and a Yukawa suppression for massive scalar fields. In fact there is also a symmetry explanation here. A massless scalar field has a shift symmetry under $\phi \rightarrow \phi + c$, which guarantees that the equations of motion can be written in the form of a total divergence. This total divergence plays the role of Gauss's law. A massive scalar breaks the shift symmetry and correspondingly has a different behavior.
A thought struck me while walking. I think this question comes from a similar place as the aether theory of the late 1800s -- an attempt to explain features of (what we now call) relativistic field theory with mechanical models. In your case you want to explain the inverse square law with particles spreading out in space, in the aether case people wanted to explain the medium in which electromagnetic waves are supposedly traveling. I think the modern response to both the "intuitive particle" argument and the aether idea are the same: we have found that mechanical models are both unnecessary and often misleading in describing relativistic field theories. What really matters are the equations satisfied by the fields. While these equations can be used to describe mechanical systems (for example Gauss's law can be used to describe some systems in fluid mechanics), the laws themselves are more general than the specific systems they can be applied to, and there is no need for there to be a mechanical model underlying these equations when they are applied in field theory. The structure matters more than the bricks out of which it is built.