Why does the temperature remain constant during a change in the state of matter?
It doesn't exactly, as we see below, but it almost does under typical conditions where something called heterogeneous nucleation (that is, easy formation of molecular-scale clusters of the new phase) dominates. This is the case for water boiling right around 100°C at sea level in a standard kettle, for instance.
Two effects arise simultaneously.
First, thermodynamically, one phase becomes the more stable phase at equilibrium.
Second, kinetically, creation of the new phase proceeds in an enormously fast process that moves quickly toward that equilibrium.
Consequently, essentially any overheating is immediately absorbed by bond breaking; conversely, any undercooling is immediately offset by energy released through bond formation. The process is fast—like exponential-function-applied-to-a-squared-term fast, as derived below.
Both factors (thermodynamic and kinetic) are essential. Without the first, the driving is absent to form regions of the new phase. Without the second, the formation process is slow or unpredictable (as in the case of room-temperature diamond decomposing to form more stable graphite, say, or a cup of microwaved water in a smooth, clean container that suddenly erupts partially into steam, or a bottle of ice from the freezer that quickly freezes when tapped).
Everything above repeats the standard explanation; the point of this answer is to present quantitative models, which are less often surveyed.
To address a point in the body of your question:
My counterargument: Intermolecular forces between molecules are either intact or broken. There is no in-between. Therefore, the change from intact to broken is instantaneous.
I don't think this is a great argument, as bonds are forming and breaking all the time in real materials at finite temperatures. Liquids have some fraction of broken bonds compared to the solid state, and the bonds themselves are fluctuating constantly. Vacancies—atomic-scale defects—are themodynamically required at any temperature. This certainly seems like an "in-between" to me.
In any case, the conceptual view can be shored up by replacing forces and bonds with the macroscale Gibbs free energy $G = H - TS$, where $H$ is the enthalpy (a measure of bonding, which Nature prefers), $T$ is the temperature, and $S$ is the entropy (a measure of broad possibilities, which Nature also prefers). Note that close bonding and broad dispersion act in opposition here. Note also that the entropy-containing term includes the temperature, predicting a temperature dependence that ends up being pivotal.
The phase with the lowest Gibbs free energy is always the equilibrium state, as mediated by the tradeoff between enhanced bonding and enhanced entropy. In addition, the higher-entropy phase is always the higher-temperature stable phase. For the typical single-component system discussed here, two phases can have the same $G$ at only one temperature: the phase-change temperature. This is part of why we would see a sudden transition at this temperature.
We can visualize this behavior in the following image. The lowest curve is the most stable. On the left side, the curves are more shallow because entropy $S$ (the negative slope) decreases toward zero with decreasing $T$. But the larger-entropy phases always win out with increasing temperature because of the $-TS$ term. (If the liquid curve were a little higher, we'd have a material such as carbon dioxide that doesn't exhibit an equilibrium liquid phase—at least not at atmospheric pressure—and sublimates directly from the solid to the gas state upon heating.)
Now let's get into why the temperature seems to stay constant during a phase change. It doesn't, in fact, but it is true that a only a very slight amount of undercooling or overheating can be sufficient to trigger a complete rapid transformation from the less stable phase to the more stable phase. The following derivation follows Porter & Easterling, Phase Transformations in Metals and Alloys, but does not assume any particular material class.
Consider slight overheating past one of the transition points in the diagram above. The volumetric driving force for the phase transformation is the reduction in Gibbs free energy, $\Delta G=\Delta H - T\Delta S$, where $\Delta$ compares the state values between the two phases. Close to the intersection, $\Delta H$ is just the latent heat $L$ of the transformation. In addition, we know that $\Delta G =0$ at the transformation temperature $T_t$, so by linear approximation $$\Delta G = L\left(1-\frac{T}{T_t}\right)=\frac{L\Delta T}{T_t}.$$ Thus, the driving force $|\Delta G|$ for transformation is modeled as increasing linearly with increasing overheating or undercooling $\Delta T$.
Now consider the benefits and penalties of nucleating a new phase in the form of a spherical molecular cluster with radius $r$. The benefit is the reduction in Gibbs free energy $\frac{4}{3}\pi r^3\Delta G$. The penalty is the new interface area $4\pi r^2\gamma$, where $\gamma$ is the surface tension of the new phase with respect to the old phase.
Clusters form and dissipate constantly from random particle motion; for just a few molecules, $r^3\ll r^2$ (a sloppy comparison, but apply the relevant coefficients for dimensional consistency). If the overheating/underheating is low, there's no energetic driving force for further growth, and the cluster is likely to just come apart. But a larger temperature excursion alters this balance substantially.
A cluster becomes stable (at radius $r^\star$) when a size increase starts to produce a net energy decrease, i.e., when a maximum exists and the derivative is zero: $$\frac{d}{dr}\left(4\pi r^2\gamma-\frac{4}{3}\pi r^3\Delta G\right)=0.$$ Solving this equation, we find that at the stable radius, the activation energy or energy peak that needs to be overcome is $$\Delta G^\star=\frac{16\pi\gamma^3}{3(\Delta G)^2}.$$ The corresponding expected volumetric nucleation rate $N$ from Arrhenius theory is then $$N=f\exp\left(-\frac{\Delta G^\star}{kT}\right)=f\exp\left(-\frac{16\pi\gamma^3}{3(\Delta G)^2kT}\right)=f\exp\left(-\frac{A\gamma^3}{(\Delta T)^2}\right),$$ where $k$ is Boltzmann's constant and where $f$ is a characteristic (very high) vibration frequency of atomic activity in the material; Porter & Easterling use $f\approx 10^{11}\,\mathrm{Hz}$, for example. To simplify the discussion, take $A$ to be a constant that incorporates everything except two parameters of interest: the surface tension difference $\gamma$ and the overheating/undercooling $\Delta T$.
The surface tension $\gamma$ would be important if every cluster interface routinely separated only the old phase from the new phase. However, $\gamma$ can be much lower if the cluster initiates on a particle of dirt or a crack in a container wall that has a high energy anyway because of unsatisfied bonds. The former case is called homogenous nucleation, and the latter—which we often see in our dirty, defect-filled world—is called heterogenous nucleation.
For small $A\gamma^3$, consider the nucleation rate $N\sim \exp\left(-\frac{A\gamma^3}{(\Delta T)^2}\right)$. We see a tremendously rapid increase in nucleation for slight overheating/undercooling because we're applying the exponential function to a squared temperature difference:
The kinetics of the process are so strong that essentially any overheating is immediately shunted into breaking bonds to form the now-favored higher-temperature phase. (Conversely, any undercooling past the phase transition temperature is immediately inundated with energy released from bond formation to provide the now-favored lower-temperature phase.)
This is ultimately why phase transitions seem to occur entirely at a single well-defined temperature. That's not quite the case, as shown above, but it can be a reasonably good approximation.
Doesn't temperature remain constant during a state change only because other factors like volume or pressure alter?
– Robbie Goodwin Feb 21 '21 at 00:40