Don't worry, I did research in surface plasmons and even then I was more than a year into it before I truly understood, on an intuitive level, how the light gets a 'kick' from the grating. You are correct that it is diffraction at a 90 degree angle to the normal, but there is an easier way to think about it.
You say you've never taken a formal course in optics so I'll talk a little bit about diffraction gratings in general. You might have come across one before and know that if a beam of light hits it, it is diffracted into several different beams. Transmissive diffraction gratings are what one usually encounters in high school physics so I'll illustrate one below:

The numbers at the end of each beam are known as the order $\nu$ of that beam. The grating equation is $d(\sin\theta_i + \sin\theta_o) = \nu\lambda$, where d is the distance between lines of the grating, $\lambda$ is the wavelength of the light, $\theta_i$ is the angle of incidence, and $\theta_o$ is the angle of the outgoing beam. In the above illustration, $\theta_i$ is zero.
Next we consider a reflective grating (for example a piece of metal with 1D periodic grooves), as in the following illustration:

The same mathematics govern this situation as well. You'll notice the $\nu=+2$ order being very close to grazing the grating surface. Adjusting the angle of incidence a little bit would cause it to do so. In that case, it would have the required wave vector to launch a surface plasmon, which is the phase matching condition that you started out with. You get the $\beta = k\sin\theta\pm\nu g$ when you convert the grating formula to wave vectors (reciprocal space) which I'm too lazy to do right now.
I suppose you could technically say that the light got a momentum 'kick' from the $\nu=-2$ order being launched in the opposite direction, but thinking of it as the light getting a 'kick' is really misleading in my experience.