The only trick here is getting used to how discrete sums are turned into integrals.
Suppose you let energy be a function of momentum $p$ and position $q$. Then you can rewrite the discrete quantum partition function as
$Z_{quantum}=\sum_{p,q}e^{- \beta E(p,q)},$
where the sum is over each of the $N$ positions and $N$ momenta, and the only challenge is how to find the appropriate constants for the continuum limit.
This is easiest if you take the system to be in a box of length $L$ and volume $V=L^3$. For position, you want to normalize to the size of the box, ie
$$\sum_q \rightarrow \frac1V\int d^3q$$
for each particle.
For $k$ notice that the spacing of wave numbers in a box is $2\pi n/L$ in each direction. This tells you that the correspondence here is
$\sum_{k} \rightarrow \frac{V}{(2\pi)^3}\int dk$ for each particle.
Put it all together and you get
$$Z_{quantum}=\sum_{p,q}e^{- \beta E(p,q)} \rightarrow \left(\frac{V}{(2\pi)^3}\frac1V\right)^N \int \int e^{- \beta E(p,q)} dq^{3N} dk^{3N},$$
which when you substitute $p=\hbar k$ for each of the 3N k's and collect factors gives you the standard expression.
Note that no special classical approximation was taken here. In fact, classical statistical mechanics is, at least in my view, a misnomer, since you need to use all sorts of things like the discretization of phase space, Planck's constant, the occasional $N!$ factor to avoid Gibbs' Paradox, etc., that make no sense without quantum physics. When using this to derive something like the ideal gas law, the only real classical assumption you make is that Fermi or Bose statistics can be neglected. (This claim seems to be quite disputed in the comments, I'll note, so I will give the disclaimer that this hinges on my personal and somewhat arbitrary consideration of what is considered a 'classical' limit and what is not).
edit: a bit more on the first continuum limit...
Let's take a 1-d discrete system with M sites. Then $\sum_q e^{-\beta E}$ is better written as
$\sum_{i=1}^M e^{-\beta E_i}$, which sums the exponential of energy on each site.
Suppose that the distance between the sites is 'a'. Then $L=Ma$. Furthermore,
$\sum_{i=1}^M =\frac1L\sum_i^M a$
You can probably guess what you want to do now- take a->0 while increasing the number of sites such that L is constant. At this point we can rename a as 'dx' and replace our sum with an integral over it for the identification $\sum_q=\frac1L\int dx$
which when extended to three dimensions and N particles gives the above result.
I certainly won't pretend this is rigorous, but at the same time I think that if you think along these lines you should be able to convince yourself that it couldn't be anything otherwise. Scaling arguments like this come up all over the place, both in statistical mechanics and other areas of physics.
edit2: As Peter rightly points out in the comments, one cannot expand a Hamiltonian simultaneously in the basis of x and p, making it unclear how this classical correspondence should be carried out.
The limit that we are taking is clear enough, I think. In real quantum mechanics, due to noncommutivity each state cannot be thought of as occupying a point in phase space, but rather a probability distribution. In our limit we are assuming that these phase space volumes are small enough to be taken as points- this is another restatement of the continuum limit above.
However, one might reasonably ask for a prescription for how to expand the wavefunction in a basis that treats position and momentum equally, to take this limit. This can be done. The tool used is the Wigner function:
$W_n(x,p)=\frac1h \int_{-\infty}^{\infty} \psi_n^*(x+y) \psi_n(x-y)e^{2ipy/\hbar} dy $
The expectation value of an operator in this formalism is
$\int \hat{A}(x,p) W(x,p) dx dp$
So if we think of the partition function as $Z_{quantum}=tr(e^{- \beta \hat{H}(p,q)})$
with this formalism in mind and take the limit as before, I think this provides a plausible way to think of relationship between the classical and quantum partition function.