My second answer, which probably isn't what you're looking for, but just in case it is:
I just realized that you might be looking for a mathematically rigorous proof that the entropy of the thermal distribution:
$$ H_\beta = - \sum_i p_i \log p_i \ \ \ \ \mathrm{where} \ \ \ \ p_i = e^{-\beta E_i}/Z $$
is increasing in temperature $T = \beta^{-1}$. Here's a sketch of one.
Lemma 1
For a given average energy $E_\mathrm{ave} = \sum_i p_i E_i$, the thermal distribution with energy $E_\mathrm{ave}$ maximizes the entropy over all distributions with that energy.
Proof sketch:
Use Lagrange multipliers.
Lemma 2
For the thermal distribution at any energy $E_\mathrm{ave}$ with $0 < \beta < \infty$, you can find a probability distribution with slightly higher average energy $E_\mathrm{ave}+\,\epsilon$ and greater entropy.
Proof sketch:
Find two energies $E_j < E_k$ and move some probability mass from $p_j$ to $p_k$.
Lemma 3
When we increase the temperature $\beta^{-1}$, we increase the average energy.
Proof sketch:
Show that for any two positive temperatures $\beta^{-1} < {\tilde{\beta}}^{-1}$, there is an energy $E_m$ such that in the thermal distributions, if $E_i < E_m$, then $p_i > \tilde{p}_i$ and if $E_i>E_m$, then $p_i < \tilde{p}_i$.
This can be shown by straightforward calculation.
Now we can prove the theorem. Start with average energy $E_1$. By Lemma 2, we can increase the average energy to $E_2 = E_1+\epsilon\,$ and find a distribution with higher energy. But the thermal distribution at average energy $E_2$ has higher entropy than this distribution by Lemma 1. Thus, we've increased both the energy and the energy of the thermal distribution. But by Lemma 3, the thermal distribution at average energy $E_2$ also has higher temperature.