It is a known fact that we can derive the Boltzmann distribution if we apply the maximum entropy principle at thermal equilibrium. In this post, I am going to work reversely: I want to first assume some conditions which guarantee Boltzmann distribution, and use Boltzmann distribution to recover the Gibbs entropy. I can reconstruct the Gibbs entropy but have some other questions left unsolved at the end of the post.
Starting Point: The Conditions which Guarantee the Boltzmann Distribution
Given some large heat reservoir and some closed system we are interested in, we assume after we put the system in contact with the reservoir for a long time,
- The probability $p_i$ of the system being in some state $i$ of energy $E_i$ reaches equilibrium for any $i$.
- There is some "universal" function $f$ such that for any $p_i$ and $p_j$, we have ${p_i \over p_j}=f(E_i-E_j)$. The word "universal" means $f$ works for any system which we put in contact with the reservoir.
- $f$ is continuous.
The above conditions are enough for us to deduce $f$ is an exponential function.
Proof:
Since for any state $i$, $j$ and $k$, $f(E_i-E_k)={p_i \over p_k}=\big({p_i \over p_j}\big)\big({p_j \over p_k}\big)=f(E_i-E_j)f(E_j-E_k)$, we have $f(x)f(y)=f(x+y)$ for any $x$ and $y$.
Let $n$ be some non-zero integer. We can easily see $f({1 \over n})=f(1)^{1 \over n}$ and $f(n)=f(1)^n$ from the relation we derived. When $n=0$, we take $f(0)=1$ instead of $0$ (This is reasonable since when two states have the same energy, their probabilities are the same). Therefore, we can generalize the results to the rational number ${m \over n} \in \mathbb{Q}$, and have $f({m \over n})=f(1)^{m \over n}$. Since $f$ is continuous, we can extend $\mathbb{Q}$ to $\mathbb{R}$, and derive for any $x \in \mathbb{R}$, $f(x)=f(1)^x$. Therefore, $f(x)=e^{-\beta x}$ where $\beta$ is some constant. We take $\beta>0$ since it matches our experience that states with lower energies are more likely to be occupied by the system.
Remark:
We have derived that $f$ is an exponential functional, and the probability distribution of the system is Boltzmann with $p_i={e^{-\beta E_i} \over Z}$ where $Z=\sum_{i}{e^{-\beta E_i}}$.
Derive the Gibbs Entropy
Suppose we have some function $S(\tilde{p}_1,\tilde{p}_2,\tilde{p}_3,\cdots)$ where $\tilde{p}_i$ is the probability of some state $i$ being occupied by the system. We have to clear up that we do not require $\tilde{p}_i$ to be the same as $p_i$, which is to say, the system may not be at thermal equilibrium here. To construct the Gibbs entropy, we require the function $S$ to have the property:
Given a fixed quantity $E$ where $E=\sum_{i}{p_iE_i}=\sum_{i}{\tilde{p}_iE_i}$, $S$ must be maximized if and only if $\tilde{p}_i=p_i$ for any $i$.
The property means that given the expectation value of system energy being conserved, we want the function of construction to be maximized when the probabilities $\tilde{p}_i$'s match those at thermal equilibrium. To explore how the maximum is achieved, we take the differential of $S$ $$\delta S = \sum_{i}{{\partial S \over \partial \tilde{p}_i}\delta \tilde{p}_i}$$ and the differential of $E=\sum_{i}{\tilde{p}_iE_i}$ $$0=\sum_{i}{\delta \tilde{p}_i E_i}$$ Since $E_i=-{1 \over \beta}(\log{p_i}+\log{Z})$, and $\sum_{i}{\delta \tilde{p}_i}=0$ (the probability adds up to be a constant, $1$), we have $$\sum_{i}{\log{p_i} \delta \tilde{p}_i} = 0$$ It is already clear that if ${\partial S \over \partial \tilde{p}_i}\big|_{\tilde{p}_i=p_i}=-k\log{p_i}+c$ where $k>0$ and $c$ are some constants, $S$ reaches the maximum since $\delta S$ would be $0$ when $\tilde{p}_i=p_i$. Therefore, one possible choice of $S$ is $$S(\tilde{p}_1,\tilde{p}_2,\tilde{p}_3,\cdots) = S_G(\tilde{p}_1,\tilde{p}_2,\tilde{p}_3,\cdots) = -k\sum_{i}{\tilde{p}_i \log{\tilde{p}_i}}$$ So we have reconstructed the Gibbs entropy $S_G(\tilde{p}_1,\tilde{p}_2,\tilde{p}_3,\cdots) = -k\sum_{i}{\tilde{p}_i \log{\tilde{p}_i}}$.
However, here come the questions:
1. Is the function $S$ must be in some form of the Gibbs entropy $S_G$ (so we have $S=S(S_G)$)?
In our derivation, $S_G$ is just a workable choice for $S$. There are other possible alternative functions. For example, $S=e^{S_G}S_G$ is a choice as well. My instinct tells me the answer should be negative. Otherwise, it implies at thermal equilibrium, there would be some quantity physically different from the Gibbs entropy $S_G$ but also guaranteed to be maximized. And whatever that is, it would be interesting and surprising if it does exist.
2. Is it possible to reasonably justify the existence of $f$ without using the concept of most probable state and entropy?
At thermal equilibrium, we assume the entropy is maximized since it represents the system is in the most probable state. The justification is reasonable in the probabilistic sense. However, is there any argument justifying the existence of $f$ without using the concept of most probable state and entropy (if we use it, we will derive the existence of $f$ as Boltzmann distribution immediately)?