7

It is a known fact that we can derive the Boltzmann distribution if we apply the maximum entropy principle at thermal equilibrium. In this post, I am going to work reversely: I want to first assume some conditions which guarantee Boltzmann distribution, and use Boltzmann distribution to recover the Gibbs entropy. I can reconstruct the Gibbs entropy but have some other questions left unsolved at the end of the post.

Starting Point: The Conditions which Guarantee the Boltzmann Distribution

Given some large heat reservoir and some closed system we are interested in, we assume after we put the system in contact with the reservoir for a long time,

  1. The probability $p_i$ of the system being in some state $i$ of energy $E_i$ reaches equilibrium for any $i$.
  2. There is some "universal" function $f$ such that for any $p_i$ and $p_j$, we have ${p_i \over p_j}=f(E_i-E_j)$. The word "universal" means $f$ works for any system which we put in contact with the reservoir.
  3. $f$ is continuous.

The above conditions are enough for us to deduce $f$ is an exponential function.

Proof:

Since for any state $i$, $j$ and $k$, $f(E_i-E_k)={p_i \over p_k}=\big({p_i \over p_j}\big)\big({p_j \over p_k}\big)=f(E_i-E_j)f(E_j-E_k)$, we have $f(x)f(y)=f(x+y)$ for any $x$ and $y$.

Let $n$ be some non-zero integer. We can easily see $f({1 \over n})=f(1)^{1 \over n}$ and $f(n)=f(1)^n$ from the relation we derived. When $n=0$, we take $f(0)=1$ instead of $0$ (This is reasonable since when two states have the same energy, their probabilities are the same). Therefore, we can generalize the results to the rational number ${m \over n} \in \mathbb{Q}$, and have $f({m \over n})=f(1)^{m \over n}$. Since $f$ is continuous, we can extend $\mathbb{Q}$ to $\mathbb{R}$, and derive for any $x \in \mathbb{R}$, $f(x)=f(1)^x$. Therefore, $f(x)=e^{-\beta x}$ where $\beta$ is some constant. We take $\beta>0$ since it matches our experience that states with lower energies are more likely to be occupied by the system.

Remark:

We have derived that $f$ is an exponential functional, and the probability distribution of the system is Boltzmann with $p_i={e^{-\beta E_i} \over Z}$ where $Z=\sum_{i}{e^{-\beta E_i}}$.

Derive the Gibbs Entropy

Suppose we have some function $S(\tilde{p}_1,\tilde{p}_2,\tilde{p}_3,\cdots)$ where $\tilde{p}_i$ is the probability of some state $i$ being occupied by the system. We have to clear up that we do not require $\tilde{p}_i$ to be the same as $p_i$, which is to say, the system may not be at thermal equilibrium here. To construct the Gibbs entropy, we require the function $S$ to have the property:

Given a fixed quantity $E$ where $E=\sum_{i}{p_iE_i}=\sum_{i}{\tilde{p}_iE_i}$, $S$ must be maximized if and only if $\tilde{p}_i=p_i$ for any $i$.

The property means that given the expectation value of system energy being conserved, we want the function of construction to be maximized when the probabilities $\tilde{p}_i$'s match those at thermal equilibrium. To explore how the maximum is achieved, we take the differential of $S$ $$\delta S = \sum_{i}{{\partial S \over \partial \tilde{p}_i}\delta \tilde{p}_i}$$ and the differential of $E=\sum_{i}{\tilde{p}_iE_i}$ $$0=\sum_{i}{\delta \tilde{p}_i E_i}$$ Since $E_i=-{1 \over \beta}(\log{p_i}+\log{Z})$, and $\sum_{i}{\delta \tilde{p}_i}=0$ (the probability adds up to be a constant, $1$), we have $$\sum_{i}{\log{p_i} \delta \tilde{p}_i} = 0$$ It is already clear that if ${\partial S \over \partial \tilde{p}_i}\big|_{\tilde{p}_i=p_i}=-k\log{p_i}+c$ where $k>0$ and $c$ are some constants, $S$ reaches the maximum since $\delta S$ would be $0$ when $\tilde{p}_i=p_i$. Therefore, one possible choice of $S$ is $$S(\tilde{p}_1,\tilde{p}_2,\tilde{p}_3,\cdots) = S_G(\tilde{p}_1,\tilde{p}_2,\tilde{p}_3,\cdots) = -k\sum_{i}{\tilde{p}_i \log{\tilde{p}_i}}$$ So we have reconstructed the Gibbs entropy $S_G(\tilde{p}_1,\tilde{p}_2,\tilde{p}_3,\cdots) = -k\sum_{i}{\tilde{p}_i \log{\tilde{p}_i}}$.

However, here come the questions:

1. Is the function $S$ must be in some form of the Gibbs entropy $S_G$ (so we have $S=S(S_G)$)?

In our derivation, $S_G$ is just a workable choice for $S$. There are other possible alternative functions. For example, $S=e^{S_G}S_G$ is a choice as well. My instinct tells me the answer should be negative. Otherwise, it implies at thermal equilibrium, there would be some quantity physically different from the Gibbs entropy $S_G$ but also guaranteed to be maximized. And whatever that is, it would be interesting and surprising if it does exist.

2. Is it possible to reasonably justify the existence of $f$ without using the concept of most probable state and entropy?

At thermal equilibrium, we assume the entropy is maximized since it represents the system is in the most probable state. The justification is reasonable in the probabilistic sense. However, is there any argument justifying the existence of $f$ without using the concept of most probable state and entropy (if we use it, we will derive the existence of $f$ as Boltzmann distribution immediately)?

Andy Chen
  • 1,145
  • 1
  • 2
  • 12
  • Do any of the answers here help: https://physics.stackexchange.com/questions/717558/what-is-the-intuition-of-the-expected-value-of-the-logarithm-and-entropy – hft Jul 22 '22 at 00:58
  • Thank you, but I have seen this question, and I am quite aware of the concept of entropy in terms of most probable state or information. My point here is we work out the thermal equilibrium through the maximum entropy principle. And it gives us Boltzmann statistics. However, is it possible to work it out in the other way around? My answer is yes. With some assumptions which guarantee the Boltzmann statistics, we can recover some term which is always maximized at the equilibrium probabilities. With no surprise, this is entropy. Then, is it possible to justify our assumptions without entropy? – Andy Chen Jul 22 '22 at 02:37
  • Thermodynamically, I think not. But I don't know for sure. I have to think about it. – hft Jul 22 '22 at 02:58
  • 1
    It seems that the OP confuses Gibbs entropy with Boltzmann entropy (see How to derive Shannon Entropy from Clausius Theorem?). In this sense it is not clear whether it asks how to derive Boltzmann entropy from Boltzmann statistics or how to prove the equivalence between the Gibbs and the Boltzmann entropies. – Roger V. Jul 25 '22 at 07:38
  • Thank you, but I am not confused about Gibbs entropy and Boltzmann entropy. I know for sure that Boltzmann entropy is a special case of Gibbs entropy. What I am doing here is to find a way to derive the Gibbs entropy from Boltzmann distribution. – Andy Chen Jul 25 '22 at 08:04
  • As pointed out in the beginning of my post, we commonly know the Boltzmann distribution of a system can be derived from the Gibbs entropy via maximum entropy principle, and I wonder whether there is a way to do it reversely. That is the motivation of what I am doing in my post. – Andy Chen Jul 25 '22 at 08:04
  • @AndyChen I think that you haven't looked at my answers that I linked. – Roger V. Jul 25 '22 at 10:23
  • Yes, I do, and I am even skeptical about your answer that "Gibbs entropy, defined via the Clausius inequality,...". According to wikipedia (https://en.wikipedia.org/wiki/Entropy_(statistical_thermodynamics)), Gibbs entropy is defined as $-\sum_{j}{p_j\log{p_j}}$, and its equivalence to Clausius entropy $S_C={U-F \over T}$ is shown in the paper (1. https://aapt.scitation.org/doi/10.1119/1.1971557 2.https://aip.scitation.org/doi/abs/10.1063/1.5111333). – Andy Chen Jul 25 '22 at 12:57
  • Also, as I said, I am not here asking how to derive the Boltzmann entropy from the Gibbs entropy or showing their equivalence. What I am doing is to find some way (also a non-standard-textbook way) to "derive the Gibbs entropy from Boltzmann distribution", and I am able to do it but with some questions generated in the derivation. However, I am truly confused about why you keep thinking I am asking about the Boltzmann entropy. Can you please elaborate your reasoning? – Andy Chen Jul 25 '22 at 13:23
  • What Wikipedia defines as Gibbs entropy is clearly the Shannon entropy, but this is a matter of terminology. My understanding is that you want to start with a Boltzmann distribution and prove that maximizing entropy gives the same values of the quantities that you would obtain by directly employing the distribution. In this respect the second answer that I linked above (in a separate comment) seems to be closely related to your question. – Roger V. Jul 26 '22 at 08:33
  • A caveat: Boltzmann distribution is for canonical ensemble (system coupled to a bath), whereas the entropy maximization is used for closed systems (microcanonical ensemble). – Roger V. Jul 26 '22 at 08:35
  • Another point to make sure that we are on the same page: Boltzmann distribution maximizes Shannon entropy. Essentially, you want the conditions on all the functions that are maximized by Boltzmann distribution? – Roger V. Jul 26 '22 at 12:20
  • Related: https://physics.stackexchange.com/q/263197/247642 – Roger V. Dec 06 '22 at 16:02

1 Answers1

2

Just as there is an infinite number of functions maximized at some point $p$, there is an infinite number of functionals maximized by the Boltzmann distribution. However only one of these may be called thermodynamic entropy and to identify it we need to connect the Boltzmann distribution to thermodynamics.

  1. Start by postulating the connection between $Z$ and free energy $A$ of the canonical ensemble:

$$\frac{A}{kT} = - \ln Z \Rightarrow \frac{\bar E - T S}{k T} = -\ln Z \Rightarrow \boxed{ \frac{S}{k} = \ln Z + \beta \bar E } \tag{1} $$

  1. Use the Boltzmann distribution to write

$$ \ln p_i = -\beta E_i - \ln Z \Rightarrow \beta E_i = -\ln p_i - \ln Z \tag{2} $$

  1. Calculate the mean $\beta \bar E$ over the Boltzmann distribution:

$$ \beta\bar E = \sum_i p_i \beta E_i = \sum_i p_i\left( -\ln p_i - \ln Z \right) = -\sum_i p_i\ln p_i -\ln Z \tag{3} $$

  1. Substitute (3) into (1) to obtain the equality between thermodynamic entropy and the Gibbs/Shannon functional: $$ \boxed{\frac{S}{k} = -\sum_i p_i\ln p_i} \tag{4}$$

  2. Show that out of all probability distributions $\tilde p_i$ with fixed mean energy $\bar E$, the distribution that maximizes the functional in (3) is the Boltzmann distribution.

Conclusions If the equilibrium distribution is given by the Boltzmann distribution, then:

  • thermodynamic entropy is equal to the Gibbs/Shannon entropy functional evaluated on the Boltzmann distribution; and

  • the Boltzmann distribution maximizes the Gibbs/Shannon functional over all distributions with the same mean energy.

Themis
  • 5,843