36

Every now and again I hear something about Tsallis entropy, $$ S_q(\{p_i\}) = \frac{1}{q-1}\left( 1- \sum_i p_i^q \right), \tag{1} $$ and I decided to finally get around to investigating it. I haven't got very deep into the literature (I've just lightly skimmed Wikipedia and a few introductory texts), but I'm completely confused about the motivation for its use in statistical physics.

As an entropy-like measure applied to probability distributions, the Tsallis entropy has the property that, for two independent random variables $A$ and $B$, $$ S_q(A, B) = S_q(A) + S_q(B) + (1-q)S_q(A)S_q(B).\tag{2} $$ In the limit as $q$ tends to $1$ the Tsallis entropy becomes the usual Gibbs-Shannon entropy $H$, and we recover the relation $$H(A,B) = H(A) + H(B)\tag{3}$$ for independent $A$ and $B$.

As a mathematical property this is perfectly fine, but the motivation for its use in physics seems completely weird, unless I've fundamentally misunderstood it. From what I've read, the argument seems to be that for strongly interacting systems such as gravitationally-bound ones, we can no longer assume the entropy is extensive (fair enough so far) and so therefore we need an entropy measure that behaves non-extensively for independent sub-systems, as in Equation $2$ above, for an appropriate value of $q$.

The reason this seems weird is the assumption of independence of the two sub-systems. Surely the very reason we can't assume the entropy is extensive is that the sub-systems are strongly coupled, and therefore not independent.

The usual Boltzmann-Gibbs statistical mechanics seems well equipped to deal with such a situation. Consider a system composed of two sub-systems, $A$ and $B$. If sub-system $A$ is in state $i$ and $B$ is in state $j$, let the energy of the system be given by $E_{ij} = E^{(A)}_i + E^{(B)}_j + E^{(\text{interaction})}_{ij}$. For a canonical ensemble we then have $$ p_{ij} = \frac{1}{Z} e^{-\beta E_{ij}} = \frac{1}{Z} e^{-\beta \left(E^{(A)}_i + E^{(B)}_j + E^{(\text{interaction})}_{ij}\right)}. $$ If the values of $E^{(\text{interaction})}_{ij}$ are small compared to those of $E^{(A)}_i$ and $E^{(B)}_j$ then this approximately factorises into $p_{ij} = p_ip_j$, with $p_i$ and $p_j$ also being given by Boltzmann distributions, calculated for $A$ and $B$ independently. However, if $E^{(\text{interaction})}_{ij}$ is large then we can't factorise $p_{ij}$ in this way and we can no longer consider the joint distribution to be the product of two independent distributions.

Anyone familiar with information theory will know that equation $3$ does not hold for non-independent random variables. The more general relation is $$ H(A,B) = H(A) + H(B) - I(A;B), $$ where $I(A;B)$ is the mutual information, a symmetric measure of the correlation between two variables, which is always non-negative and becomes zero only when $A$ and $B$ are independent. The thermodynamic entropy of a physical system is just the Gibbs-Shannon entropy of a Gibbs ensemble, so if $A$ and $B$ are interpreted as strongly interacting sub-systems then the usual Boltzmann-Gibbs statistical mechanics already tells us that the entropy is not extensive, and the mutual information gets a physical interpretation as the degree of non-extensivity of the thermodynamic entropy.

This seems to leave no room for special "non-extensive" modifications to the entropy formula such as Equation $1$. The Tsallis entropy is non-extensive for independent sub-systems, but it seems the cases where we need a non-extensive entropy are exactly the cases where the sub-systems are not independent, and therefore the Gibbs-Shannon entropy is already non-extensive.

After that long explanation, my questions are: (i) Is the above characterisation of the motivation for Tsallis entropy correct, or are there cases where the parts of a system can be statistically independent and yet we still need a non-extensive entropy? (ii) What is the current consensus on the validity of Tsallis entropy-based approaches to statistical mechanics? I know that it's been the subject of debate in the past, but Wikipedia seems to imply that this is now settled and the idea is now widely accepted. I'd like to know how true this is. Finally, (iii) can the argument I sketched above be found in the literature? I had a quick look at some dissenting opinions about Tsallis entropy, but surprisingly I didn't immediately see the point about mutual information and the non-extensivity of Gibbs-Shannon entropy.

(I'm aware that there's also a more pragmatic justification for using the Tsallis entropy, which is that maximising it tends to lead to "long-tailed" power-law type distributions. I'm less interested in that justification for the sake of this question. Also, I'm aware there are some similar questions on the site already [1,2], but these don't cover the non-extensivity argument I'm concerned with here the answers only deal with the Rényi entropy.)

N. Virgo
  • 33,913
  • A review paper by Tsallis might interest you – Trimok Nov 13 '13 at 10:39
  • 2
    I have the same question in my mind, and your arguments make perfect sense to me. It seems the central problem is whether it is possible to make concrete connection between the three terms: $$ S_q(A, B) = S_q(A) + S_q(B) + (1-q)S_q(A)S_q(B), ,e^{-\beta E^{(\text{interaction})}_{ij}} ,and ,I(A;B).$$ I am very curious about that. –  May 05 '14 at 22:02
  • 1
    I had similar concerns with Tsallis as you do. So we decided to write something up. See S. Presse, K. Ghosh, J. Lee, K. Dill, "Nonadditive Entropies Yield Probability Distributions with Biases not Warranted by the Data",PRL, 111, 180604, 2013. –  Jan 19 '14 at 05:07
  • 1
    First of all, one should be careful while talking about extensivity and additivity - these are two different quantities! See for example nice explanation by H. Touchette: http://arxiv.org/pdf/cond-mat/0201134.pdf The main idea (in the context we are dealing with here) behind Tsallis entropy is that it can become extensive for highly correlated systems. You can read more in this article by C. Tsallis et al. (for quick checking: section 2.2): http://arxiv.org/pdf/cond-mat/0309093v1.pdf – nalewkoz May 28 '15 at 23:07
  • @nalewkoz thank you, your comment was helpful. I'll keep Toucehette's distinction in mind in future. From the Tsallis article I got that the motivation is not so much "we need an entropy that fails to be additive for independent systems" as "we need an entropy that is additive for systems that are correlated in some particular way." That does make more sense, but I still don't understand the motivation fully. For the Gibbs-Shannon entropy I know multiple arguments for why one should expect that particular function to be maximised, but for the Tsallis it still seems to be missing. – N. Virgo Jun 13 '15 at 05:11
  • 1
    (I suspect such an argument will not come from Tsallis himself. The paper you linked to has this text: "If we knew how to deduce [the Gibbs-Shannon entropy] from first principles for those systems ... whose microscopic dynamics ultimately leads to ergodicity, we could try to generalize along that path. But this procedure is still unknown, the [Shannon formula] being adopted, as we already mentioned, at the level of a postulate." (Emphasis added.) This is just completely wrong - the first derivation is due to Boltzmann, with finer points added by Gibbs, Shannon and Jaynes.) – N. Virgo Jun 13 '15 at 05:13
  • This is a wonderful question, and in my opinion it has not been fully recognized by Tsallis enthusiasts. Although Tsallis statistics is being pushed as a generalization of Boltzmann statistics - this matter is very far away of being a scientific consensus, in spite of what Wikipedia may state. Maybe if all the students only studied Tsallis statistics, and did not learn anything else, then it may happen (consensus is not the truth). – Marco Antonio Ridenti Dec 01 '20 at 01:46
  • The informational Shannon-Jaynes entropy can describe any system correctly, with the right constraints. The composition law may not be valid for one particle events, but it will always be for the all particle micro states (6N dof) if the system is memoryless. Tsallis statistics (and other entropies) are just shortcuts to fit complicated one particle distribution in correlated systems. This paper (https://iopscience.iop.org/article/10.1238/Physica.Regular.071a00443) shows that Tsallis statistics is not a fundamental concept, but a derived one from incomplete knowledge of the system. – Marco Antonio Ridenti Dec 01 '20 at 01:46

2 Answers2

3

(i) Is the above characterisation of the motivation for Tsallis entropy correct, or are there cases where the parts of a system can be statistically independent and yet we still need a non-extensive entropy?

The one example I can think of that fits this description is a collisionless plasma (well, at least weakly collisional), like the solar wind.

Over scales larger than the Debye length, the system behaves in a collective manner but the collisionless nature of the gas keeps it from reaching equilibrium. Further, even though electromagnetic fields produce long-range interactions, the "parts" of the system (e.g., Debye spheres) can still be statistically independent. This allows a collisionless plasma to behave according to a non-extensive kinetic theory.

(ii) What is the current consensus on the validity of Tsallis entropy-based approaches to statistical mechanics? I know that it's been the subject of debate in the past, but Wikipedia seems to imply that this is now settled and the idea is now widely accepted. I'd like to know how true this is.

I think the validity of Tsallis entropy is generally accepted, at least in space plasma physics [e.g., see Livadiotis, 2015]. The support for a non-Maxwell-Boltzmann theory arose because of the continual observation of velocity distributions (e.g., Maxwellian) that had power-law tails and the lack of observations of Maxwellians. Initial attempts to model these distributions included superpositions of modified Lorentzian distributions (e.g., similar to Cauchy distributions) with Maxwellians [e.g., Feldman et al., 1983; Thomsen et al., 1983]. Later studies [e.g., Maksimovic et al., 1997] resurrected an old form called a kappa distribution, which was originally derived by Vasyliunas [1968]. Eventually, Leubner [2002] showed the connection between the kappa distribution and the Tsallis distribution when $\kappa = -1/\left( q - 1 \right)$, where $q$ is the entropic parameter from Tsallis statistics (Note that the kappa distribution is a member of the modified Lorentzian distributions).

More recently, a great deal of work has started to solidify the relationship between kappa distributions and Tsallis statistics and fundamental thermodynamics. In recent years a lot of work on this topic has been published that attempts to merge the more traditional statistical mechanics with non-extensive statistical mechanics [e.g., Livadiotis, 2015; Treumann and Baumjohann, 2014, 2016].

While there is still some hesitation by some in the community, the fact that nearly all particle velocity distributions observed to date in collisionless space plasmas can be modeled by kappa distributions more accurately than Maxwellians is strong support for Tsallis statistics.

Finally, (iii) can the argument I sketched above be found in the literature? I had a quick look at some dissenting opinions about Tsallis entropy, but surprisingly I didn't immediately see the point about mutual information and the non-extensivity of Gibbs-Shannon entropy.

The long-range interactions and the collisionless nature of some plasmas causes these systems to continually be in a state of non-equilibrium. This type of system requires a non-extensive formalism, as Leubner [2002] states:

Any extensive formalism fails whenever a physical system includes long-range forces or long-range memory. In particular, this situation is usually found in astrophysical environments and plasma physics where, for example, the range of interactions is comparable to the size of the system considered. A generalized entropy is required to possess the usual properties of positivity, equiprobability, concavity and irreversibility but suitably extending the standard additivity to nonextensivity...

References

  • Feldman, W.C., et al., "Electron Velocity Distributions Near the Earth's Bow Shock," Journal of Geophysical Research 88(A1), pp. 96--110, doi:10.1029/JA088iA01p00096, 1983.
  • Leubner, M.P. "A Nonextensive Entropy Approach to Kappa-Distributions," Astrophysics and Space Science 282(3), pp. 573--579, doi:10.1023/A:1020990413487, 2002.
  • Livadiotis, G. "Introduction to special section on Origins and Properties of Kappa Distributions: Statistical Background and Properties of Kappa Distributions in Space Plasmas," Journal of Geophysical Research: Space Physics 120(3), pp. 1607--1619, doi:10.1002/2014JA020825, 2015.
  • Maksimovic, M., et al., "Ulysses electron distributions fitted with Kappa functions," Geophysical Research Letters 24(9), pp. 1151--1154, doi:10.1029/97GL00992, 1997.
  • Thomsen, M.F., et al., "Stability of Electron Distributions Within the Earth's Bow Shock," Journal of Geophysical Research 88(A4), pp. 3035--3045, doi:10.1029/JA088iA04p03035, 1983.
  • Treumann, R.A. and W. Baumjohann "Beyond Gibbs-Boltzmann-Shannon: general entropies—the Gibbs-Lorentzian example," Frontiers in Physics 2(49), pp. 1--5, doi:10.3389/fphy.2014.00049, 2014.
  • Treumann, R.A. and W. Baumjohann "Generalised partition functions: inferences on phase space distributions," Annales Geophysicae 34(6), pp. 557--564, doi:10.5194/angeo-34-557-2016, 2016.
  • Vasyliunas, V.M. "A survey of low-energy electrons in the evening sector of the magnetosphere with OGO 1 and OGO 3," Journal of Geophysical Research 73(9), pp. 2839--2884, doi:10.1029/JA073i009p02839, 1968.
  • 2
    Thanks for this answer - it's helpful, +1 - but for the most part it doesn't really address the argument in my question. Rather, it's mostly concerned with what I already mentioned in the footnote at the end of my question, i.e. the post-hoc justification based on good matches to empirical distributions. – N. Virgo Jul 30 '17 at 02:36
  • 2
    In particular, the quote from Leubner is a good example of exactly what I was talking about in the question. It asserts, correctly, that a nonextensive formalism is required when there are long-range interactions, but then jumps, incorrectly, to the conclusion that this necessitates a non-extensive entropy function. The usual Boltzmann-Gibbs-Shannon entropy will already be nonextensive when there are long-range interactions, so on the face of it a special non-extensive entropy function doesn't seem necessary. – N. Virgo Jul 30 '17 at 02:39
  • 1
    Don't get me wrong - the fit to empirical data is remarkable and really does suggest there is something in all of this. I'm critical of the theoretical justification because I want to understand the concept on a deeper level, not because I want to criticise it arbitrarily. (I asked the question 5 years ago but I've recently become interested in it again, so your answer is well timed.) – N. Virgo Jul 30 '17 at 02:50
  • 2
    Ah, I see what you are getting at. Yes, I actually found the discussions in many of the original works somewhat confusing and/or lacking too. I was surprised, for instance, that when trying to motivate the need for non-extensive entropy in many of the papers they did not even explain what normal/extensive entropy entailed and why it failed in collisionless plasmas. Okay, I will have to think about this more... – honeste_vivere Jul 30 '17 at 18:28
  • In principle, for a collisionless plasma, the traditional approach would be to use some sort of kinetic equation, such as Vlasov equation and Fokker-Planck equation. However, they may not capture the long range correlation in detail and even more general kinetic equation would be needed - probably very hard to solve. In some way, Tsallis statistics encapsulates such difficulties and gives a good fit with the right choice of q. – Marco Antonio Ridenti Dec 01 '20 at 01:09
  • Just another note: in the case of collisional plasma such as weakly ionized plasmas, for which I have some experience, the kinetic Boltzmann equation must be solved to give the correct electron energy distribution function EEDF. Depending on the cross section, these EEDFs may be full of ripples. It can in no way be described by Tsallis statistics. Maybe for this reason, Tsallis statistics went more or less unnoticed in our community. – Marco Antonio Ridenti Dec 01 '20 at 01:27
-2

I do not think you can use your expression of the usual Boltzmann-Gibbs statistical mechanics seems when your system is a self gravitating one, such as a star, unless you use an infinite number of interactions instead of just two. In that case you need to use the continuous expression. And the results they found on stellar polytropes is quite interesting. I remember a few years ago when they were lobbing in favor of Tsallis for the Nobel prize! (I think that was too much, but is just an anecdote)