6

Given a finite Markov chain, how do I find the topological entropy $h_T$? Furthermore, I should compare it with the Shannon entropy $h_S$ and show that $h_T\leq h_S$. Is this a general fact?

This actually comes from an exercise, the specific Markov chain used there is defined by: $$\Omega=\{1,2,3\}$$ with transition probabilities $$p_{1\rightarrow 1}=1-q \quad p_{1\rightarrow 2}=p_{1\rightarrow 3}=\frac{q}{2}$$ $$p_{2\rightarrow 2}=1-q\quad p_{2\rightarrow 3}=q$$ $$p_{3\rightarrow 1}=1$$

Drebin J.
  • 386

1 Answers1

5

I strongly recommend two references:

  • Young's paper (e-print), an illuminating conceptual review of entropy in dynamical systems;
  • Pekoske's Master thesis, for its examples with step by step calculations.
    (Beware of the typo in page 36, though: $h_\mu(\sigma)\approx .075489$ should read $h_\mu(\sigma)\approx 0.32451$.)

Topological Entropy $h_T$

how do I find the topological entropy $h_T$?

From a compatible topological Markov shift.

The given transition probabilities correspond to the (left) transition matrix: $$ P = \begin{pmatrix} 1-q & 0 & 1\\ q/2 & 1-q & 0\\ q/2 & q & 0 \end{pmatrix}. $$

We obtain a topological Markov shift (i.e., a subshift of finite type) from the original Markov chain by considering the adjacency matrix $A$ compatible with $P$ (see the posts here and this talk): $$ A = \begin{pmatrix} 1 & 0 & 1\\ 1 & 1 & 0\\ 1 & 1 & 0 \end{pmatrix}. $$

The topological entropy of the topological Markov shift is given (see also this book (e-print)) by the logarithm of the spectral radius (i.e., largest eigenvalue modulus, $|\lambda|$) of the adjacency matrix: $$h_T = \log_2|\lambda|.$$

For $A$ above we have $\lambda=2$ and, thus $$h_T = 1.$$

Kolmogorov-Sinai (KS) entropy $h_{KS}$

If $\pi$ is the steady state of the Markov chain given by $P$, then

$$ h_{KS} = - \sum_{i,j} \pi_i P_{ij} \log_2 P_{ij}. $$

The steady state vector $\pi$ of the Markov chain is the eigenvector associated with the unit eigenvalue of $P$, normalized so that $\sum\pi_i=1$ (see these examples). For $P$ above we have

$$ \pi = \frac{1}{2q+3}\begin{pmatrix} 2\\ 1\\ 2q \end{pmatrix}. $$

That yields, e.g.:

$$ h_{KS}(q=1/2) \approx 1,$$ $$ h_{KS}(q=1/5) \approx 0.75464.$$

Shannon Entropy $h_S$

show that $h_T\le h_S$. Is this a general fact?

Yes, I think so.

The Shannon entropy $h_S$ depends on the chosen partition and is unbounded, so I'd expect the statement (1) $h_T < h_S$ to be correct. Also, it's possible to have (i) $h_S=h_{KS}$ (e.g., for the Bernoulli shift) and we have (ii) $h_T \ge h_{KS}$: from (i) and (ii) we see it's possible that (2) $h_T = h_S$ holds; finally, from (1) and (2) we have: $h_T\le h_S$.

In terms of probabilities $p_i$, we can write

$$ h_S = - \sum_{i} p_{i} \log_2 p_{i}. $$

Calculating the Shannon entropy using $P$'s steady state probabilities, $\pi_i$, we obtain, e.g.:

$$ h_{S}(q=1/2) \approx 1.5.$$ $$ h_{S}(q=1/5) \approx 1.3328.$$

stafusa
  • 12,435
  • Nice answer! Can I ask what's the prescription for writing the transition matrix? (and why?) Naively, I'd write down $p_{ij} = p_{i\to j}, which is triangular with largest eigenvalue 1, thus giving zero entropy – John Donne May 06 '18 at 18:45
  • @JohnDonne, Thanks. The largest eigenvalue you need for $h_T$ is of the adjacency matrix $A$, not from the transition matrix $P$. About how to write $P$, it's a matter a convention: the one you mention is the right transition matrix, the most usual in mathematics (see here), and then the stochastic vectors are rows; for dealing with matrices I'm more used to column vectors, which I'd guess are common in physics, and that corresponds to a left transition matrix $P_{ij}=p_{i\leftarrow j}$. – stafusa May 06 '18 at 20:52
  • Yes, of course, my mistake. I copied down the wrong transition matrix. Thanks! – John Donne May 06 '18 at 21:38