1

In regards to the Kullback information (also here), I'm trying to prove that:

$$K[p_1,p_2] = \int dx \ p_1(x,t) \ ln\frac{p_1(x,t)}{p_2(x,t)}$$

is non-negative, whereby $p_1(x,t)$ and $p_2(x,t)$ are any two probability densities. Therefore, I've introduced a function:

$$g(x) = x-1-ln(x)$$

for which $g(x) \geq 0$ applies. I now want to rewrite the functional $K[p_1,p_2]$ as follows:

$$\int dx \ p_1(x,t) \ ln[R(x,t)] = \int dx \ p_1(x,t) \ g\left[\frac{1}{R(x,t)}\right]$$

If I insert the function $R(x,t) = \frac{p_1(x,t)}{p_2(x,t)}$ into the right-hand side of the equation I get:

$$\int dx \ p_1(x,t) \ g\left[\frac{1}{R(x,t)}\right] = \int dx \ p_1 \left(\frac{p_2}{p_1}-1-ln\frac{p_2}{p_1}\right) = \int dx \left(p_2-p_1+p_1 \ ln\frac{p_1}{p_2}\right)$$

which now is the point I'm not sure how to go on, to get the equality of the equations.

The only case where the left and right-hand sides are equal occurs, if $p_1(x,t) = p_2(x,t)$ and so both sides would be zero.
So I wanted to ask if anyone could give me a hint what I'm missing here?

1 Answers1

2

The introduction of $g$ is a bit strange as the additional linear part needlessly complicates the calculations. The classic proof is to use the convexity of $-\ln$:

$$ K = \int p_1 \left(-\ln \frac{p_2}{p_1}\right) \\ \geq -\ln \left( \int p_1 \frac{p_2}{p_1}\right) \\ \geq -\ln 1 \\ \geq 0 $$

The positivity and normalization of $p_1$ is used for Jensen’s inequality, the positivity of both distributions used so that the $\ln$ is well defined and the normalisation of $p_2$ is used for calculating the lower bound.

By the way, the arguments are pretty general and can be applied to different diverges using convex functions.

Hope this helps.

LPZ
  • 11,449
  • 1
  • 4
  • 24