6

Given a probability p(0,1) and parameter α(0,1), is there an entropy-based proof which yields a good upper bound for the sum \sum_{\ell = 0}^{\alpha n} \binom{n}{\ell}p^\ell(1-p)^{n-\ell} when n is large?

When p = 1/2, there is very simple proof (for example, see section 3.1 of this paper) which upper bounds the above quantity by 2^{(H(\alpha) - 1)n} when H(\cdot) denotes the binary entropy function.

Is there a proof using similar techniques which gives a bound for the more general sum above (which can be interpreted as the CDF of a binomial distribution with parameter p)?

I'd also be interested in other proofs for bounds on the above sum. The appropriate bound has already noted in this answer, but doesn't sketch out a proof establishing this result.

Naysh
  • 455

3 Answers3

10

By the Chernoff–Hoeffding theorem, the sum in question is \le\exp(-nD(a||p)) for a\le p, where a:=\alpha and D(a||p):=a\ln\frac ap+(1-a)\ln\frac{1-a}{1-p}, the Kullback–Leibler divergence between the distributions of Bernoulli random variables with parameters a and p. In particular, this gives your bound for p=1/2.

On the other hand, if a>p then, by the the law of large numbers, the sum in question converges to 1 as n\to\infty. This convergence is actually exponentially fast -- because, if a>p, then 1-(\text{your sum}) is \le\exp(-nD(a||p)) and D(a||p)>0.

Iosif Pinelis
  • 116,648
7

Yes, if \alpha<p (if \alpha>p, the sum is almost 1). To see this, write \sum_{\ell = 0}^{\alpha n} \binom{n}{\ell}p^\ell(1-p)^{n-\ell}\leqslant t^{-\alpha n}(pt+(1-p))^n for every t\in (0,1]. Choose a positive t=t_0 for which RHS is minimal possible, taking the logarithmic derivative equal to 0 we get -\alpha/t_0+p/(pt_0+1-p)=0, -\alpha p t_0-\alpha(1-p)+pt_0=0, pt_0(1-\alpha)=\alpha(1-p), t_0=\frac{\alpha(1-p)}{p(1-\alpha)}. We see that if \alpha\leqslant p, this t_0 is indeed in (0,1], thus we get the upper bound \theta^n, where \theta=\frac{p^\alpha(1-p)^{1-\alpha}}{\alpha^\alpha (1-\alpha)^{1-\alpha}}.

Fedor Petrov
  • 102,548
4

As you said, the sum is \Pr[X \leq \alpha n] where X is drawn from a Binomial distribution with n trials having p probability of success. Bounds on this sum (for \alpha < p) are called "tail bounds", "concentration inequalities", etc. These bounds are proven for many settings, especially sums of independent random variables, of which Binomials are the nicest special case.

A typical proof approach, which some of us call the Chernoff method, looks like this: if X = \sum_{i=1}^n Y_i where each Y_i is an i.i.d. Bernoulli(p), then

\begin{align} \Pr[X \leq k] &= \Pr[e^X \leq e^k] \\ &\leq e^{-k} \mathbb{E} e^X & \text{Markov's inequality} \\ &= e^{-k} \prod_{i=1}^n \mathbb{E} e^{Y_i} & \text{Indpendence} \\ &= e^{-k} \left(pe + (1-p)\right)^n \\ &\dots \end{align} etc. I omitted a detail -- we scale both X and k by some factor \lambda, which we choose later -- and stopped the analysis early, but that's how many proofs start.

Starting points:

usul
  • 4,429