Derive Poisson distribution from probability per time of event

Question

Suppose we have a probability per time $\lambda$ that something (e.g. nuclear decay, random walk takes a step, etc.) happens. It is a known result that the probability that $n$ events happen in a time interval of length $T$, is given by the Poisson distribution $$P(n) = \frac{e^{-\lambda T} (\lambda T)^n}{n!} \, .$$ How do we prove this?

Ha - I saw this question and I thought "weird, I would have expected Daniel to know the answer to this". Then I scrolled down... — Floris, Dec 04 '17 at 23:13
@Floris well it took me a while to get help on the integral... — DanielSank, Dec 04 '17 at 23:25
I wrote an alternative that doesn't explicitly need integrals — Floris, Dec 04 '17 at 23:54
@Floris yes, and I like the explicit relation to the binomial distribution, but I'm not sure that approach directly answers this question. Perhaps you can argue that many little time slices are like many low-probably attempts... — DanielSank, Dec 04 '17 at 23:56
Yes - that's exactly what I argue. In the limit of infinitely many time slices with the product $pN$ remaining constant, you are effectively making the transition from discrete to continuous - without ever appearing to integrate. — Floris, Dec 04 '17 at 23:57

DanielSank · Accepted Answer · 2020-08-25T02:55:54.380

15

Probability distribution of time until next event

First we calculate the probability density that a time $t$ passes without any event happening. Divide $t$ into $N$ small intervals each of length $dt = t/N$. Defining $\lambda$ as the probability per time that the event occurs, the probability that no event occurs within any one short time interval is approximately $(1 - \lambda dt)$. Therefore, the probability that no event happens in any of the intervals, but then does happen within an interval of length $dt$ right at the end of all the intervals is $$\left( \prod_{i=1}^N \left( 1 - \lambda dt \right) \right) \lambda dt = \left( 1 - \frac{\lambda t}{N} \right)^N \lambda dt \stackrel{N\rightarrow \infty}{=} \lambda dt \exp \left( - \lambda t \right) \, .$$ In other words, given a starting time $0$, the probability density that no event has happened after a time $t$, but then happens right at $t$ is $\lambda \exp(-\lambda t)$.

Probability of multiple events

Now we ask the probability that we get $n$ events in the time interval $T$. Suppose the first event happens at $t_1$, the second even happens at $t_2$, etc. We therefore have a series of intervals $$\{[0, t_1], [t_1, t_2], \ldots [t_n, T] \} $$ with events happening at the end of each interval. The probability that our events occur in this way is $$P(t_1, t_2, \ldots t_n) = \lambda \exp(-\lambda t_1) \lambda \exp(-\lambda (t_2 - t_1)) \cdots \lambda \exp(-\lambda(T - t_n)) = \lambda^n \exp(-\lambda T) \, .$$ Of course, any arrangement of $\{t_1, t_2, \ldots t_n \}$ such that $t_1 < t_2 < \ldots < t_n$ counts as an $n$-event arrangement, so we have to add up the probabilities of all these possible arrangements, i.e. the probability of $n$ events is \begin{align} P(n \text{ events}) &= \int_0^T dt_1 \int_{t_1}^T dt_2 \cdots \int_{t_{n-1}}^T dt_n P(t_1, t_2, \ldots t_n) \\ &= \lambda^n \exp(-\lambda T) \int_0^T dt_1 \int_{t_1}^T dt_2 \cdots \int_{t_{n-1}}^T dt_n \, . \end{align} The multiple integral is the volume of a right simplex and has value $T^n/n!$, so the final result is $$P(n\text{ events}) = \frac{(\lambda T)^n \exp(-\lambda T)}{n!} $$ which is the Poisson distribution with mean $\lambda T$.

In your first derivation should $\lambda =1$ produce a probability density of 1 everywhere. If the probability density of the event happening at $t=t'$, and not before, is $\lambda e^{-\lambda t'} dt$ then $lambda = 1 \implies PDF(t)=e^{-t} dt$ then for any $t'$ we should get $ p(t') = \int_{0}^{t'} PDF(t) = -e^{-t'} + 1$ not $1$. [mod edit to fix MathJax - DZ] – D. W. Dec 05 '17 at 00:38
@D.W. I don't understand this comment. $\lambda$ has dimensions of 1/time, so it can't have value 1. Could you clarify? – DanielSank Dec 05 '17 at 19:14
Forgive me, I'm not criticising your answer, or suggesting you're wrong, I'm just trying to understand your derivation for myself. Isn't $\lambda$ the probability per unit time that something happens? What if something is guaranteed to happen? Wouldn't that correspond to a value of $\lambda = 1$ (the probability per unit time that the event happens is 1) – D. W. Dec 05 '17 at 19:17
@D.W. I know you're not criticizing. No problem. In fact, the case you're describing where the event always happens even for infinitessimally small times would correspond to $\lambda \to \infty$. You can plug that into the various equations at any point and see how the results turn out. Does that help? – DanielSank Dec 05 '17 at 19:36
Yes, thank you. I'm still a little confused as to how an event having a probability one of happening at all times corresponds to a probability per unit time that approaches infinity, but that's for me to ponder. – D. W. Dec 05 '17 at 19:43
@D.W. Well, think about what "probability per time" means. It means that, for small enough times $dt$, the probability that a thing happens is $\lambda dt$. For any fixed value of $\lambda$, we can make $dt$ small enough such that $\lambda dt \to 0$. Therefore, the only way to make sure something always happens for arbitrarily small $dt$ is to make $\lambda$ really big, i.e. infinite. – DanielSank Dec 05 '17 at 19:51
Ah, brilliant! Yeah that makes complete sense now thank you – D. W. Dec 05 '17 at 19:53
Excellent answer, but could you modify it to explain what lambda is before you use it? I'm still a little confused at the equation that it first appears in. – user1717828 Apr 20 '18 at 00:33
@user1717828 see edited first paragraph, and please let me know via comment if it's ok now. – DanielSank Apr 20 '18 at 01:10
I've accepted my own answer because it had the highest number of votes. The other answers are excellent as well so check them out. I especially like leonbloy's answer. – DanielSank Jun 26 '22 at 16:53

Floris · Answer 2 · 2017-12-05T12:30:02.170

The Poisson distribution describes the probability of a certain number ($n$) of unlikely events ($p\ll 1$) happening given $N$ opportunities.

This is like doing a very unfair coin toss $N$ times, with the probability $p$ of the coin turning up heads. The number of heads would follow the binomial distribution:

$$P(n|p,N) = ~^{N}C_n~p^n (1-p)^{N-n} =\frac{N!}{(N-n)!~n!} p^n (1-p)^{N-n}$$

Now it remains to prove that when $N\rightarrow \infty$ and $p\rightarrow 0$ while $Np\rightarrow \lambda T$, that the above converges to the known result. In essence, I argue that when you make the number of opportunities tend to infinity, you go from a discrete to a continuous approach; but as long as you are careful with your infinities, the result should still be valid.

First, we find an approximation for $(1-p)^{N-n}$. Taking the log, we get

$$\log\left((1-p)^{N-n}\right) = (N-n)\log(1-p)\approx (N-n)\cdot (-p)$$

Since $N\gg n$, we obtain $(1-p)^{N-n}\approx e^{-Np}$

Next, we approximate the $~^N C_n$ term using Stirling's approximation that $\log N! \approx N\log N - N$ and noting that $n\ll N$

Then

$$\begin{align} \log\left(\frac{N!}{(N-n)!}\right) &= N\log N - N - (N-n)\log(N-n) + (N-n) \\ &=N\log N - (N-n)\log(N-n) - n\\ &= N \log N -(N-n)\left(\log(N)+\log\left(1-\frac{n}{N}\right)\right) - n\\ &\approx N\log N -(N-n)\left(\log(N)-\frac{n}{N}\right) - n\\ &\approx n\log N + n -n \log n - n\\ &=n\log(N-n)\end{align}$$

It follows that $\frac{N!}{(N-n)! n!} \approx \frac{N^n}{n!}$

Finally, we note that $pN = \lambda T$, and we get

$$P(n|N,p) = \frac{N^n p^n e^{-Np}}{n!}$$

this easily rearranges to

$$P(n) = \frac{(\lambda T)^n e^{-\lambda T}}{n!}$$

Which is the result we set out to prove.

I made use of this article to remind me of some of the steps in this.

Minor nitpick: it's actually Stirling's approximation, not Sterling's — Selene Routley, Dec 05 '17 at 08:06
Actually, don't worry if you think that's bad, I spelt Max Planck's name for years the same as the piece of wood or the title of the short film by Eric Sykes. And I know German pretty well (or well enough for much of my communications for my work). I was 50 before I was corrected. That's not the worst of it. I had spent several month long or more visits to Max Planck Institute before I caught on. — Selene Routley, Dec 05 '17 at 08:12
I like this answer. It would be nice to explain why $pN = \lambda T$ is the right way to get from the limit of the discrete binomial distribution to the continuous case. Otherwise it feels a bit hand-wavy. — DanielSank, Aug 25 '20 at 02:54

leonbloy · Answer 3 · 2020-02-07T19:28:43.767

Let $A^n_t$ be the event: exactly $n$ point events happened over a time interval $t$. Then, for small $\Delta t$

$$\begin{align}P(A^n_{t+\Delta t}) &= P( A^n_t \cap A^0_{\Delta t }) + P(A^{n-1}_t \cap A^1_{\Delta t }) \\ &= P(A^n_t) P (A^0_{\Delta t }) + P(A^{n-1}_t )P( A^1_{\Delta t })\\ \end{align} $$ where we've used independence of occurrence. Now, defining $p_n(t) \equiv P( A^n_t)$ and $\lambda = \lim_{\Delta t\to 0} p_1(\Delta t)/\Delta t$, and taking the limit for $\Delta t \to 0$ we get

$$ p_n(t+\delta t)=p_n(t)(1 - \lambda \, \delta t) + p_{n-1}(t) \lambda \, \delta t $$

which leads to the differential equations:

$$p'_n(t)= \begin{cases} -\lambda ( p_n(t) - p_{n-1}(t) ) & n>0\\ -\lambda p_n(t) & n=0 \end{cases}$$

with the initial conditions $$p_n(0)=\begin{cases}0 & n>0\\1 & n=0 \, .\end{cases}$$

The solution is given by the Poisson distribution:

$$p_n(t)=\frac{(\lambda t)^n \exp(-\lambda t)}{n!} \, .$$

I've learnt two new ways to prove this today, aside from Floris's answer, which was the one I already knew :). Most elegant! — Selene Routley, Dec 05 '17 at 07:53

Derive Poisson distribution from probability per time of event

3 Answers3

Probability distribution of time until next event

Probability of multiple events

Related

Linked