Normal distribution by successive approximation?

Question

$\newcommand\R{\mathbb R}\newcommand\la\lambda$It is well known and easy to see that the rotationally invariant product of two probability measures on $\R$ has to be a Gaussian (or Dirac) measure; see e.g. this answer.

This appears to make the conjecture below somewhat plausible.

Let $\mu$ be any probability measure on $\R^2$ with a finite nonzero covariance matrix. Let $\mu_1:=\mu$. For each natural $n$, consider the following three-step procedure:

let $$\la_n:=\mu_n^{(1)}\otimes\mu_n^{(2)},$$ where $\mu_n^{(1)}$ and $\mu_n^{(2)}$ are the marginals of $\mu_n$;
let $$\nu_n:=\frac1{2\pi}\int_0^{2\pi}\la_n R_t\,dt,$$ where $\la_n R_t$ is the pushforward measure obtained from $\la_n$ by the rotation about the origin through angle $-t$;
let $\mu_{n+1}$ be obtained by rescaling the probability measure $\nu_n$ so that the covariance matrix of $\mu_{n+1}$ be the unit matrix.

Conjecture: $\mu_n$ converges weakly (as $n\to\infty$) to the standard Gaussian measure on $\R^2$.

Is this conjecture true?

Comment: Perhaps Step 3, the rescaling, is not essential. Of course, if we have the convergence to a (nondegenerate) Gaussian measure without Step 3, then we have such a convergence with Step 3 as well.

We can restate the problem (without rescaling) analytically as follows. Let $f_n$ denote the characteristic function of $\nu_n$, so that $$f_n(u,v)=\int_{\R^2}\nu_n(dx\times dy)e^{i(ux+uy)} \\ =\int_{\R^2}\nu_n(dx\times dy)\cos(ux+uy)\quad \text{(by symmetry)}$$ for all real $u$ and $v$. Then $$f_n(u,v)=g_n\big(\sqrt{u^2+v^2}\big)$$ for some function $g_n\colon[0,\infty)\to\mathbb R$ and all real $u$ and $v$. Then for all natural $n$ and all real $r\ge0$ $$g_{n+1}(r)=\frac2\pi\int_0^{\pi/2} dt\,g_n(r\cos t)g_n(r\sin t). \tag{1}$$ We want to show that $g_n(r)\to e^{-c^2 r^2/2}$ for some $c\in(0,\infty)$ and all real $r\ge0$.

So, analytically, the problem may be viewed as one of stability of a (nonlinear) integral equation or as one of solving such an integral equation by iterations.

Using substitutions $g_n(r)=h_n(r^2)$ and $r^2=s$, we can rewrite (1) as $$h_{n+1}(s)=\frac1\pi\int_0^s du\,\frac{h_n(s-u)}{\sqrt{s-u}}\frac{h_n(u)}{\sqrt{u}}$$ and then as $$\pi H_{n+1}(s)\sqrt s=[(H_n*H_n)(s)=]\int_0^s du\,H_n(s-u)H_n(u)$$ for all natural $n$ and all real $s\ge0$, where $H_n(u):=h_n(u)/\sqrt u$. We want to show that $H_n(u)\to e^{-c^2 u/2}/\sqrt u$ for some $c\in(0,\infty)$ and all real $u>0$.

Related https://mathoverflow.net/questions/191791/gaussian-distributions-as-fixed-points-in-some-distribution-space/191825#191825 — Abdelmalek Abdesselam, May 31 '21 at 14:46
@AbdelmalekAbdesselam : Thank you for this reference. Both settings indeed involve iterations. I think that other setting is a case of the central limit theorem for iid summands and $2^k$ summands, for natural $k$. Here the setting involves a different kind of iterations. But presumably/hopefully in this setting too one has normality in the limit. — Iosif Pinelis, May 31 '21 at 15:03
Of course the iterations are different, but the methods used for one might work for the other. It would take me some time to work it out, but the renormalization group inspired strategy I would use for your questions is: 1) write $g$ as the Gaussian times $(1+h)$ and write the iteration for the function $h$, 2) write the linearization of the transformation at the fixed point $h=0$, 3) see if you can diagonalize it explicitly using suitable orthogonal polynomials for the integral over $t$. Ultimately, one would need a Lyapunov function like some kind of entropy. Does Stein's method help? — Abdelmalek Abdesselam, May 31 '21 at 15:11
@AbdelmalekAbdesselam : Thank you for your suggestions. I see your point better now, and will have these suggestions in mind. — Iosif Pinelis, May 31 '21 at 15:50
I just did a quick computation which gives support to your conjecture. Functions $h_k(r)=r^k$ are eigenfunctions of the linearization (no need for orthogonal polynomials like Hermite etc.) with eigenvalues $c_k=(4/\pi)\times W_k$ in terms of Wallis integrals in the notations of https://en.wikipedia.org/wiki/Wallis%27_integrals The only expanding/relevant directions are for $k=0,1$, while $k=2$ is neutral/marginal. Finally for $k>2$ , the corresponding directions are contracting. This is exactly the same as in the RG link I mentioned. One can mimick Koralov-Sinai and make this into a proof... — Abdelmalek Abdesselam, May 31 '21 at 16:05
...in the near Gaussian case. Given that the situation is simpler than for the RG, I would expect it would not require as much ingenuity to find a Lyapunov functions that would allow a proof in the global (far from Gaussian) case. BTW a better link than above is https://mathoverflow.net/questions/182752/central-limit-theorem-via-maximal-entropy — Abdelmalek Abdesselam, May 31 '21 at 16:05
@AbdelmalekAbdesselam : Thank you for these further ideas. Meanwhile, i have done some simple rewriting of the iteration equation, now almost in a convolution form. — Iosif Pinelis, May 31 '21 at 16:22

fedja · Accepted Answer · 2023-01-17T02:19:24.960

1

In the rewritten form it doesn't require any ingenuity at all. For brevity, introduce the notation $$ (Th)(s)=\frac 1\pi\int_0^s\frac{h(s-u)h(u)}{\sqrt{u(s-u)}}\,du=\frac 1\pi\int_0^1\frac{h(s(1-v))h(sv)}{\sqrt{v(1-v)}}\,dv $$ Suppose that $h_0$ is any real valued Lipschitz function on $[0,+\infty)$ (the finiteness of the second moment guarantees Lipschitzness for your $h_0$ and, if you really want to discuss it, I can show how this particular condition can be dropped) such that $\|h_0\|_\infty=h_0(0)=1$ and there exists $h_0'(0)=-c< 0$ (the latter two conditions are essential). Then the iterations $h_{n+1}=Th_n$ converge to $e^{-cx}$ uniformly on compact subsets of $[0,+\infty)$.

The proof consists of several easy observations:

$\|Th\|_\infty\le\|h\|_\infty^2$ and, thereby, $\|h_n\|_\infty\le 1$ for all $n$.
If $h$ is bounded and $L$-Lipschitz, then $Th$ is $L\|h\|_\infty$-Lipschitz.

Indeed, $$ Th(s)-Th(S)=\frac1\pi\int_0^1\frac{dv}{\sqrt{v(1-v)}}[h(s(1-v))h(sv)-h(S(1-v))h(Sv)] $$ and $$ |h(s(1-v))h(sv)-h(S(1-v))h(Sv)| \\ \le\|h\|_\infty[|h(s(1-v))-h(S(1-v))|+|h(sv)-h(Sv)|] \\ \le\|h\|_\infty L[|s-S|(1-v+v)]=\|h\|_\infty L|s-S|\,. $$ Thus all $h_n$ are Lipschitz with the same Lipschitz constant $L$ as $h_0$.

If $A\in\mathbb R$ and $h(s)\ge e^{-As}$ on $[0,s_0]$, then $(Th)(s)\ge e^{-As}$ on $[0,s_0]$. 3') If $a\in\mathbb R$ and $0\le h(s)\le e^{-as}$ on $[0,s_0]$, then $(Th)(s)\le e^{-as}$ on $[0,s_0]$.

(all exponential functions are fixed points of $T$ and we have monotonicity as long as $h$ stays non-negative)

Now choose any $0<b<a<c<A<B$ and choose $s_0>0$ so that $e^{-As}\le h_0(s)\le e^{-as}$ on $[0,s_0]$ (by the derivative at zero condition such $s_0$ exists).

Consider the largest $S_n$ such that $e^{-Bs}\le h_n(s)\le e^{-bs}$ on $[0,S_n]$. We have $S_0\ge s_0$ and $S_{n+1}\ge S_n$. We want to improve the latter trivial inequality to some quantitative advance $S_{n+1}\ge S_n+\delta(S_n)$ where $\delta>0$ is separated from $0$ on any compact subinterval of $[s_0,+\infty)$. That is quite easy: $$ h_{n+1}(S_n)=(Th_n)(S_n)=\frac1\pi\int_0^{S_n}\frac{h_n(S_n-u)h_n(u)}{\sqrt{u(S_n-u)}}\,du\ge \\ \frac1\pi\int_0^{S_n}\frac{e^{-B(S_n-u)}e^{-Bu}}{\sqrt{u(S_n-u)}}\,du +\frac1\pi\int_0^{s_0}\frac{e^{-B(S_n-u)}[e^{-Au}-e^{-Bu}]}{\sqrt{u(S_n-u)}}\,du \\ \ge e^{-BS_n}+e^{-BS_n}S_n^{-1}\int_0^{s_0}[e^{-Au}-e^{-Bu}]\,du=e^{-BS_n}+\Delta(S_n) $$ Then, by the Lipschitz property of both $h_{n+1}$ and $e^{-Bs}$, the inequality $h_{n+1}(s)\ge e^{-Bs}$ persists on $[S_n,S_n+\delta(S_n)]$ with $\delta(S_n)=\frac{\Delta(S_n)}{B+L}$, say. The extension of the upper bound is similar.

The outcome is that the double inequality $e^{-Bs}\le h_n(s)\le e^{-bs}$ propagates from $[0,s_0]$ to the entire real line. Since $0<b<a<c<A<B$ were arbitrary, we conclude that $h_n$ tend to $e^{-cs}$ uniformly on compact intervals, finishing the story.

edited Jan 17 '23 at 02:19

answered Jan 17 '23 at 00:37

fedja

59,730

Thank you for your answer. The solution is indeed when it's done. :-) I have fixed some typos. – Iosif Pinelis Jan 17 '23 at 02:09
Do you want to try the problem at https://mathoverflow.net/q/438390/36721 ? – Iosif Pinelis Jan 17 '23 at 02:11
@IosifPinelis Yes, if you or somebody else translates it into the language I can understand (I know neither what copula is, nor what the Frechet-Hoeffding upper bound refers to and google search is definitely not my favorite thing) – fedja Jan 17 '23 at 02:16
A copula is the joint cdf of two uniformly distributed random variables. The Frechet--Hoeffding upper bound is, I think, the copula $C(u,v)=\min(u,v)$ for $u,v$ in $[0,1]$. – Iosif Pinelis Jan 17 '23 at 03:32
I meant to say, a copula is the joint cdf of two random variables each uniformly distributed on $[0,1]$. – Iosif Pinelis Jan 17 '23 at 04:08
@IosifPinelis Then either I misunderstand something, or it is trivial. Let me know which one is the case (I posted an answer in that thread). – fedja Jan 17 '23 at 13:50
In my view, neither is the case. :-) I made a corresponding comment on your "copula" answer there. – Iosif Pinelis Jan 17 '23 at 16:16
@IosifPinelis I'm inclined to stick to my claim, but thank you very much for fixing the stupid typos anyway! Anything else you want me to look at? BTW, you have one habit that I do mind: when you retype somebody else's question and then answer it yourself, you don't create any distinction between this situation and the situation when you ask something you are interested in, so finding the actual questions of yours in that stream is quite an exercise. Some relevant note in the question title would certainly help ;-) – fedja Jan 17 '23 at 17:58
"you don't create any distinction between this situation and the situation when you ask something you are interested in" -- I am not sure what you mean here. Whenever I ask a question (even if previously asked by someone else and then deleted), I am interested in it, to some degree. Also, whenever I ask a question and can answer it, I would answer it. Also, when I revive a question previously asked by someone else, I always try to make the origin of the question quite clear. So, I am not sure what you mean by "you don't create any distinction". – Iosif Pinelis Jan 17 '23 at 18:21
@IosifPinelis I mean that when I click on your profile and use "all questions" option to get the multipage list of question titles alone (not the actual posts, there you do make a clear distinction but I need to click on the titles one by one to get there) I cannot tell at a glance which is which. – fedja Jan 17 '23 at 20:44
I see now. I will try to remember this and, when suitable, indicate the kind of source in the title. – Iosif Pinelis Jan 17 '23 at 21:08

Normal distribution by successive approximation?

1 Answers1