426

While playing around with Mathematica I noticed that most polynomials with real coefficients seem to have most complex zeroes very near the unit circle. For instance, if we plot all the roots of a polynomial of degree 300 with coefficients chosen randomly from the interval $[27, 42]$, we get something like this:

roots

The Mathematica code to produce the picture was:

randomPoly[n_, x_, {a_, b_}] := 
  x^Range[0, n] . Table[RandomReal[{a, b}], {n + 1}];
Graphics[Point[{Re[x], Im[x]}] /. 
  NSolve[randomPoly[300, x, {27, 42}], x], Axes -> True]

If I try other intervals and other degrees, the picture is always mostly the same: almost all roots are close to the unit circle.

Question: why does this happen?

Andrej Bauer
  • 47,834
  • 31
    That's a great question. Have you taken a look at the pictures created by Dan Christensen (http://jdc.math.uwo.ca/roots/) and the stuff John Baez has written about it (http://math.ucr.edu/home/baez/roots/)? See also http://johncarlosbaez.wordpress.com/2011/12/11/the-beauty-of-roots/ and Jordan Ellenberg here: http://quomodocumque.wordpress.com/2010/01/09/what-do-roots-of-random-polynomials-look-like/ – Tom Leinster Oct 02 '14 at 21:24
  • @TomLeinster: yeah I know about those. I am considering computing one more degree than Dan, so up to 25. It would take about 3 days using several computers. – Andrej Bauer Oct 02 '14 at 21:56
  • 4
    For polynomials of the form $x^2+bx+c$, the absolute values of the non-real roots depend only on $c$, so that as $b$ varies, we get lots of roots all on a circle. Is this the same phenomenon writ small, and does it shed any light? – Steven Landsburg Oct 02 '14 at 22:05
  • @Steven, the absolute values $\ \xi\ \eta\ $of the two roots (regardless of being real or not) travel on the positive-quarter branch of hyperbole $\ {(\xi\ \eta) : \xi\cdot\eta =|c|}\ $ when $\ b\ $ varies. (Your comment was strange to me). – Włodzimierz Holsztyński Oct 03 '14 at 01:59
  • @WłodzimierzHolsztyński: Well, yes, but when the roots are non-real, they "travel" in the degenerate sense that they don't actually travel at all. – Steven Landsburg Oct 03 '14 at 02:02
  • 1
    Related: http://mathoverflow.net/questions/139804/distribution-of-roots-of-complex-polynomials . I thought I had seen other discussions of this phenomenon on here as well.... – usul Oct 03 '14 at 08:00
  • 12
    "The zeros of random polynomials cluster uniformly near the unit circle" http://www-old.newton.ac.uk/preprints/NI04017.pdf – Benjamin Dickman Oct 03 '14 at 08:42
  • 3
    Related: http://math.stackexchange.com/q/206890/622 – Asaf Karagila Oct 03 '14 at 09:40
  • 32
    I suspect your constraints on the coefficients are similar to taking the $100$th root of a relatively small number: $\sqrt[100]{42}\approx 1.038$. Try multiplying the coefficients of $x^i$ by $2^{100-i}$ to see if it makes a difference. – Henry Oct 04 '14 at 13:53
  • 4
    George Lowther proves that the roots "becomes concentrated on the unit circle" in his answer to the MO question, "Distribution of roots of complex polynomials" (as linked by usul). – Joseph O'Rourke Oct 04 '14 at 14:10
  • 4
    Since the roots and coefficients of a polynomial are Fourier transforms of each other, another way to rephrase the question that I find interesting is to ask why the Fourier transform of a random real vector is another random vector most of whose components have unit magnitude. I don't have an answer as to why but maybe this will help someone find another intuitive explanation. – user541686 Oct 05 '14 at 09:47
  • 7
    If we fix the degree at $n$ and the coefficients are iidrvs then the distribution is invariant under $p(z) \mapsto z^np(1/z)$, which means that the distribution of zeros is invariant under $z \mapsto 1/z$. I wonder if there is an equally simple way to see that there has to be some rotational symmetry. – François G. Dorais Oct 05 '14 at 20:54
  • 35
    I'm sorry, but with all due respect for the other contributions to this site of the OP and most answerers , this is definitely not a great question. It is not precise, does not admit any definitive answer, and is kind of trivial: Henry's comment already demystifies it completely, and Joseph Van Name shows that it's juts a very simple exercise in complex analysis. Yet another example of the stack exchange system gone crazy. – Joël Oct 06 '14 at 12:51
  • 5
    I know very little about the math going on in the background but a colleague demystified this for me by saying that basically it is a special feature of the basis you have chosen, namely the monomials. If you choose a different basis or even just weight each monomial by a factor, the roots will tend to congregate on a different set. Basically although a) any polynomial can arise and b) you chose them randomly... They aren't as generic as as you think; they've in fact been chosen in a special way. – Spencer Oct 06 '14 at 14:46
  • 4
    @Joël: While I don't disagree with the craziness, I disagree with the reasoning. While seemingly plausible, Henry's comment is arguably incorrect. While Joseph's answer is enlightening, it does not explain the observed rotational symmetry. There is a lot more to this question than meets the eye... – François G. Dorais Oct 07 '14 at 01:11
  • 25
    @Joël: I myself wouldn't rate this question as highly as it is, I think we're all victims of the scale-free networks. If I had to guess what makes it interesting: it is easy to understand (and thereby very likely not "research level") and it sparks ideas in people's minds because it looks like it is within reach of a good coffee break discussion. – Andrej Bauer Oct 07 '14 at 06:51
  • 1
    @Francois Dorais. I added to my answer some remarks that do explain why there should be rotational symmetry. Unfortunately it is not very rigorous at this point since the denominator could be near zero close to the contour. But I would agree that there is definitely more than meets the eye (just see Terry Tao's comments on my answer). – Joseph Van Name Oct 12 '14 at 00:22
  • 2
    I have a theorem for this in my habilitation from 1993 (Interpolation Points and Zeros of Polynomials in Approximation Theory, R. Grothmann, Habil. at KU Eichstätt). It is not a central result there and I never published this corollary anywhere else, I must admit. Such things seem to get rediscovered often. – Rene Oct 12 '14 at 17:32
  • 2
    a quick answer to the title's question is: if z has modulus 1, all its powers have modulus 1 too, and a random linear combination of them (with coefficients iid chosen in [a,b]) has more chances to vanish. For $n$ large compared to |a|+|b|, $|z|\neq 1$ puts a strong constraint on the possible vanishing linear combinations of the powers of z. – Pietro Majer Oct 13 '14 at 07:56
  • 1
    I think it is not difficult to make the above argument into an elementary rigorous estimate of the set of coefficients $(a_0,\dots,a_n)\in[a,b]^{n+1}$ of polynomials that have a zero $1-\epsilon <|z|< 1+\epsilon,$ and with a bit more care, with $\alpha<\mathrm{arg}z<\beta$, to explain quantitatively the uniform concentration on the unit circle. – Pietro Majer Oct 13 '14 at 08:07
  • 2
    @PietroMajer: please post an answer if you have a nice one. – Andrej Bauer Oct 13 '14 at 08:20

16 Answers16

201

Let me give an informal explanation using what little I know about complex analysis.

Suppose that $p(z)=a_{0}+\dotsm+a_{n}z^{n}$ is a polynomial with random complex coefficients and suppose that $p(z)=a_{n}(z-c_{1})\cdots(z-c_{n})$. Then take note that

$$\frac{p'(z)}{p(z)}=\frac{d}{dz}\log(p(z))=\frac{d}{dz}\log(z-c_{1})+\dotsm+\log(z-c_{n})= \frac{1}{z-c_{1}}+\dotsm+\frac{1}{z-c_{n}}. $$

Now assume that $\gamma$ is a circle larger than the unit circle. Then

$$\oint_{\gamma}\frac{p'(z)}{p(z)}dz=\oint_{\gamma}\frac{na_{n}z^{n-1}+(n-1)a_{n-1}z^{n-2}+\dotsm+a_{1}}{a_{n}z^{n}+\dotsm+a_{0}}\approx\oint_{\gamma}\frac{n}{z}dz=2\pi in.$$

However, by the residue theorem,

$$\oint_{\gamma}\frac{p'(z)}{p(z)}dz=\oint_{\gamma}\frac{1}{z-c_{1}}+...+\frac{1}{z-c_{n}}dz=2\pi i|\{k\in\{1,\ldots,n\}|c_{k}\,\,\textrm{is within the contour}\,\,\gamma\}|.$$

Combining these two evaluations of the integral, we conclude that $$2\pi i n\approx 2\pi i|\{k\in\{1,\ldots,n\}|c_{k}\,\,\textrm{is within the contour}\,\,\gamma\}|.$$ Therefore there are approximately $n$ zeros of $p(z)$ within $\gamma$, so most of the zeroes of $p(z)$ are within $\gamma$, so very few zeroes can have absolute value significantly greater than $1$. By a similar argument, very few zeroes can have absolute value significantly less than $1$. We conclude that most zeroes lie near the unit circle.

$\textbf{Added Oct 11,2014}$

A modified argument can help explain why the zeroes tend to be uniformly distributed around the circle as well. Suppose that $\theta\in[0,2\pi]$ and $\gamma_{\theta}$ is the pizza slice shaped contour defined by $$\gamma_{\theta}:=\gamma_{1,\theta}+\gamma_{2,\theta}+\gamma_{3,\theta}$$ where

$$\gamma_{1,\theta}=([0,1+\epsilon]\times\{0\})$$

$$\gamma_{2,\theta}=\{re^{i\theta}|r\in[0,1+\epsilon]\}$$

$$\gamma_{3,\theta}=\cup\{e^{ix}(1+\epsilon)|x\in[0,\theta]\}.$$

Then $$\oint_{\gamma_{\theta}}\frac{p'(z)}{p(z)}dz= \oint_{\gamma_{\theta,1}}\frac{p'(z)}{p(z)}dz+\oint_{\gamma_{\theta,2}}\frac{p'(z)}{p(z)}dz+\oint_{\gamma_{\theta,3}}\frac{p'(z)}{p(z)}dz$$

$$\approx O(1)+O(1)+\oint_{\gamma_{\theta,3}}\frac{p'(z)}{p(z)}dz$$

$$\approx O(1)+O(1)+\oint_{\gamma_{\theta,3}}\frac{na_{n}z^{n-1}+(n-1)a_{n-1}z^{n-2}+\dotsm+a_{1}}{a_{n}z^{n}+\dotsm+a_{0}}dz $$

$$\approx O(1)+O(1)+\oint_{\gamma_{\theta,3}}\frac{n}{z}dz\approx n i\theta$$.

Therefore, there should be approximately $\frac{i\theta}{2\pi}$ zeroes inside the pizza slice $\gamma_{\theta}$.

  • 7
    A nice argument, but really the same as the heuristic in my argument, I think... – Igor Rivin Oct 03 '14 at 01:58
  • 31
    A slight variant of this argument: Jensen's formula tells us that the zeroes of $p$ occur when $\log |p|$ fails to be harmonic. But $\log |p(z)|$ is typically $O(1)$ for $|z| \leq 1-\varepsilon$ and typically $n \log |z| + O(1)$ for $|z| \geq 1+\varepsilon$, so the main opportunity for non-harmonicity is near the unit circle. This formulation of the argument has the advantage of extending to other models of random polynomials than Kac polynomials, e.g. Weyl polynomials. It can also be pushed to give local universality: see my paper with Van at http://arxiv.org/abs/1307.4357 – Terry Tao Oct 03 '14 at 15:49
  • 34
    I am accepting this argument because it doesn't make me read papers, and instead just explains what's going on directly. Thanks for all the other answers, too! – Andrej Bauer Oct 03 '14 at 16:27
  • 4
    I don't understand this argument. Where is the assumption of randomness used? It must be in the approximation that the $p'/p \approx n/z$ -- that obviously holds for $\gamma$ large, but that doesn't support the conclusion that "very few zeros can have absolute value significantly greater than 1". What am I missing? – Aaron Bergman Oct 10 '14 at 01:56
  • 10
    Basically, the randomness is needed to ensure that the denominator $a_n z^n + \dots + a_0$ does not inconveniently end up being unexpectedly small on the contour of integration, which would make the error in the approximation unpleasantly large. (To ensure that this unexpected smallness does not occur requires a bit of work, and is part of what is now known as Littlewood-Offord theory, introduced by Littlewood and Offord to study almost exactly the problem discussed here, namely to understand the distribution of roots of random polynomials.) – Terry Tao Oct 10 '14 at 05:48
  • 11
    ... for the purposes of rigorous argument rather than heuristics, it turns out that the Jensen formula argument sketched in my previous comment is more robust (basically because $\log |z|$ diverges at zero far more slowly than $\frac{1}{z}$) and is easier to make fully rigorous, as is done in the paper linked to in my previous comment. – Terry Tao Oct 10 '14 at 05:52
  • @Terry Tao. I have added a similar argument for why there should be rotational symmetry as well. Of course, this informal argument still suffers from the pesky fact that the denominator could be near zero. It seems like one could use the Poisson-Jensen Formula (for nearly arbitrary domains instead of simply the unit circle) for the pizza slice shaped domains to deduce rotational symmetry using $\log(|p(z)|)$ instead of $p'(z)/p(z)$. Is the usage of the Poisson-Jensen Formula for pizza slices the best way to formally rigorously deduce rotational symmetry or should different domains be used? – Joseph Van Name Oct 12 '14 at 00:20
  • 3
    I'm afraid I'm still being dense here. Can someone say what the heuristic is that, for a random polynomial $p$, $p'/p \approx n/z$ on a contour close to the unit circle? – Aaron Bergman Oct 12 '14 at 02:55
  • 1
    @AaronBergman For $|z|>1$, the top order terms of $p$ and $p'$ dominate. And $np$ and $zp'$ almost agree to top order. – Terry Tao Feb 02 '15 at 19:43
118

A complete derivation can be found in the classical paper of Shepp and Vanderbei:

Larry A. Shepp and Robert J. Vanderbei: The complex zeros of random polynomials, Trans. Amer. Math. Soc. 347 (1995), 4365-4384

But the heuristic explanation is that for small modulus the higher order terms contribute very little to the polynomials, and so can be thrown away (so the polynomial can be viewed as one of much lower degree, so has not so many roots), and for large modulus, one can use the same reasoning with $z\rightarrow 1/z.$

EDIT

For a general distribution of coefficients, see this (underappreciated, in my opinion, paper): Distribution of roots of random real generalized polynomials

Igor Rivin
  • 95,560
  • 12
    Is there a bit more to the heuristic? As stated, I do not see how it implies that the unit circle is special. – Andrej Bauer Oct 02 '14 at 21:11
  • Intuitively, when you choose "random coefficients" you avoid too large and too small. – Alexandre Eremenko Oct 02 '14 at 21:41
  • 7
    Sure, but why wouldn't that give me a circle of radius 2, rather than 1? – Andrej Bauer Oct 02 '14 at 21:56
  • Just a name, my classmate Eric Kostlan worked on roots of random polynomials. See if I can find anything specific.. – Will Jagy Oct 02 '14 at 23:04
  • I see Eric's paper with Edelman is a reference for the Transactions article. I wrote to Eric, maybe he will have something to add. – Will Jagy Oct 02 '14 at 23:15
  • @AndrejBauer If you read the heuristic argument, you will see why $1$ is different from $2.$ – Igor Rivin Oct 02 '14 at 23:40
  • 2
    @WillJagy Kostlan has done some really nice stuff. – Igor Rivin Oct 02 '14 at 23:40
  • Igor, glad you think so. If he does not put anything on MO within a few days, I will forward your opinion. – Will Jagy Oct 02 '14 at 23:43
  • That paper is very valuable but may not be the "complete derivation" since it is for a specific sample space (normally distributed real coefficients.) Here "random" may mean something else. However the conclusion is about the same, and the heuristic is highly relevant. – Aaron Meyerowitz Oct 03 '14 at 05:39
  • 1
    @AaronMeyerowitz Your wish is my command :), see the edit. – Igor Rivin Oct 03 '14 at 11:11
  • 15
    Modulus 1 is special because it is the positive fixpoint of $z \mapsto 1/z$. If I understand the heuristics correctly it is in a way wrong to use random coefficients from the same interval $I$ for all powers of $z$. Rather one should use random numbers $r_k \in I$ and sum over $r_k^{n-k},z^k$. – Karl Fabian Oct 03 '14 at 11:47
  • Here's a different way of stating the heuristic. Say the coefficients $a_i$ are IID with mean 1 and variance $\sigma^2$. Then for fixed $z$, $P(z)$ has a (complex) variance given by a geometric sum. For large $n$ and $|z|\gtrsim1$ this variance has real and imaginary parts that grow approximately like $|z|^{2n}$, and therefore the distribution gets too wide, and the probability of $|P|<\epsilon$ falls like $|z|^{-2n}$. This drops off so fast that the prob. of having $|z|$ significantly greater than 1 is small. But the problem is isomorphic under circle inversion, so the same holds for $|z|<1$. –  Oct 03 '14 at 20:41
  • 13
    Igor, Eric Kostlan said the late post by Phantom Hoover is what he would have said, later added "Its even easier, if you are willing to be non-rigorous. If you generate random polynomials in any of a number of "natural ways", the middle coefficients tend to grow fast. For example, one model which gives roots equi-distributed on the Riemann sphere gives the i-th coefficient a variance of (n choose i). So forcing all the coefficients to have the same variance is sort of like forcing the middle coefficients to be zero. So roughly speaking, this starts to look like x^n +- 1." – Will Jagy Oct 04 '14 at 20:04
84

I think the following geometric argument is interesting and maybe sufficient to answer "why" at an intuitive level (?).

When we take the powers of $x$ in the complex plane, the absolute value scales geometrically ($|x^n| =|x|^n$) and the argument (angle with the x-axis) scales linearly ($\arg x^n = n \arg x$). So the powers of $x$ look like this:

powers of x

If $x$ is a root of our random polynomial $$ p(x) = a_nx^n + \dots + a_1 x + a_0, $$ then each of these vectors (including the $x^0$ vector not drawn) is multiplied by a random coefficient, and the sum is equal to the zero vector. I'm just thinking of i.i.d. positive bounded coefficients for this response.

The key point is that this weighted sum of the vectors in any particular direction must cancel out to zero if $x$ is a root of the polynomial, yet each time $x^k$ goes "around the circle" the sizes $|x^k|$ of the vectors is geometrically larger --- unless $|x|$ is very close to $1$. Intuitively, some randomness in the coefficients will not be enough to cancel out large growth of $|x^k|$ because the vectors must sum to zero in every direction simultaneously.

For concreteness, choose the direction of the positive $x$-axis. Then the condition that $x$ be a root implies that, letting $\theta = \arg x$ be the angle of $x$ with the $x$-axis, \begin{align*} 0 &= \sum_k a_k Re(x^k) \\ &= \sum_k a_k |x|^k \cos (k \theta) . \end{align*} Heuristically, since $\cos(k \theta)$ is an oscillating term in $\theta$ and the $a_k$ are independently random, $|x|$ must be very close to one or else the large-$k$ terms "unbalance" the sum. And this condition must hold in all directions, not just the positive $x$-axis.

I have drawn the case where $|x| > 1$, but the $|x| < 1$ case is exactly the same.

(Edit: Maybe also interesting, in light of Francois' simulations, but this suggests that if the coefficients are all positive, or more likely to be positive, and the degree $k$ is relatively small, then we should see few roots with argument (angle to $x$-axis) close to $0$: In this case there is not enough oscillation to get cancellation. That is, the powers of $x$ don't go "around the cycle" and neither are they cancelled by negative coefficients.)

usul
  • 4,429
60

Think I will post this, the question remains popular and there are differing views on what it means; I wrote to Eric Kostlan, classmate, who has published on this sort of thing; he supported the answer by Phantom Hoover, saying " I was going to answer, but someone beat me to it" and sent an informal version later:

Eric Kostlan:

Its even easier, if you are willing to be non-rigorous. If you generate random polynomials in any of a number of "natural ways", the middle coefficients tend to grow fast. For example, one model which gives roots equi-distributed on the Riemann sphere gives the i-th coefficient a variance of (n choose i). So forcing all the coefficients to have the same variance is sort of like forcing the middle coefficients to be zero. So roughly speaking, this starts to look like x^n +- 1.

Here is one of Kostlan's papers in this area: https://www.ams.org/journals/bull/1995-32-01/S0273-0979-1995-00571-9/

Here is a different one, along the top of the page there is an option to download. https://doi.org/10.1016/0024-3795(92)90386-O

Will Jagy
  • 25,349
  • That's a very good explanation too. – Andrej Bauer Oct 05 '14 at 19:28
  • @AndrejBauer, glad you saw this. – Will Jagy Oct 05 '14 at 19:32
  • The pdf of the Science Direct article is in fact free at the link that you gave. Articles in many Springer & Elsevier journals are freely available four years after publication (and the same holds for all AMS journals). – Lucia Oct 05 '14 at 22:34
  • @Lucia, thanks, i will edit that in. – Will Jagy Oct 05 '14 at 23:15
  • 4
    This also nicely explains the (apparently relatively unremarked) fact that judging by the OP's illustration, the roots are not just approximately on the unit circle, but approximately uniformly distributed on it. – Ian Morris Oct 07 '14 at 18:23
  • 5
    @IanMorris Indeed, the roots will be uniformly distributed on the unit circle, as per the result of Erdos and Turan that I described in my answer (so sorry, but this fact was not unremarked). But as you say, the heuristic of Kostlan gives a very nice, albeit rough, explanation of why it should be true. The Erdos-Turan theorem, which is not at all easy prove, gives a precise quantitative formulation. – Joe Silverman Oct 10 '14 at 01:56
47

The following papers might be helpful:

Shmerling, E and Hochberg, K.J., Asymptotic Behavior Of Roots Of Random Polynomial Equations, Proceedings Of The American Mathematical Society, Volume 130 (2002), Number 9, Pages 2761-2770.

Erdős, P. and Turan, P., On the distribution of roots of polynomials, Ann. Math. 51 (1950), 105-119.

In particular, the Erdős-Turan paper contains the following beautiful result, which is a quantitative version of the observation that the angles of the roots of a random polynomial tend to be equidistributed on the unit circle. (The paper may well discuss the magnitudes of the roots, too, but this is the result that I know from that paper.)

Theorem (Erdos, Turan) Let $F(x)=\sum_{k=0}^d a_kx^k\in\mathbb{C}[x]$ with $a_0a_d\ne0$, and let $$ N(F;\alpha,\beta) = \#\bigl\{ \text{roots $r\in\mathbb{C}$ of $F$ with $\alpha\le\operatorname{arg}(r)\le\beta$}\bigr\}. $$ Then for all $0\le \alpha<\beta\le2\pi$, $$ \left| \frac{N(F;\alpha,\beta)}{d} - \frac{\beta-\alpha}{2\pi}\right| \le \frac{16}{\sqrt{d}} \cdot \left[ \log \left( \frac{|a_0|+\cdots+|a_d|}{\sqrt{|a_0a_d|}} \right) \right]^{1/2}. $$

Joe Silverman
  • 45,660
30

I was a bit skeptical of some of the explanations, so I ran my own experiments to see how varying the parameters affected the distribution of zeros. Note that I was only interested in the case where coefficients are independent and identically distributed. Computations were done using PARI/GP instead of Mathematica.

I first essentially repeated Andrej's experiment, with degree 100 and sampling uniformly from the unit side square centered at the origin, with the expected result: degree 100, center

I then decided to sample uniformly from the unit side square with a corner at the origin and I saw something different: degree 100, corner

To make sure I was seeing what I thought I was seeing, I reduced the degree to 10: degree 10, corner

Of course, the reason is as tros443 explained, if we normalize $p(z) = a_nz^n + \cdots + a_1z + a_0$ to obtain a monic polynomial $\bar{p}(z) = z^n + \cdots + (a_1/a_n)z + (a_0/a_n),$ the normalized coefficients $a_i/a_n$ are independent identically distributed random variables with mean $1$, so the expected value of $\bar{p}(z)$ is $z^n + \cdots + z + 1$.

However at degree 10, sampling from the unit side square centered at the origin does not show this pattern at all: degree 10, center

The difference becomes clear when looking at roots of degree 1 polynomials, i.e. at the distribution of $-a_0/a_1$ in both cases: degree 1, center degree 1, corner

In the case of the centered square, the distribution of $a_i/a_n$ has mean $0$, so the expected value of $\bar{p}(z)$ is $z^n$ rather than $z^n + \cdots + z + 1$. Note that the distribution is also very diffuse.

As I remarked in a comment to the question, the fact that the coefficients of $p(z)$ are independent identically distributed random variables implies that the distribution of zeros is invariant under $z \mapsto 1/z$, which is enough to guess that the roots will concentrate on the unit circle. However, to answer my own question, there is no reason to believe the distribution of zeros will be rotationally symmetric unless the distribution of the coefficients has mean 0.

28

Here's a simplistic intuitive explanation. First, consider the polynomial

$$x^n+x^{n-1}+\ldots+x^2+x+1$$

i.e. the polynomial of degree $n$ where all coefficients are $1$. Then all of its roots have absolute value equal to $1$. To see this, multiply it by $x-1$. This will add a new root (namely $1$):

$$(x-1)(x^n+x^{n-1}+\ldots+x^2+x+1)$$

This is equal to $x^{n+1}-1$. But $x^{n+1}-1$ has only roots with absolute value equal to $1$ (because $x^{n+1}=1$ implies that $1=|1|=|x^{n+1}|=|x|^{n+1}$).

Now lets move on to polynomials of the form

$$a_nx^n+a_{n-1}x^{n-1}+\ldots+a_2x^2+a_1x+a_0$$

where the $a_i$'s were randomly selected from a bounded interval with uniform probability. Then the $a_i$'s will be roughly equal, especially when $n$ is relatively large. That is, if $\bar a$ is the average of $a_i$'s, then by multiplying the polynomial by $\bar a^{-1}$ we will get a polynomial where the coefficients will tend to be close to $1$. Thus that polynomial will behave like the first polynomial.

Edit

The above explanation also gives you an idea why the roots are relatively evenly distributed across the unit circle.

tros443
  • 389
  • 5
    This is a nice intuitive explanation, but there is a gap that I have hard time filling heuristically. It is not at all clear that if the coefficients of a polynomial are equal on average, their zeroes should be where they were if the coefficients were actually equal. Is there a simple explanation for this? – Joonas Ilmavirta Oct 05 '14 at 15:38
  • 2
    Well, the roots of a polynomial as a function of its coefficient is a continuous function. So polynomials with coefficients that are roughly equal, have roots that are roughly equal. – tros443 Oct 05 '14 at 15:53
  • 1
    Continuity only seems to explain it for small perturbations of the coefficients. I know that this is only supposed to be a heuristic justification, so it is not an enormous issue. – Joonas Ilmavirta Oct 05 '14 at 16:21
  • 2
    "Then the ai's will be roughly equal," Why? – Martin Brandenburg Oct 06 '14 at 14:32
  • As seen in my answer, this doesn't seem to explain rotational symmetry at all. In fact, it suggests the opposite. – François G. Dorais Oct 07 '14 at 00:29
  • I think one would need to show that random polynomials tend to be near-cyclotomic. – asmeurer Oct 09 '14 at 15:48
24

This is NOT an answer, I just decided to look at what will happen if one turns this question backwards - i. e. if we want polynomials whose roots are randomly distributed in various senses (say, normally around zero with various standard deviations, or uniformly in a square around zero, or uniformly on $\mathbb C\mathrm P^1$ wrt standard metric of the Riemann sphere).

(And still later) ...trying to reconcile the pictures with what Will Sawin said, I've finally figured out that something is indeed wrong, and it has to do with precision. (A good additional motivation to clarify what's going on were upvotes starting to turn into downvotes :D )

When I increased precision, by replacing everywhere

RandomReal[{?,??}]

with

RandomReal[{?,??},WorkingPrecision->1000]

the outcome changed considerably. Typical pictures with coefficients now look like this (I hope this time precision artifacts do not distort the picture):

enter image description here

In most cases now there seems to be present what Will Sawin mentioned in his first comment (that rotating around zero the set of roots by an angle $\alpha$ results in rotating the $n$th coefficient by $n\alpha$).

I still do not understand why lower precision gave the pictures I produced before, and why there still is some symmetry present in most cases, but anyway this is what I've currently got.

One can still say that the placement of coefficients is far from random in any sense - there still is rotational symmetry, while absolute values of coefficients seem to form an "almost" log-concave unimodal sequence (ascending followed by descending):

enter image description here

I've decided to leave the rest of the message intact too.

(Added later)
As Will Sawin points out in a comment below, there is something suspicious about what follows; let me add the Mathematica code used to produce the pictures below (for the $\mathbb C\mathrm P^1$ version), maybe somebody will find some error...

RandomCP1Point[] := Module[{u=RandomReal[{-1,1}]}, 
    Exp[RandomReal[{0,2Pi}]I] Sqrt[1-u^2]/u
]
RandomCP1RootCoefficients[degree_] := Module[{L,x},
    L=CoefficientList[Times@@(x-Table[RandomCP1Point[],{degree}]),x];
    L/Max[Abs[L]]
]

Example:

ListPlot[Map[{Re[#],Im[#]}&,RandomCP1RootCoefficients[500]], 
    PlotRange->All, AspectRatio->Automatic
]

The results seem to be qualitatively indistinguishable, and the picture is quite interesting I think. I have no idea why do the coefficients tend to lie on a smooth curve, but the fact is that the density close to zero is higher than away from it.

Here are some typical results (roots are arbitrary, not complex conjugate pairs, hence coefficients are complex, not real; they are normalized by dividing through the overall maximum modulus).

Each picture contains all coefficients of a polynomial whose roots are randomly chosen in one of the above senses. The first four have 300 points, the last - 1000. In each case numbering of coefficients goes along the curve, with lowest and highest coefficients near the origin.

enter image description here

enter image description here

enter image description here

enter image description here

enter image description here

One more example of degree 1000 with labels for coefficient numbers:

enter image description here

  • I would assume that each picture represents a single coefficient: would you mind labelling them? Also, how does this depend on the degree? Finally, why not posting this as a new question? – Marco Golla Oct 03 '14 at 08:45
  • 1
    @MarcoGolla Sorry, I should explain better. Each picture represents all coefficients of a separate polynomial of degree 300 whose 300 roots are randomly distributed in various senses. I will add the explanation. – მამუკა ჯიბლაძე Oct 03 '14 at 09:15
  • 1
    ...as for posting a new question - well, I have not thought about it long enough yet :D – მამუკა ჯიბლაძე Oct 03 '14 at 09:23
  • @MarcoGolla Oh I realize now, you meant to label the points to see which coefficients are they. I'll explain that too - sorry, too lazy to replot everything... – მამუკა ჯიბლაძე Oct 03 '14 at 09:35
  • Actually, I thought that each plot showed the distribution of a single coefficient for a fixed degree, so your answer was accurate. However, after your explanation your latest interpretation of my question did cross my mind. It would indeed be nice to see that too (if not with labels, at least with colours). – Marco Golla Oct 03 '14 at 09:57
  • 1
    @MarcoGolla I see. Anyway, as I said, numbering goes along the curve, with first and last coefficients nearest to zero; at least this last fact is more or less understandable, since these two are the sum and the product respectively, of random complex numbers with zero mean and with absolute values which are mostly small... – მამუკა ჯიბლაძე Oct 03 '14 at 10:42
  • 1
    I'm confused by this. Two of your distributions for roots are centrally symmetric around the origin. This gives a symmetry of the distribution for coefficients - for $\alpha$ a complex number of unit norm, you can multiply the $n$th coefficient by $\alpha^n$. This is problematic - if you apply this symmetry to a smooth function, you get a function that is not smooth at all. So why are your functions smooth? – Will Sawin Oct 04 '14 at 07:48
  • 2
    A function $\mathbb Z \to \mathbb C$ appears smooth if applying the difference operator to it destroys most of its mass (say, $L^2$ norm), and applying the difference operator more times destroys even more. Taking the Fourier transform (on the unit circle), we get that the function should have most of its $L^2$ norm concentrated near $1$. For the coefficients of a polynomial, the Fourier transform is just evaluating the polynomial on the unit circle. So if you choose random polynomials with random roots that are often near the unit circle but always far from $1$, you should get smoothness. – Will Sawin Oct 04 '14 at 07:52
  • @WillSawin Maybe precision problems interfere somehow. Maximum modulus of the coefficients becomes huge, of magnitude $10^{100}$ or so. Still I do not see how exactly possible rounding could give smoothing. – მამუკა ჯიბლაძე Oct 04 '14 at 08:49
  • @მამუკაჯიბლაძე I played with your code with degree 100 polynomials and found many polynomials that look like if you applied the symmetry I describe they would become smooth. So maybe the random functions on the unit circle you get this way tend to have all their mass concentrated somewhere - this seems plausible to me, and would explain my observations for degree $100$. But already at degree $200$ I see this almost never. Could some slight bias in random number generation be compounding? I'm not sure. – Will Sawin Oct 04 '14 at 11:20
  • @WillSawin Yes I agree this is even more suspicious that these smooth curves appear at high degrees, where errors might accumulate. Except I don't understand how accumulation of errors might result in more regular behavior... – მამუკა ჯიბლაძე Oct 04 '14 at 23:45
  • It's interesting that you chose a random magnitude-phase pair rather than a random real-imaginary pair. The points are therefore not uniformly random over the complex plane right? – user541686 Oct 05 '14 at 09:51
  • @მამუკაჯიბლაძე I mean if hypothetically there were a bias in the random number generator such that some angles were less likely, then after taking the product of a large number of linear factors, the value at points on the unit circle near those angles would be much larger than the value elsewhere, causing concentration of mass near those points. – Will Sawin Oct 05 '14 at 10:14
  • @WillSawin It was indeed precision artifact, although I don't understand the mechanism. Increasing precision seems to amplify the rotation issue you mentioned, although it seems that behavior of absolute values remains the same. I've updated the text. And thank you very much for noticing it! – მამუკა ჯიბლაძე Oct 07 '14 at 16:47
  • 1
    Fascinating! The (almost) unimodality you noticed made me think of what I have observed for many classical polynomials, as the degree grows, see here: http://mathoverflow.net/questions/125811/why-are-all-these-families-of-polynomials-finally-log-concave so I wouldn't be surprised if (almost) log-concavity of the absolute coefficient values is also one of the features of polynomials with many random zeros. May you have a look at that? (Well, your diagram looks quite different from a Gaussian distribution, so maybe the log-concavity is an illusion.) – Wolfgang Oct 07 '14 at 17:18
  • @Wolfgang Thanks for pointing this out! David Speyer's answer there seems to be relevant here too. As for (almost) log-concavity - it is certainly not an illusion - the plot is logarithmic! – მამუკა ჯიბლაძე Oct 07 '14 at 17:37
  • 1
    You should expect the absolute value of the $k$th coefficient to be close to the square root of the expectation of the squared absolute value of the $k$th coefficient, which is $\sqrt{ \left(\begin{array}{c} n \ k \end{array}\right)}$. This is probably the easiest way to figure out the shape of the curve. – Will Sawin Oct 07 '14 at 18:34
  • @Mehrdad Sorry, I only noticed your comment now. Frankly speaking, I don't know how to go about uniformly distributing points over an infinite area. For any given $R$, absolute values should be greater than $R$ with probability 1... – მამუკა ჯიბლაძე Oct 08 '14 at 05:53
  • This is really strange. My experiments with MATLAB (just with double precision floating points) show strange effects even when I calculate the coefficients of a polynomial which has $N$ roots precisely (up to eps) at the roots of unity when $N$ is as small as 100. I obtain a polynomial with very large coefficients for the middle exponents, and comparably small coefficients close $x^0$ and $x^N$. – Dirk Oct 08 '14 at 10:53
  • @Dirk The same happens in Mathematica. For N=100, some very large coefficients appear until 36 digit precision. At 36 digits, still middle coefficients are of magnitude 0.1; increasing precision makes them gradually smaller, e.g. with 50 digits they are around $10^{-16}$. – მამუკა ჯიბლაძე Oct 09 '14 at 08:57
  • Shouldn't the distribution of each coefficient be pretty easy to find from your base distribution? Take, for instance, polynomials of degree 2: $(x-x_0)(x-x_1)=x^2-(x_0+x_1)x+x_0 x_1$ so then you get, for each coefficient, depending only on the degree, a different expected distribution. And it'll always be one as if you draw $n$ values from your distribution and build various products and sums from them, in a somewhat transparent pattern. – kram1032 Oct 09 '16 at 10:16
  • @kram1032 It is certainly straightforward that the coefficients are just the elementary symmetric functions of the roots, and there must be ways to derive distributions of the former from distributions of the latter. Do you know any relevant references where one could find explicit expressions for, say, roots uniformly distributed on the Riemann sphere? – მამუკა ჯიბლაძე Oct 09 '16 at 10:25
  • @მამუკაჯიბლაძე nope, but it should be easy enough to do, right? Generate $n$ uniform random variables on a sphere, re-interpret them as lying on a Riemann sphere, multiply out $\Pi_{i\in {0,n-1}}{(x-x_i)}$ and there you go. Or am I missing something? If you want analytic results rather than numeric ones then, admittedly, it won't be as straight-forward, except for the coefficients of $x^{n-1}$ and $1$: Those will just straight up be the sum or product over all the roots, respectively. – kram1032 Oct 09 '16 at 10:33
  • @kram1032 well numerically it is basically what I did in my experiments (except there was lot of confusion initially, because of precision issues). Regarding analytic conclusions, there are some in comments above by Will Sawin: expectations of absolute values seem to follow his expression quite well, but what happens with arg is less clear to me. As he indicates, rotating all roots by $\alpha$ will rotate the $k$th coefficient by $\alpha^k$, and I don't understand consequences of this fact when the distribution of roots is rotation-invariant one. – მამუკა ჯიბლაძე Oct 09 '16 at 10:44
17

This is barely more than a string of remarks but a bit long.

There are some great references given here. I think a specific question, in the spirit of the question, not answered by them is

"Under (some model of) random polynomials, what is the expected range of the non-real roots"

See some very minor computational results at the end.

For the (very) particular example giving the illustration (coefficients uniformly chosen from $[27,42]$) we could divide through by $34.5$ and rephrase (for that one example) the question as :

Setting all $a_i=1,$ the roots of $\sum_0^{300}a_iz^i$ are equally distributed on the unit circle (with a gap at $1$). Q: It seems to be about the same with $a_i$ uniformly distributed in $[1-0.22,1+0.22].$ Why?

I can think of things to say about that but won't as the whole point is that the phenomenon of "Very likely that roots very close to the unit circle and very nearly uniformly distributed in argument" is a very robust.

I find it believable and the heuristic convincing. How the random polynomials are chosen, and what the three instances of "very" mean is important for a more precise answer.

The article mentioned by Shepp and Vanderbei is great, but concerns random polynomials with coefficients from the normal distribution. At least in the example the coefficients are uniformly distributed on the real (or integer) interval $[27,42],$ that avoids a very small sector around the positive $x$-axis. Using real coefficients slightly favors the real axis. (There is at least one real root for $n$ odd. Something like $O(\ln{n})$ expected under certain assumptions.)

The mentioned paper by Schmerling and Hochberg seems quite satisfying, it just assumes that each coefficient $a_k$ comes from a distribution with a finite mean $\mu_k$ and standard deviation $\sigma_k$ which grow sub-exponentially (see the paper for a precise statement). One conclusion is that the proportion of the roots which have $1-\delta \lt |z| \lt 1+\delta$ goes in the limit to 1.

I suggested a specific question involving all the non-real roots. I'll report that I generated $100$ random polynomials $1+\sum_1^{99}a_kx^k+x^{100}$ with the $a_i$ uniformly distributed in $[0,1].$ There were $4$ real roots once, $2$ real roots $47$ times and no real roots the other $52$ times. The smallest value of $|z|$ seen among the non-real roots of any of the polynomials was $0.8365$ and was above $0.8967$ for $75$ of them. The maximum $|z|$ was $1.2386$ and for $75$ of them was below $1.1199.$ The reasoning for setting $a_0=1$ was that allowing $a_0$ very small may allow in one very small root. Perhaps dropping the one smallest and one largest $|z|$ would avoid the need for that.

  • 4
    The other answers are quite enlightening, but it was the rephrasing in this answer that made me go "Oh! Of course!". If you pull all the coefficients of an $m$-degree polynomial from a reasonably tight probablility distribution, then the coefficients will all be pretty close to each other, so the polynomial will be close to a factor of $x^{m+1}-1$. Of course it's natural to think about how to make this more precise, but to me, this is the insight that answers the original "Why should I expect this?" question. – Steven Landsburg Oct 05 '14 at 00:37
  • Thanks! But it may be enough to have the constant term and highest coefficient not too small relative to the rest. Then almost all the non-real roots should be near the unit circle (usually). That is vague enough to be safe. Experiment! Take $x^{200}+a_{199}x^{199}+\cdots+a_1x+ 1$ where the (real) $a_i$ are randomly chosen with $|a_i|$ in $[0.9,1.1].$ As you say, typically the non-real roots have $|1-|x||<0.4$ and for all but a few outliers $0.1$ is enough. Add $cx^{100}$ and as $|c|$ grows half of the $|x|$ are around $c^{0.01}$ and half around $c^{-0.01}.$ Set a few $a_i$ to $0$, etc. – Aaron Meyerowitz Oct 05 '14 at 08:43
15

It is interesting to consider deterministic sequences of polynomials, whose degree tends to infinity. I have two examples of proven asymptotics of the zeros.

  • Let $\theta$ be a Pisot number, with minimal polynomial $P\in{\mathbb Z}[X]$. Assume that it is a cluster point in the set of Pisot numbers. Then there exists $A\in{\mathbb Z}[X]$ such that $|A|<|P|$ over the unit circle $\mathbb T$. Define $P_n(X)=X^nP(X)-A(X)\in{\mathbb Z}[X]$. This polynomial has only one root $\theta_n$ of modulus $\ge1$ ; it is again a Pisot number, whose limit is $\theta$. The other roots of $P_n$ tend to $\mathbb T$ and the empirical measure tends to the uniform measure of the unit circle.
  • In 1992, I considered the (deterministic) sequence of polynomials $$q_r(X)=\prod_{j=0}^{r-1}(X+1+\frac{j}r)-\prod_{j=0}^{r-1}(1+\frac{j}r)$$ I proved that the roots concentrate, as $r\rightarrow+\infty$, along the transcendental curve $\Gamma$ of equation $$\left|\frac{(z+2)^{z+2}}{(z+1)^{z+1}}\right|=4.$$ The distribution density is $\frac1{2\pi}\rho(s)ds$ where $s\mapsto\gamma(s)$ is the arc-length parametrization of $\Gamma$ and $$\rho(s)=\Im\left(\gamma'(s)\log\frac{\gamma(s)+2}{\gamma(s)+1}\right)+\frac{\partial(P\circ\gamma)}{\partial s}\,$$ and $P$ is the solution of the non-homogeneous Neumann problem $\Delta P=0$ inside $\Gamma$ with $$\frac{\partial P}{\partial\nu}=\Re\left(\gamma'(s)\log\frac{\gamma(s)+2}{\gamma(s)+1}\right).$$ Notice that $\Gamma$ is quite close to a circle, although it is not exactly one.
Denis Serre
  • 51,599
  • A theorem of Jentzsch-Szegö gives a result in a similar spirit. – ACL Oct 05 '14 at 22:24
  • @ACL. Jentzsch-Szegö is about power series and it tells about zeroes of partial sums accumulating along the boundary of the convergence disk. Thus it is a very different spirit. – Denis Serre Oct 06 '14 at 13:28
  • I don't agree, for two reasons. 1) Partial sums furnish nice sequence of polynomials. 2) There is a generalization of Jentzsch-Szegö where the limit behaviour of the zeroes can be more general (see the book of Andrievskii-Blatt, or a paper of mine, http://dx.doi.org/10.1142/S1793042111004691, where I discuss a generalization to Riemann surfaces of arbitrary genus). – ACL Oct 06 '14 at 17:44
15

I think the reason this is happening is that you're selecting your coefficients from a uniform distribution on the same interval. If you try running your experiment from the other direction -- that is, you look at the coefficients of a polynomial of degree $n$ with roots randomly selected from some interval $[a,b]$ (with expected value $m=\frac{b-a}{2}$) then you'd expect the coefficients to be about the same as the binomial coefficients of $(x-m)^n$. But then the $i^\text{th}$ coefficient has expected value ${n \choose i}m^{n-i} x^i$, which produces very different distributions for different values of $i$. The exception to this is if all your roots have magnitude close to $1$, in which case the $m^{n-i}$ part doesn't vary much and so you'd get coefficients more in line with the distribution you're picking from.

There are details I haven't worked out properly but I'm pretty sure this is the basic reason for what you're seeing.

  • I think this argument explains why I am not getting a uniform distribution of the roots, but does not actually explain why they are all so close to the unit circle. – Andrej Bauer Oct 05 '14 at 20:23
  • @Andrej I noticed the slight tendency of the roots to cluster near 1 and to disperse more near -1, and I'm curious if anyone can provide a statistical explanation to explain this effect. It does seem that there is a strong tendency to be nearly uniform, and this is sort of a second-order thing. – Ryan Reich Oct 06 '14 at 17:13
  • If your distribution of coefficients or the zeroes is symmetric with respect to the imaginary axis then the results should likewise be symmetric. I would suspect that something is amiss with your simulations. – Andrej Bauer Oct 06 '14 at 19:12
14

The question that I will try to answer is:

Why do zeros of $\sum_{j=0}^n a_j z^j$ often accumulate on the unit circle when $a_j$ are iid but otherwise have a very general distribution?

My answer focuses on the role of the basis $\{z^j,\, j=0,\ldots, n\}$ for polynomials of degree at most $n$ and I will assume throughout that $a_j\sim N(0,1)$ are iid. The reason that zeros concentrate on $S^1$ is that the monomials $z^j$ are an orthonormal basis (just the usual Fourier basis) for $L^2(\delta_{S^1},\mathbb C),$ where $\delta_{S^1}$ is the uniform probability measure on $S^1.$ As explained below, the empirical measure of zeros of $\sum_{j=0}^n a_j z^j$ will converge to the same limit the empirical distribution of $n$ electrons that repel one another but are constrained to lie on the support of $\delta_{S^1}.$

If we consider instead the so-called $SU(2)$ ensemble: $$p_n(z)=\sum_{j=0}^n a_j \sqrt{\frac{n!}{j!(n-j)!}}~z^j$$ for $a_j\sim N(0,1),$ then then empirical measure for the zeros of $p_n$ converges almost surely to the uniform measure $\delta_{S^2}$ on the Riemann Sphere. This is a special case of Shiffman-Zelditch (Number Variance of Random Zeros on Complex Manifolds Geom. Funct. Anal. 18 (2008), 1422-1475) and Zeitouni-Zelditch (Large deviations of empirical zero point measures on Riemann surfaces, I: $g = 0$ IMRN Vol. 2010, No. 20, 3939-3992).

The reason is that the basis $\{\sqrt{\frac{n!}{j!(n-j)!}}~z^j, \, j=0,\ldots, n\}$ for polynomials of degree at most $n$ is an orthonormal basis for the $SU(2)-$invariant inner product on the space of polynomials of degree at most $n:$

$$\langle f,g\rangle=\int_{\mathbb C} \frac{f(z)\overline{g(z)}}{\left(1+|z|^2\right)^{N+2}}\frac{i}{2\pi}dz\wedge d\overline{z}.$$

The $SU(2)-$invariance of the inner product shows immediately that the covariance kernel of $p_n$ is $SU(2)-$invariant and hence so is the average distribution of zeros. There are also more refined results concerning weak almost sure convergence, CLTs for linear statistics, and large deviations due to Shiffman-Zelditch (Equilibrium Distribution of Zeros of Random Polynomials Int. Math. Res. Not. 2003, 25-49.) with generalizations by Bloom-Shiffman (Zeros of Random Polynomials on $\mathbb C^m$ Math. Res. Lett. 14 (2007), 469-479) that captures this behavior.

Their results work in all dimensions. I will state only a very special case in complex dimension $1$ since that seems most relevant here. These theorems, modulo some technical assumptions, say that suppose we have:

  1. A compact smooth simply connected domain $\Omega\subseteq \mathbb C$
  2. A (pluri)-subharmonic function $\phi$ on $\Omega$ with logarithmic growth at infinity
  3. Any reasonable probability measure $\mu$ whose support is contained in $\Omega$
  4. The measure $\mu$ and function $\phi$ satisfy the (rather weak) Bernstein-Markov condition.

Define $$p_n^{\mu,\phi}(z)=\sum_{j=0}^n a_j \phi_j(z),$$ where $a_j\sim N(0,1)$ are iid and $\phi_j$ are an orthonormal basis for the polynomials of degree at most $n$ with respect to the inner product coming from $L^2( e^{-n\phi(z)}d\mu(z),\mathbb C).$ Then the empirical distribution of zeros converges weakly almost surely to the equilibrium measure $\nu(\Omega, \mu, \phi),$ which is the unique minimizer of the weighted logarithmic energy $$E(\nu):=\int_{supp(\mu)}\int_{supp(\mu)} \log\left[|z-w| e^{-\phi(z)/2-\phi(w)/2}\right]d\nu(z)d\nu(w).$$

Put another way, the zeros of the random polynomial $p_n^{\mu, \phi}$ tend to be distributed precisely like electrons that are confined to stay on the support of $\mu$ and are subject to the external potential $e^{-\phi}.$ In this way, the orthonormal basis $\phi_j$ remembers the "geometry" of the domain $\Omega,$ the measure $\mu$ and the weight function $\phi,$ which is can be thought of as a Hermitian metric.

In the above setup, taking $\mu=\delta_{S^1}$ and $\phi=0,$ we recover the original "Kac" polynomials $\sum_{j=0}^n a_j z^j.$ Taking $\mu=\delta_{S^2}$ and $\phi=\log(1+|z|^2)$ recovers the $SU(2)$ polynomials.

11

I offer another point of view from the angle of the companion matrix of the polynomial. It may also give a very vague intuition about the observed uniformity of the distribution along the unit circle.

Consider the Jordan block $$J = \begin{bmatrix} 0 & 1 & & & \\ & 0 & 1 & &\\ & & &\ddots & 1\\ & & & & 0 \end{bmatrix}$$ which is the companion matrix of the zero-polynomial and has all its eigenvalues equal to zero.

Now consider a slight perturbation of this matrix in a single component in the last row, i.e. $$J_\delta = \begin{bmatrix} 0 & 1 & & & \\ & 0 & 1 & &\\ & & &\ddots & 1\\ & & \delta & & 0 \end{bmatrix}.$$ If this $\delta$ sits in $k$th entry of the last row, the eigenvalues of $J_\delta$ are the solutions of $$ z^{k-1}(z^{N-k+1} - \delta) = 0 $$ i.e. we have $k-1$ eigenvalues equal to zero and the others are the $(N-k+1)$th roots of unity times $\delta^{1/(N-k+1)}$. That means that a small perturbation in the last row throws a lot of eigenvalues from the origin towards the unit circle and the more left the perturbation is, the more eigenvalues leave the origin and also they move closer to the unit circle. In the extreme case of $k=1$ we have no eigenvalue zero anymore but all $N$th roots of unity times $\sqrt[N]{\delta}$ as eigenvalues.

Roughly, one may say that all perturbations in the last row tend to spread a number of eigenvalues equally distributed around a scaled unit circle. Moreover, the lower left corner is the most sensitive position.

Now it gets more shaky, but if we consider a perturbation in every entry of the last row and each is comparable in size, the perturbation in the lower left corner has the largest effect…

(Inspired from a talk by Sjøstrand on distributions of eigenvalues of small random perturbations of large Jordan blocks, see also the book "Spectra and Pseudospectra" by Trefethen and Embree.)

Dirk
  • 12,325
6

I have an answer that is, in content, similar to that of tros443, but with some additional detail. First, I have to make an assumption on the coefficients based on the nature of your "random" polynomial:

Suppose that the numbers $a_0, \dots, a_n$ are chosen uniformly from an interval $[m, M]$ where $M - m \ll m$.

Note that the roots of the polynomial

$$p(x) = \sum_{i = 0}^n a_i x^i$$

are independent of the scaling of $p$, so if we divide by $(m + M)/2$ we may replace this with the following assumption:

Let $\epsilon > 0$ and let the numbers $a_0, \dots, a_n$ be chosen uniformly from the ball $B_\epsilon(1)$ of radius $\epsilon$ around $1$.

In this formulation, it is acceptable for the coefficients to be properly complex. Now, for fixed $a_n = 1$, the transformation $T$ from roots $\{r_0, \dots, r_n\}$ to coefficients $\{a_0, \dots, a_{n - 1}\}$ is given by the elementary symmetric polynomials. We have the following property:

There exists a function $\delta \colon \mathbb{R}_{> 0} \to \mathbb{R}_{> 0}$ such that for all $\epsilon > 0$, we have $T^{-1}(B_\epsilon(1)^n) \subset B_{\delta(\epsilon)}(T^{-1}(\{1\}^n))$ and $\lim_{\epsilon \to 0} \delta(\epsilon) = 0$.

By $B_r(S)$ for a finite set $S \subset \mathbb{C}^n$, we mean the union of the $B_r(s)$ for each $s \in S$. This property follows from the invertibility of $DT_s$ for each $s \in T^{-1}(\{1\}^n)$, where the latter set contains all permutations of the set of roots of

$$c_n(x) = \sum_{i = 0}^n x^i,$$

which are of course the $(n + 1)$'th roots of unity; in particular, they are uniformly distributed around the unit circle. They are also distinct for each $n$, so $DT_s$ is indeed invertible for each $s$ above. (Proof: the columns of $DT_s$ are the coefficients of the $c_n(x)/(x - \zeta_i)$, where the $\zeta_i$ are the $(n + 1)$'th roots of unity, and these are linearly independent polynomials since the roots are distinct.)

The above observation then says that "roots of a polynomial whose coefficients are randomly chosen near 1, will be near the roots of unity". Here, of course, the second "near" is a relative term depending on the leading coefficient and the quantification of the first "near", but it does reproduce your observations that not only are the roots approximately on the circle, but they are, actually, approximately uniform on the circle.

Ryan Reich
  • 7,173
4

One can explore things visually/experimentally and make good discoveries, Theory is needed to justify them. Here, however, are more experiments and speculation. This is somewhat the same thing looked at in different ways, but I think each adds something.

We see that if the (real) coefficients of $p(x)=\sum_0^n a_ix^i$ are drawn randomly from a positive interval then we can normalize to get the same roots with coefficients from $[1-\delta,1+\delta]$ and if $n$ is large enough and $\delta$ small enough relative to each other (whatever that means) the roots will be near the unit circle and almost equally distributed.


Optional example for illustration and checking: The polynomial $\sum_0^{299}z^n$ has roots $z_m=r_me^{i\theta_m}$ for $r_m=1$ and $\theta_m=\frac{2\pi m}{300}$ $1 \le m \le 299.$ I generated a single random polynomial $f(x)=1+\sum_1^{298} a_n x^x+x^{299}$ with the $a_i$ random and uniformly selected from $[0.8,1.2].$ The $299$ roots (actually, half of them) are shown below. Much can be seen but specifically: The roots, in order of increasing argument, are $r_me^{i \theta_m}$ where in all cases $0.978 \lt r_m \lt 1.036$ and $300|\theta_m-{2\pi m}| \lt 0.78.$ Here are the most extreme deviations in argument (in the upper half): $[123, -.7783]$, $[73, .7107]$, $[61, -.7036]$, $[100, -.5640]$, $[56, .5493]$, $[67, -.5482]$, $[102, -.5382]$, $[72, .5213]$, $[117, .5156]$, $[97, .5099]$, $[43, .4866]$, $[86, .4827]$, $[87, -.4749]$

enter image description here


Before going on:

  • better to say the unit circle except the neighborhood of $1$. Though we could fix that by multiplying through by $x-1$ and discussing polynomials $x^{n+1}+\sum_1^{n}a_ix^i+a_0$ with $a_0$ as before but the $a_i$ in $[-2\delta,2\delta]$ (but denser near $0$.) Then the roots really are almost equally distributed.
  • maybe it is better to look at complex coefficients with magnitude in $[1-\delta,1+\delta]$ or $[0,1]$ or $\{{0,1\}}$
  • or real coefficients of that form or from one of the sets $\{{-1,1\}},\{{0,1\}},\{{-1,0,1\}}.$

Actually these all relate to each other. Of course small perturbations of coefficients should move roots only a bit. But does that explain how little these moved?


If we are allowed to cook the coefficients to really move a root it seems best to move $-1$ by picking $n$ even and make the coefficients alternately $1-\delta$ and $1+\delta.$ Then $-1$ is no longer a root, $p(-1)=(n+2)\delta.$ But $p'(-1) =\frac{n+2}{2} -\frac{n^2+n}{2}\delta$ is so large that we shouldn't have to go far. In fact calculation shows that the root is roughly $-(1+2\delta).$ More precisely, exactly $-(1+2\delta+2\delta^2\cdots)=-(1+\frac{1+\delta}{1-\delta})$ and the other roots seem unchanged. Of course (in hindsight) we should just factor to see that $$p(x)=((1-\delta)x+(1+\delta))\frac{x^n-1}{x^2-1}.$$

So a root can move more than $2 \delta.$ Is that tight? Seems reasonable, but I'm not going to check. We saw only a fraction of that above. For random coefficients in our model $p(-1)=\sum(a_i-1)(-1)^i$, the sum of $n+1$ values uniformly drawn from $[-\delta,\delta]$ so not that large in relation to $p'(-1)$ . So an actual root is likely not far away at all. As far as it goes, that reasoning is valid for any of the other roots of $\sum x^i$. The root $-1$ is special, but only because we are using real positive coefficients. Arbitrary coefficient near the unit circle should resolve that.


If the cyclotomic roots don't move "much" then they can't end up "too close" together. But perhaps it is also good to just consider if roots (want to be) separated from each other. We could try to add a new root very near an old one or move two roots until they touch or are near. I'll leave it to you to check that, if the other roots are fixed, we get coefficients almost as large as $2.$ Can one do better moving all the roots?


What is the effect of each coefficient? If we change just one interior coefficient $a_k$ to something extremely huge, then there will be seen to be about $n-k$ "big" roots near equally spaced on the circle of radius $a_k^{1/(n-k)}$ and about $k$ "small" roots near the circle of radius $a_k^{-1/k}.$ We can see why. Also, for $k$ not too near either extreme, and $a_k$ merely kind of huge, we would still have all those roots quite close to the unit circle. It is far from obvious how that carries over allowing all the coefficients moving a moderate amount (randomly). Perhaps it could be said that, as long as things like $(\frac{a_{n-k}}{a_n})^{1/k}$ and $(\frac{a_k}{a_0})^{-1/k}$ are all near $1$, there should be many pullings and pushings, none very large, that usually cancel out. That is very vague and unsupported but it works for me as a motivation. With the right deliberate choices we could shove a particular root or small set of roots. That actually seems like the case above for $x=-1.$


Here is an idea in perhaps more detail than it deserves. If we trace $q(x)=\frac{x^{n+1}-1}{x-1}$ as $x$ moves around the unit circle, we get a path not too hard to describe that touches the origin at the roots and then goes somewhat far away until coming back for the next one. When we perturb the coefficients and look at the position $p(x)$ on the new path, it will differ from the position on the old path by $\sum (a_i-1)x^i$ where these coefficients are random and distributed in $[-\delta,\delta]$ so the $x$ that were (near) roots of $q(x)$ will (usually) be near roots of $p(x)$ and those with $|q(x)|$ not that close to $0$ will? have the same true. It is possible to cook things to get a big deviation in one or a few places but unlikely to happen by random.

0

Old interesting question with many excellent answers. Posting this since I didn't see mentioned:

To gain intuition, you can also ask the reverse question: what needs to happen to get many large roots (e.g. norm >2) (or very small roots e.g. norm <.5, but not both)?

Assume that the coefficient for the largest power is fixed to be 1.

Assume that you have many large roots but not many small roots. Their product would be a very large number. You are bounding the coefficients to a range that is relatively very small.

What if we have both many large and many small roots? e.g. $(x-2i \cos(\theta)-2\sin(\theta))^{100}(x-.5i \cos(\rho)-.5\sin(\rho))^{100}$

First, to get real coefficients we need $100 \theta + 100 \rho\in Z$. Not a very likely event. We would also need similar conditions on $\rho$ and their relationship. Looking at the middle coefficients, these would need to cancel each other out, and the only way that can happen is if they are roughly the roots of the same number, which means the norm of the roots can not be very far from each other.