45

A couple of years ago, I came up with the following question, to which I have no answer to this day. I have asked a few people about this, most of my teachers and some friends, but no one had ever heard of the question before, and no one knew the answer.

I hope this is an original question, but seeing how natural it is, I doubt this is the first time someone has asked it.

First, some motivation. Take P any nonzero complex polynomial. It is an easy and classical exercise to show that the roots of its derivative P lie in the convex hull of its own roots (I know this as the Gauss-Lucas property). To show this, you simply write P=ari=1(Xαi)mi where the αi (i=1,,r) are the different roots of P, and mi the corresponding multiplicities, and evaluate PP=imiXαi on a root β of P which is not also a root of P. You'll end up with an expression of β as a convex combination of α1,,αr. It is worth mentioning that all the convex coefficients are >0, so the new root cannot lie on the edge of the convex hull of P's roots.

Now fix P a certain nonzero complex polynomial, and consider Π, its primitive (antiderivative) that vanishes at 0: Π(0)=0 and Π=P. For each complex ω, write Πω=Πω, so that you get all the primitives of P. Also, define for any polynomial Q, Conv(Q), the convex hull of Q's roots.

MAIN QUESTION: describe Hull(P)=ωCConv(Πω).

By the property cited above, Hull(P) is a convex compact subset of the complex plane that contains Conv(P), but I strongly suspect that it is in general larger.

Here are some easy observations:

  1. replacing P (resp. Π) by λP (resp. λΠ) will not change the result, and considering P(aX+b) will change Hull(P) accordingly. Hence we can suppose both P and Π to be monic. The fact that Π is no longer a primitive of P is of no consequence.

  2. the intersection defining Hull(P) can be taken for ω ranging in a compact subset of C: as |ω|, the roots of Πω will tend to become close to the (deg(P)+1)-th roots of ω, so for large enough ω, their convex hull will always contain, say, Conv(Π).

  3. Hull(P) can be explicitly calculated in the following cases: P=Xn, P of degree 1 or 2. There are only 2 kinds of degree 2 polynomials: two simple roots or a double root. Using zaz+b, one only has to consider P=X2 and P=X(X1). The first one yields {0}, which equals Conv(X2), the second one gives [0,1]=Conv(X(X1)).

Also, if Π is a real polynomial of odd degree n+1 that has all its roots real and simple, say λ1<μ1<λ2<<μn<λn+1, where I have also placed P's roots μ1,,μn, and if you further assume that Π(μ2j)Π(μn)Π(μ1)Π(μ2j+1) for all suitable j (a condition that is best understood with a picture), then Hull(P)=Conv(P)=[μ1,μn]: just vary ω between [Π(μn),Π(μ1)]; the resulting polynomial Πω is always split over the real numbers and you get

[μ1,μn]=Conv(P)Hull(P)Conv(ΠΠ(μ1))Conv(ΠΠ(μn))==[μ1,][,μn]=[μ1,μn]

  1. The equation Πω(z)=Π(z)ω=0 defines a Riemann surface, but I don't see how that could be of any use.

Computing Hull(P) for the next most simple polynomial P=X31 has proven a challenge, and I can only conjecture what it might be.

Computing Hull(X31) requires factorizing degree 4 polynomials, so one naturally tries to look for good values of ω, the ω that allow for easy factorization of Πω=X44Xω---for instance, the ω that produces a double root. All that remains to be done afterwards is to factor a quadratic polynomial. The problem is symmetric, and you can focus on the case where 1 is the double root (i.e., ω=3). Plugging in the result in the intersection, and rotating twice, you obtain the following superset of Hull(X31): a hexagon that is the intersection of three similar isoceles triangles with their main vertex located on the three third roots of unity 1,j,j2

QUESTION: is this hexagon equal to Hull(X31)?

Here's why I think this might be.

Consider the question of how the convex hulls of the roots of Πω vary as ω varies. When ω0 is such that all roots of Πω0 are simple, then the inverse function theorem shows that the roots of Πω with ω in a small neighborhood of ω0 vary holomorphically linearly in ωω0: z(ω)z(ω0)ωω0. If however ω0 is such that Πω0 has a multiple root z0 of multiplicity m>1, then a small variation of ω about ω0 will split the multiple root z0 into m distinct roots of Πω that will spread out roughly as z0+c(ωω0)1m, where c is some nonzero coefficient. This means that for small variations, these roots will move at much higher velocities than the simple roots, and they will constitute the major contribution to the variation of Conv(Πω); also, they spread out evenly, and (at least if the multiplicity is greater or equal to 3) they will tend to increase the convex hull around z0. Thus it seems not too unreasonable to conjecture that the convex hull Conv(Πω) has what one can only describe as critical points at the ω0 that produce roots with multiplicities. I'm fairly certain there is a sort of calculus on convex sets that would allow one to make this statement precise, but I don't know see what it could be.

Back to X31: explicit calculations suggest that up to second order, the double root 1 of X44X+3h for |h|<<1 splits in half nicely (here ω=3+h), and the convex hull will continue to contain the aforementioned hexagon.

QUESTION (Conjecture): is it true that Hull(P)=ωMRConv(Πω), where MR is the set of all ω0 such that Πω0 has a multiple root, i.e., the set of all Π(αi) where the αi are the roots of P?

All previous examples of calculations agree with this, and I have tried as best I can to justify this guess heuristically.

Are you aware of a solution? Is this a classical problem? Is anybody brave enough to make a computer program that would compute some intersections of convex hulls obtained from the roots to see if my conjecture is valid?

  • 12
    In other words: If P is a polynomial, what points lie in the convex hull of the roots of every antiderivative of P, besides the roots of P and their convex hull? It's a good question, and your partial results are interesting too, but your write-up is somehow long and hard to read. – Greg Kuperberg Mar 18 '11 at 09:23
  • 5
    Another way to word the question: what is the intersection of the convex hull of level sets z|Q(z)=ω for a polynomial Q? Here P is the derivative of Q. By chance, I've discussed this question a bit with Tan Lei; she made some nice movies of how the convex hulls of level sets vary with ω. (Also, it's fun to look at their diagrams interactively manipulated in Mathematica). If I get my thoughts organized I'll post an answer. – Bill Thurston Mar 18 '11 at 12:13
  • 3
    It's a nice question. A suggestion for readability : replace Π (which looks like a product) with a roman letter. – François Brunault Mar 18 '11 at 12:17
  • 1
    This reminds me Hildebrandt's Theorem (it does not answer the question, however): If P is the characteristic polynomial of a matrix A, the intersection of the numerical ranges W(B) where B runs over the matrices conjugated to A equals the convex hull of the roots of P. – Denis Serre Mar 18 '11 at 15:17

4 Answers4

18

First, a counterexample to your conjecture. Let Π=x4+x3+4x2+4x=x(x+1)(x2+4), so P=4x3+3x2+8x+4. The critical values of ω are 1.06638,3.89455+2.87687i,3.894552.87687i, and by inspection (using Mathematica) we see that for each of these values of ω, Conv(Πω) contains a neighborhood of 0.

Now for a calculus on convex sets. Every convex set is the intersection of a set of halfplanes. Call a halfplane in this collection essential if removing all of the halfplanes in an open set of halfplanes (in the halfplane topology) containing it from our set of halfplanes makes the intersection of the halfplanes in our set bigger. We wish to find a characterization of the essential halfplanes of Hull(P).

First of all, I claim that any essential halfplane of Hull(P) occurs as an essential halfplane of Conv(Πω) for some ω. This follows from continuity - for any open set around our essential halfplane there is some ω, take the limit of a subsequence of these ωs...

Now, suppose that the halfplane Re(x)0 occurs as an essential halfplane of some Conv(Πω), i.e. there are at least two roots of Πω with real part 0, and the rest of the roots have negative real part. If the number of roots on the line Re(x)=0 (counted with multiplicity) is two, then by holomorphicity we can always find a direction to move ω so that either both roots move to the left, or both roots stay on the line Re(x)=0 and move towards eachother. If we can ever make both roots move to the left, then clearly the halfplane Re(x)0 is not an essential halfplane of Hull(P), otherwise we keep pushing the roots towards eachother until either they run into eachother or until a third root hits the line Re(x)=0. In any case, we see that if a halfplane is essential for Hull(P), then there is some ω such that the halfplane is essential for Conv(Πω) and such that at least three roots (counted with multiplicity) of Πω are on the boundary of the halfplane, or two of the roots are equal and Πω has no other roots.

So if we let L be the set of ωs such that three of the roots of Πω lie on a line, we get that Hull(P)=ωLConv(Πω) if degP2.

Edit: Actually, I think there is a problem with this. It's conceivable that two roots are on the line Re(x)=0 and have derivatives (with respect to ω) pointed in opposite directions, such that we can't simply push them towards eachother. For instance, the map from one root to the other root could, locally, look like the fractional linear transform sending the left halfplane to a circle contained in the right halfplane and tangent to the line Re(x)=0 at the other root. So, we may need to enlarge the set L to contain also those ωs for which the ratio of the derivatives of two of the roots (with respect to ω) is a negative real number.

Edit 2: It turns out that this isn't an issue. Call the two roots on the line Re(x)=0 r1 and r2. Suppose that locally, r1(ϵ)=ϵ, r2(ϵ)=imϵ+aϵk+O(ϵk+1), a0, m>0. Note that if we had r2(ϵ)=imϵ, then the intersection of the halfplanes corresponding to r1(ϵ),r2(ϵ) and r1(ϵ),r2(ϵ) would be contained in the halfplane Re(x)0, and the intersection of their boundaries would be located at i/(m+1). Now if k is even, then the correction term shifts the intersection of the boundaries by aϵk/(m+1)+O(ϵk+1), so if we choose ϵ small such that aϵk is real and negative, then we see that the halfplane Re(x)0 is not essential. If k is odd, then if we choose ϵ small such that Re(ϵ)<0 and aϵk is a positive real times i, then r2(ϵ) is shifted up and r2(ϵ) is shifted down, so the intersection of the boundaries will be shifted to the left (draw a picture), so again the halfplane Re(x)0 is not essential.

zeb
  • 8,533
  • Hi Zeb, thank you for your answer. I should use mathematica to convince myself, but I see now that problems may arise when 3 roots are aligned as is the case for 2i,0,2i: they force Hull(P) to lie inside the closed halp plane (z)0, yet all the convex envelopes I considered may very well contain 0 in their inside.

    I understand that you can say any essential halfplane H of Hull(P) arises as one of the essential half planes of Conv(πω): there must half planes Hω close to H, otherwise you'd have an angular point

    – Olivier Bégassat Mar 18 '11 at 18:02
  • where there is no essential half plane, or else all essential half planes of the Πω would contour a sort of disk with positive radius. And therefore there are always essential half planes for certain ω that are close to H – Olivier Bégassat Mar 18 '11 at 18:08
  • if the derivatives with respect to ω of the two simple roots you counsider are such that their ratio is in mathbbCR, then you can indeed push the half plane to the left for a suitable choice of perturbation ω+h, if they are opposites you should be able to push them towards eachother. I don't really understand the problem you mention in your edit. – Olivier Bégassat Mar 18 '11 at 18:33
  • The problem with pushing the roots towards eachother is that the higher order derivatives might screw things up in such a way that we can neither move both roots to the left nor push them directly towards eachother... I found a workaround, though (my second edit). – zeb Mar 18 '11 at 19:39
  • I still wonder if you can at least show that only finitely many such root aligning values of ω need to be considered. Take the example of X31 and it's renormalized primitive X44X. Take ω that gives a multiple root (such as ω=3). Take r<<1 and let θ vary, the two roots for 3+r.eiθ (that split out of the double root 1) will turn around 1, performing a semicircle as the exponential completes a complete circle. Continuity and the near staticness of the 2 other roots show that there will be many values of ω that produce alignment. – Olivier Bégassat Mar 19 '11 at 00:25
  • in fact, there should be 2 distinct 3 root alignment situations for each value of r, corresponding to alignment with the 2 remaining roots. From what I could tell from my calculations involving second order approximation, these alignment situations will not affect the hexagon I described in my question. This also shows that triple alignment is not rare at all. Are there better alignment situations than others? Can we replace the really large intersection over root aligning values (RAV) of ω by a smaller (finite?) intersection? – Olivier Bégassat Mar 19 '11 at 00:32
  • The set of ωs that give root alignment should usually be a one-dimensional subset of the complex plane. As we vary the value of ω along this set, we get a one dimensional family of hyperplanes. My intuition is that there are two cases: either the hyperplanes are moving in the "wrong" direction (think of the set of hyperplanes intersecting a fixed circle at a point on its boundary) and are not essential, or the hyperplanes are moving in the "right" direction (think of the set of hyperplanes containing a fixed circle and tangent to its boundary) and probably are essential. – zeb Mar 19 '11 at 01:06
  • clarification: for the wrong direction, the hyperplanes intersect the fixed circle ONLY at single points on its boundary – zeb Mar 19 '11 at 01:07
15

I. First I want to share some computer experiments of H.H. Rugh. The following image supports the positive answer of the

QUESTION: is this hexagon equal to Hull(X31)?

triangle (as a new user I was not allowed to use image tags).

A scilab program testing this problem can be found at roots-dancing. In particular an example z63z3+z similar to the starting example of Zeb, showing that Hull(P) can be strictly smaller then vConv(fv) for v ranging the critical values of f (shown in blue polygons). See smaller.


II. Here is a proof (communication of Rugh) of the same statement of Zeb (with f=Π):

If we let L be the set of v's such that three of the roots of fv lie on a line, we get Hull(P)=vLConv(fv) if degf3.

The underlying idea is very similar to that of Zeb.

Statements: Let f be a polynomial of degree at least 3. Assume that a0 and b0 are two distinct simple roots of f(z)v0. Then for v in a small neighborhood of v0, there are two simple roots a(v), b(v) of f(z)v with a(v0)=a0 and b(v0)=b0. In this case,

  1. If for some fixed complex number t0,1, the holomorphic function vta(v)+(1t)b(v) is a constant c, then c is a critical point of f and f has a rotational symmetry about c.

  2. If the segment [a0,b0] is a boundary edge of the polygon Conv(fv0) and no other point in the line through a0,b0 is mapped to v0, then

2.1 for v sufficiently close to v0, the segment [a(v),b(v)] is a boundary edge of the polygon Conv(fv), and for any t]0,1[, the map vta(v)+(1t)b(v) is an open mapping.

2.2. The line through a0,b0 is outside vConv(fv).

Proof. 1. Replacing f(z) by f(zc) if necessary, we may assume c=0. Note that ab(f(a)) is defined and holomorphic in a neighborhood of a0, satisfying that b(f(a0))=b0 and f(b(f(a)))=f(a). It follows that ta+(1t)b(f(a))0 so b(f(a))=tat1. Therefore f(a)=f(tat1) in a neighborhood of a, thus in the entire complex plane. Comparing the coefficients we conclude that tt1 is a root of unity and f(0)=tt1f(0). Using tt11 we get f(0)=0. An example is z6+3z45z2.

2.1. The condition means that all the other points in f1(v0) are contained in one of the open half planes delimited by the line through a0,b0. This is clearly an open condition.

Now if ta(v)+(1t)b(v)c, by Point 1 c must be a critical point and a center of symmetry and Conv(fv0) would have been symmetric with respect to c. This is not possible by our assumption that all the other points of f1(v0) (and there is at least one due to the assumption on the degree of f and on the simplicity of a0,b0 as roots of fv0) are on one side of the line through a0,b0.

2.2. We may look at the open set W=vOuter(fv) where Outer(fv) is the complement of Conv(fv). We may assume a0,b0 are on the imaginary axis and all the other points of f1(v0) are on the left half plane. We know already iR[a0,b0]Outer(fv0)W.

Fix some t[0,1] and let z0=ta0+(1t)b0. We may assume z0=0. Now vz(v)=ta(v)+(1t)b(v) is open. Choose a path v(s) such that z(v(s)) is negative real. Then the segment [a(v(s)),b(v(s))] passes through z(v(s)) and remains almost vertical for sufficiently small s, so has z0 on its right side. By 2.1 we may conclude z0Outer(fv(s))W. qed.


III. Finally I want to share some numerical experiments (with the help of Jos Leys) illustrating a refinement (communication by Thurston) of Gauss-Lucas property.

Consider a polynomial f as a branched covering of the complex plane. Denote by C the convex hull of the critical points of f. It is called {\em the critical convex} of f. The following statements are equivalent: (1) For any vC, we have Conv(fv)C.

(2) The map f is surjective on any closed half plane H intersecting C.

(1)(2). Assume f(H)v. Then f1(v) is contained in CH which is an open half plane, in particular convex. Then Conv(f|v) is also contained in CH. So Conv(f|v)C, contradicting (1).

(2)(1). Assume that Conv(f|v) does not contain the entire set C. Then there is a closed half plane H intersecting C but disjoint from Conv(f|v). Then f(H)v, that is, f is not surjective on H, contradicting (2).

Now the refinement (I'll leave the proof to Thurston if somebody requires) is that if one takes a supporting line L of C there is a region on the outer half space of L on which f is a bijection onto C (injective in the interior and bijective on the union of the interior with half of the boundary arc). In fact this region is bounded by two geodesic rays of the conformal metric |f(z)||dz| tangent to L at a critical point (if the critical point is simple, otherwise the region is even smaller). This means that we use the euclidean length in the range to measure tangent vectors in the domain. Geodesics in this metric are the pullbacks by f of straight lines.

The movie bijectivity is made by Jos Leys to illustrate this result.

12

This problem has been considered before:

The notes to chapter 4 of

Rahman/Schmeißer: Analytic Theory of Polynomials, Oxford University Press, 2002

mention this problem, state that conv(P) is a proper subset of hull(P) in general, and give two references:

1) J. L. Walsh: The location of Critical Points of Analytic and Harmonic Functions, AMS Colloquium Publications Volume 34, 1950, p. 72

2) E. Chamberlin and J. Wolfe: Note on a converse of Lucas's theorem, Proceedings of the American Mathematical Society 5, 1954, pp. 203 - 205

I had a look at both of them, the paper of Chamberlin and Wolfe can be obtained online via the AMS for free. The relevant paragraph in Walsh's book is 3.5.1 (starts on page 71). I state the 4 theorems given there for convenience:

1) conv(P) = hull(P) if P is of degree 1 or 2

2) conv(P) = hull(P) if P is of degree 3 and its zeros are collinear

3) There exists a P of degree 3, such that conv(P) is a proper subset of hull(P); example P(z)=z3+1.

4) There exists a P of degree 4 with real zeros, such that conv(P) is a proper subset of hull(P); example P(z)=(z21)2.

Chamberlin and Wolfe prove, that

1) Any point on the boundary of hull(P) is on the boundary of conv(Πω) for some ω.

2) If a side of any conv(Πω) contains a point of hull(P) and only two zeros of Πω , counting multiplicities, then P is of degree 1. (Here a side is equal to conv(Πω), if the zeros of Πω are collinear.)

3) Vertices of conv(P) need not lie on the boundary of hull(P), example Π(z)=z2(z+1)(z22az+1+a2), where a is positive and sufficiently small.

4) Even if P is cubic, hull(P) need not be determined by its primitives with multiple zeros, example P(z) = 4z3+9/2z2+2z+3/2.

While I have corrected what I considered a minor typo in the example Π(z)=z2(z+1)(z22az+1+a2), I did no thorough proofreading.

Walsh gives no special references to further work in section 3.5.1 and the only reference provided by Chamberlin and Wolfe is to Walsh's book cited above.

  • The link to the AMS paper is link . Morris Marden gives result 2) of Chamberlin and Wolfe as exercise 8 to paragraph 6 of the 1966 edition of his "Geometry of polynomials" on page 24 with a reference to Chamberlin and Wolfe. – thomashennecke Aug 11 '15 at 12:02
9

We can characterize those P for which Hull(P)=Conv(P).

First suppose that the roots of P do not lie on a line. We prove that Hull(P)=Conv(P) if and only if there is an antiderivative Q of P for which Q=A2B, and the roots of B lie in the convex hull of A. The "if" is immediate and left to the reader.

Note that a root β of P can lie on the boundary of Conv(Πω) only if β is a root of Πω, or all the roots of Πω lie on a line. This latter case of course is impossible if the roots of P do not lie on a line.

Let β be a root of P. We claim that for every neighborhood N of Π(β) there is a neighborhood M of β such that Conv(Πy)M for every yCN. This certainly holds for any given M when y lies outside a large compact subset of C, so we can think of y ranging over a compact set. For each yΠ(β), there is a ball of positive radius around β in Conv(Πy), and the size of the maximal such ball varies continuously, so the claim follows.

Now, suppose that β and γ are adjacent vertices (extreme points) of Conv(P). Suppose γ is not a root of ΠΠ(β). Then γ lies in the interior of Conv(ΠΠ(β)), and we can find γ in the interior of Conv(ΠΠ(β)) so that ¯γβConv(P)=β. Then ¯γβConv(Πy) for all y in a suitable neighborhood N of Π(β). On the other hand, MConv(Πy) for a suitable neighborhood M of β and all yN. Therefore ¯γβMHull(P), and hence Conv(P)Hull(P).

So if Conv(P)=Hull(P), then Π(β)=Π(γ) for all adjacent vertices β,γ of Conv(P). Letting Q be Π(β) for any extreme point β of Conv(P), we see that every extreme point of Conv(P) is a root of Q, and hence a double root of Q. Moveover, if Conv(P)=Hull(P), then Conv(P)Conv(Q), so Q=A2B, where the roots of B lie in the convex hull of the roots of A, the extreme points of Conv(P).

Now suppose that the roots of P lie on a line. We can assume that P is real (with positive leading term) and all of the roots of P are real as well. Let β and γ be the least and greatest root of P, respectively. Then, by a similar analysis, Hull(P)=Conv(P) if and only if

  1. Πy has all real roots for some value of y,
  2. The roots of ΠΠ(β) have real part at least β, and
  3. The roots of ΠΠ(γ) have real part at most γ.

You can easily verify that 1, 2, and 3 hold when P has degree 3. I would be tempted to conjecture that 2 and 3 always hold (when the roots of P are real).