74

I am very interested in proofs that become shorter and simpler by going to higher dimension in $\mathbb R^n$, or higher cardinality. By "higher" I mean that the proof is using higher dimension or cardinality than the actual theorem.

Specific examples for that:

  1. The proof of the 2-dimensional Brouwer Fixed Point Theorem given by by Aigner and Ziegler in "Proofs from the BOOK" (based on the Lemma of Sperner). The striking feature is that the main proof argument is set up and run in $\mathbb R^3$, and this 3-dimensional set-up turns the proof particularly short and simple.

  2. The proof about natural number Goodstein sequences that uses ordinal numbers to bound from above.

  3. The proof of the Finite Ramsey Theorem using the Infinite Ramsey Theorem.

In fact, I would also be interested in an example where the theorem is e.g. about curves, lattice grids, or planar graphs $-$ and where the proof becomes strikingly simple when the object is embedded e.g. in a torus, sphere, or any other manifold.

Are you aware of proofs that use such techniques?

Claus
  • 6,777
  • 8
    Do you count Buffon's noodle as an example of this? https://en.wikipedia.org/wiki/Buffon%27s_noodle – amakelov May 21 '20 at 06:46
  • 14
    Desargue’s Theorem is a famous example. There are two senses in which a move to something larger gives an easier proof for $\mathbb R^2$: moving to $\mathbb R^3$ and adding points at infinity. – Monroe Eskew May 21 '20 at 06:54
  • 4
    Perhaps also: Showing Borel determinacy using a measurable cardinal (which implies analytic determinacy by a simpler argument). – Monroe Eskew May 21 '20 at 06:56
  • 1
    I‘m not sure if this is exactly what you are asking for, but in the area of mass partitions there are a number of proofs that use some lifting into higher dimensions. For example to prove that an (open) necklace with beads of k colors can be fairly distributed among 2 robbers using k cuts, you can just lift the necklace to the moment curve in k-dimensional space and apply the Ham-Sandwich Theorem. – Patrick Schnider May 21 '20 at 07:10
  • 1
    Here is a nice questions: given 3 disjoint circles of different radii in the plane, show that the 3 meeting points of the 3 pairs of bi-tangent lines to each pair of circles are co-linear. – Uri Bader May 21 '20 at 08:19
  • @UriBader this is an interesting one! Are you aware of a proof taking the "higher order" approach? Maybe some kind of projection from $\mathbb R^3$ into the plane? – Claus May 21 '20 at 08:33
  • 6
    When proving existence of an algebraic closure of a field $K$, one issue which prevents naive applications of Zorn's Lemma is the lack of an ambient structure which would contain all algebraic extensions. An elegant way to get around that is to fix some set $S$ containing $K$ and of cardinality larger than $K$, and consider all algebraic extensions contained in $S$. Then an application of Zorn's Lemma gives an algebraic closure. – Wojowu May 21 '20 at 08:35
  • 1
    @Claus, yes, of course, this is why I put it here. This is not research level, hence it is just a comment . I will not provide the solution, not to take all the fun out, but the main hint is already given by the context. – Uri Bader May 21 '20 at 08:39
  • 42
    There is the nice way of calculating $\int_\mathbb{R} e^{-x^2}dx$ by going planar and using polar coordinates... – Uri Bader May 21 '20 at 08:47
  • @UriBader thanks a lot for your nice examples! Very much appreciated – Claus May 21 '20 at 09:00
  • @amakelov, thanks a lot for your comment! Just to confirm my understanding, you mean taking the noodle as the limit of piecewise linear needles? It is actually an interesting view, I think I should specify better in my question – Claus May 21 '20 at 09:20
  • 1
    @Claus - with the Buffon noodle, we take the original problem referring to a unit-length, straight needle, and we generalize in two directions: 1) to arbitrary length, and 2) to a curved needle ("noodle"). Neither of these is formally generalizing to higher dimension or higher cardinality - we just allow the needle to be longer (so, a finer notion than cardinality is being generalized here) and, if you choose to look at it this way, we give it "one more dimension" of wiggle room. I was just wondering what your point of view on this example is - does it fit your intended criteria? :) – amakelov May 21 '20 at 10:32
  • @amakelov thanks a lot, this makes it clear. Thanks to your comment, I have now updated my question, as in this post I am most interested in the case where the proof uses higher dimension or cardinality than the actual theorem. I should have specified this earlier, my question was not in the direction of generalizing the theorem itself. Thanks again amakelov for helping me clarify! – Claus May 21 '20 at 11:32
  • 1
    Somewhat related question: https://mathoverflow.net/questions/336008/legendary-extra-parameters-to-simplify-a-counting-problem – Sam Hopkins May 21 '20 at 14:39
  • 3
    Another simple example may be reducing higher order differential equations over $\mathbb{R}$ to first order differential equations over $\mathbb{R}^n$ by the standard trick of introducing extra variables, each begin the derivative of the previous one – Andrea Ferretti May 21 '20 at 15:04
  • 1
    Another somewhat related question: https://mathoverflow.net/questions/106705/2d-problems-which-are-easier-to-solve-in-3d/ – Oliver Nash May 21 '20 at 15:13
  • @SamHopkins thank you for providing this link - great. Very much appreciated! – Claus May 21 '20 at 17:42
  • 1
    @OliverNash thanks a lot for your link! Your own example in this thread is great. Thanks again! – Claus May 21 '20 at 17:45
  • 1
    I believe (I'm not a professor) that an example could be A Generalization which is Easier to Prove than a Special Case, by R. A. Rosenbaum, The American Mathematical Monthly, Vol. 70, No. 4 (April, 1963), p. 427. I can to read it from my account of JSTOR. Isn't required a response for my comment, good day. – user142929 May 22 '20 at 10:48
  • 1
    Another somewhat related question: https://mathoverflow.net/questions/271608/17-camels-trick/272143 – Sam Hopkins May 22 '20 at 11:51
  • 1
    Numberphile has a good video that explains something that only applies to a circle by using higher dimensions: https://www.youtube.com/watch?v=6_yU9eJ0NxA – Fabian Röling May 22 '20 at 15:42
  • @SamHopkins Sam, great link, thanks a lot – Claus May 22 '20 at 16:43
  • 1
    This video is a relatively simple, but nice example. – K.defaoite May 22 '20 at 17:06
  • 1
    My professor in Model Theory said that none of the proofs work for Finite Model Theory because you always run out of elements to choose from. – Pål GD May 22 '20 at 21:08
  • 2
    There are entire compendia of real integrals performed using the residue theorem ... – Michael Engelhardt May 23 '20 at 02:18

29 Answers29

58

Whitney's theorem is an example of this. To prove the weak version (i.e. embedding a manifold $M^n$ in $\mathbb{R}^{2n +1}$), you start by using a partition of unity to embed $M^n$ into $\mathbb{R}^{N}$ where $N$ is very large. This is relatively easy to do when $M^n$ is compact and takes a little bit of thought otherwise, but is significantly easier than trying to get an embedding in a lower dimension from scratch. You can then use transversality arguments to show that a generic projection map preserves the embedding of $M^n$ to cut down $N$ until you get to $\mathbb{R}^{2n +1}$.

To get the strong version of the theorem (embedding $M^n$ in $\mathbb{R}^{2n}$), there is another insight needed, which is using Whitney's trick to get rid of double points. As such, it's really the weak version where the high-dimensionality approach is used.

LSpice
  • 11,423
Gabe K
  • 5,374
49

Tarski's plank theorem (1932).

A plank of width $w$ in ${\bf R}^n$ is the closed region between two parallel hyperplanes at distance $w$ from each other.

Q: Can a unit disc in ${\bf R}^2$ be covered with a sequence of planks of total width less than $2$?

Note that a single plank of width $2$ suffices, and can be divided into any number of parallel planks of total width $2$. But it seems conceivable that one might reduce the total width using planks that are not parallel, even if they must overlap. We show that this is impossible by going from a circle in ${\bf R}^2$ to a sphere in ${\bf R}^3$.

A: No. If the total width is $W$ then a unit ball in ${\bf R}^3$ is covered by planks of total width $W$. But by a classical theorem of Archimedes, a plank of width $w$ in ${\bf R}^3$ meets the unit sphere $S$ in a subset of area at most $2\pi w$, with equality if and only if both of the plank's bounding planes intersect $S$. Therefore the planks cover at most $2\pi W$ of the area of $S$. Since $S$ has area $4\pi$, we deduce that $W \geq (4\pi) / (2\pi) = 2$. QED

  • 10
    When I was a grad student, I learned this proof from Paul Chernoff. His formulation of the argument began something like "Imagine that the disk is a manhole and that a bubble of noxious gases is emerging from it." – Andreas Blass May 22 '20 at 02:46
  • 2
    @NoamD.Elkies this is a beautiful example, and striking. Thanks a lot for sharing. AndreasBlass, nice comment, – Claus May 22 '20 at 04:11
  • 5
    A rather obvious comment is that the analogue of Archimedes theorem does not hold in $\mathbb{R}^2$, so moving to $\mathbb{R}^3$ seems necessary here. – Tony Huynh May 23 '20 at 04:13
  • 4
    This argument also shows that if the planks cover each point of the disk $m$ times, the sum of their widths is at least $2m$. This generalization is no longer true for general Bang's theorem (which claims that if planks cover a convex compact set, the sum of their widths is not less than the width of the set). – Fedor Petrov May 23 '20 at 10:31
  • 1
    @Fedor Petrov I see: an equilateral triangle of width $w$ is doubly covered by three planks of width $w/2$ ! Neat. Is that kind of thing possible also for a convex compact set that's also centrally symmetric, in ${\bf R}^2$ or some higher dimension? – Noam D. Elkies May 24 '20 at 23:58
42

(Uri Bader points this out in comments, but it should really be an answer.)

A classic example is the calculation of the one-dimensional integral $\int_{-\infty}^\infty e^{-x^2}dx$ by squaring it, considering that as a two-dimensional integral, and changing to polar coordinates.

29

Hindman's theorem comes to mind: If $\mathbb N$ is partitioned into finitely many pieces, then there is an infinite set $A$ such that not only $A$, but also all sums of finite subsets of $A$ are contained in the same piece of the partition. (This is a slightly simplified statement: see the link for the full version.)

The statement of the theorem does not mention any uncountable sets.

The theorem can be proved in a "purely combinatorial" way, without mentioning any uncountable sets or appealing to topology or algebra, and in fact this was how the original proof by Hindman went. But this proof is very complicated and hard to follow. (In Hindman's own words, "That proof is really only good for punishing graduate students.")

There is a much nicer proof that uses topological dynamics, specifically the shift map on the Cantor space (which has size $2^{\aleph_0}$). This proof can be found in the last chapter of the Graham-Rothschild-Spencer book Ramsey Theory.

But the nicest proof of Hindman's theorem uses an algebraic structure on the compact Hausdorff space $\beta \mathbb N$ (which has size $2^{2^{\aleph_0}}$) to construct a special kind of non-principal ultrafilter (which has size $2^{\aleph_0}$), and then uses the existence of this special ultrafilter to prove the theorem in a few lines.

Will Brian
  • 17,444
  • thanks a lot for this cool example, great. It is completely striking that you go to such a high cardinality to get a simpler proof for a "countable result". And by the way cool quote from Hindman. – Claus May 21 '20 at 17:50
  • 2
    And Hindman's theorem was preceded by Folkman's theorem wis the same except that $A$ instead of being an infinite set is an arbitrarily large finite set. I don't know the original proof of Folkman's theorem, so I'm asking: is the easiest way to prove Folkman's nowadays to prove the stronger Hindman's theorem? – bof May 22 '20 at 02:55
  • 2
    @bof the other contender is to prove the more general finite statement, Rado's theorem. Which is easiest probably depends on the reader's background in Ramsey theory or topology. – Ben Barber May 22 '20 at 08:48
  • 3
    The interesting thing is that the progressively nicer proofs require stronger axioms: Hindman's original proof could be formulated in a weak fragment of second-order arithmetic, whereas the elegant ultrafilter proof requires the ultrafilter lemma (which can be proved in ZFC but not in ZF). – Adam P. Goucher May 22 '20 at 10:20
24

Borel determinacy is a good example. First, it is a fact about reals that actually requires using much larger sets to prove. This was shown by Harvey Friedman, and there is a recorded recent online talk of Menachem Magidor explaining Friedman’s argument. But this is an example where the use of larger cardinalities doesn’t just give an easier proof, but a proof at all. A nice presentation of the proof of Borel determinacy in ZF can be found in the textbook by Kechris.

However, Tony Martin’s argument for Borel determinacy was preceded by his argument for analytic determinacy (analytic being a larger class than Borel), using a measurable cardinal. The argument is simpler than the one for Borel determinacy in ZFC. Analytic determinacy is actually equivalent to the existence of $x^\sharp$ for every real $x$, which I would describe as countable “shadows” of a measurable.

Monroe Eskew
  • 18,133
18

The Max-Cut problem for a graph asks for a subset $S$ of vertices such that the number of edges between $S$ and the complement of $S$ is maximized. This problem is NP-hard. In fact, Håstad showed that even getting within 5.8% of optimal is NP-hard.

However, Goemans and Williamson showed that you can get within 12.2% of optimal in polynomial time by using high-dimensional optimization. We replace the edges with repulsive springs, confine the vertices to a unit sphere, and cut across a random hyperplane. For the unit sphere in 1 dimension (i.e., two points), this process just yields a restatement of the original problem. However, when the dimension of ambient space is equal to the number of vertices, then the quadratic program becomes semidefinite, and relaxation can be done quickly.

S. Carnahan
  • 45,116
  • thanks a lot for this example and the links, it looks like a great one and I will follow the links to get a good understanding! – Claus May 21 '20 at 17:55
17

There is one example from algebra which I'm pretty fond of, namely a proof that every field $K$ (which I assume to be infinite for simplicity) has an algebraic closure.

One issue which prevents naive applications of Zorn's Lemma is the lack of a natural ambient structure which would contain all algebraic extensions. An elegant way to get around that is to fix some set $S$ containing $K$ and of cardinality larger than $K$, and consider all algebraic extensions contained in $S$. Since under the axiom of choice all algebraic extensions of $K$ have the same cardinality, any maximal such extension has to be algebraically closed, so an application of Zorn's Lemma gives existence of an algebraic closure.

Wojowu
  • 27,379
  • thanks for this great algebra example. Very much appreciated! – Claus May 21 '20 at 09:34
  • 5
    To avoid putting field structures on abstract sets, you could prove this theorem by using a large polynomial ring over $K$. The use of Zorn’s Lemma is then hidden in the appeal to a suitable maximal ideal in that polynomial ring. See https://kconrad.math.uconn.edu/blurbs/galoistheory/algclosureshorter.pdf. – KConrad May 22 '20 at 17:28
17

Here is an example from projective geometry. Desargues's theorem states that for two triangles, if the lines connecting their corresponding vertices are concurrent, then the intersection of each pair of corresponding edges are collinear. (Wikipedia states if and only if, but IIRC the converse is an easy corollary.)

In a non-degenerate 3D case (where the triangles are not in the same plane), if we call the triangles $ABC$ and $abc$, then the proof goes something like this: $Aa$ and $Bb$ intersect, so $AaBb$ are on the same plane, and thus $AB$ and $ab$ intersect. Similarly, $BC$ and $bc$ intersect, and $CA$ and $ca$ intersect. These intersections must lie of the intersection of the planes containing $ABC$ and $abc$, which is a line.

A 2D case, which is just a degenerate 3D case, can be viewed as projecting a 3D case onto the plane. Proofs without using 3D geometry often rely on calculations.

Timothy Chow
  • 78,129
Max Xiong
  • 101
  • great example! Thanks for adding this – Claus May 21 '20 at 20:57
  • 2
    Also mentioned by Monroe Eskew in a comment. – Timothy Chow May 22 '20 at 02:26
  • 7
    It is not a coincidence that proofs that do not rely on the 3D structure rely on calculations. One can give a combinatorial definition of a projective plane as two sets $P$ of points and $L$ of lines and a relation $R \subset P \times L$ that denotes which points lie on which lines, satisfying suitable axioms. A central question in the theory is whether such a structure is the projective plane over a skewfield $K$. One can try to costruct $K$ from the geometry: after all, $K$ should be in bijective correspondence with a line minus one point, and one can try to define the operations on $K$... – Andrea Ferretti May 22 '20 at 07:49
  • 7
    ...using intersections. It turns out that this strategy works (hence the projective plane really is $\mathbb{P}^2(K)$) if and only if the Desargues theorem holds. Since there are nontrivial examples of combinatorial projective planes, Desargues theorem is equivalent to being able to introduce coordinates. This is why you see proofs using computations. The interesting thing is that one can define also a combinatorial $n$-space, but, due to your proof, once $n \geq 3$ you only get projective spaces over a skewfield. Hence the only combinatorially interesting case is $n = 2$. – Andrea Ferretti May 22 '20 at 07:52
  • @AndreaFerretti Andrea this is giving great extra perspective, thanks for that! – Claus May 22 '20 at 17:00
15

Liapunov's convexity theorem (1940). Let $\mu_1,\dots,\mu_n$ be finite, atomless, signed measures on a sigma-algebra $\mathcal F$. Then the set $$\big\{\big(\mu_1(A),\mu_2(A),\dots,\mu_n(A)\big) \in \mathbb R^n: A \in \mathcal F\big\}$$ is closed and convex.

Liapounoff, A., Sur les fonctions-vecteurs complètement additives., Bull. Acad. Sci. URSS, Sér. math. 4, 465-478 (1940). ZBL66.0219.02.

In 1966, Lindenstrauss provided a shorter proof. This proof goes into an infinite-dimensional Banach space $X$, then uses the fact that a linear map $X \to \mathbb R^n$ cannot be injective.

Lindenstrauss, Joram, A short proof of Liapounoff’s convexity theorem, J. Math. Mech. 15, 971-972 (1966). ZBL0152.24403.

David White
  • 29,779
Gerald Edgar
  • 40,238
  • 1
    thanks a lot for a great example. This is exactly the kind of simpler proof I am interested in (and I was not aware of this result before) – Claus May 21 '20 at 10:06
14

I will gladly expand my above comment into an answer. I know this example from Matouseks great book „Using the Borsuk-Ulam theorem“.

Consider the Necklace Splitting Problem: two thieves have stolen a necklace (which is open) with beads made from k different gems. They want to divide the necklace in a fair way between them, i.e. in such a way that each thieve gets half of the stones of each type of gem. They further want to do so cutting the necklace as few times as possible.

The Necklace theorem now asserts that they can divide the necklace between them using at most k cuts. A possible proof is the following: place the necklace onto the moment curve in k-dimensional space. By the Ham Sandwich theorem there is a hyperplane splitting each type of gem in half. It can be shown that any hyperplane cuts the moment curve in at most k places, so the Ham Sandwich cut in k dimensions translates to a solution to the original one-dimensional problem.

As a side remark, a combination of lifting and Ham Sandwich cuts can be used to show a number of results about mass partitions. For example, lifting masses in 2D to the unit paraboloid in 3D, the Ham Sandwich theorem shows that there is always a circle (where a line is also a circle, just with infinite radius) which simultaneously bisects 3 masses. Another lifting can be used to show the existence of bisections by algebraic curves of fixed degree, which is the so called „Polynomial Ham Sandwich Theorem“.

As a second side remark, the lifting to the unit paraboloid in 3D can also be used to show that the Lawson flip algorithm to find a Delaunay triangulation terminates, see page 86 of these lecture notes. There is also a neat argument involving lifting in another chapter which is about counting embracing triangles (the lifting comes in at page 152, the fourth page of the chapter).

13

The following is copied from an answer to another question on this site.

Here's an example in planar euclidean geometry. Consider an equilateral triangle of side $a$ and a general point in the plane distant $b$, $c$, and $d$ from the respective vertices. Then

$3(a^4 + b^4 + c^4 + d^4) = (a^2 + b^2 + c^2 + d^2)^2$.

This is an an awful slog to get by planar trigonometry. Even harder to do by trig in three dimensions is the corresponding result for the regular tetrahedron. However, it's easy to get the $(n - 1)$-dimensional result for a regular $(n - 1)$-dimensional simplex of side $d_0$, with vertex distances $d_1 ,..., d_n$ :

$n(d_0^4 +\cdots+ d_n^4) = (d_0^2 + \cdots + d_n^2)^2$.

You can do this by embedding the euclidean $(n - 1)$-dimensional space as the hyperplane of points $(x_1 ,..., x_n)$ in euclidean $n$-space such that $x_1 + \cdots+x_n = d_0/\surd2$. The vertices of the simplex can then be represented as the points $(d_0/\surd2)(1, 0 ,..., 0), ... , (d_0/\surd2)(0 ,..., 0, 1)$ in the hyperplane, and the result drops out in a few lines.

John Bentin
  • 2,427
  • thank you very much for a great example. I can see how cumbersome it must be to prove it in the plane – Claus May 21 '20 at 21:04
10

This one comes to my mind (feel free to edit as I'm missing references, and might be wrong in details)

Let $n$ be a nonnegative integer. Let $F,F'$ be homeomorphic compact subsets of $\mathbf{R}^n$. Then $\mathbf{R}^n-F$ and $\mathbf{R}^n-F'$ have the same number of connected components [to be safe, say same finite number, or both $=\infty$].

This typically applies to Jordan's theorem on closed loops in the plane, and more generally to a topological $(n-1)$-sphere in $\mathbf{R}^n$: the complement has 2 connected components.

The proof, as I remember, consists in proving that a homeomorphism $F\to F'$ can be extended to a self-homeomorphism of $\mathbf{R}^{2n}$ ($\mathbf{R}^n$ being embedded in the standard way as the first $n$ coordinates). And then relating $H^0(\mathbf{R}^n-F)$ (whose dimension, finite or $\infty$, is the number of connected components) to the de Rham cohomology $H^n(\mathbf{R}^{2n}-F)$.

YCor
  • 60,149
  • 5
    This beautiful trick is due to Victor Klee (1955). You can find references in https://mathoverflow.net/q/271846/6451 – BS. May 21 '20 at 21:48
  • @BS. Great link! Thanks a lot – Claus May 22 '20 at 09:10
  • @YCor I wanted to thank you for adding a great example. Just noticed that I erroneously deleted my previous comment – Claus May 22 '20 at 09:15
  • 2
    The same argument also shows that the other Betti numbers of the complement match as well. – ACL May 26 '20 at 19:56
  • A silly question: are they weak homotopy equivalent? –  Jan 29 '21 at 22:24
10

Perhaps I am just partially blind and someone has already said this, but the thing that springs to mind for me is to show that there no percolation (ie no infinite component) at criticality for $\mathbb Z^d$ with $d \ge 3$.

To learn more about this, there is a survey Sixty Years of Percolation by Hugo Duminil-Copin from IHÉS. He is one of the top people in the field. (In fact, he's one of the top young mathematicians in the world -- if he wins a Fields Medal, you heard the prediction here first!) He can the talk at ICM 2018. The final sentence of the abstract reads as follows:

This review is not intended to probabilists...: the target audience is mathematicians of all kinds.

Regarding the history, it is outlined in (the second half of) Section 1.2. Let me summarise a little -- the complete history is not given there. All references which I mention below can be found in the text of Duminil-Copin linked above.

It was originally proved by Hara and Slade for $d \ge 19$ using lace expansion. To quote the reference, "Each few years, more delicate uses of the lace-expansion enable [us] to reduce the dimension". the The current best is $d \ge 11$, due to Fitzner and van der Hofstad. (I think along the way vdH brought $d \ge 19$ down to something like $d \ge 14$.)

For some details on why higher dimensions may be easier, see Section 3.2. Roughly, it is to do with crossing probabilities of simple random walks in $\mathbb Z^d$. It is well-known that $d \ge 3$ implies transience (so return to the origin finitely many times). However, I think you need $d \ge 5$ in order to say that two independent walks intersect only finitely many times. I forget the exact details. For very large $d$, a SRW on $\mathbb Z^d$ looks, at first, a little bit like a SRW on a $d$-regular tree (up until time $o(\sqrt d)$?).


It's an interesting history, showing how these tools originally really on work for sufficiently high dimensions. Unfortunately, $d \ge 3$ is still rather out of reach with current techniques...

Barsky, Grimmett and Newman showed, in 1991, that the analogous claim is true (for all $d \ge 3$) not for $\mathbb Z^d$ but for $\mathbb N \times \mathbb Z^{d-1}$. One would surely think the main conjecture is within touching distance given this. Amazingly, ~30 years later, basically no improvement for small $d$ has been obtained!

Sam OT
  • 560
  • 1
    I found a nice reference: I have included that, and added some of my own comments :) – Sam OT May 22 '20 at 08:47
  • this is a great link, and the text you added gives great insights. Thanks a lot! – Claus May 22 '20 at 09:13
  • 1
    Glad to help! It was interesting for me reading up on it again -- I do discrete probability, so have some idea of the history, but I don't do percolation really. Was good to refresh the memory! (I've also added a note at the very end just now) – Sam OT May 22 '20 at 11:01
  • thanks for a great add, and a comment, because you make this good remark that 30 yrs later still no progress. In a similar direction, same thing for the Collatz Conjecture. This is even harder to believe – Claus May 22 '20 at 11:14
  • 1
    I was asked this in an interview one time! After a few preliminary questions on the algorithm, I was asked if it terminates. Needless to say, I couldn't give a proof! The interviewer quickly said that it's a famous open question – Sam OT May 22 '20 at 11:26
9

The Cauchy problem for the wave equation $$\partial_t^2u=c^2\Delta_xu$$ is not too difficult to solve explicitly in $3$ space dimensions, by the method of spherical means. This yields a close formula for the fundamental solution.

It is much more difficult, if not impossible, to carry out the calculation directly in space dimension $2$. Actually, the explicit solution of the Cauchy problem and the fundamental solution are obtained by extending the initial data to ${\mathbb R}^3$ by $u_j(x_1,x_2)\mapsto v_j(x_1,x_2,x_3):=u_j(x_1,x_2)$ (here $j=0,1$ for the data of $u$ and $\partial_tu$ at initial time). This is called the descent method.

Denis Serre
  • 51,599
8

Not a single theorem per se, but in dynamical systems, it is often very useful to translate questions about properties of a continuous system $\dot{x}=f(x)$ or discrete-time system $x_{k+1}=f(x_k)$, where $x\in\mathbb{R}^n$, to questions about probability distributions or densities over $\mathbb{R}^n$. This is done by studying the associated Perron-Frobenius/Transfer operator that describes the density time evolution.

Arguably, questions such as existence and properties of invariant sets of $f$ are better handled in this infinite dimensional setting. The key point is that the infinite dimensional operators are linear, even if $f$ itself is nonlinear. This brings the spectral theory of linear operators into play.

8

In 3D-graphics, 3D-points are translated into 4D-points using a technique refereed to as "homogeneous coordinates". Then 3D-perspective transformations and coordinate translations (which are non-linear in 3D), become linear in 4D. This allows one to concatenate all the successive transformations into a single linear transformation. This really enables the lightening fast 3D graphics you see today although it was discovered and used quite early on. I remember being amazed when I learned it.

There are correspondingly a host of perspective geometry theorems that are enabled as a result of this linearity in 4D, like the ability to clip at the end of the pipeline to get the same results of clipping at the beginning, etc.

Mike Wise
  • 101
  • thanks a lot, striking! I have to look it up. Do you have a good link to that? – Claus May 22 '20 at 16:56
  • 1
    @Claus: It's very simple, and there is totally no need for "homogenous coordinates". Put the 3d plane in 4d space with one of the coordinates being 1. A translation in that 3d plane can be done via a linear shear in 4d. So everything transform is represented by a matrix operator on 4d vectors. This means that any sequence of transforms is a product of matrices. Using standard knowledge from algorithms and data structures, joining or splitting transform sequences can be done in $O(\log(n))$ time where $n$ is their total size. – user21820 May 23 '20 at 13:28
7

Differential equations of order $n$ in $\mathbb{R}$, like $\frac{d^n}{dt^n}x(t) = F\left(t, x(t), \frac{d}{dt}x(t), \dots, \frac{d^{n-1}}{dt^{n-1}}x(t)\right)$ can be transformed into first order differential equations in $\mathbb{R}^n$.

This is done by defining $x_1(t) = x(t), x_2(t) = x_1'(t), \dots, x_n(t) = x_{n-1}'(t)$ and observing that the equation reduces to $x_n'(t) = F(t, x_1(t), \dots, x_n(t))$, which - together with the relations defining $x_i(t)$ - is a differential equations of the first order in the vector $(x_1(t), \dots, x_n(t))$.

Andrea Ferretti
  • 14,454
  • 13
  • 78
  • 111
  • thanks again for a great example from analysis/differential equations. Great add, thanks a lot! – Claus May 22 '20 at 08:35
7

Not a theorem but a cool result regardless: Given a convex $n$ sided polygon in 2D, give an algorithm to find the largest circle that can fit inside it.

I am not aware of any non messy or particularly efficient approaches in 2D but if you consider the planes in 3D that contain each side length of the polygon and forming a 45 degree angle with the plane of the polygon, then the problem can be solved by finding the point with the largest third coordinate that is underneath all these planes. This can be done very efficiently with a linear program.

7

The symmetry of the octahedron projects to Simpson's Rule.

Recall that Simpson's Rule is the approximation $$ \int_a^b f(x) \, dx \approx \frac{b-a}{6} \Bigl(f(a) + 4 f\bigl(\frac{a+b}{2}\bigr) + f(b) \Bigr) \,, $$ which is exact for $f$ polynomial with $\deg f \leq 3$, and thus true within $2\epsilon(b-a)$ for a function that is approximated within $\epsilon$ by a cubic polynomial. All intervals $[a,b]$ are equivalent under affine-linear change of variable, and these changes of variable preserve Simpson's Rule, so it is enough to consider the special case of the interval $|x| \leq 1$, which is $$ \int_{-1}^1 f(x) \, dx \approx \frac13 \bigl(f(-1) + 4 f(0) + f(1) \bigr). $$

Now let $V = \{ \pm e_1, \pm e_2, \pm e_3 \}$ be the set of six vertices of the standard octahedron inscribed in the sphere $S^2 \subset {\bf R}^3$; and let $G$ be the group of symmetries of $V$, which are the $2^3 3! = 48$ signed permutation matrices of order $3$. Any polynomial function $F: {\bf R}^3 \to {\bf R}$ that is invariant under $G$ and has $\deg F \leq 3$ is a linear combination of $1$ and $x^2+y^2+z^2$. It follows the average of $F$ over $S^2$ equals the average $\frac16 \sum_{i=1}^3 (F(e_i) + F(-e_i))$ of $F$ over $V$. That is, $V$ is a spherical 3-design.

Now apply this to $F$ of the form $F(x_1,x_2,x_3) = f(x_1)$ with $f \in {\bf R}[x]$ of degree at most $3$. For any function $G: S^2 \to {\bf R}$ of the form $G(x_1,x_2,x_3) = g(x_1)$, the average of $G$ over $S^2$ equals the average of $g$ over $[-1,1]$, essentially by the same theorem of Archimedes that I cited in an earlier answer to the same MO question (Tarski's plank theorem of 1932). Of the six points in $V$, four have $x_1=0$ and one each has $x_1 = 1$ or $-1$, so we have recovered Simpson's Rule.

I learned this from Greg Kuperberg; see his article

Numerical cubature from Archimedes' hat-box theorem, SIAM J. Numer. Anal. 44 (2006), 908-935 (arXiv:math/0405366).

"Cubature" is quadrature in higher dimensions (NB "quadrature" = "squaring" as in "squaring the circle"). The article gives many other quadrature and cubature formulas that can be obtained this way by projection from symmetrical designs in higher dimension. For starters, rotating $V$ so two faces are perpendicular to the $x$-axis, or replacing $V$ by the cube $(\pm 1, \pm 1, \pm 1)/\sqrt 3$ (which has the same symmetries, and is thus also a $3$-design), gives the quadrature rule $$ \int_{-1}^1 f(x) \, dx \approx f(-1/\sqrt3) + f(1/\sqrt3) $$ which is again exact for polynomials of degree at most $3$.

Remark: This is arguably a better example than the Tarski plank theorem, not just because we must raise the dimension by $2$ but also because here we make essential use of symmetries of ${\bf R}^3$: for Tarski we could have integrated $dx \, dy \left/ \sqrt{1-x^2-y^2} \right.$ instead of invoking the third dimension.

  • 1
    On my first reading of this, I was confused by "Any polynomial function $F$ ... that is invariant under $G$ ....Now apply this to $F$ of the form $F(x_1,x_2,x_3)=f(x_1)$ ..." since $f(x_1)$ isn't invariant under $G$. Fortunately, it's explained on the first page of the Kuperberg's cited paper, and once it's explained it's obvious. OK, it should have been obvious anyway, but maybe mention it for the sake of other readers who are as sleepy as I am. – Andreas Blass May 24 '20 at 02:36
  • Yes I should rewrite that -- all the more so since there are two very different $G$'s ! But I need to go to sleep now too . . . – Noam D. Elkies May 24 '20 at 03:14
  • @NoamD.Elkies thank you for providing another great example here. It fits my question perfectly and great to have an example from numerics as well – Claus May 24 '20 at 04:28
6

The aperiodic Penrose tiling can be generated as a cross section of a regular tiling in 5 dimensions, which is periodic! See this answer for more details.

Ivan Meir
  • 4,782
  • thanks for a great answer. This is truly a fascinating example as well. Thank you for the link with the details. – Claus May 25 '20 at 12:39
  • How do we need five dimensions? The quasilattice on which the nodes are based becomes periodic in only four. – Oscar Lanzi Aug 28 '22 at 19:35
6

The Toepliz problem, also known as "inscribed square problem" or "square peg problem", asks whether every Jordan curve in the plane contains the vertices of a square.

Vaughan's proof of the rectangular peg problem embeds the plane (and the curve) in $\mathbb{R}^3$ and works towards a contradiction. The proof is really beautiful, and there's a 3blue1brown video that fleshes it out.

Vaughan's idea has then been furthered developed by Hugelmeyer, who embeds the plane in $\mathbb{R}^4$ instead. His proof is really clever and works for smooth curves. Just last week Greene and Lobb posted a symplectic refinement of Hugelmeyer's ideas, leading to a much stronger statement about aspect ratios. Let me also advertise Matschke's work and his survey on the Toepliz problem.

Marco Golla
  • 10,402
  • Thanks a lot for this beautiful example! I like this proof a lot and the video is great as well. Thanks for the other links as well including the update from last week. Great example! – Claus May 25 '20 at 12:33
5

An example in the spirit (in fact a generalization) of the answer by Sam T is the triviality of the $\phi^4$ quantum field theories from lattice approximations. In dimension 5 or more this was done a long time ago by Aizenman and Fröhlich. In dimension 4 this is a brand new result by Aizenman and Duminil-Copin. The reason I said generalization above is that this is part of a general phenomenon in statistical mechanics due to the notion of upper critical dimension. See this review by Gordon Slade for a general mathematical introduction.

4

The famous result by Bang is that if a convex compact set $K\subset \mathbb{R}^n$ is covered by a finite number of open planks, then the sum of their widths is greater than a width of $K$. [The closed, correspondingly open, plank with normal $\theta$ of width $h\geqslant 0$ is the set of points lying between, correspondingly strictly between, two planes at distance $h$, both orthogonal to a unit vector $\theta$. The width $w(K)$ of $K$ is defined as the minimum of widths of closed planks containing $K$.]

If $n=2$ and $K$ is a unit disc, there is a short proof using the lifting to the third dimension, also mentioned in the answer by Noam Elkies : considering $K$ as a section of the unit ball in $\mathbb{R}^3$, for any plank $S$ of width $h$ its lifting $S\times \mathbb{R}=\{(s,x)\in \mathbb{R}^3: s\in S, x\in \mathbb{R}\}$ intersects the unit sphere by a set of area (at most) $2\pi h$ (this fact belongs to Archimedes himself). Since the whole unit sphere, which has area $4\pi$, must be covered by the liftings of our planks, we immediately get the desired lower bound 2 for the sum of their widths, it is strict for open planks.

Now about the general case, we again use the lifting but differently.

We use the following

Lemma. If $K\subset \mathbb{R}^n$ is a convex compact set and $f\in \mathbb{R}^n$, $\|f\|\leqslant w(K)=:h$, then

a) $K\cap (K+f)\ne \emptyset$;

b) $w(K\cap (K+f)) \geqslant h-\|f\|$.

Proof. a) Assume the contrary. Then by Hahn -- Banach $K$ and $K+f$ may be separated by a plane $\langle x,\theta\rangle=c$. That is, $\langle x,\theta\rangle< c<\langle x+f,\theta\rangle$ for any $x\in K$. Thus $K$ may be covered by an open plank of width $\langle f,\theta\rangle \leqslant \|f\|\leqslant h$, a contradiction.

b) Denote $g=f\cdot \frac{h}{\|f\|}$ (if $f\ne 0$, the case $f=0$ is trivial). Then $\|g\|=h$ and by a) there exists a point $a\in K\cap (K+g)$. We have by convexity $$\frac{h-\|f\|}h(K-a)\subset K-a,\\ \frac{h-\|f\|}h(K+g-a)\subset K+g-a,$$ that is equivalent to $a+\frac{h-\|f\|}h(K-a)\subset K\cap (K+f)$. Therefore $w(K\cap (K+f))\geqslant w(a+\frac{h-\|f\|}h(K-a))=h-\|f\|$.

Now assume that $\sum h_i\leqslant h=w(K)$ and the open planks $S_i=\{x:|\langle x-x_0,\theta_i\rangle|< \frac{h_i}2 \}$, $i=1,\ldots,N$, cover $K$. In other words, we assume that there exists a point, called $x_0$, which belongs to all the middle planes of the planks ($x_0$ may belong to $K$ or not).

The $2^N$ sets $K\pm \frac{h_1}2 \theta_1 \pm \frac{h_2}2 \theta_2\pm \ldots \pm \frac{h_N}2 \theta_N$ have a non-empty intersection: this follows from applying Lemma $N$ times (we start with $w((K-\frac{h_1}2\theta_1)\cap (K+ \frac{h_1}2\theta_1))=w(K\cap (K+h_1\theta_1))\geqslant h- h_1$ and proceed naturally, using the obvious inclusions like $(A\cap B)+x\subset (A+x)\cap (B+x)$.)

So, for certain $p\in \mathbb{R}^n$, the set $\Omega=\{p\pm \frac{h_1}2 \theta_1 \pm \frac{h_2}2 \theta_2\pm \ldots \pm \frac{h_N}2 \theta_N\}$ is contained in $K$. Choose the point $q\in \Omega$ on the maximal distance from $x_0$. We should have $|\langle q-x_0,\theta_i\rangle| <h_i/2$ for some $i$, and this implies (easily seen from the picture) that both points $q+h_i\theta_i$, $q-h_i\theta_i$ are further from $x_0$ than $q$. But one of these two points belongs to $\Omega$, a contradiction.

Now a general case. Assume that $K$ is covered by $N$ planks. If the normals of our planks are linearly independent, there middle planes have a common point and we are done. If $N\leqslant n$, we may move our planks a bit so that their normals become linearly independent and they still cover $K$. Finally, if $N>n$, we lift $K$ to a cylinder $C:=K\cdot [0,M]^{N-n}\subset \mathbb{R}^N$ (where $M$ is so large that $w(C)=w(K)$, $M=h$ is enough) and lift the planks $S_i$ to $S_i\times \mathbb{R}^{N-n}$. The problem is reduced to the case which is already done.

Fedor Petrov
  • 102,548
  • thanks a lot for providing! And great that you give the proof for general $n$ to it here! – Claus May 22 '20 at 17:07
  • 1
    Oh yes it is. I searched for Bang before posting, but not for Tarski, my bad. – Fedor Petrov May 22 '20 at 17:18
  • 1
    Thanks @FedorPetrov. It seems that this argument has inspired other results around here as well, like this one by fedja: https://mathoverflow.net/questions/352720/reference-to-a-conjecture-on-unit-vectors-in-euclidean-space – Sandeep Silwal May 22 '20 at 19:02
  • 2
    @SandeepSilwal earlier it inspired fedja also to solve the coefficient problem for Fourier series (“The Bang solution of the coefficient problem”, St. Petersburg Math. J., 9:2 (1998), 407-419) – Fedor Petrov May 22 '20 at 19:54
  • Amazing, I always learn something from fedja. – Sandeep Silwal May 22 '20 at 19:57
3

Given that you asked about planar graph : In graph theory, there is the Heawood conjecture proven in 1968 by Ringel and Youngs:

If a graph $G$ has genius $g>0$ then $$ \chi(G)\leq \left\lfloor \frac{7+\sqrt{1+48g}}{2}\right\rfloor$$

Note that the case $g=0$ (not included in this theorem) would be the four color theorem for planar graph! It is a pretty surprising result to get a relatively simple theorem for any genius $g>0$ but not $g=0$.

3

For a graph $G$, its multi-dimensional characteristic polynomials $\Phi_G=\det(I_x-A)$ where $A$ is adjacency and $I_x=diag\{x_1,...,x_n\}$. It's definition depends on the labeling of the vertices but it is multi-affine real stable. One has $\Phi_{G-v_j}=\frac{\partial \Phi_G}{\partial x_j}$ which gives an intuitive reason why its contraction $\phi_{G-v_j}(x)$ interlaces $\phi_G(x),$ since $f'$ always interlace real rooted $f$.

Also the derivativ formula $\frac{d}{dt} \Phi_G(x_1(t),...,x_n(t))=\sum_{j=1}^n\frac{\partial{\Phi_G}}{{\partial x_j}} \frac{d x_j}{dt},$ implies the formula $ \frac{ d \phi_G(x)}{dx}=\sum_{j=1}^n \phi_{G-v_j} (x),$ if we identify $x_j=x=t$.

Chua KS
  • 487
  • thanks a lot for your example! Can i ask you, please let me know, what's the name of this theorem? – Claus May 24 '20 at 19:05
  • This continued to hold if one replace determinant by permanent, in fact by the immanent attached to any irreducible character of $S_n$. If $\lambda$ is a partition of $n$ and $\Phi_G^\lambda:=imma_\lambda(I_x-A)$, then $ \frac{\partial \Phi^\lambda_G}{\partial x_n}=\sum_{\lambda_1' < \lambda} \Phi^{\lambda_1}{G-v_n}(x_1,...,x{n-1}),$, where $\lambda_1$ range over all partition of $n-1$ whose Young diagram is derive from that of $\lambda$ by removing one border square which remains a valid Young diagram. – CHUAKS Jun 08 '21 at 03:35
  • This is a pure algebraic identity and it holds for all complex matrix $A$ not necessarily adjacency of a graph or Hermitian. – CHUAKS Jun 10 '21 at 01:58
3

There are several examples related to the mathematics of quantum field theory in which using higher-dimensional thinking in physics led to mathematical theorems answering questions which might not have even been asked without said higher-dimensional thinking. (Those proofs don't necessarily use the same higher-dimensional methods, so maybe this isn't exactly what you're looking for. If so, I'm happy to remove this answer.)

One common way to study quantum field theories is "compactification" from a higher-dimensional theory, by saying, e.g., that your $n$-dimensional QFT on a manifold $M$ is the same thing as an $(n+2)$-dimensional QFT on $M\times T^2$. (It doesn't have to be $T^2$). This often explains mysterious properties of the $n$-dimensional QFT in terms of clearer information coming from the $(n+2)$-dimensional QFT (in one example, an $\mathrm{SL}_2(\mathbb Z)$-symmetry on the original theory arising from the mapping class group of that $T^2$ in the $(n+2)$-dimensional theory). Generally, these QFTs aren't mathematically well-defined, but their study still leads to mathematically rigorous questions, and this perspective can help answer them.

Mirror symmetry is a great example. One of its avatars is a collection of conjectures (some of which are now theorems) about six-dimensional Calabi-Yau manifolds, associating to such a manifold $X$ a "mirror" $X^\vee$, another Calabi-Yau $6$-manifold, and equating certain data on $X$ with other data on $X^\vee$. These conjectures arose in physics, where physicists suspected an equivalence between one kind of string theory on $\mathbb R^{1,3}\times X$ and another kind of string theory on $\mathbb R^{1,3}\times X^\vee$. Without that insight, it is very unlikely anyone would've thought to ask the questions leading to mirror symmetry, let alone answer them.

(There's no shortage of other examples, such as the study of anomalous QFTs as boundary theories of invertible theories in one dimension higher, or the use of Theory $\mathfrak X$ to study lower-dimensional mathematical objects…)

Arun Debray
  • 6,766
  • thanks a lot for your contribution based on quantum field theory! I think it is great to have such an example. My thinking, it is not a case where the proof gets simple because it goes "higher" dimension than the theorem, but it is still a great example in that the higher-dim case informs lower-dim results – Claus May 24 '20 at 19:10
3

This answer is quite different in spirit from my other answer, so I've factored it out.

The $n$th bordism group $\Omega_n$ is the abelian monoid of diffeomorphism classes of closed smooth $n$-manifolds under disjoint union, modulo those which bound compact $(n+1)$-manifolds (this is in fact a finitely generated abelian group). There are many variations on this, such as requiring everything to be oriented, or have spin structures, or so on.

One can imagine computing these groups directly using topological or geometric methods, and this works up to dimension 3 or so (e.g. this MO question and its answers, or this paper of Stipsicz), but eventually these methods aren't powerful enough.

Thom and Pontrjagin discovered a very different approach which requires higher-dimensional methods: use the Whitney theorem to embed your manifold $M$ in $S^N$ for some $N$ large enough. The normal bundle of $M$ is classified by a map from the normal bundle to the universal rank-$(N-n)$ vector bundle $V_{N-n}\to B\mathrm O_{N-n}$. One can extend this to a map from $S^N$ to something called the Thom space $T_{N-n}$ of $V_{N-n}\to B\mathrm O_{N-n}$, constructed by adding a basepoint at infinity in a suitable sense. One checks that homotopy classes of maps $S^N\to T_{N-n}$ are in bijection with $\Omega_n$, and now computing bordism groups amounts to computing homotopy groups of this Thom space.

Computing homotopy groups is not easy, but this method scales in $n$ much more nicely than more direct approaches, and Thom completely solved this problem for all $n$. (Many variants of this problem are also completely solved, thanks to work of Wall, Anderson-Brown-Peterson, Milnor, and many more.)

Arun Debray
  • 6,766
3

Let $P$ be a convex polytope in $\mathbb{R}^d$ with vertices $v_1,\dots,v_n\in \mathbb{Z}^d$. A nice trick that helps to visualize, understand and prove that the number of lattice points in dilations $tP$ $(t\in\mathbb{N})$ is a polynomial in $t$, called the Ehrhart polynomial of $P$, is to add one dimension a consider the cone over $P$: $$\mathrm{cone}(P)=\{r_1(v_1,1)+\cdots+r_n(v_n,1)\mid r_1,\cdots,r_n\ge0\}\subset\mathbb{R}^{d+1}.$$ Then the dilated polytope $tP\subset\mathbb{R}^d$ corresponds to the intersection of $\mathrm{cone}(P)$ with the hyperplane $\{(x_1,\dots,x_{d+1})\in\mathbb{R}^{d+1}\mid x_{d+1}=t\}$. This allows to work with some generating functions associated to polyhedra which simplify for cones.

efs
  • 3,099
2

Hopefully, you will agree that infinite dimension is higher dimension.

A fruitful approach to solve a nonlinear problem on a finite dimensional space is to convert it into a linear problem on an infinite dimensional space. There are literally hundreds of examples in that vein. Let me give just five of them.

  • Hurwitz's proof of the isoperimetric inequality in the plane using Fourier series. Convert a geometrical problem in the plane into a problem about complex valued functions.

  • More generally the resolution of the standard pdes (wave, heat) using Fourier series. See a function of two variables with finite dimensional range as a function of one variable with value in an infinite dimensional function space.

  • The theory of Schwartz's distributions. Functions as linear functionals on functions. The fact that every functions become differentiable simplify a lot of computations in mathematical physics. The book of Laurent Schwartz "Mathematics For The Physical Sciences" is full of examples.

  • Koopmanism in dynamical systems. Replace the action of a transformation $T : X \rightarrow X$ on a finite dimensional manifold by the action of the linear operator $f \rightarrow f \circ T$ on a well chose functional space, for example $L^2(X,\mu)$ if $T$ preserves some measure $\mu$. Von Neumann used that method to prove what is now known as the Von Neumann ergodic theorem.

  • Another application of Koopmanism to the isomorphism problem in dynamical systems : show that two rotations on the circle are conjugate by a measurable transformation preserving the Lebesgue measure if and only if their angles are equal or opposite. This is easily done by looking at the spectrum of the Koopman operator, which is an invariant of measurable isomorphism.

coudy
  • 18,537
  • 5
  • 74
  • 134