14

Let $\lbrace x_i\rbrace_{i=1}^\infty$ be a sequence of distinct numbers in $(0,1)$. For any $n$ after deleting $x_1,...,x_n$ from $[0,1]$ we get $n+1$ subintervals. Let $a_n$ be the maximum length of these subintervals. Is there any sharp lower bound for $\limsup_n na_n$ ?

Harald Hanche-Olsen
  • 9,146
  • 3
  • 36
  • 49
dimo
  • 191
  • 2
    If you delete 1 you have 2 intervals, so there seems a slight glitch in the formulation. But, other than that the intended answer is presumably 1, for the lower bound. This does however not seem like a question related to current research in math, which is what this site is for. General questions on math are welcome at http://math.stackexchange.com –  Jul 29 '13 at 07:38
  • Hmm. I voted to close, but on second thought maybe I was too hasty. The question seems more subtle than it appears! (I edited the question to avoid a minor wrongness pointed out by @quid.) – Harald Hanche-Olsen Jul 29 '13 at 08:23
  • 1
    To amplify on my previous comment: Note it says lim sup, not lim inf. If it were the latter, it would be easy. – Harald Hanche-Olsen Jul 29 '13 at 08:33
  • 1
    Random thoughts: You can easily achieve the value $2$, by listing all the numbers $j\cdot2^{-k}$ in the order $1/2$, $1/4$, $3/4$, $1/8$, $3/8$, … On the other hand, if you get carried away and have $(n+1)na_n=1$ for some $n$, meaning we are looking at $n+1$ intervals of length $1/n+1$, then it takes $n+1$ steps to reduce the size of the maximum interval, with the result that $2na_{2n}\approx2$. I think one can do better; I would venture a wild quess that the golden ratio is the answer. – Harald Hanche-Olsen Jul 29 '13 at 08:50
  • The minor misformulation made me a bit quick in my judgement; I partially retract the comment (though the question as presented is still not a good question, lacking any context and motivation, or preliminary thoughts). –  Jul 29 '13 at 09:04
  • 1
    I am pretty sure the answer is known; this amounts to planning in advance a strategy to guess a number when one can only ask questions of the form "is it greater than $x$?". The bound asked in the OP is the best possible (over strategies) least favorable (over number to guess) rate of approximation one gets. – Benoît Kloeckner Jul 29 '13 at 09:12
  • That said, the formal answer to the question is certainly yes: there is a sharp lower bound, $\inf_{{x_n}} \limsup_n n a_n$. The question really is: what are known upper and lower bounds for this number? For now, we have the obvious lower bound $1$ and the upper bound $2$ given by Harald Hanche-Olsen. – Benoît Kloeckner Jul 29 '13 at 09:16
  • @dimo: I know, but sometimes starting with bounds is the best way to go. The $\ln$ you expect makes me think you should loof in the direction of information theory; but again this should be well-known. What are your motivations? where did you look? – Benoît Kloeckner Jul 29 '13 at 10:39
  • Did you look in the literature? using the reformulation I gave? Also, you still did not give us any motivation. – Benoît Kloeckner Jul 29 '13 at 10:58
  • 1
    There is a lot of literature on sequences with 'good' distribution proerties. I do not know this well and I did not find anything matching right away. But 'low discrepancy sequences' could be a good keyword to find related literature. –  Jul 29 '13 at 11:00
  • This is sort of related to the work of van Aardenne-Ehrenfest and others on discrepancy, see, e.g., http://mathworld.wolfram.com/DiscrepancyTheorem.html – Gerry Myerson Jul 29 '13 at 12:48
  • I am not entirely sure I understand the formulation. Specifically, where is it guaranteed that $a_n\to 0$ as $n\to\infty$. Consider $x_i= 1/2 + 1/2^{i+1}$ for $i\in\mathbb{N}$. I am under the impression that for any $n\geq 0$, it follows that $a_n\geq 1/2$ and hence, the $\limsup$ in question blows up as $n\to\infty$. It seems that you need to at least assume some sort of denseness property of ${x_i}_{i\in\mathbb{N}}$ in the interval $[0,1]$ to say anything about the $\limsup$ in question. – JCM Jul 29 '13 at 14:13
  • @JCM You are both right and wrong. Indeed, the lim sup can be infinite as you say, but the question asks for the infimum of all possible values of the lim sup, the infimum taken over all possible sequences. – Harald Hanche-Olsen Jul 29 '13 at 14:31
  • 1
    Please do not delete all your comments. (To be clear, it is too late now anyway, but for the future.) –  Jul 29 '13 at 17:28

3 Answers3

11

Here is how to achieve $1/\ln 2$. It is a modification of Harald's construction. We take the sequence $1/2$, $1/4$, $3/4$, etc. and apply the transformation $x \to \ln (1+x)/\ln 2$.

So the sequence is $\ln(3/2)/\ln(2), \ln(5/4)/\ln (2),\ln(7/4)/\ln(2),\dots$

The general terms is $x_n$ for $n=2^a+b$ with $b<2^a$ is

$$\ln \left( \frac{2^{a+1} + 2 b+1}{2^{a+1}}\right)$$

or, equivalently

$$\frac{\ln( 2^{a+1}+2 b + 1)}{\ln(2)} - a-1$$

Then after $n = 2^a + b$ steps for $b<2^a$, we will have just added the cut $(2 b+1 ) /(2^{a+1})$, the largest interval will be

$$\left[\frac{\ln( 2^{a}+2 b + 1)}{\ln(2)} - a, \frac{\ln( 2^{a}+2 b + 2)}{\ln(2)} - a\right] $$

whose length is:

$$\frac{ \ln \left(1 + \frac{1}{2^{a}+b+1}\right)}{\ln 2}$$

If we multiply by $2^a+b+1$, the $\lim\sup$ is $1/\ln 2$.

To show this is optimal, form a binary tree where the vertices are intervals that appear at any point in a sequence. Each interval at some point splits into two intervals, which gives the tree structure. Label each vertex with the step at which it splits. For any $\beta>\lim\sup n a_n$, for all but finitely many vertices, the length of the interval labeled $n$ is at most $\beta/n$. The sum of the lengths of the intervals on each row is $1$, so the sum of the first $k$ rows is $k$.

Thus $k$ is at most the sum over $2^k-1$ distinct numbers $n$ of $\beta/n$ plus a constant coming from the finitely many vertices where the length is not at most $\beta/n$. So $k$ is at most the sum over the first $2^k-1$ numbers of $\beta/n$ plus a constant, which is at most $\beta k \ln 2$ plus a constant. So $\beta\geq 1/\ln 2$.

So $\lim\sup n a_n \geq 1/\ln 2$.

Will Sawin
  • 135,926
6

Added: As John Bentin mentions it is in fact known that the result I mentioned below for the dispersion and thus the $1/\log 2$ of OP (mentioned in now deleted comments) is optimal.

This appears originally in Niederreiter, On a measure of denseness for sequences. In: Topics in classical number theory, North-Holland, Amsterdam, 1984.

An easier accssible source is the book by Niederreiter 'Random Number Generation and Quasi Monte Carlo Methods' (SIAM, 1992). There it is Theorem 6.7.

That there cannot be a better sequence is due to Niederreiter, but the example given there showing it is optimal is attributed to Ruzsa. The example there is $x_1 = 1$ and then $$x_n = \left \{ \frac{\log (2n-3)}{ \log 2} \right \}.$$

The above mentioned book is available as a scan from Niederreiter's webspace (note is about 10Mb).


In a comment I mentioned the key-word low discrepancy sequence, which indeed is related, but the better key-word and really what is asked for is a low dispersion sequence.

The dispersion of a finite point $P=\{x_1, \dots , x_n\}$ set in some ambient space $S$ is defined as the supremum over all $x \in S$ of $\min_i d(x,x_i)$ where in the paper I quote the $d$ is the infinity-norm (but since be are in 1-d it does not matter so much).

And, then for a sequence one considers the dispersion of the intial segements. Thus the limes superior of $n$ times the dispersion of the first $n$ elements of the sequence is just half of what is asked here (actually, one one might have to be slightly careful to include 0 and 1 to be save and thus multiply by $n+2$, but asymptically this does not change anything).

This notion of disperson, it appears, was introduce by Niederreiter and in his paper 'Low-discrepancy and low-dispersion sequences' (J. Number Th., 1987), see at the very end, he reports that the then best construction of a low dispersion sequence yields a $1/\log 4$ which indeed is just half of the $1/\log 2$ that OP reports. (The construction is not in this paper of Niederreiter but he quotes it; I should add that the result is for arbitrary dimesion the 1-d case might or might not be older).

There seems to be various related work so one might find more with more extended searches.

  • 2
    Another source is Theorem 6.7 in Harald Niederreiter's book Random Number Generation and Quasi Monte Carlo Methods (SIAM 1992). The logarithmic result is due to Rusza and is indeed optimal. – John Bentin Jul 29 '13 at 15:58
  • 1
    Thank you for this information! For better visibility I will include it with link and details in the answer. –  Jul 29 '13 at 16:48
5

Edit: here is a simple proof that $\alpha := \inf \limsup n a_n \geq 1/\ln 2$. Note that this edit is made after solutions have been given in other answers.

Let us consider the sequence of lengths of the subintervals at some step $n$, in decreasing order: $\ell_0\geq\ell_1\geq\dots\geq\ell_n$ so that $\ell_0=a_n$. Assume that we look at one of the worst steps, so that $k a_k\leq n a_n$ for all $k>n$. Then $j$ step later, there is at least one subinterval of length $\ell_j$, since not all longest one can have been subdivided. That implies $$ \ell_j \leq \ell_0\frac{n}{n+j} = a_n\frac{1}{1+j/n}$$

Moreover, the sum of the $\ell_\ast$ must be $1$, so that $$1=\sum \ell_j \leq a_n\sum_j \frac{1}{1+j/n} \sim a_n \cdot n\int_0^1\frac{\mathrm{d}t}{1+t}= na_n \ln 2$$ This proves $\limsup n a_n \geq 1/\ln 2$.

We also get a hint on how to realize it: at each step, divide the longest interval in a way that makes the distribution of length closest to the profile $t\mapsto \frac1{1+t}$ over $[0,1]$.


This is not a complete answer, but non-matching upper and lower bound for $\alpha := \inf \limsup n a_n$ (where the infimum is on $\{x_n\}$ and the supremum limit on $n$):

$$\frac12+\frac1{\sqrt{2}} \simeq 1,207 \leq \alpha \leq \frac{1+\sqrt{5}}2 \simeq 1,618$$

For the lower bound: let $\{x_n\}$ be fixed and assume that for big enough $n$, all pieces have length at most $a/n$ (where $a\geq1$). Let $\psi$ be a number less than $1$ to be optimized later on, let me ignore roundoff errors that are negligible, and look at the $n\psi$-th largest piece at step $n$. Denote its length by $\ell$: the total length of the pieces is $1$ and is also at most $a/n\cdot n\psi+\ell n(1-\psi)$, so that $$\ell \geq \frac{1-a\psi}{1-\psi}\cdot\frac1n.$$ Since there are at least $\psi n$ pieces of size at least $\ell$, at a step of the order of $n+\psi n$ there will be one such piece remaining. Therefore we have $$\frac a{n+\psi n} \geq \frac{1-a\psi}{1-\psi}\cdot\frac1n$$ from which it comes $$a\geq \frac{1+\psi}{1+\psi^2}.$$ Optimizing in $\psi$ gives the desired lower bound.

For the upper bound: we seek a way to achieve $\limsup n a_n=\phi$ where $\phi=\frac{1+\sqrt{5}}2$ is the golden ratio.

Fix $\lambda \leq 1/2$, and construct $\{x_n\}$ inductively so as to cut at each step the largest interval in two parts of lengths in the ratio $\lambda:1-\lambda$. To simplify the analysis, then choose $\lambda$ so that $(1-\lambda)^2=\lambda$ (i.e. take $\lambda=\frac{3-\sqrt{5}}2$). With this choice, we get that at each step $n$ all pieces are of lengths $\lambda^k$ or $\lambda^{k-1}(1-\lambda)$, for some $k$ that depends only on $n$.

Rather than express $k$ in function of $n$, let us consider the worse case: there are $n$ pieces of length $\lambda^k$ and only one of length $\lambda^{k-1}(1-\lambda)$. Then as the sum of the lengths is $1$, writing $\ell=\lambda^k$ and noticing $\phi=\frac{1-\lambda}{\lambda}$, we can compute $\ell=\frac{1}{n+\phi}$ so that (for those $n$ when we are in the worst-case scenario): $$n a_n = \frac{n\phi}{n+\phi}\to\phi.$$