20

Example: How can you guess a polynomial $p$ if you know that $p(2) = 11$? It is simple: just write 11 in binary format: 1011 and it gives the coefficients: $p(x) = x^3+x+1$. Well, of course, this polynomial is not unique, because $2x^k$ and $x^{k+1}$ give the same value at $p=2$, so for example $2x^2+x+1$, $4x+x+1$ also satisfy the condition, but their coefficients have greater absolute values!

Question 1: Assume we want to find $q(x)$ with integer coefficients, given its values at some set of primes $q(p_i)=y_i$ such that $q(x)$ has the least possible coefficients. How should we do it? Any suggestion/algorithm/software are welcome. (Least coefficients means: the least maximum of modulii of coefficients).

Question 2: Can one help to guess the polynomial $p$ such that $p(3) = 221157$, $p(5) = 31511625$ with the smallest possible integer coefficients? (Least maximum of modulii of coefficients). Does it exist?

(That example comes from the question MO404817 on count of 3x3 anticommuting matrices $AB+BA=0$ over $F_p$.)

(The degree of the polynomial seems to be 10 or 11. It seems divisible by $x^3$, and I have run a brute force search bounding absolute values of the coefficients by 3, but no polynomial satisfying these conditions is found, so I will increase the bound on the coefficients and will run this search again, but the execution time grows too quickly as the bound increases and it might be that brute force is not a good choice).

Question 3: Do conditions like $q(p_i)=y_i$ imply some bounds on coefficients? E.g., can we estimate that the coefficients are higher than some bound?

markvs
  • 1,802

5 Answers5

26

You can certainly do better than brute force by considering modular constraints. If the solution is $p(x) = \sum_i a_i x^i$ then $p(x) - \sum_{j=0}^{n-1} a_j x^j$ is divisible by $x^n$ and $$\frac{p(x) - \sum_{j=0}^{n-1} a_j x^j}{x^n} = a_n \pmod x$$

Solving for $a_0$ in each of the given bases and using the Chinese remainder theorem gives an equivalence class for $a_0$; for each possible value of $a_0$ you can expand similarly for $a_1$; and traversing this tree in order of increasing cost of the coefficients gives a directed search. This works in principle for any cost function which increases when any coefficient increases in absolute value.

This Python code implements the idea and finds two polynomials with sum of absolute values of 29:

$$-2x^3 + 4x^4 + 3x^5 + 11x^6 + x^7 + 5x^8 + 3x^{10} \\ -2x^3 + 4x^4 + 3x^5 - 4x^6 + 9x^7 + 4x^8 + 3x^{10}$$

and one polynomial with maximum absolute value of 7:

$$-2x^3 + 4x^4 + 3x^5 - 4x^6 - 6x^7 - 3x^8 + 7x^9 + 2x^{10}$$

in a small fraction of a second.


Some follow-up questions in comments brought me to the realisation that if we're trying to interpolate $\{ (x_i, y_i) \}$ with the $x_i$ coprime, there is at most one polynomial with coefficients in the range $[-\lfloor \tfrac{(\operatorname{lcm} x_i) - 1}2 \rfloor, \lfloor \tfrac{\operatorname{lcm} x_i}2 \rfloor]$, because the tree collapses to a chain. This gives the following algorithm for finding such a polynomial, if it exists:

M := lcm(x_i)
while any y_i is non-zero:
    find a_0 by Chinese remainder theorem
    if a_0 > floor(M / 2):
        a_0 -= M
output a_0
update y_i := (y_i - a_0) / x_i

In the long term, the initial values of the $y_i$ are reduced to negligibility by the repeated division by $x_i$, so eventually each $y_i$ will be reduced to a range which is bounded by $\frac{x_i}{x_i - 1} M$. This means that for a given set of $x_i$ it's possible to compute a finite directed graph to see whether existence is guaranteed.

In the particular case that the $x_i$ are $\{3, 5\}$ there are three cycles, all of them loops: $(0,0) \to (0,0)$ is the terminating loop which indicates that a solution exists, but there are also loops $(2, 1) \to (2, 1)$ and $(-2, -1) \to (-2, -1)$.

Peter Taylor
  • 6,571
  • 2
    That's a really attractive piece of Python code – J.J. Green Sep 25 '21 at 12:43
  • 2
    You are minimizing sum of absolute values; OP asks to minimize largest absolute value. – Gerry Myerson Sep 25 '21 at 13:00
  • @RolandBacher PeterTaylor it is great that the last coefficients in all proposed examples are the same: (-2, 4, 3). The first "-2" can be guessed looking in 3 and 5 -adic decompositions - we have "1" for p=3 and we have "3" for p=5 so "-2" the only way to satisfy both. – Alexander Chervov Sep 25 '21 at 14:13
  • 1
    @GerryMyerson, I misunderstood the original text. The approach should work for a wide class of cost functions including any $L^k$ norm; there may be a risk with the $L^\infty$ norm of building arbitrarily long chains of small norm but in practice it works for the example, as the updated code demonstrates. – Peter Taylor Sep 25 '21 at 14:27
  • Thank you very much ! It is really very helpful and Python code ! By the way the new (third) polynomial - is the same as Roland Bacher asnwer – Alexander Chervov Sep 25 '21 at 14:27
  • @AlexanderChervov, yes, it seems that Roland Bacher's answer was indeed optimal. – Peter Taylor Sep 25 '21 at 14:27
  • Does your method imply that there is NO polynomial of degree 10 with coefficients LESS than 7 (in abs value) ? (Since the one you found - maximal coef is 7) – Alexander Chervov Sep 25 '21 at 14:32
  • 2
    @AlexanderChervov, it's stronger than that: this is the only polynomial with coefficients less than 8. – Peter Taylor Sep 25 '21 at 14:34
  • Wow ! How this can be seen ? – Alexander Chervov Sep 25 '21 at 14:37
  • 2
    Working mod 15, every equivalence class contains only one number in the range $[-7, 7]$, so the tree is pruned to just a chain. – Peter Taylor Sep 25 '21 at 14:40
  • Can we put bound for coefs as input to your algorithm ? So if the take coefs in [-8,8] I guess it will be again be the only solution ? Since your other examples contain coef = 9 or more. – Alexander Chervov Sep 25 '21 at 15:23
  • No, if you allow $[-8, 7]$ there's an infinite family: $-2x^3 + 4x^4 + 3x^5 - 4x^6 - 6x^7 - 3x^8 -8x^9 -5x^{10} - 8x^{11} (\sum_{i=0}^j x^i) + 7x^{12+j} - x^{13+j}$ for $-1 \le j \in \mathbb{Z}$. – Peter Taylor Sep 25 '21 at 15:31
  • I mean degree = 10. These examples seems to have higher degree – Alexander Chervov Sep 25 '21 at 15:39
  • 1
    At degree 10 there's one solution with max abs coeff 7, one with 9, eleven with 11, eleven with 12, twenty-six with 13, ... – Peter Taylor Sep 25 '21 at 15:43
  • Thank you very much ! – Alexander Chervov Sep 25 '21 at 15:49
  • Gist updated to demonstrate answers to these additional questions; you can run it online at https://tio.run/#python3 but a direct link embedding the code is too long for a comment. – Peter Taylor Sep 25 '21 at 15:59
  • 1
    Please take a look: https://mathoverflow.net/q/404817/10446 – Alexander Chervov Sep 25 '21 at 20:46
  • Very interesting new follow-up part of the answer. So bound depends only on $x_i$ and does not depend on $y_i$. Can you estimate the degree ? – Alexander Chervov Sep 26 '21 at 17:53
  • @MaxAlekseyev, no. Change range(m) to range(-M*x//(x-1), M*x//(x-1)+1) and change ((r[i]-c)//x[i])%x[i] to (r[i]-c)//x[i]. – Peter Taylor Sep 26 '21 at 20:01
  • Sorry, the range should use M//2 rather than M. – Peter Taylor Sep 26 '21 at 20:10
  • @MaxAlekseyev, it's the evolution of $y[i]$ as we recover more coefficients. – Peter Taylor Sep 26 '21 at 20:50
  • @AlexanderChervov, it seems that (for ${3, 5}$) the degree should be at most $5 + \max \log_{x_i} y_i$. Possibly this can be tightened slightly. Also, (again for ${3, 5}$) half of the vertices in the graph end in a solution, so I think it's a reasonable heuristic that the probability of a solution is very close to one half. – Peter Taylor Sep 26 '21 at 23:08
  • Remark if we have just one prime - then your suggestion quite corresponds to the naive argument which I mendtioned at the begining: if q(p) = y, we can just take p-adic decomposition ang UNIQUELY get polynomial with coefficients 0<..< p (or can use negatives (-p/2 < ,,, < p/2) and degree is log_p(y) . Quite nice – Alexander Chervov Sep 27 '21 at 08:22
  • @PeterTaylor: Thanks for corrections, but I have a concern about bound for $y_i$. We have a guarantee for $y_i$ to decrease while $\frac{y_i-a_0}{x_i} \leq \frac{y_i+M/2}{x_i}< y_i$. That is, if $y_i > \big\lfloor \frac{M}{2(x_i-1)}\big\rfloor =: U$ it will decrease and sooner or later it will end up in the interval $[-U,U]$. Where $\frac{x_iM}{(x_i-1)}$ comes from? – Max Alekseyev Sep 27 '21 at 23:33
  • And here is an updated graph https://sagecell.sagemath.org/?q=wmwcvc – Max Alekseyev Sep 27 '21 at 23:40
  • @MaxAlekseyev, you're right, the bound can be tightened to $\frac{M-1}{x_i-1}$. – Peter Taylor Sep 28 '21 at 08:41
  • @PeterTaylor: It looks like you lost factor of 2 in the denominator. Anyway, $\big\lfloor \frac{M}{2(x_i-1)}\big\rfloor$, which I stated above, is more tight. – Max Alekseyev Sep 28 '21 at 13:16
  • @MaxAlekseyev, it's a bound for the range. The factor of 2 comes from bounding instead the absolute value assuming that the representatives of the equivalence classes are chosen symmetrically around zero. – Peter Taylor Sep 28 '21 at 14:07
  • (and the floor is only valid if $M$ is odd). – Peter Taylor Sep 28 '21 at 14:15
  • Thanks again ! Please, when have some free time can you check some my issues with the code: https://gist.github.com/pjt33/c942040a9b8fa9bacc7f5d1bccc7ac2b#gistcomment-3915246 – Alexander Chervov Oct 03 '21 at 18:01
16

Perhaps not the best possible polynomial but a very good polynomial can generally be computed by the LLL algorithm: Let $I$ be an integral polynomial taking the given values at the given points and let $\Lambda$ be the lattice (in $\mathbb Z^{d+1}$ for a suitable high value of the degree $d$, coordinates are of course coefficients) of integral polynomials with roots at the prescribed set of points. We search a short vector in $I+\Lambda$. Adding $I$ to a basis of $\Lambda$ and running the LLL algorithm should give a good polynomial among a basis of short vectors most of the time (It may sometimes happen that it returns only short vectors of the form $kI+P_0$ with $P_0$ in $\Lambda$ and with $k$ avoiding $\pm 1$. Changing the value of $d$ and rerunning LLL solves perhaps the issue).

Added details (after some experimentation): The problem with results giving small multiples of the required evaluations can be avoided by adjoining an additional very large coordinate to the coefficients of $I$ and by setting this coordinate to $0$ for all elements in $\Lambda$. The Maple implementation of LLL returned $$2x^{10}+7x^9-3x^8-6x^7-4x^6+3x^5+4x^4-2x^3$$ for the data of the OP. (I worked with degree $40$.) Unfortunately not a solution with coefficients in $\{-3,\ldots,3\}$ but not very far off.

(Messy maple code for the OP example:

with(IntegerRelations);
d:=50;
ii:=10^8+t*interp([3,5],[221157,31511625],t);
li:={[seq(coeff(t*ii,t^i),i=1..d+2)]};for i from 2 by 1 to d do po:=t*(-t^i+interp([3,5],[3^i,5^i],t)):li:=li union{[seq(coeff(t*po,t^i),i=1..d+2)]};od:
u:=LLL([seq(li[i],i=1..nops(li))]);
vector(nops(u),i->sum('u[i,j]*3^(j-2)','j'=2..d+2)/221157);
po:=sum('u[nops(u),j]*x^(j-2)','j'=2..d+2);

)

Added after a comment from Max Alekseev His comment is correct ( https://www.goodreads.com/quotes/123557-god-s-final-message-to-his-creation-we-apologize-for-the ): An optimal polynomial (with minimal sum of squared coefficients) is (up to sign) given by a closest lattice point in $I+\Lambda$ to the orthogonal projection of the origin onto the affine hyperplane $I+\mathbb R\otimes_{\mathbb Z}\Lambda$. The trick of adding a last large coordinate to the coordinates of $I$ (or by multiplying $I$ by a large constant $c$ as in Alekseev's answer) moves this affine hyperplane away from the hyperplane containing the codimension one lattice $\Lambda$ and ensures that the last generator of an LLL-basis is in $I+\Lambda$ (up to a sign) and close to the line $\Lambda^\perp$ perpendicular to $\Lambda$ in the ambient spanned by $I$ and $\Lambda$.

Roland Bacher
  • 17,432
10

Given $p(x_i)=y_i$ for $i\in\{1,2,\dots,n\}$, finding a polynomial of fixed degree $d$ can be posed as finding a closest vector to $(cy_1,\dots,cy_n,\underbrace{0,\dots,0}_{d+1})^T$ in the lattice spanned by $(d+1)$ column vectors $$\begin{bmatrix} c\cdot V_{d+1}(x_1,\dots,x_n)\\ I_{d+1} \end{bmatrix}, $$ where $V_{d+1}$ is Vandermonde matrix (of size $n\times (d+1)$), $I_{d+1}$ is the identity matrix, and $c$ is a large constant. By the choice of $c$ we can guarantee that a closest vector will have form $(cy_1,\dots,cy_n,a_0,a_2,\dots,a_d)^T$ and these $a_i$ will be small and deliver us a required polynomial $p(x) = \sum_{i=0}^d a_i x^i$.

As for how to solve CVP in practice, see Package for the Closest Vector Problem (CVP)?

Max Alekseyev
  • 30,425
5

Here is an online SageMath implementation of @PeterTaylor's idea. It uses recursively enumerable sets and MapReduce, and is automatically parallelized by Sage (when available).

As an example it computes all polynomials $p$ of degree at most 12 with coefficients bounded by 10 by absolute value such that $p(3)=221157$ and $p(5)=31511625$.

It also confirms (when run with max_degree = +oo) that there is only one polynomial with the coefficients bounded by $7$ by absolute value.

Max Alekseyev
  • 30,425
  • Other answers show two different polynomials with coefficients bounded by 7. I did not check them. Somebody must have made a mistake (it could be me). – user132647 Sep 26 '21 at 05:50
  • 2
    @user132647, I gave it in little-endian order and Roland gave it in big-endian order, but it's the same polynomial, and there's a simple proof of uniqueness in the comments on my answer. – Peter Taylor Sep 26 '21 at 06:43
  • Oof, now I feel stupid... Thanks for pointing out my mistake! – user132647 Sep 26 '21 at 06:52
3

In the $\ell^{\infty}$ norm for the coefficients, the problem of finding polynomial $p(x)=\sum_{k=0}^d c_kx^k$ satisfying $p(x_i)=y_i$ for $i\in\{1,2,\dots,n\}$ can also be naturally posed as an integer linear program for finding integer $c_i$ and the best bound $b$: $$\begin{cases} \textrm{minimize}\ b\\ -b \leq c_k\leq b & \text{for}\ k\in\{0,1,\dots,d\}\\ \sum_{k=0}^d c_k x_i^k = y_i & \text{for}\ i\in\{1,2,\dots,n\} \end{cases} $$

Here is a sample implementation of this approach in Sage.

Max Alekseyev
  • 30,425
  • 2
    For $d=10$, best_poly([3,5],[221157,31511625],10) gives $2z^{10} + 7z^9 - 3z^8 - 6z^7 - 4z^6 + 3z^5 + 4z^4 - 2z^3 + 2*z - 6$ (Best bound: 7), while for $d=20$ best_poly([3,5],[221157,31511625],20) gives $z^{11} + z^{10} - z^9 + z^8 - z^7 + z^6 + z^5 + z^4 + z^3 + z^2 + z + 1$ (Best bound: 1). – Pablo H Sep 28 '21 at 13:56
  • @PabloH strange – Alexander Chervov Sep 28 '21 at 14:07
  • @PabloH [ 221980., 56972656.] seems to be different values for the second polynom, than required – Alexander Chervov Sep 28 '21 at 14:17
  • 1
    @PabloH: That is an issue with the ILP solver (by default SageCell uses GPLK), which sometimes does too much relaxation and inadvertently violates some of the given constraints. This can be remediated by tightening solver parameters or switching to an exact solver (which may be time costly). – Max Alekseyev Sep 28 '21 at 17:59
  • 2
    @PabloH: I've checked with a locally installed Gurobi solver, and with degree=20 it produces the same polynomial as with degree=10. – Max Alekseyev Sep 28 '21 at 18:06
  • By the way does it mean the LLL can be applied to some class of ILP problems ? – Alexander Chervov Sep 28 '21 at 19:03
  • @AlexanderChervov: Yes and No. In this context, LLL solves problem in $\ell^2$ norm, while ILP does that in $\ell^\infty$ norm; LLL solves the problem approximately, ILP does that exactly. – Max Alekseyev Sep 28 '21 at 19:48