1. Introduction
Sidon asked (see [Reference Erdős7, Reference Sidon20]) whether there exists a set
$A\subseteq \mathbb{N}$
such that
$A + A = \mathbb{N}$
(i.e.
$A$
is an additive basis of order
$2$
) and for all
$\varepsilon \gt 0$
,

Erdős [Reference Erdős7] answered Sidon’s question by showing that there exists an additive basis of order
$2$
which, in fact, satisfies the stronger bound

It is a major open problem whether there exists an additive basis of order
$2$
for which the factor of
$\log {N}$
in the denominator can be replaced by an absolute constant; Erdős and Turán [Reference Erdős and Turán6] famously conjectured that this is impossible.
Erdős’s proof of the existence of
$A$
is randomized; in modern notation, one simply includes the number
$n$
in the set
$A$
with probability proportional to
$C(\log n)^{1/2}n^{-1/2}$
. Kolountzakis [Reference Kolountzakis12] derandomized (a variation of) Erdős’s proof in the sense that one can deterministically generate the elements of
$A\cap \{0,\ldots , N\}$
in time
$N^{O(1)}$
. We remark that a number of variants of the original result of Erdős have been developed including results of Erdős and Tetali [Reference Erdős and Tetali11] which prove the analogous result for higher order additive bases and results of Vu [Reference Van21] regarding economical versions of Waring’s theorem.
The focus of this work is on ‘explicit’ constructions. Erdős several times [Reference Erdős7–Reference Erdős9] asked for an explicit set
$A$
which affirmatively answers Sidon’s question and, in fact, offered
$\$100$
for a solution [Reference Erdős9]. We note that if one takes
$A$
to be the set of squares, then
$A+A$
contains all primes which are
$1\,\mathrm{mod}\,4$
and by the divisor bound
$A+A$
has multiplicities bounded by
$N^{o(1)}$
. Therefore, if one is willing to assume strong number-theoretic conjectures, one can take
$A$
to be the set of numbers
$n$
which are within
$O((\log n)^{O(1)})$
a square. The purpose of this note is to present an explicit construction unconditionally.
Given a set
$A$
which is either a subset of
$\mathbb{N}$
or
$\mathbb{Z}/q\mathbb{Z}$
for some
$q$
, we denote by
$\sigma _A(n)$
the number of representations
$n = a+a'$
or
$n \equiv a+a' \,\mathrm{mod}\,q$
, where
$a, a' \in A$
.
Theorem 1.1. There is an explicit set
$A \subset \mathbb{N}$
and absolute constants
$C, c \gt 0$
such that for every
$n \in \mathbb{N}$
, we have
$1 \le \sigma _A(n) \le C n^{c/\log \log n}$
.
Our starting point is an explicit construction of a set
$A_p \subseteq \mathbb{Z}/(p^2 \mathbb{Z})$
due to Ruzsa [Reference Ruzsa18] (for a prime
$p \equiv 3,5 \,\mathrm{mod}\,8$
) such that
$A_p + A_p = \mathbb{Z}/(p^2 \mathbb{Z})$
and
$\sigma _{A_p}(r) = O(1)$
for all
$r \in \mathbb{Z}/(p^2 \mathbb{Z})$
. Given this set and a sequence of squares of primes
$\mathbf{b} = (p_1^2,p_2^2,\ldots )$
,
$A$
is given by

Here the overline denotes a generalized base expansion in base
$\mathbf b$
, that is

here
$b_j = p_j^2$
. In words, the set is given by specifying that all but the leading digit of the generalized base expansion lies in the specially constructed sets
$A_p$
. By working upward from the smallest digit, one can use the property that
$A_{p} + A_p$
covers all residues in
$\mathbb{Z}/(p^2 \mathbb{Z})$
to see that all natural numbers are represented. The fact that no number is represented too many times is similarly derived using that
$A_p + A_p$
is ‘flat’, in particular that the multiplicities are bounded by
$p^{o(1)}$
. We remark that generalized bases have been utilized in a variety of questions related to Sidon sets, including works of Ruzsa [Reference Ruzsa19], Cilleruelo, Kiss, Ruzsa and Vinuesa [Reference Cilleruelo, Kiss, Ruzsa and Vinuesa4], and Pillate [Reference Pilatte17].
The construction above is clearly ‘explicit’ in the traditional mathematical sense (the sets
$A_p$
are given by the union of
$3$
parabolas over
$\mathbb{F}_p\times \mathbb{F}_p$
and then projecting; see Lemma 2.2). Furthermore, we can also examine the word ‘explicit’ from the computational complexity perspective. In analogy with the long and established line of work on explicit Ramsey graphs [Reference Barak, Rao, Shaltiel and Wigderson2, Reference Cohen5, Reference Li15, Reference Li16], we say that a set
$A \subset \mathbb{N}$
is explicit if one may test membership
$n\in A$
in
$(\log n)^{O(1)}$
-time, that is, polynomial in the number of digits. We note that for the purpose of simply obtaining an upper bound of
$N^{o(1)}$
in Theorem 1.1, one can actually choose the primes
$p$
sufficiently small (e.g. of size
$\log \log \log N$
say) and find a suitable set
$A_p$
by brute force enumeration. However Rusza’s [Reference Ruzsa18] construction is ‘strongly explicit’ (i.e. membership can be tested in time
$O((\log p)^{O(1)})$
).
The current bottleneck in Theorem 1.1 (given Rusza’s construction) is finding the smallest prime in an interval
$[N,2N]$
. Under strong number-theoretic conjectures (e.g. Cramer’s conjecture) finding such a prime would take time
$O((\log N)^{O(1)})$
due to the AKS primality testing algorithm [Reference Agrawal, Kayal and Saxena1]. Assuming this to be the case (or under a more traditional ‘mathematical’ definition of explicit), we can choose the primes more carefully to obtain an improved upper bound of
$\exp (O((\log N)^{1/2}))$
(see Section 2.2). The limiting feature of our construction now is that in the top block, one is forced to allow ‘all possibilities’. We believe obtaining an explicit construction achieving
$\sigma _A(N) \leq \exp ((\log N)^{\varepsilon })$
or better would be of substantial interest.
Notation
Throughout this paper we let
$[N] = \{0,\ldots , N-1\}$
and
$\mathbb{N} = \{0,1,2,\ldots \}$
. We let
$\lfloor x\rfloor$
denote the largest integer less than or equal to
$x$
. We use standard asymptotic notation, for example,
$f \lesssim g$
if
$|f(n)| \le C |g(n)|$
for some constant
$C$
and all large enough
$n$
. We usually denote by
$c, C$
absolute constants which may change from line to line.
2. Proof of Theorem 1.1
We formally define the notion of a generalized base.
Definition 2.1. Let
$\mathbf{b} = (b_1,b_2,\ldots )$
be an infinite set of integers such that
$b_i\ge 2$
. Given any integer
$x\in \mathbb{N}$
there exists a representation
$x = \overline {a_n\ldots a_1}^{\mathbf{b}}$
with
$0\le a_i\le b_i - 1$
such that

Here an empty product (when
$i=1$
) is treated as
$1$
.
Remark. If one requires
$a_n \neq 0$
(e.g. does not have leading zeros) the representation is unique. When
$b_j = g$
for all
$j$
, we recover precisely the base-
$g$
expansion.
A crucial piece in our construction is an ‘economical’ modular additive basis of order
$2$
over
$\mathbb{Z}/(p^2\mathbb{Z})$
due to Ruzsa [Reference Ruzsa18, Theorem 1]; the precise constant
$M$
in the result below has been studied in [Reference Chen3].
Lemma 2.2. There exists an absolute constant
$M\ge 1$
such that the following holds. Consider a prime
$p$
such that
$p \equiv 3,5 \,\mathrm{mod}\,8$
. There exists a set
$A_p\subseteq \mathbb{Z}/(p^2 \mathbb{Z})$
such that for all
$r\in \mathbb{Z}/(p^2\mathbb{Z})$
, we have
$1\le \sigma _{A_p}(r)\le M$
. Furthermore, given
$p$
and
$x\in \mathbb{Z}/(p^2 \mathbb{Z})$
, one can check whether
$x\in A_p$
in time
$O((\log p)^{O(1)})$
.
For completeness, and especially in order to discuss the second part of the statement, we present the proof of Lemma 2.2 in Section 2.1.
We next need the following basic fact about deterministically finding primes, which is immediate via (say) the Sieve of Eratosthenes. Using a more sophisticated algorithm of Lagarias and Odlyzko [Reference Lagarias and Odlyzko14], one may obtain a run time of
$O(N^{1/2+o(1)})$
in the statement below; runtimes of the form
$O(N^{o(1)})$
remain a major open problem.
Lemma 2.3. Let
$N \ge C_{2.3}$
. Then one may produce the smallest prime
$p\in [N,2N]$
such that
$p \equiv 3\,\mathrm{mod}\,8$
in time
$O(N^{1+o(1)})$
.
We now are in position to give the proof of Theorem 1.1.
Proof of Theorem
1.1. Let
$f \, : \, \mathbb{N} \rightarrow \mathbb{N}$
be an arbitrary monotone increasing function such that
$f(k) \ge C_0$
for some large constant
$C_0$
and all
$k$
. Let
$p_1 \lt p_2 \lt \ldots$
be a sequence of primes such that
$p_k \equiv 3 \,\mathrm{mod}\,8$
and
$p_k \in [f(k), 2f(k))$
is the least such prime.Footnote
1
Define
${\mathbf b} = (b_1, b_2, \ldots )$
by setting
$b_{k} = p_k^2$
for all
$k \ge 1$
. We are going to define our set
$A$
in terms of its expansion in the generalized base
$\mathbf b$
. Namely, for each
$k\ge 1$
let
$A_k \subset \{0, 1, \ldots , p_k^2-1\}$
be the set from Lemma 2.2 (where we lift elements
$\,\mathrm{mod}\,p_k^2$
to their integer representatives) and consider the set

We begin by showing that
$A + A = \mathbb{N}$
. For any
$n \in \mathbb{N}$
, we (uniquely) write
$n = \overline {n_k \ldots n_1}^{\mathbf b}$
for some
$k \geq 1$
; we will construct the representation
$n = a+a'$
for
$a, a' \in A$
digit by digit. First, since
$A_1$
is an order
$2$
additive basis mod
$b_1$
, there exist
$a_1, a_1'\in A_1$
such that
$n_1 \equiv a_1 +a'_1 \,\mathrm{mod}\,b_1$
. Let
$c_1 = \big \lfloor \frac {a_1+a'_1}{b_1}\big \rfloor \in \{0,1\}$
be the carry bit. Next, there exist
$a_2, a'_2\in A_2$
such that
$n_2 - c_1 \equiv a_2 + a_2' \,\mathrm{mod}\,b_2$
. As before, define the carry bit
$c_2$
and continue in the same fashion to produce sequences of digits
$a_1, \ldots , a_{k-1}$
and
$a'_1, \ldots , a'_{k-1}$
and a carry bit
$c_{k-1} \in \{0,1\}$
. Finally, let
$a_k = n_k - c_k \in \{0, \ldots , b_k-1\}$
and consider the elements

By construction, we have
$n = a+a'$
and
$a, a' \in A$
.
Next, we bound the number of possible representations
$n = a + a'$
with
$a,a' \in A$
. Write
$n = \overline {n_k\ldots n_1}^{\mathbf b}$
,
$a = \overline {a_{\ell }\ldots a_1}^{\mathbf b}$
, and
$a' = \overline {a'_{\ell '}\ldots a'_1}^{\mathbf b}$
, where
$\ell , \ell ' \le k$
are the digit lengths of
$a$
and
$a'$
. We may assume that
$\ell \le \ell '$
(this costs us a factor of 2 in the number of representations). Since
$a, a' \in A$
, we have
$a_i \in A_i$
for
$i \le \ell -1$
and
$a'_i\in A_i$
for
$i \le \ell '-1$
but the top digits
$a_\ell$
, (respectively,
$a'_{\ell '}$
) can be arbitrary elements of
$\{0,\ldots , b_{\ell }-1\}$
, (respectively,
$\{0,\ldots , b_{\ell '}-1\}$
). By Lemma 2.2, we can choose
$a_1, a_1'$
such that
$n_1 \equiv a_1+a_1' \,\mathrm{mod}\,b_1$
in at most
$M$
ways. Given a choice of
$a_1, a_1'$
, there are at most
$M$
pairs
$a_2, a_2'$
with
$n_2-c_1 \equiv a_2+a_2' \,\mathrm{mod}\,b_2$
, where
$c_1 = \big \lfloor \frac {a_1+a_1'}{b_1}\big \rfloor \in \{0,1\}$
is the carry. Continuing in this fashion for
$j=1, \ldots , \ell -1$
, we get that there are at most
$M^{\ell -1}$
ways to fix the first
$\ell -1$
digits
$a_1,\ldots , a_{\ell -1}$
and
$a_1',\ldots , a_{\ell -1}'$
. We can fix
$a_{\ell }$
and
$a'_{\ell }$
in at most
$b_\ell$
ways. Given this choice, the digits
$a'_{\ell +1}, \ldots , a'_{\ell '}$
are uniquely determined. Putting this together, we obtain the following upper bound on the number of representations
$n=a+a'$
:

where we used
$b_k = p_k^2 \le (2f(k))^2$
and
$M\ge 2$
. On the other hand,
$b_j = p_{j}^2\ge f(j)^2$
and so

Hence,
$k \le \frac {2\log n}{\log f(\lfloor k/2\rfloor )}$
and substituting this in Eq. (2.2), we obtain the bound

Note that the right hand side is
$n^{o(1)}$
for any sufficiently slowly growing function
$f$
. Owing to the computational considerations in the next paragraph, we take
$f(k) = k$
which leads to
$k \lesssim \frac {\log n}{\log \log n}$
and
$\sigma _A(n) \lesssim n^{c/\log \log n}$
.
Finally, we quickly verify that testing membership
$a \in A$
can be done in time
$O((\log a)^{O(1)})$
. Indeed, given
$a \in \mathbb{N}$
, we can compute all primes
$p_k$
for
$k \le c\log a$
in time
$(\log a)^{O(1)}$
(Lemma 2.3), compute the base
$\mathbf b$
expansion
$a = \overline {a_k\ldots a_1}$
in time
$(\log a)^{O(1)}$
, and check that
$a_j \in A_j$
for
$j=1, \ldots , k-1$
in time
$O(k (\log f(k))^{O(1)})$
using Lemma 2.2.
2.1 Modular construction and computational details
We record the proof of Lemma 2.2, following Ruzsa [Reference Ruzsa18]. For
$n \in \mathbb{Z}, p \in \mathbb{N}$
we write
$(n \,\mathrm{mod}\,p)$
for the unique
$n' \in \{0, 1, \ldots , p-1\}$
congruent to
$n$
modulo
$p$
. The following is exactly [Reference Ruzsa18, Lemma 3.1].
Lemma 2.4. Let
$p\equiv 3,5\,\mathrm{mod}\,8$
. Define
$B_p \subseteq \{0,\ldots , 2p^2\}$
by

We have
$\sup _{n\in \mathbb{Z}}\sigma _{B_p}(n) \le 18$
and furthermore, for all
$0\le n\lt p^2$
, at least one of the six numbers

appears in the set
$B_p + B_p$
.
Given Lemma 2.4, we prove Lemma 2.2.
Proof of Lemma 2.2.
Let
$B_p' = B_p + \{-p, 0, p\}$
(viewed as a subset of
$\mathbb{Z}$
) and set

Applying Lemma 2.4, we immediately have:
-
•
$B_p' + B_p' \subseteq [-2p,5p^2]$
-
• For all
$0\le n\lt p^2$ , one of
$n$ or
$n+p^2$ appears in
$B_p' + B_p'$
-
• We have that
$\sup _{n\in \mathbb{Z}}\sigma _{B_p'}(n) \le 9 \cdot 18 = 162$ .
Noting that
$n\equiv n + p^2 \,\mathrm{mod}\,p^2$
, it follows that
$A_p + A_p = \mathbb{Z}/(p^2\mathbb{Z})$
. Furthermore we have that
$\sup _{n\in \mathbb{Z}}\sigma _{A_p}(n) \le 6 \cdot 9 \cdot 18 = 594$
; this is immediate as
$B_p' + B_p' \subseteq [-2p,5p^2]$
and there are at most
$6$
representatives in this interval for a given residue modulo
$p^2$
.
We now discuss the time complexity of testing membership in
$A_p$
. Given
$p$
, and
$x\in \mathbb{Z}/(p^2 \mathbb{Z})$
, we consider the unique representative
$x' \in \{0,\ldots , p^2-1\}$
. Noting that
$B_p'\subseteq [-p, 2p^2 + p]$
, it suffices by construction to test whether at least one of
$x' - p^2, x', x'+p^2, x'+2p^2$
is in
$B_p'$
. This is equivalent to checking whether at least one of
$x' + \{-p^2,0,p^2,2p^2\} + \{-p,0,p\}$
is in
$B_p$
; in particular, one of at most
$12$
distinct given elements is in
$B_p$
.
To test whether
$y\in \{0,\ldots ,2p^2\}$
is in
$B_p$
amounts to testing whether
$y = z + 2p (3z^2 \,\mathrm{mod}\,p)$
,
$y = z + 2p (4z^2 \,\mathrm{mod}\,p)$
, or
$y = z + 2p (6z^2 \,\mathrm{mod}\,p)$
for an integer
$z\in \{0,\ldots , p-1\}$
. Given
$y$
, the ‘candidate’
$z$
is forced to be the unique number in
$\{0,\ldots ,p-1\}$
equivalent to
$y \,\mathrm{mod}\,p$
and we can then simply compute
$(3z^2 \,\mathrm{mod}\,p)$
,
$(4z^2 \,\mathrm{mod}\,p)$
, and
$(6z^2 \,\mathrm{mod}\,p)$
. This procedure clearly takes time
$O((\log p)^{O(1)})$
.
2.2 Assuming deterministic polynomial time algorithms for locating primes
For the remainder of this section we will operate under the following assumption.
Assumption 2.5. There exists a deterministic algorithm which outputs the least prime which is
$3 \,\mathrm{mod}\,8$
in the interval
$[N,2N]$
in time
$O((\log N)^{O(1)})$
.
To obtain a better upper bound on
$\sigma _A(n)$
, we take the function
$f$
in the proof of Theorem 1.1 to be
$f(k) = \exp (c k)$
. It follows from (2.2) and (2.3) that
$\sigma _A(n) \lesssim \exp (Ck)$
and
$n \gtrsim \exp (c k^2)$
, thus giving the bound
$\sigma _A(n) \lesssim \exp (C \sqrt { \log n})$
. To test membership, we need to construct primes
$p$
of order at most
$\exp (ck) \approx \exp (c\sqrt {\log n})$
which can be done in
$(\log n)^{O(1)}$
-time under Assumption 2.5.
Acknowledgements
V.J. is supported by NSF CAREER award DMS-2237646. H.P. is supported by a Clay Research Fellowship and a Stanford Science Fellowship. M.S. is supported by NSF Graduate Research Fellowship Programme DGE-2141064. D.Z. is supported by the Jane Street Graduate Fellowship. We thank Zach Hunter and Sándor Kiss for carefully reading the manuscript and suggesting improvements and references.