1. Introduction
Turán-type questions are some of the most well-studied problems in combinatorics. They typically ask how ‘dense’ should an object be in order to guarantee that it contains a certain small substructure. In the setting of graphs, this question asks how many edges an
-vertex graph should contain in order to force the appearance of some fixed graph
. For example, a central open problem in this area asks, given a bipartite graph
, to determine the smallest
$T=T_{H}(\varepsilon )$
so that for every
$n\geq T$
-vertex graph with
$\varepsilon\!\left(\begin{smallmatrix}n\\ 2\end{smallmatrix}\right)$
edges contains a copy of
(see [Reference Bukh3] for recent progress). A closely related question which also attracted a lot of attention is the supersaturation problem, introduced by Erdős and Simonovits [Reference Erdős and Simonovits5] in the 80s. In the setting of Turán’s problem for bipartite
, the supersaturation question asks to determine the largest
$T^*_{H}(\varepsilon )$
so that every
-vertex graph with
$\varepsilon\! \left(\begin{smallmatrix}n\\ 2\end{smallmatrix}\right)$
edges contains at least
$(T^*_{H}(\varepsilon )-o_n(1)) \cdot n^h$
labelled copies of
, where
denotes a quantity tending to
$n \rightarrow \infty$
. One of the central conjectures in this area, due to Sidorenko, suggests that
$T^*_{H}(\varepsilon ) = \varepsilon ^{m}$
, where
(see [Reference Conlon, Kim, Lee and Lee4] for recent progress).
We now describe two problems in additive number theory, which are analogous to the graph problems described above. We say that a homogenous linear equation
$\sum ^k_{i=1}a_ix_i=0$
is invariant if
$\sum _ia_i=0$
. All equations we consider here will be invariant and homogenous. Given a fixed linear equation
, the Turán problem for
asks to determine the smallest
$R=R_{E}(\varepsilon )$
so that for every
$n\geq R$
, every
$S\subseteq [n]\,:\!=\,\{1,\ldots,n\}$
of size
$\varepsilon n$
contains a solution to
in distinct integers. For example, when
is the equation
we get the Erdős–Turán–Roth problem on sets avoiding
-term arithmetic progressions (see [Reference Kelley and Meka7] for recent progress). Continuing the analogy with the previous paragraph, we can now ask to determine the largest
$R^*_{E}(\varepsilon )$
so that every
$S \subseteq [n]$
of size
$\varepsilon n$
contains at least
$(R^*_{E}(\varepsilon )-o_n(1)) \cdot n^{k-1}$
solutions to
, where
is the number of variables in
. We now turn to discuss two aspects which make the arithmetic problems more challenging than the graph problems.
Let us say that
is sparse if there is
so thatFootnote
$R_E(\varepsilon ) \leq \varepsilon ^{-C}$
. The first aspect which makes the arithmetic landscape more varied is that while in the case of graphs it is well known (and easy) that for every bipartite
we have
$T_{H}(\varepsilon )=\text{poly}(1/\varepsilon )$
, this is no longer the case in the arithmetic setting. Indeed, while Sidon’s equation
is sparse, a well-known construction of Behrend [Reference Behrend1] shows that
is not sparse.Footnote
The problem of determining which equations
are sparse is a wide open problem due to Ruzsa, see Section
in [Reference Ruzsa9].
Our main goal in this paper is to study another aspect which differentiates the arithmetic and graph-theoretic problems. While it is easy to translate a bound for
$T_{H}(\varepsilon )$
into a bound for
$T^*_{H}(\varepsilon )$
(in particular, establishing that
$T^*_{H}(\varepsilon ) \geq \text{poly}(\varepsilon )$
for all bipartite
), it is not clear if one can analogously transform a bound for
$R_{E}(\varepsilon )$
into a bound for
$R^*_{E}(\varepsilon )$
. The first reason is that while we can average over all subsets of vertices of graphs, we can only average over “structured” subsets of
. This makes is hard to establish a black-box reduction/transformation between
$R_{E}(\varepsilon )$
$R^*_{E}(\varepsilon )$
. The second complication is that, as we mentioned above, we do not know which equations are sparse. This makes it hard to directly relate these two quantities. Following [Reference Girão, Hurley, Illingworth and Michel6], we say that
is abundant if
$R^*_{E}(\varepsilon ) \geq \varepsilon ^C$
for some
. Clearly, if
is abundant then it is also sparse. Girão et al. [Reference Girão, Hurley, Illingworth and Michel6] asked if the converse also holds, that is, if one can transform a polynomial bound for
$R_{E}(\varepsilon )$
into a polynomial bound for
$R^*_{E}(\varepsilon )$
. Our aim in this note is to prove the following.
Theorem 1.1.
If an invariant equation
in four variables is sparse, then it is also abundant. More precisely, if
$R_{E}(\varepsilon ) \leq \varepsilon ^{-C}$
$R^*_{E}(\varepsilon ) \geq \frac 12\varepsilon ^{8C}$
for all small enough
Given the above discussion, it is natural to extend the problem raised in [Reference Girão, Hurley, Illingworth and Michel6] to all equations
Problem 1.2.
Is it true that for every invariant equation
there is
, so that for all small enough

It is interesting to note that Varnavides [Reference Varnavides11] (implicitly) gave a positive answer to Problem 1.2 when
is the equation
. In fact, essentially the same argument gives a positive answer to this problem for all
in three variables. Hence, Problem 1.2 can be considered as a generalisation of Varnavides’s theorem. Problem 1.2 was also implicitly studied previously in [Reference Bloom2,Reference Kosciuszko8]. In particular, Kosciuszko [Reference Kosciuszko8], extending earlier work of Schoen and Sisask [Reference Schoen and Sisask10], gave direct lower bounds for
which, thanks to [Reference Kelley and Meka7], are quasi-polynomially related to those of
The proof of Theorem 1.1 is given in the next section. For the sake of completeness, and as a preparation for the proof of Theorem 1.1, we start the next section with a proof that Problem 1.2 holds for equations in three variables. We should point that a somewhat unusual aspect of the proof of Theorem 1.1 is that it uses a Behrend-type [Reference Behrend1] geometric argument in order to find solutions, rather than avoid them.
2. Proofs
In the first subsection below, we give a concise proof of Varnavides’s theorem, namely, of the fact that Problem 1.2 has a positive answer for equations with three variables. In the second subsection, we prove Theorem 1.1.
2.1 Proof of Varnavides’s theorem
Note that for every equation
, there is a constant
such that for every prime
$p\geq Cn$
every solution of
with integers
$x_i \in [n]$
is also a solution over
. Since we can always find a prime
$Cn \leq p \leq 2Cn$
, this means that we can assume that
itself is primeFootnote
and count solutions over
. So let
be a subset of
of size
$\varepsilon n$
and let
. For
$b=(b_0,b_1) \in (\mathbb{F}_n)^2$
$x \in [R]$
$f_{b}([R])=\{x \in [R]\,{:}\, f_{b}(x) \in S\}$
. Pick
uniformly at random from
and note that for any
$x \in [R]$
the integer
is uniformly distributed in
. Hence,

It is also easy to see that for every
$x \neq y$
the random variables
are pairwise independent. Hence,

Therefore, by Chebyshev’s Inequality we have

In other words, at least
choices of
are such that
$|\,f_{b}([R])| \geq \frac{\varepsilon }{2}R$
. By our choice of
, this means that
contains three distinct integers
which satisfy
and such that
$f_{b}(x_i) \in S$
. Note that if
, then so do
. Let us denote the triple
. We have thus obtained
. To conclude the proof, we just need to estimate the number of times we have double counted each solution
. Observe that for every choice of
and distinct
$x_1,x_2,x_3 \in [R]$
, there is exactly one choice of
$b=(b_0,b_1) \in (\mathbb{F}_n)^2$
for which
for every
$1 \leq i \leq 3$
. Since
contains at most
solutions of
this means that for every solution
$s_1,s_2,s_3 \in S$
, there are at most
choices of
for which
. We conclude that
contains at least
distinct solutions, thus completing the proof.
2.2 Proof of Theorem 1.1
As in the proof above, we assume that
is a prime and count the number of solutions of the equation
$E:\,\sum ^4_{i=1}a_ix_i=0$
. Let
be a subset of
of size
$\varepsilon n$
, and let
be integers to be chosen later and let
be some subset of
to be chosen later as well. For every
$b=(b_0,\ldots,b_d)\in (\mathbb{F}_n)^{d+1}$
$x=(x_1,\ldots,x_d) \in X$
, we use
to denote
$b_0+\sum ^d_{i=1}b_ix_i$
$f_b(X)=\{x \in X\,{:}\,\, f_b(x) \in S\}$
. We call
good if
$|\,f_b(X)| \geq \varepsilon |X|/2$
. We claim that at least half of all possible choices of
are good. To see this, pick
uniformly at random from
, and note that for any
$x \in X$
the integer
is uniformly distributed in
. Hence,

It is also easy to see that for every
$x \neq y \in X$
the random variables
are pairwise independent. Hence,

Therefore, by Chebyshev’s Inequality we haveFootnote 5

implying that at least half of the
’s are good. To finish the proof, we need to make sure that every such choice of a good
will “define” a solution
in a way that
will not be identical to too many other
. This will be achieved by a careful choice of
, and
We first choose
to be the largest subset of
containing no three points on one line. We claim that

Indeed, for an integer
be the points
$x \in [t]^d$
$\sum ^d_{i=1}x^2_i=r$
. Then every point of
lies on one such
, where
$1 \leq r \leq dt^2$
. Hence, at least one such
contains at least
of the points of
. Furthermore, since each set
is a subset of a sphere, it does not contain three points on one line.
We now turn to choose
. Let
be such that
$R_E(\varepsilon ) \leq (1/\varepsilon )^C$
. Set
$a=\sum ^4_{i=1}|a_i|$
and pick

$t=2^{\sqrt{\log 1/\varepsilon }}$
$d=2C\sqrt{\log 1/\varepsilon }$
the above for all small enough
. Note that by (3) and our choice of
, we have
$R_E\left (\frac{\varepsilon }{2dt^2a^d}\right ) \leq t^d$
Let us call a collection of four vectors
$x^1,x^2,x^3,x^4 \in X$
helpful if they are distinct, and they satisfy
in each coordinate, that is, for every
$1 \leq i \leq d$
we have
$\sum ^4_{j=1}a_jx^j_i=0$
. We claim that for every good
, there are useful
$x^1,x^2,x^3,x^4 \in f_r(X)$
. To see this let
denote the integers
and note that (2) along with the fact that
is good implies that

Now think of every
$x \in X$
as representing an integer
$p(x) \in [M]$
written in base
. So we can also think of
as a subset of
of density at least
. By (3), we have

implying that there are distinct
$x^1,x^2,x^3,x^4 \in f_r(X)$
for which
$\sum ^4_{j=1}a_j\cdot p(x^j)=0$
. But note that since the entries of
are from
, there is no carry when evaluating
$\sum ^4_{j=1}a_j\cdot p(x^j)$
in base
, implying that
in each coordinate. Finally, the fact that
$\sum _ja_j=0$
and that
for each
$1 \leq i \leq d$
allows us to deduce that

which means that
forms a solution of
. So for every good
, let
be (some choice of)
$f_b(x^1),f_b(x^2),f_b(x^3),f_b(x^4) \in S$
as defined above. We know from (1) that at least
of all choices of
are good, so we have thus obtained
. To finish the proof, we need to bound the number of times we have counted the same solution in
, that is, the number of
for which
can equal a certain
-tuple in
and recall that
only if there is a helpful
(as defined just before equation (4)) such that
. We claim that for every helpful
, there are at most
choices of
for which
. Indeed recall that by our choice of
the vectors
are distinct and do not lie on one line. Hence, they are affine independentFootnote
. But since the entries of
belong to
$t \leq 1/\varepsilon$
, we see that for large enough
the vectors
are also affine independent over
. This means that the system of three linear equations:

) has only
solutions, implying the desired bound on the number of choices of
. Since
$|X| \leq t^d \leq (1/\varepsilon )^{2C}$
by (3), we see that
contains at most
$(1/\varepsilon )^{8C}$
-tuples. Altogether this means that for every
$s_1,s_2,s_3,s_4 \in S$
, there are at most
$(1/\varepsilon )^{8C}n^{d-2}$
choices of
for which
. Since we have previously deduced that
contains at least
$\frac 12n^{d+1}$
, we get that
contains at least
$\frac 12\varepsilon ^{8C}n^3$
distinct solutions, as needed.
I would like to thank Yuval Wigderson and an anonymous referee for helpful comments and suggestions.