1. Introduction
The Bombieri–Vinogradov theorem [Reference BombieriBom65, Reference VinogradovVin65] famously states that for any
$x > 2$
,
$A > 0$
, and B sufficiently large in terms of A, one has

where
$\pi(x)$
denotes the number of primes up to x, and
$\pi(x; q, a)$
denotes the number of these primes which are congruent to a modulo q. Informally, (1.1) states that the primes are well distributed in arithmetic progressions when averaging over moduli almost as large as
$x^{1/2}$
. Without the sum over q, the Siegel–Walfisz theorem only controls the summand uniformly in the (much smaller) range
$q \leqslant (\log x)^A$
, and the Generalized Riemann Hypothesis would improve this to
$q \leqslant x^{1/2}/(\log x)^B$
. Thus (1.1) provides an unconditional substitute for GRH when some averaging over q is available; this is very often the case in sieve theory, where results like (1.1) have led to multiple major breakthroughs (including, e.g., the existence of infinitely many bounded gaps between primes [Reference ZhangZha14, Reference PolymathPol14a, Reference MaynardMay15, Reference PolymathPol14b]).
We say that the primes have exponent of distribution
$\alpha < 1$
iff the analogue of (1.1) holds true when summing over all moduli
$q \leqslant x^\alpha$
. The Elliott–Halberstam conjecture [Reference Elliott and HalberstamEH68] asserts that
$\alpha = 1 - \varepsilon$
works for any
$\varepsilon > 0$
(the implied constant depending on
$\varepsilon$
), but it remains open whether (1.1) holds for any
$\alpha > 1/2$
. Quite remarkably, it is possible to go beyond this square-root barrier if one slightly weakens the left-hand side of (1.1), by fixing the residue a, assuming various factorization properties of the moduli q, and/or replacing the absolute values with suitable weights. On this front, we mention the pioneering work of Fouvry [Reference FouvryFou84, Reference FouvryFou87, Reference FouvryFou85, Reference FouvryFou82] and Fouvry and Iwaniec [Reference Fouvry and IwaniecFI80, Reference Fouvry and IwaniecFI83], a series of three papers by Bombieri, Friedlander, and Iwaniec [Reference Bombieri, Friedlander and IwaniecBFI86, Reference Bombieri, Friedlander and IwaniecBFI87, Reference Bombieri, Friedlander and IwaniecBFI89], the main estimate in Zhang’s work on bounded gaps [Reference ZhangZha14], and three recent papers of Maynard [Reference MaynardMay25a, Reference MaynardMay25b, Reference MaynardMay25c]; in particular, in [Reference MaynardMay25b], Maynard achieved exponents of distribution as large as
$3/5-\varepsilon$
assuming well-factorable weights.
In this paper, we are concerned with the case of y-smooth (or y-friable) numbers rather than primes; the objects of study here are the sets

defined by two parameters
$x, y \geqslant 2$
, where y will grow like
$x^{o(1)}$
. To state our main result, we denote

Theorem 1.1 (Smooth numbers in APs with moduli beyond
$x^{3/5}$
). Let
$a \in \mathbf{Z} \setminus \{0\}$
and
$A, \varepsilon > 0$
. Then there exists
$C = C(A, \varepsilon) > 0$
such that, in the range
$x > 2$
,
$(\log x)^C \leqslant y \leqslant x^{1/C}$
, one has

A similar result with an exponent of
$1/2 - \varepsilon$
and uniformity in a (as in (1.1)) is due to Granville [Reference GranvilleGra93a, Theorem 2]; see also [Reference WolkeWol73a, Reference WolkeWol73b, Reference Fouvry and TenenbaumFT91, Reference GranvilleGra93b]. Virtually all results of this type that go beyond the
$x^{1/2}$
-barrier rely on equidistribution estimates for convolutions of sequences, but unless a long smooth sequence is involved, the exponents have been limited to
$3/5$
or less. Bombieri, Friedlander, and Iwaniec proved a triple convolution estimate handling moduli up to
$x^{3/5}$
for sequences of convenient lengths [Reference Bombieri, Friedlander and IwaniecBFI86, Theorem 4], and Fouvry and Tenenbaum used this result (along with the flexible factorization properties of smooth numbers) to prove an analogue of Theorem 1.1 for
$q \leqslant x^{3/5-\varepsilon}$
, and with a right-hand side of
$x/(\log x)^A$
[Reference Fouvry and TenenbaumFT96, Théorème 2]. Motivated by an application to the Titchmarsh divisor problem for smooth numbers, Drappeau improved the bound to
$\Psi(x, y)/(\log x)^A$
in the same range
$q \leqslant x^{3/5-\varepsilon}$
[Reference DrappeauDra15, Théorème 1]. Unfortunately, the BFI estimates and subsequent arguments seem limited to this range of moduli.
Maynard recently introduced a different arrangement of exponential sums [Reference MaynardMay25a, Chapter 18], which would, in principle, allow for a triple convolution estimate with moduli up to
$x^{5/8}$
, if the Selberg eigenvalue conjecture for Maass forms [Reference SelbergSel65, Reference SarnakSar95, Reference IwaniecIwa85, Reference Iwaniec and SzmidtIS85, Reference IwaniecIwa90, Reference Luo, Rudnick and SarnakLRS95] held true; but his unconditional estimates were still limited below
$x^{3/5}$
[Reference MaynardMay25a, Proposition 8.3]. We introduce a further variation of Maynard’s argument which eliminates certain coefficient dependencies, allowing one to use more efficient estimates for sums of Kloosterman sums in some ranges. More precisely, we rely on an optimized estimate of Deshouillers–Iwaniec type (see Theorem 3.10), which averages over exceptional Maass forms (and their levels) more carefully, ultimately allowing us to go beyond
$3/5 = 0.6$
unconditionally. Our exponent of
$66/107 \approx 0.617$
uses the best progress towards Selberg’s conjecture, due to Kim and Sarnak [Reference KimKim03, Appendix 2] (based on the automorphy of symmetric fourth-power L-functions).
Notation 1.2 (Exceptional eigenvalues). For
$q \in \mathbf{Z}_+$
, define
$\theta_q := \sup_\lambda \sqrt{\max(0,1 - 4\lambda)}$
, where
$\lambda$
runs over all eigenvalues of the hyperbolic Laplacian for the Hecke congruence subgroup
$\Gamma_0(q)$
(such
$\lambda$
is called exceptional iff
$\lambda < 1/4$
). Also, let
$\theta_{\max} := \sup_{q \geqslant 1} \theta_q$
.
Conjecture 1.3 (Selberg [Reference SelbergSel65]). One has
$\theta_{\max} = 0$
, i.e., there are no exceptional eigenvalues.
Theorem A (Kim–Sarnak [Reference KimKim03]). One has
$\theta_{\max} \leqslant 7/32$
.
Remark 1.4. We warn the reader of another common normalization for the
$\theta$
-parameters, which differs by a factor of 2 (resulting in a bound of
$7/64$
in Theorem A); our normalization follows [Reference Deshouillers and IwaniecDI82, Reference MaynardMay25a]. We give more details on the role of exceptional eigenvalues in our work in § 10.
We now state a more general version of our main result from Theorem 1.1, which makes the dependency on
$\theta_{\max}$
explicit, gives a refined bound on the right-hand side (following [Reference DrappeauDra15]), and allows for some small uniformity in the residue parameter a.
Theorem 1.5 (Conditional exponent of distribution). For any
$\varepsilon > 0$
, there exist
$C, \delta > 0$
such that the following holds. Let
$x > 2$
,
$(\log x)^C \leqslant y \leqslant x^{1/C}$
, and denote
$u := (\log x)/(\log y)$
,
$H(u) := \exp (u \log^{-2} (u+1))$
. Then with an exponent of

one has

for all
$a_1, a_2 \in \mathbf{Z}$
with
$1 \leqslant |a_1|, |a_2| \leqslant x^\delta$
, and all
$A \geqslant 0$
. The implicit constant is effective if
$A < 1$
.
Remark 1.6. In (1.3),
$\overline{a_2}$
denotes a multiplicative inverse of
$a_2$
modulo q; so the residue
$a_1\overline{a_2}$
corresponds to congruences of the form
$a_2 n \equiv a_1 \ (\textrm{mod } q)$
. The right-hand side of (1.3) is the same as in Drappeau’s result [Reference DrappeauDra15, Théorème 1], and ultimately comes from a result of Harper (see Lemma 3.2).
In particular, Conjecture 1.3 would imply an exponent of distribution of
$5/8 - o(1)$
, while Theorem A leads to the unconditional exponent of
$66/107 - o(1)$
from Theorem 1.1. As in previous approaches, our main technical result leading to Theorem 1.5 is a triple convolution estimate, given in Theorem 4.2; this improves on [Reference Bombieri, Friedlander and IwaniecBFI86, Théorème 4], [Reference DrappeauDra15, Théorème 3], [Reference Drappeau, Granville and ShaoDGS17, Lemma 2.3], and [Reference MaynardMay25a, Proposition 8.3]. We expect all such approaches to face a significant barrier at the exponent
$2/3 = 0.\overline{6}$
(see the remark after (4.4)), so we may view the exponents of
$66/107 \approx 0.617$
and
$5/8 = 0.625$
as progress towards this limit.
In fact, Theorem 4.2 is already in a suitable form to improve the analogous results of Drappeau, Granville, and Shao about smooth-supported multiplicative functions [Reference Drappeau, Granville and ShaoDGS17]. More precisely, using Theorem 4.2 instead of [Reference Drappeau, Granville and ShaoDGS17, Lemma 2.3], one can improve the exponent of
$3/5$
in [Reference Drappeau, Granville and ShaoDGS17, Theorem 1.2] to the same value as in (1.2). We state a particular case of this result below, borrowing the notation

from [Reference Drappeau, Granville and ShaoDGS17]; we also say that an arithmetic function f satisfies the Siegel–Walfisz criterion if and only if

Theorem 1.7 (Smooth-supported multiplicative functions in APs). For any
$\varepsilon, A > 0$
, there exists
$\delta > 0$
such that the following holds. Let
$x > 2$
,
$x^\delta \geqslant y \geqslant \exp(\sqrt{\log x} \log \log x)$
, and f be a 1-bounded completely multiplicative function supported on y-smooth integers, satisfying the Siegel–Walfisz criterion. Then for
$\alpha := ({5-4\theta_{\max}})/({8-6\theta_{\max}}) - \varepsilon$
, and all
$a_1, a_2 \in \mathbf{Z}$
with
$1 \leqslant |a_1|, |a_2| \leqslant x^{\delta}$
,

Remark 1.8. The improvement from
$3/5-\varepsilon$
to the exponent in (1.2) follows through in most applications of Drappeau’s result [Reference DrappeauDra15, Théorème 1], such as [Reference DrappeauDra15, Corollaire 1]. Following [Reference de La Bretèche and DrappeaudLBD20, §§2 and 4], our triple convolution estimate also implies a version of Theorem 1.5 restricted to smooth moduli, which can be used to deduce refined upper bounds for the number of smooth values assumed by a factorable quadratic polynomial. For instance, one should obtain

for
$(\log x)^{O_\varepsilon(1)} \leqslant y \leqslant x$
, where
$u = (\log x)/(\log y)$
and
$\varrho(u)$
is the Dickman function (satisfying
$\Psi(x, y) = x \varrho(u) e^{O_\varepsilon(u)}$
in this range [Reference de La Bretèche and DrappeaudLBD20, Reference HildebrandHil86]).
2. Overview of key ideas
Let us give a very rough sketch of our argument, for a simpler case of Theorem 1.1. Consider the residue
$a = 1$
, the smoothness parameter
$y = x^{1/\sqrt{\log \log x}}$
, and a sum over moduli just above the
$3/5$
threshold, say

for some small
$\sigma \gg 1$
(we switch the variable q to r, following [Reference DrappeauDra15, Reference Drappeau, Granville and ShaoDGS17]). Using the factorization properties of smooth numbers, it suffices to prove a triple convolution estimate roughly of the form

where
$(\rho_r)$
,
$(\alpha_m)$
,
$(\beta_n)$
, and
$(\gamma_\ell)$
are arbitrary 1-bounded complex sequences, but we are free to choose the parameters
$M, N, L \gg 1$
subject to
$MNL \asymp x$
. We pick

for some small
$\delta = o(1)$
; thus
$M, N \approx x^{2/5-\sigma}$
and
$L \approx x^{1/5+2\sigma}$
. Note additionally that
$NL = x^{\delta} R$
.
2.1 First steps and limitations of previous approaches
Following previous works [Reference Bombieri, Friedlander and IwaniecBFI86, Reference DrappeauDra15, Reference MaynardMay25a] based on Linnik’s dispersion method [Reference Linnik and SchuurLin63], we apply Cauchy and Schwarz in the r, m variables, expand the square, and Fourier complete the resulting sums in m to sums over h. Ignoring GCD constraints, the key resulting exponential sum is a smoothed variant of

where
$(u_k)$
is the convolution of the original sequences
$(\beta_n)$
,
$(\gamma_\ell)$
. We then flip moduli in the exponential via Bézout’s identity, and substitute
$t := (k - n\ell)/r$
; this leads to the sum

Following Maynard [Reference MaynardMay25a], we apply Cauchy and Schwarz in the t, n, k variables (keeping the congruence modulo t inside), and expand the square to reach the sum

which we ultimately need to bound by
$\ll_A RNL^2 (\log x)^{-A}$
; note that the trivial bound is larger by about
$(R/M)^2$
, due to the sum over
$h_1, h_2$
introduced by Poisson summation. We bound the contribution of the diagonal terms (with
$h_1\ell_2 = h_2\ell_1$
) by

which is acceptable since
$N/M = x^{-\delta}$
. We then introduce Kloosterman sums
$S(i, j; k)$
by completing the sum in n to a sum over j, and find acceptable contributions from the zeroth Fourier coefficient (
$j = 0$
), as well as from the terms with
$\ell_1 = \ell_2$
; here, it suffices to use the Ramanujan bound, respectively an estimate of Deshouillers and Iwaniec [Reference Deshouillers and IwaniecDI82, Theorem 9]. It ultimately remains to bound a variant of

by
$\ll_A N^2 L^4 (\log x)^{-A}$
. Inserting 1-bounded coefficients
$\xi_{h_1, h_2}$
(also depending on
$t, \ell_1, \ell_2$
), and letting
$a_d := \sum_{h_1 \ell_2 - h_2 \ell_1 = d} \xi_{h_1, h_2}$
, the sum over
$h_1, h_2$
in (2.7) roughly reduces to

Such sums can be bounded using the spectral theory of automorphic forms, specifically through the aforementioned work of Deshouillers and Iwaniec [Reference Deshouillers and IwaniecDI82] (based on the Kuznetsov trace formula and the Weil bound); the relevant level of the congruence group in Notation 1.2 is
$Q = \ell_1\ell_2$
. Indeed, Maynard [Reference MaynardMay25a] uses [Reference Deshouillers and IwaniecDI82, Theorem 9] to bound (a smoothed variant of) the sum in (2.8) by

and consequently the sum in (2.7) by

Unfortunately, this falls short of the desired bound of
$N^2 L^4 (\log x)^{-A} \approx x^{8/5 + 6\sigma}$
, unless

This is (barely) impossible with the currently best-known bound of
$\theta_{\max}/2 \leqslant 7/64 \approx 0.109$
.
2.2 Improved exponential sum manipulations for specific ranges
Starting from the work of Deshouillers and Iwaniec [Reference Deshouillers and IwaniecDI82], better bounds (in the
$\theta$
-aspect) for sums like (2.8) have been available when one additionally averages over the level
$\ell_1\ell_2$
, and at least one of the sequences of coefficients is independent of the level. Indeed, Drappeau’s triple convolution estimate [Reference DrappeauDra15] and prior works rely on [Reference Deshouillers and IwaniecDI82, Theorem 12], which gives such a result for incomplete Kloosterman sums.
Following Maynard’s argument (which is in turn based on Bombieri–Friedlander–Iwaniec’s work in [Reference Bombieri, Friedlander and IwaniecBFI87, §10]), we prefer to complete our Kloosterman sums and bound the contribution from the zeroth Fourier coefficient by hand, and separate into terms with
$\ell_1 = \ell_2$
and
$\ell_1 \neq \ell_2$
, all before invoking Deshouillers–Iwaniec-style bounds. We then aim to apply an optimized bound for sums of complete Kloosterman sums with averaging over the level (given in Theorem 3.10), which improves [Reference Deshouillers and IwaniecDI82, Theorem 11] by making the dependency on the
$\theta_{\max}$
parameter explicit. But for this strategy to work out, we would need:
-
(1) the range of
$(\ell_1, \ell_2)$ in (2.7) to be (discretely) dense inside
$[L, 2L]^2$ ; and
-
(2) crucially, the coefficients
$e(j \overline{\ell_1} / t )$ to not depend on
$\ell_1, \ell_2$ .
While (2) is obviously false in our case, it is only barely false for the specific ranges in (2.2), due to the smallness of the parameter
$t \sim x^\delta$
. In particular, losing a factor of at most
$x^{O(\delta)} = x^{o(1)}$
in (2.7), we may fix t and the values of
$\ell_1$
and
$\ell_2 \ (\textrm{mod } t)$
, turning (2) into a true statement at the expense of (1). The number of pairs
$(\ell_1, \ell_2)$
now becomes
$\asymp L^2/t^2$
, which ends up costing us another acceptable factor of
$x^{o(1)}$
. Overall, it remains to bound a sum of the form

for some fixed
$\omega \in \mathbf{R}/\mathbf{Z}$
(independent of
$\ell_1, \ell_2$
). Using Theorem 3.10, we obtain a bound like (2.9) where the factor depending on
$\theta_{\max}$
is

rather than
$x^{\theta_{\max}/2}$
. Thus instead of (2.10), we now reach the desired bound provided that

which is possible since
$(1/5) \cdot (7/32) = 0.04375 < 0.1$
. In fact, this handles all values

reaching the exponent of distribution in (1.2). Plugging in Kim–Sarnak’s bound of
$\theta_{\max} \leqslant 7/32$
(Theorem A) yields the unconditional exponent of
$66/107 \approx 0.617$
from Theorem 1.1.
Remark 2.1. It is likely that optimized Deshouillers–Iwaniec-style bounds like Theorem 10.3 could also improve Drappeau’s argument [Reference DrappeauDra15], leading to a triple convolution estimate with different ranges than in our Theorem 4.2. In terms of the final exponent of distribution of smooth numbers, all such methods currently seem limited below
$66/107$
unconditionally (and
$5/8$
conditionally).
2.3 Completing the argument
To increase the range of uniformity in y, we adapt Drappeau’s version of the dispersion method [Reference DrappeauDra15]: we aim for a triple convolution estimate with a power saving in Theorem 4.2, after separating the contribution of small-conductor Dirichlet characters

from (2.1); this can be handled via Lemma 3.2. As a result, the simpler two dispersion sums
$\mathcal{S}_2$
,
$\mathcal{S}_3$
and their main terms involve Dirichlet characters (see Propositions 5.1 and 6.1), which ultimately bring in the classical Gauss sum bound (Lemma 3.7) and the multiplicative large sieve (Lemma 3.4). The difficulties in working with a general residue
$a_1 \overline{a_2}$
for
$a_1, a_2 \ll x^\delta$
, and in obtaining power savings throughout the computations in § 2.1, are quite tedious but purely technical (following [Reference DrappeauDra15]).
We also adapt a ‘deamplification’ argument of Maynard [Reference MaynardMay25a], which introduces an artificial sum over
$e \sim E = x^{o(1)}$
into the dispersion sums (by averaging over the residue of
$n\ell \ (\textrm{mod } e)$
before applying Cauchy and Schwarz); for instance, the sum in (2.3) becomes

Keeping e inside the second application of Cauchy and Schwarz, this essentially reduces the contribution of the diagonal terms in (2.6) by a factor of E, allowing us to cover wider ranges of sequence lengths in Theorem 4.2 (including the case
$M = N$
). This is generally convenient, and critical when one has less control over the sizes of the sequence lengths (which is the case in applications to the primes, but not to smooth numbers).
Figure 1 gives a visual summary of our formal argument, outlining the logical dependencies between our main lemmas, propositions, and theorems.

Figure 1 Structure of argument (arrows show logical implications).
3. Notation and preliminaries
3.1 Sets, sums, estimates, and congruences
We use the standard asymptotic notation in analytic number theory, with
$f = O(g)$
(or
$f \ll g$
) meaning that there exists some constant
$C > 0$
such that
$|f| \leqslant Cg$
globally. We write
$f \asymp g$
when
$f \ll g$
and
$g \ll f$
, and indicate that the implied constants may depend on a parameter
$\varepsilon$
by placing it in the subscript (e.g.,
$f = O_\varepsilon(g)$
,
$f \ll_\varepsilon g$
, and
$f \asymp_\varepsilon g$
). When
$g \geqslant 0$
, we also say that
$f = o(g) = o_{x \to \infty}(g)$
if and only if
$f(x)/g(x) \to 0$
as
$x \to \infty$
. Given
$q \in [1, \infty]$
, we write
$\|f\|_q$
for the
$L^q$
norm of a measurable function
$f : \mathbf{R} \to \mathbf{C}$
(using the Lebesgue measure), and
$\|a_n\|_q$
for the
$\ell^q$
norm of a complex sequence
$(a_n)$
.
We denote by
$\mathbf{Z}_+, \mathbf{Z}, \mathbf{R}, \mathbf{C}$
, and
$\mathbf{H}$
the sets of positive integers, integers, real numbers, complex numbers, and complex numbers with positive imaginary part, and set
$\textrm{e}(x) := \exp(2 \pi i x)$
for
$x \in \mathbf{R}$
(or
$x \in \mathbf{R}/\mathbf{Z}$
). We write
$\mathbf{Z}/n\mathbf{Z}$
and
$(\mathbf{Z}/n\mathbf{Z})^\times$
for the additive and multiplicative groups modulo a positive integer n, and denote the inverse of
$c \in (\mathbf{Z}/n\mathbf{Z})^\times$
by
$\overline{c}$
. We may abuse notation slightly by identifying integers a, b, c with their residue classes modulo n where this is appropriate (e.g., in congruences
$a \equiv b \overline{c} \ (\textrm{mod } \pm n)$
,
$x \equiv b\overline{c}/n \ (\textrm{mod } 1)$
, or in exponentials
$\textrm{e} (b \overline{c}/n)$
); the following simple lemma is an example of this.
Lemma 3.1 (Bézout’s identity). For any relatively prime integers a, b, one has

Proof. Note that, here,
$\overline{a}$
and
$\overline{b}$
denote the inverses of a and b modulo b and a, respectively, so we have
$a \overline{a} \equiv 1 \ (\textrm{mod } b)$
and
$b \overline{b} \equiv 1 \ (\textrm{mod } a)$
. The conclusion follows from the Chinese remainder theorem, once we multiply the congruence by ab and verify it modulo a and b separately.
Given
$N > 0$
, we write
$n \sim N$
for the statement that
$N < n \leqslant 2N$
, usually in the subscripts of sums. Given a statement S, we write
$\mathbb{1}_S$
for its truth value (e.g.,
$\mathbb{1}_{2 \mid n}$
equals 1 when n is even and 0 otherwise); we may use the same notation for the indicator function of a set S (i.e.,
$\mathbb{1}_S(x) = \mathbb{1}_{x \in S}$
).
Given
$a_1, \ldots, a_k \in \mathbf{Z}$
, we write
$(a_1, \ldots, a_k)$
(if not all
$a_i$
are 0) and
$[a_1, \ldots, a_k]$
(if none of the
$a_i$
are 0) for their greatest common divisor and lowest common multiple, among the positive integers. Given
$a \in \mathbf{Z} \setminus \{0\}$
, we write
$\textrm{rad}(a)$
for the largest square-free positive integer dividing a; for
$b \in \mathbf{Z}$
, we also write
$a \mid b^\infty$
if and only if
$\textrm{rad}(a) \mid b$
(i.e., a divides a large enough power of b), and
$(a, b^\infty)$
for the greatest divisor of a whose prime factors divide b. If
$x > 0$
and
$m \in \mathbf{Z} \setminus \{0\}$
, sums like
$\sum_{n \leqslant x}$
,
$\sum_{n \sim x}$
,
$\sum_{d \mid m}$
,
$\sum_{d \mid m^\infty}$
,
$\sum_{(a, m) = 1}$
,
$\sum_{(a, m^\infty) = 1}$
and
$\sum_{ab = m}$
are understood to range over all positive integers n, d, a, b with the respective properties.
We also keep the notations specific to smooth numbers from the introduction, for S(x, y),
$\Psi(x, y) = \# S(x, y)$
,
$\Psi_q(x, y)$
,
$\Psi(x, y; a, q)$
, and H(u).
3.2 Multiplicative number theory
We denote by
$\mu$
,
$\tau$
, and
$\varphi$
the Möbius function, the divisor-counting function (
$\tau(n) := \sum_{d \mid n} 1$
), and the Euler totient function (
$\varphi(n) := \sum_{1 \leqslant a \leqslant n} \mathbb{1}_{(a, n) = 1}$
). We may use various classical bounds involving these functions implicitly, including the divisor bound
$\tau(n) \ll_\varepsilon n^\varepsilon$
(valid for all
$\varepsilon > 0$
), the lower bound
$\varphi(n) \gg n/(\log \log n)$
, and the upper bounds

(The latter follows from the former, using that
$\varphi(ab) \gg \varphi(a) \varphi(b)$
for positive integers a, b.)
We write
$\chi \ (\textrm{mod } q)$
to indicate that
$\chi$
is a Dirichlet character with period q (of which there are
$\varphi(q)$
), and denote by
$\textrm{cond}(\chi)$
the conductor of
$\chi$
(which divides q; this is the smallest positive integer d such that there exists a Dirichlet character
$\chi' \ (\textrm{mod } d)$
with
$\chi(n) = \chi'(n) \mathbb{1}_{(n, q) = 1}$
for all
$n \in \mathbf{Z}$
); we say that
$\chi \ (\textrm{mod } q)$
is primitive when
$\textrm{cond}(\chi) = q$
. We will require a couple of results involving Dirichlet characters, the first being essentially due to Harper.
Lemma 3.2 (Contribution of small-conductor characters). There exist constants
$\varepsilon, \delta, C > 0$
such that, for
$(\log x)^C \leqslant y \leqslant x$
,
$Q \leqslant x$
and
$A > 0$
, one has

with
$u := (\log x)/(\log y)$
,
$H(u) := \exp(u \log^{-2}(u+1))$
. The implicit constant is effective if
$A < 1$
.
Proof. This is the same as [Reference DrappeauDra15, Lemme 5], and follows from the work of Harper in [Reference HarperHar12, 3.3].
Remark 3.3. The condition
$\textrm{cond}(\chi) > 1$
in Lemma 3.2 leaves out the trivial character
$\chi_0$
.
The second result is the classical multiplicative large sieve, as stated in [Reference DrappeauDra15, Lemme 6].
Lemma 3.4 (Multiplicative large sieve). For
$Q, M, N \geqslant 1$
and any sequence
$(a_n)$
of complex numbers, one has

Proof. See, for example, [Reference Iwaniec and KowalskiIK21, Theorem 7.13].
3.3 Fourier analysis
Given an integrable function
$f : \mathbf{R} \to \mathbf{C}$
, we write

for its Fourier transform. We will need the truncated version of Poisson summation stated below.
Lemma 3.5 (Truncated Poisson/Fourier completion). Let
$C > 0$
,
$x > 1$
,
$1 < M \ll x$
, and
$\Phi : \mathbf{R} \to \mathbf{R}$
be a smooth function supported in
$[1/10, 10]$
such that
$\|\Phi^{(j)}\|_\infty \ll_j (\log x)^{jC}$
for
$j \geqslant 0$
. Then for all positive integers
$q \ll x$
, any
$a \in \mathbf{Z}/q\mathbf{Z}$
, and any
$\varepsilon > 0$
,
$H \geqslant x^{\varepsilon} qM^{-1}$
, one has

Proof. This is the same as [Reference MaynardMay25a, Lemma 13.4] (see also [Reference DrappeauDra15, Lemme 2]), following directly from the Poisson summation formula.
While Lemma 3.5 will introduce exponential sums into our estimates, we will need an additional corollary (and generalization) of it to obtain sums of complete Kloosterman sums, defined by

The following is the same as [Reference MaynardMay25a, Lemma 13.5], and can be quickly deduced from Lemma 3.5.
Lemma 3.6 (Kloosterman completion). Let
$C, x, M, \Phi$
be as in Lemma 3.5. Then for all positive integers
$c, q \ll x$
with
$(c, q) = 1$
, any
$a \in \mathbf{Z}/q\mathbf{Z}$
,
$n \in \mathbf{Z}/c\mathbf{Z}$
, and any
$\varepsilon > 0$
,
$H \geqslant x^\varepsilon cq M^{-1}$
, one has

Proof. Rewrite the left-hand side as

and apply Lemma 3.5 to expand the inner summation, for the unique residue class
$r \in \mathbf{Z}/cq\mathbf{Z}$
which is congruent to
$a \ (\textrm{mod } q)$
and to
$b \ (\textrm{mod } c)$
(invoking the Chinese remainder theorem). Noting that

by Lemma 3.1 swapping sums and taking out the factor depending on a, the conclusion follows.
3.4 Bounds for exponential sums
The simpler two of the three dispersion sums arising in our computations (see (5.6)) will be estimated using the classical bounds for Gauss and Kloosterman sums.
Lemma 3.7 (Gauss sum bound). For any
$a \in \mathbf{Z}$
,
$q \in \mathbf{Z}_+$
, and Dirichlet character
$\chi \ (\textrm{mod } q)$
, one has

Proof. This follows from [Reference Iwaniec and KowalskiIK21, Lemmas 3.2 and 3.1], and is also used in [Reference DrappeauDra15, §3.2].
Lemma 3.8 (Weil and Ramanujan bounds). For
$c \in \mathbf{Z}_+$
and
$m, n \in \mathbf{Z}$
(or
$\mathbf{Z}/c\mathbf{Z}$
), one has

For
$m = 0$
, we have in fact
$|S(0, n; c)| \leqslant (n, c)$
.
Proof. The first (Weil) bound is [Reference Iwaniec and KowalskiIK21, Corollary 11.12], while the second (Ramanujan) bound can be deduced by Möbius inversion.
Lemma 3.9 (Incomplete Weil bound). Let
$x > 1$
,
$1 < M \ll x$
, and let
$n, c, k, \ell \ll x$
be positive integers. Then for any
$\varepsilon > 0$
, one has

Proof. This follows immediately from [Reference DrappeauDra15, (2.5)] and the divisor bound (see also [Reference FouvryFou82, Lemme 4] and [Reference MaynardMay25a, Lemma 16.1]). It can be proven by expanding
$m\varphi(m)^{-1} = \sum_{v \mid m^\infty} v^{-1}$
and
$\mathbb{1}_{(m, k)} = \sum_{d \mid k} \mu(d) \mathbb{1}_{d \mid m}$
, changing variables
$m \gets m [\ell, \textrm{rad}(v), d]$
, completing sums via a result like Lemma 3.6, and finally applying Lemma 3.8 for the terms
$h = 0$
and
$h \neq 0$
separately.
To estimate the first dispersion sum, will crucially need the following bound for sums of Kloosterman sums, which is an optimization of [Reference Deshouillers and IwaniecDI82, Theorem 11] (see also [Reference Bombieri, Friedlander and IwaniecBFI87, Lemma 5]).
Theorem 3.10 (The DI-type Kloosterman bound). Let
$1 \ll M, N, R, S, C \ll x^{O(1)}$
,
$(a_{m,r,s})$
be a complex sequence supported in
$m \sim M, r \sim R, s \sim S$
, and
$\omega \in \mathbf{R}/\mathbf{Z}$
. Also, let g(t) be a smooth function supported on
$t \asymp 1$
, with bounded derivatives
$\|g^{(j)}\|_\infty \ll_{j} 1$
for
$j \geqslant 0$
. Then, for any
$\eta > 0$
, one has

where we recall that
$\theta_{\max} \leqslant 7/32$
by Theorem A. (The ‘
$\pm n{\kern-1pt}$
’ notation indicates that either consistent choice of sign is allowable.)
Theorem 3.10 makes use of the spectral theory of automorphic forms, and follows from a variation of the landmark arguments of Deshouillers and Iwaniec (all of the necessary ingredients being already present in [Reference Deshouillers and IwaniecDI82]). We leave its proof, which requires much additional notation, to § 10.
4. The triple convolution estimate
Here, we state our main technical result, Theorem 4.2, which concerns the distribution in arithmetic progressions of convolutions of three bounded sequences (we point the reader to similar results in [Reference Bombieri, Friedlander and IwaniecBFI86, Theorem 4], [Reference DrappeauDra15, Théorème 3], [Reference Drappeau, Granville and ShaoDGS17, Lemma 2.3], and [Reference MaynardMay25a, Proposition 8.3]). We then deduce Theorem 1.5 from Theorem 4.2.
Remark 4.1. One can apply the most efficient convolution estimates directly to the setting of smooth numbers (and smooth-supported multiplicative functions), since these can essentially be factorized into any number of factors of pre-specified sizes. By contrast, in the case of primes, combinatorial decompositions of the von Mangoldt function produce more types of convolution sums, requiring different estimates for different ranges (typically organized into ‘type I’ and ‘type II’ information).
To achieve a power saving in Theorem 4.2, appropriate for the application to smooth numbers, one needs a better approximation for indicator functions of the form
$\mathbb{1}_{k \equiv 1 \ \textrm{mod } r}$
than
$({1}/{\varphi(r)})\mathbb{1}_{(k, r) = 1}$
(given
$r \in \mathbf{Z}_+$
and
$k \ \textrm{mod } r$
). Drappeau [Reference DrappeauDra15, Reference Drappeau, Granville and ShaoDGS17] noticed that since

one can instead consider the partial sum

and work with the error term

One should then expect to obtain better bounds for
$\mathcal{E}_D(k; r)$
when D is moderately large (i.e., a small power of x) than when
$D = 1$
(and
$\omega_1(k; r) = 1$
). We also note the crude bound

which may be used implicitly in our proofs.
Theorem 4.2 (Triple convolution estimate). For all sufficiently small
$\varepsilon > 0$
, there exists
$\delta > 0$
such that the following holds. Let
$a_1, a_2$
be coprime nonzero integers, and let
$M, N, L, R, x > 2$
satisfy

for
$\theta = \theta_{\max}$
. Then for any 1-bounded complex sequences
$(\alpha_m)$
,
$(\beta_n)$
,
$(\gamma_\ell)$
, one has

for all
$1 \leqslant D \leqslant x^\varepsilon$
.
Remark 4.3. Error terms of the form
$O_\varepsilon(x^{1-\delta})$
are dominated by the right-hand side of (4.3), and will be available throughout most of our proof. If
$x^{2\delta} \leqslant D \leqslant x^\varepsilon$
, then the right-hand side of (4.3) becomes
$x^{1-\delta} (\log x)^4$
, i.e., a power saving; having an explicit dependence on the conductor bound D is required for the application to smooth-supported multiplicative functions, as in [Reference Drappeau, Granville and ShaoDGS17].
Remark 4.4. If one is free to choose the parameters M, N, and L subject only to the constraints in (4.2) and
$MNL \asymp x$
, then in order to maximize the range R, it is optimal to pick (up to
$x^{o(1)}$
factors)

This improves on the conditions from Drappeau’s triple convolution estimate [Reference DrappeauDra15, (3.2)], which can handle moduli up to
$R \approx x^{3/5}$
.
Remark 4.5. Although our (conditional) results hit a barrier at
$R = x^{5/8-o(1)}$
, a more essential limitation of triple convolution estimates proven via the dispersion method lies at
$R \leqslant x^{2/3 - o(1)}$
, corresponding to the case of three equal parameters
$M = N = L \asymp x^{1/3}$
in (4.4). Indeed, the diagonal terms in the first Cauchy–Schwarz step require
$R < NL$
, and already our bounds for the second dispersion sum will use
$NL < x^{2/3}$
(moreover, it is natural to Fourier complete in the largest variable m, leaving
$NL \leqslant x^{2/3}$
). We note again that Theorem 4.2 allows for the case of two equal parameters
$M = N \approx x/R$
, for any
$x^{1/2-o(1)} \ll R \ll x^{(5-4\theta)/(8-6\theta)-o(1)}$
; this is possible due to Maynard’s deamplification argument [Reference MaynardMay25a]. In particular, the ranges
$M = N = x^{2/5}$
, and
$L = x^{1/5}$
, a limiting case in Drappeau’s work [Reference DrappeauDra15, Théorème 3]), are now admissible (this is analogous to the infamous case of convolving five sequences of equal sizes).
Given Theorem 4.2 and Lemma 3.2, deducing Theorem 1.5 is now a routine modification of Drappeau’s argument in [Reference DrappeauDra15, §3.7] (we follow the same reasoning, using Theorem 4.2 instead of [Reference DrappeauDra15, Théorème 3], and with the choice of parameters in (4.4)).
Proof of Theorem 1.5 assuming Theorem 4.2. Let
$\varepsilon > 0$
be sufficiently small, C be the maximum between
$\varepsilon^{-1}(1-\varepsilon)^{-1}$
and the constant C given by Lemma 3.2,
$\delta$
be the minimum between
$\varepsilon/100$
and the
$\delta$
values of Lemma 3.2 and Theorem 4.2, and let
$(\log x)^C \leqslant y \leqslant x^{1/C}$
,
$D := x^{\varepsilon/10}$
,
$\theta := \theta_{\max}$
. It suffices to show (up to a rescaling of
$\varepsilon$
at the end) that (1.3) holds for the range of moduli

and we note that error terms of the form
$O_\varepsilon(x^{1-\delta})$
are acceptable in (1.3) (up to slightly modifying the value of
$\delta$
), due to the inequality
$x^{1-\delta} \ll \Psi(x, y) y^{-\delta/2}$
(as in [Reference DrappeauDra15, §3.7]). We may of course assume that
$a_1$
and
$a_2$
are relatively prime, by reducing any common factors.
We split the left-hand side of (1.3) into

The second sum is at most

which is appropriately bounded by Lemma 3.2 and the triangle inequality. It remains to bound the first sum.
Recall that the range
$n \in S(x, y)$
means
$n \leqslant x$
and
$P^+(n) \leqslant y$
, where
$P^+(n)$
denotes the greatest prime factor of n. We bound the contribution of
$n \leqslant x^{1-\varepsilon}$
trivially, as in [Reference DrappeauDra15, §3.7]

and put the other n values into
$O(\log x)$
dyadic ranges
$n \sim X$
, with
$x^{1-\varepsilon} \leqslant X \ll x$
. We also extend the range of r in these sums to
$r \leqslant X^{{\alpha} + 10\varepsilon}$
, noting that
$x^{\alpha} \leqslant x^{(1-\varepsilon)({\alpha} + 10\varepsilon)}$
. Putting r into
$O(\log x)$
dyadic ranges
$r \sim R$
, it remains to bound sums of the form

for
$R \ll X^{{\alpha}+10\varepsilon}$
. The contribution of the Bombieri–Vinogradov range
$R \ll X^{(1/2) - (\varepsilon/10)}$
is handled by classical methods (e.g., using [Reference Iwaniec and KowalskiIK21, Theorem 17.4]; see [Reference Drappeau, Granville and ShaoDGS17, Proof of Proposition 2.4] and [Reference DrappeauDra15, Proof of Proposition 2]). For any R in the remaining range
$X^{(1/2) - (\varepsilon/10)} \leqslant R \ll X^{{\alpha}+10\varepsilon}$
, we set

and factorize smooth numbers as in [Reference DrappeauDra15, Lemme 7] (or [Reference Fouvry and TenenbaumFT96]) to rewrite the sum in (4.5) as

where
$P^-(n)$
denotes the smallest prime factor of n.
We also put
$m, n, \ell$
into
$O((\log y)^3)$
dyadic ranges
$m \sim M$
,
$n \sim N$
,
$\ell \sim L$
, with
$M \in [M_0, yM_0]$
,
$N \in [y^{-2}N_0, N_0]$
,
$L \in [L_0, yL_0]$
, and
$MNL \asymp X$
. Recalling that
$y \leqslant x^{\varepsilon(1-\varepsilon)} \leqslant X^\varepsilon$
, it is easily checked that for such M, N, L and
$X^{(1/2)-(\varepsilon/10)} \leqslant R \ll X^{{\alpha}+10\varepsilon} = X^{(5-4\theta)/(8-6\theta) - 990\varepsilon}$
, and small enough
$\varepsilon$
, the conditions in (4.2) are satisfied (with respect to X instead of x).
At this point, our sums are almost in the right form to apply the triple convolution estimate in Theorem 4.2, except for a few joint constraints on the variables
$m, n, \ell$
(these are
$P^+(m) \leqslant P^-(\ell)$
,
$P^+(n) \leqslant P^-(m)$
, respectively,
$mn\ell \sim X$
). The last step of analytically separating these constraints is identical to that in [Reference DrappeauDra15, §3.7], except that in the end we apply Theorem 4.2 instead of [Reference DrappeauDra15, Théorème 3]. Overall, the contribution of the range
$X^{(1/2)-(\varepsilon/10)} \leqslant R \ll X^{{\alpha} + 10\varepsilon}$
is
$O_\varepsilon ( (\log x)^{O(1)} x^{1-\delta} )$
, which is acceptable; this completes our proof.
We only briefly note that the result for smooth-supported multiplicative functions in Theorem 1.7 follows by an analogous modification to the arguments in [Reference Drappeau, Granville and ShaoDGS17], using the parameters in (4.6), and Theorem 4.2 instead of [Reference Drappeau, Granville and ShaoDGS17, Lemma 2.3]. The main additional difficulty in [Reference Drappeau, Granville and ShaoDGS17] lies in the contribution of the small-conductor characters, since Lemma 3.2 is no longer applicable; as a replacement, Drappeau, Granville, and Shao developed a large sieve inequality for smooth-supported sequences [Reference Drappeau, Granville and ShaoDGS17, Theorem 5.1]. (We also point the reader to the follow-up work of Shparlinski in [Reference ShparlinskiShp18].)
5. Dispersion and deamplification
Our goal for the rest of this paper is to prove Theorem 4.2, proceeding by Linnik’s dispersion method. For the reader following the outline in § 2.1, the exponential sum from (2.3) will ultimately arise in the first dispersion sum, after Poisson summation (see Proposition 8.2).
Assume the set-up of Theorem 4.2. We may take x larger than an absolute constant, since the conclusion of Theorem 4.2 is trivial otherwise, and
$(\alpha_m)$
,
$(\beta_n)$
, and
$(\gamma_\ell)$
to be supported on
$m \sim M$
,
$n \sim N$
, and
$\ell \sim L$
, without loss of generality. We first combine the sequences
$\beta_n$
and
$\gamma_\ell$
into one sequence

supported in (K, 4K] where
$K := NL$
,
$|u_k| \leqslant \tau(k) \ll_\varepsilon x^{\varepsilon/2}$
, and
$\sum_k |u_k| \ll K$
. Denoting the left-hand side of (4.3) by
$\Delta = \Delta_D(M,N,L,R)$
, we can introduce coefficients
$\rho_r$
of absolute value 1, supported in (R, 2R], to rewrite

where we recall that
$\omega_D$
was defined in (4.1). Normally, at this point we would apply Cauchy and Schwarz in the r, m variables, but we first perform a ‘deamplification’ step (following Maynard [Reference MaynardMay25a] with minor modifications), as anticipated in § 2.3. The idea is to split the inner sum according to the residue class of
$k \ (\textrm{mod } re)$
for some
$e \geqslant 1$
, and then to average over a convenient set of
$e \sim E$
; this artificially introduces a new parameter E (to be chosen later), which will help reduce the contribution of a certain diagonal sum by a small power of x (at the expense of increasing the corresponding off-diagonal terms, which already had a power-saving bound). For now, we require that

which is compatible with (4.2) since
$R \ll x^{-5\varepsilon} NL$
. For multiple reasons of convenience throughout our proof, we will actually average over the set

Proposition 5.1 (Dispersion set-up with deamplification). Let
$\Phi$
be a smooth function satisfying

and
$\|\Phi^{(j)}\|_\infty \ll_j 1$
for
$j \geqslant 0$
. Then, for any
$\varepsilon > 0$
and
$1 \gg \delta > 0$
, under the parameter conditions in (4.2) and (5.2), one has

where

Proof. For a fixed prime
$e \sim E$
, we wish to eliminate the contribution to
$\Delta$
of the terms k with
$(e, k) \neq 1$
, i.e.,
$e \mid k$
. This contribution is

which is
$\ll x^{1-\varepsilon}$
since
$R \ll x^{(2/3)-11\varepsilon}$
by (4.2). It follows that, for any
$e \sim E$
,

Now for fixed m and e we have

and there are precisely
$\varphi(re)/\varphi(r)$
choices of b in the summation; thus

(For a complete deamplification set-up, one could also try to split the term
$\omega_D(mk\overline{a_1}a_2; r)$
according to the residue b of
$k \ (\textrm{mod } re)$
, but we do not need to do this in our proof.)e then average over e in the set
$\mathcal{E}$
from (5.3), which has size
$|\mathscr{E}| \asymp_\varepsilon E/\log E$
(recalling that
$E \gg x^{\varepsilon}$
and
$|a_2| \leqslant x^\delta$
). Thus up to an error of
$O_\varepsilon(x^{1-\varepsilon})$
, we can rewrite
$\Delta$
as

We now apply Cauchy and Schwarz in the e,r,m,b variables, allowing us to eliminate the
$\rho_r$
and
$\alpha_m$
coefficients; using that
$\varphi(re) \leqslant \varphi(r) e$
, this gives

where

Anticipating a later application of Poisson summation, we bound the indicator functions
$\mathbb{1}_{m \sim M}$
and
$\mathbb{1}_{r \sim R}$
from above by
$\Phi(m/M)$
and
$\Phi(r/R)$
. Then we expand the square and perform the b-summation to obtain

Combining (5.7) with (5.8) and recalling that
$M \asymp x/K$
, we recover (5.5).
Since the error term of
$O_\varepsilon(x^{1-\varepsilon})$
in Proposition 5.1 is admissible for Theorem 4.2, it remains to estimate the dispersion sums
$\mathcal{S}_1, \mathcal{S}_2, \mathcal{S}_3$
.
6. The main terms
We note that, except for the coefficients
$\Phi({m}/{M})$
, only the residue of m modulo r matters in the inner summations from (5.6). Thus if we define

which can be estimated via the truncated Poisson summation in Lemma 3.5, we can rewrite

where

Intuitively, these main terms reflect what would happen if, in the summations from (5.6), the variable m (weighted by
$\Phi({m}/{M})$
) were uniformly distributed modulo r. Thus for
$j \in \{1, 2, 3\}$
,
$\widehat{\Phi}(0)X_j$
is essentially the best approximation to
$\mathcal{S}_j$
which does not depend on M. We now bound the contribution to (5.5) of
$X_1 - 2\textrm{Re} X_2 + X_3$
, using the multiplicative large sieve as in [Reference DrappeauDra15, Reference Drappeau, Granville and ShaoDGS17].
Proposition 6.1 (Contribution of main terms). With the notation above, one has

Proof. In analogy with (5.8), we can write

where the first equality shows that
$X_1 - 2\textrm{Re} X_2 + X_3 \geqslant 0$
. Note that

where
$S = S(r,e,\varepsilon) = S_1 \cup S_2$
and

Since all the characters in S are primitive, any distinct
$\chi_1, \chi_2 \in S$
must induce different characters modulo re. Thus
$\chi_1 \overline{\chi_2} \mathbb{1}_{(re, \cdot) = 1}$
is not the principal character modulo re, so it must have average 0. But then

From (6.4), (6.5), and the fact that all characters
$\chi \in S_1$
also have
$\textrm{cond}(\chi) > D$
(due to
$D \leqslant x^\varepsilon < E \leqslant e \mid \textrm{cond}(\chi)$
), we conclude that

Now letting
$Q := RE$
, substituting q for re, using that q has
$O(\log q)$
different prime factors, and decomposing
$\mathbb{1}_{(a, b) = 1} = \sum_{d \mid (a, b)} \mu(d)$
to get rid of the coprimality restriction, we can bound the sum above by

Noting that

we further have

Finally, applying the multiplicative large sieve from Lemma 3.4 as in [Reference Drappeau, Granville and ShaoDGS17, (2.6)], we conclude that

Using the condition
$RE \ll x^{-\varepsilon}K$
from (5.2) to bound
$Q = RE \ll K/D$
, we conclude that

as we wanted.
To bridge Propositions 5.1 and 6.1, it remains to compare the dispersion sums
$\mathcal{S}_j$
with their main terms
$\widehat{\Phi}(0) X_j$
; we make the following claim.
Proposition 6.2 (Estimates for dispersion sums). For all sufficiently small
$\varepsilon > 0$
, there exists
$\delta > 0$
such that, with the notation in (6.2), the following hold.
-
(i) Assuming the ranges in (4.2), there exists a choice of E satisfying (5.2) such that
(6.6)\begin{equation} \mathcal{S}_1 - \widehat{\Phi}(0)X_1 \ll_\varepsilon x^{-2\delta} \frac{K^2}{R}.\end{equation}
-
(ii) Assuming both (4.2) and (5.2), one has
(6.7)\begin{align} \mathcal{S}_2 - \widehat{\Phi}(0)X_2 &\ll_\varepsilon x^{-2\delta} \frac{K^2}{R}, \\ \mathcal{S}_3 - \widehat{\Phi}(0)X_3 &\ll_\varepsilon x^{-2\delta} \frac{K^2}{R}.\end{align}
Proof of Theorem 4.2 assuming Proposition 6.2. The hypothesis of Theorem 4.2 assumes (4.2), so we can pick E as in Proposition 6.2(i), subject to (5.2). Then, by combining Propositions 5.1, 6.1 and 6.2, we obtain

The conclusion of Theorem 4.2 follows after replacing
$\delta$
with
$\min(\delta, \varepsilon)$
.
Our remaining task is to prove Proposition 6.2; the truncated Poisson expansion of the coefficients
$\mathcal{E}_r(c)$
from (6.2) will ultimately reduce our problem to that of bounding various exponential sums. We note that we have not chosen the value of
$\delta$
in terms of
$\varepsilon$
yet; the condition
$\delta \leqslant \varepsilon/2$
will suffice for estimating
$\mathcal{S}_2$
and
$\mathcal{S}_3$
, but more will be needed for the (much more involved) study of
$\mathcal{S}_1$
.
7. The second and third dispersion sums
Here, we prove Proposition 6.2(ii), adapting Drappeau’s arguments in [Reference DrappeauDra15, §§ 3.2 and 3.3]. We assume all the parameter conditions in (4.2) and (5.2).
Proof
of (6.8), estimating
$\mathcal{S}_3$
. Recall from (6.2) that

where
$\mathcal{E}_r(c)$
is given by (6.1). Expanding
$\mathcal{E}_r(c)$
according to Lemma 3.5 with
$H := x^\varepsilon R M^{-1}$
, we obtain

(In such manipulations, we warn the reader of the potential confusion between the integer variable
$e \in \mathscr{E}$
and the function
$\textrm{e}(\cdot)$
; the difference should be clear from context.)
The inner sum (over c) in (7.1) is a Gauss sum, which we can bound using Lemma 3.7 for the Dirichlet character
$\chi_1 \overline{\chi_2} \mathbb{1}_{(\cdot, r) = 1} \ (\textrm{mod } r)$
(whose conductor divides r and is at most equal to
$D^2 \leqslant x^{2\varepsilon}$
). This yields

which leads to

Since
$x/M \asymp K \ll x^{(2/3)-6\varepsilon}$
by (4.2), we have
$M \gg x^{1/3+6\varepsilon} \gg x^{7\varepsilon}$
for small enough
$\varepsilon$
, and in particular
$\mathcal{S}_3 - \widehat{\Phi}(0)X_3\ll_\varepsilon x^{-\varepsilon} K^2/R$
, proving the easiest third of Proposition 6.2.
Proof
of (6.7), estimating
$\mathcal{S}_2$
. Recall from (6.2) that

Applying Lemma 3.5 with
$H := x^\varepsilon R M^{-1}$
to expand
$\mathcal{E}_r(a_1\overline{a_2 k_2})$
(as given in (6.1)), we obtain

where we used that
$\varphi(re) \gg \varphi(r) e$
as before (since e is prime). The error term is acceptable, so let us focus on the main term on the right-hand side (denote this by
$Y_2$
). By Lemma 3.1, we have

so that

At this point we decompose

aiming to apply the exponential sum bound in Lemma 3.9. Fixing
$e, a_i, k_i, h$
, this lets us rewrite the sum over r on the right-hand side of (7.2) as

where

and

Note that u extends to a differentiable function of a real variable
$\xi$
, supported in
$[R/2, 3R]$
, and with derivative bounds

in this region (we used that
$H = x^\varepsilon R M^{-1}$
,
$MK \asymp x$
, and the very crude bound
$|a_1| \ll x^{1+\varepsilon}$
). So we may use integration by parts to estimate
$Z_2$
; letting

which can be bounded via Lemma 3.9 (with
$n = a_1 h$
,
$c = a_2k_2$
,
$m = r$
, and
$k = a_1k_1$
), we obtain

uniformly in
$\ell \geqslant 1$
. Returning to (7.2) and summing over h and
$k_2$
, the GCD terms contribute at most
$O_\varepsilon(x^{\varepsilon})$
on average (since
$(a_1, a_2) = 1$
). Thus

By (4.2), we have
$K^{3/2} \ll x^{1-9\varepsilon}$
and
$R \ll x^{1-11\varepsilon}$
, so we get a final bound of
$Y_2 \ll_\varepsilon x^{-\varepsilon} K^2 / R$
. This completes our proof of Proposition 6.2(ii).
8. The first dispersion sum
Finally, we work towards establishing Proposition 6.2(i) (for a suitable choice of
$\delta$
in terms of
$\varepsilon$
); the first part of this section is very similar to [Reference DrappeauDra15, §3.4]. Recall from (6.2) that

where
$\mathcal{E}_r(c)$
is given by (6.1). We wish to bound this by
$O_\varepsilon(x^{-2\delta} K^2 / R)$
, as in (6.6).
We still aim to apply Poisson summation for the sums
$\mathcal{E}_r(c)$
, and reduce our problem to bounding certain exponential sums. But due to issues that would arise later in manipulating these exponential sums, we first need to eliminate the contribution of certain ‘bad’ pairs
$(k_1, k_2)$
, in terms of a small parameter
$\eta$
(to be chosen later in terms of
$\varepsilon$
, as an intermediary step to choosing
$\delta$
).
Proposition 8.1 (Eliminating bad index pairs). For
$\varepsilon \geqslant \eta > 0$
, under the parameter conditions in (4.2) and (5.2), one has

where

Proof. We eliminate the contribution of several sets of pairs
$(k_1, k_2)$
, putting absolute values on all the coefficients involved; thus, it does not matter if some of the ‘eliminated sets’ have nonempty intersections. First, we consider the almost-diagonal pairs with
$|k_1 - k_2| \leqslant K/x^\eta$
; using that
$\sum_{c \in (\mathbf{Z}/r\mathbf{Z})^\times} |\mathcal{E}_r(c)| \leqslant ({1}/{M}) \sum_m \Phi({m}/{M}) + \widehat{\Phi}(0) \ll 1$
and
$x^{\eta} RE \ll K$
(which follows from (5.2) and
$\eta \leqslant \varepsilon$
), these contribute to
$\mathcal{S}_1 - \widehat{\Phi}(0)X_1$
at most

Then, we consider those pairs with
$v := (k_1, k_2) > x^{\eta/2}$
. Their contribution to
$\mathcal{S}_1 - \widehat{\Phi}(0)X_1$
is at most

Using that
$(v, re) = 1$
, we can bound one inner sum over k by
$\ll_\eta x^{\eta/8}(K(vRE)^{-1} + 1) \ll x^{-3\eta/8}K(RE)^{-1}$
(recall that
$x^{\eta} RE \ll K$
by (5.2)). This yields a total contribution of

which is also acceptable. Keeping the notation
$v = (k_1, k_2)$
, which we may now assume is at most
$x^{\eta/2}$
, note that

and let us consider those pairs
$(k_1, k_2)$
with
$d_1 > x^\eta$
. Using that
$x^{\eta/2} RE \ll K$
and swapping sums, these contribute at most

Considering the cases
$a_2 m k_1 = a_1$
and
$a_2 m k_1 - a_1 \neq 0$
separately, and using
$R \ll x^{1-\eta/2} \asymp KM x^{-\eta/2}$
(by (4.2)), this is further bounded by

Now since the number of distinct prime factors of a positive integer b is
$O(\log b / \log \log b)$
, for
$b \ll x$
we have the majorization

Using this, we find that the previous sum contributes an acceptable
$O_\eta(x^{-\eta/4} K^2/R)$
.
The contribution of the pairs with
$d_1 > x^\eta$
to
$\widehat{\Phi}(0)X_1$
is simpler and similarly bounded, and the contribution of the pairs with
$d_2 := (k_2, (a_2k_1)^\infty) = (k_2, (a_2v)^\infty)$
is bounded symmetrically. All that is left is to eliminate the contribution of the pairs with large values of
$(k_1 - k_2, (a_2k_1k_2)^\infty)$
; since
$(k_1 - k_2, k_1k_2) = (k_1 - k_2, v^2)$
, we have

Using that
$Rx^{\eta/2} \ll K$
, the pairs with
$d_\Delta > x^\eta$
(as well as
$v \leqslant x^{\eta/2}$
and
$|k_1 - k_2| > K/x^\eta$
) contribute to
$\mathcal{S}_1$
at most

Bounding the inner sum by
$\tau(k)^2 \ll_\eta x^{\eta/9}$
, this further becomes

Using the majorization from (8.2), this is again
$O_\eta(x^{-\eta/4} K^2/R)$
.he contribution of the pairs with
$d_\Delta > x^\eta$
to
$\widehat{\Phi}(0) X_1$
is simpler and similarly bounded by
$O_\eta(x^{-\eta/4} K^2/R)$
. Having eliminated the absolute contribution of all pairs in
$\mathcal{K}(\eta)$
at least once, while incurring only admissible errors, we can conclude our proof of Proposition 8.1.
We can now apply Poisson summation to prove the following.
Proposition 8.2 (Reduction to exponential sum). For
$\varepsilon \geqslant \eta > 0$
and
$H := x^\eta R M^{-1}$
, under the parameter conditions in (4.2) and (5.2), one has

Proof. Rewrite the sum in Proposition 8.1 as

and apply Lemma 3.5 to expand
$\mathcal{E}_r(a_1 \overline{a_2 k_1})$
. The resulting main term is precisely the sum in Proposition 8.2, while the error terms are acceptable.
Remark 8.3. The trivial bound for the right-hand side of Proposition 8.2 is H times worse than for the right-hand side of Proposition 8.1, due to the additional sum over h. This is relevant because H is a nontrivial power of x for the choice of parameters in (4.4) (where
$H \approx R M^{-1} \approx R^2/x$
), since we are working with moduli r well beyond the
$\sqrt{x}$
barrier. This is why we needed to eliminate the bad pairs
$(k_1, k_2)$
(via Proposition 8.1) before applying Poisson summation.
We now go through a series of fairly technical manipulations, following [Reference DrappeauDra15] and [Reference MaynardMay25a], to reduce the sum in Proposition 8.2 to a variation of the exponential sum considered by Bombieri, Friedlander, and Iwaniec in [Reference Bombieri, Friedlander and IwaniecBFI87, §10]; the goal is to prove the remaining dispersion estimate for
$\mathcal{S}_1$
in Proposition 6.2. We do this in two steps (after the statements of Propositions 8.4 and 8.6); first, we assume the following exponential sum bound, which can be compared with Drappeau’s [Reference DrappeauDra15, Proposition 1].
Proposition 8.4 (Improved Drappeau-style exponential sum bound). For all sufficiently small
$\varepsilon > 0$
and all
$\eta \in (0, 1)$
, under the parameter conditions in (4.2), there exists E satisfying (5.2) (with
$K := NL$
) such that the following holds. For any nonzero integer
$a \ll x^{O(\eta)}$
and positive integers
$b, d_a, d_1, d_2, d_\Delta, v, \delta_1, \delta_2 \ll x^{O(\eta)}$
with
$d_2 = \delta_1 \delta_2$
and

one has

where
$H := x^\eta R M^{-1}$
, and
$|u'_k| \leqslant \tau(d_1 k)$
,
$|\beta'_n| \leqslant 1$
,
$|\gamma'_\ell| \leqslant 1$
are sequences supported in
$k \asymp K/d_1$
,
$n \sim N/\delta_1$
, and
$\ell \sim L/\delta_2$
.
Proof of Proposition 6.2(i), assuming Proposition 8.4. Let
$\varepsilon \in (0, 1)$
be sufficiently small, and let us pick E as in Proposition 8.4. By Proposition 8.2, it remains to establish the bound

for some choice of
$0 < 8\delta \leqslant \eta \leqslant \varepsilon$
in terms of
$\varepsilon$
(since
$\delta \leqslant \eta/8$
, the error term of
$x^{-\eta/4} K^2 / R$
from Proposition 8.2 is acceptable). For now, let us fix
$\delta$
and
$\eta$
such that
$8\delta \leqslant \eta$
; we will give explicit choices at the end of this proof.
By the definition of
$\mathcal{K}(\eta)$
from (8.1), we may consider the
$x^{\eta}$
-bounded variables

Noting that
$d_1$
,
$d_2$
, and
$d_\Delta$
all divide
$(a_2 v)^\infty$
, we may then expand

where

Changing variables
$k_i \mapsto d_i k_i$
and adjusting coprimality conditions accordingly (e.g., we now have
$(k_1, a_2d_2k_2) = 1$
as well as
$(k_1, d_1) = 1$
, and
$(d_1k_1d_2k_2, re) = 1$
), we get

Let us denote

for convenience. At this point we record that, since
$d_1d_2d_\Delta \mid (a_2v)^\infty$
,
$v \mid d_1$
, and
$a_1, a_2 \ll x^{\delta} \leqslant x^\eta$
by (4.2), we have

as needed in Proposition 8.4 (in particular, b will act as a bookkeeper for the prime factors of
$d_1, d_2, d_\Delta$
,
$a_2$
inside coprimality constraints). Recalling that we chose
$\mathcal{E} = \{e \sim E : e \text{ prime}\}$
in (5.3), we can ensure that
$(e, a_2d_1) = (e, b) = 1$
for
$e \in \mathcal{E}$
by enforcing
$\delta < 4\varepsilon$
(since then
$|a_2| \leqslant x^\delta \leqslant x^{4\varepsilon} \leqslant E$
). Writing
$d_1k_1 - d_2k_2 = red_\Delta t$
, where
$(t, a_2d_1d_2k_1k_2) = 1$
(which is further absorbed by the conditions
$(t, b) = (k_1, k_2) = (k_1 k_2, b) = 1$
), we further get

We can also get rid of the restriction
$(r, a) = 1$
using Möbius inversion, by writing
$\mathbb{1}_{(r, a) = 1} = \sum_{d_a \mid a} \mu(d_a) \mathbb{1}_{d_a \mid r}$
and expanding

where

Finally, using the definition of
$(u_k)$
from (5.1) and the fact that
$(d_2, k_2) \mid (b, k_2) = 1$
, we can expand

and thus

where

We can now apply Proposition 8.4 for the sequences

supported in
$k \asymp K/d_1$
,
$n \sim N/\delta_1$
, and
$\ell \sim L/\delta_2$
, respectively, to get

Putting together (8.4) to (8.7), we obtain

where the maximum includes all applicable restrictions on the tuple
$(d_1,d_2,d_\Delta,v,d_a,\delta_1)$
(which takes at most
$O_\eta(x^{O(\eta)})$
values).
Now let
$C > 0$
denote the absolute constant in the final exponent of
$O(\eta)$
from (8.8), and let us pick

Then we have
$0 < 8\delta \leqslant \eta \leqslant \varepsilon$
as desired, and the bound in (8.8) implies

completing our proof.
Finally, we prove Proposition 8.4 assuming the following BFI-style bound, the proof of which is left to the later sections. This should be compared with Maynard’s [Reference MaynardMay25a, Lemma 18.5].
Proposition 8.6 (Improved BFI/Maynard exponential sum bound). For all sufficiently small
$\varepsilon > 0$
and all
$\eta \in (0, 1)$
, the following holds. Under the conditions in (4.2), there exists E satisfying (5.2), such that for any positive integers b, d with
$b \ll x^{O(\eta)}$
and
$d \ll x^{O(1)}$
, and for any parameters
$K' \ll NL x^{O(\eta)}$
,
$N' \asymp N x^{O(\eta)}$
,
$L' \asymp L x^{O(\eta)}$
,
$T' \ll NL(RE)^{-1} x^{O(\eta)}$
, and
$H' \ll R M^{-1} x^{O(\eta)}$
, one has

for any 1-bounded complex coefficients
$\beta(e, h, \ell)$
(independent of k, n, t).
Proof of Proposition 8.4, assuming Proposition 8.6. Let us denote the sum in Proposition 8.4 by
$\mathcal{D}_4$
, and assume without loss of generality that
$(d_a, b) = 1$
(since otherwise
$\mathcal{D}_4$
vanishes). We choose E as in Proposition 8.6, and take a closer look at the exponential: by iterating Lemma 3.1, since b, k, r are pairwise coprime we have

and thus, using that
$re d_\Delta t \equiv -d_2 n \ell \ (\textrm{mod } k)$
and
$d_1d_2d_\Delta \mid b^\infty$
(so in particular
$(d_2, k) = 1$
),

Since
$|a| \ll x^{O(\eta)}$
,
$h \ll x^\eta R M^{-1}$
,
$kr \gg KR/x^{O(\eta)}$
and
$KM \asymp x$
, we obtain

and thus

Recalling that
$K \gg RE$
(due to (5.2)),
$H = x^\eta RM^{-1}$
,
$MK \asymp x$
, and
$KR \ll x^{2-\varepsilon}$
(again by (4.2)), the error term gives an acceptable contribution of

We now change variables in the main term by replacing the r-summation with a summation over

noting that the condition
$red_\Delta |t| > K/x^\eta$
implies
$|t| > K/(RE x^{O(\eta)})$
. We also put
$|t|$
into dyadic intervals
$|t| \sim T$
to obtain

where, after adjusting coprimality conditions as explained below,

(We inserted the condition
$(kn\ell, t) = 1$
; this must happen since
$t \mid d_1 k - d_2 n \ell$
and
$(t, d_1 d_2) \mid (t, b^\infty) = 1$
; if a prime divides both t and one of k and
$n\ell$
, then it must also divide the other, contradicting
$(k, n\ell) = 1$
. Moreover, the conditions in the sum over
$k, n, \ell$
are enough to imply
$(kn\ell, r) = 1$
, since
$(d_1 k - d_2 n\ell, k) = (d_2 n\ell, k) = (d_2, k) \mid (b^\infty, k) = 1$
and similarly
$(d_1 k-d_2 n\ell, n\ell) = 1$
.)
We now aim to simplify the term
$ah \overline{rk}/b$
from the exponential, by fixing all relevant residues modulo b. With this goal, we denote the residues of e, t modulo b by
$\widehat{e}, \widehat{t}$
, and those of
$k, n, \ell$
modulo
$d_\Delta b$
by
$\widehat{k}, \widehat{n}, \widehat{\ell}$
. Since
$(d_1 k - d_2n\ell)/d_\Delta = ret$
is coprime with b, we must have
$d_1 \widehat{k} - d_2 \widehat{n} \widehat{\ell} \in d_\Delta (\mathbf{Z}/b\mathbf{Z})^\times$
$= \{d_\Delta (n + b \mathbf{Z}) : (n, b) = 1\} \subset \mathbf{Z}/d_\Delta b \mathbf{Z}$
. This allows us to expand
$\mathcal{D}_5$
as

with

where
$\widehat{r} = \widehat{r}(\widehat{e}, \widehat{t}, \widehat{k}, \widehat{n}, \widehat{\ell}) \in (\mathbf{Z}/b\mathbf{Z})^\times$
is the unique residue mod b such that
$d_\Delta \widehat{r} \widehat{e} \widehat{t} = d_1 \widehat{k} - d_2 \widehat{n} \widehat{\ell} \in d_\Delta (\mathbf{Z}/b\mathbf{Z})^\times$
(this
$\widehat{r}$
is the residue of r mod b, and it is fixed inside each
$\mathcal{D}_6$
). Denoting
$y(h) := \textrm{e}(- ah\overline{\widehat{r} \widehat{k}}/b )$
and suppressing the congruences to
$\widehat{e}, \widehat{t}, \widehat{k}, \widehat{n}, \widehat{\ell}$
through the notation
$\sum^*$
, we obtain

We now remove some of the dependencies between the variables t, k, n and
$e, \ell, h$
, as in the proof of [Reference MaynardMay25a, Lemma 18.4]. Consider the function

where
$\alpha := e d_\Delta t R / (d_1 k - d_2 n \ell)$
; note that
$\Psi$
is smooth in
$e, \ell, h$
, and nonzero only if
$\alpha \asymp 1$
. Since
$Mh/R \ll x^\eta$
and
$\Phi$
,
$\widehat{\Phi}$
have bounded derivatives, the chain rule and the bounds
$d_2 n \ell \asymp K$
,
$|d_2 n \ell - d_1 k| > K/x^\eta$
imply

We thus have

and then by partial summation, (8.12) implies that

where, after removing the residue constraints in the outer sums over t, k, n for an upper bound and putting
$|h|$
in a dyadic interval,

According to the desired bound in Proposition 8.4, and in light of (8.10), (8.11) and (8.13), it remains to show that

for all
$\eta \in (0, 1)$
. Now let
$\mathcal{I}(t, k, n)$
be the subinterval of
$[L/\delta_2, 2L/\delta_2]$
(which is the support of
$\gamma'_\ell$
) containing those
$\ell$
values such that

As in the proof of [Reference MaynardMay25a, Lemma 18.4], we can remove the constraint
$\ell \in \mathcal{I}(t, k, n)$
using the identity

for some coefficients
$c(t, k, n, \omega) \ll 1$
, and the
$L^1$
bound
$\int_0^{1/2} \min(L, 2\omega^{-1}) d\omega \ll \log x$
. Together with the divisor bound
$|u'_k| \ll_\eta x^\eta$
, this shows that

where

We denote
$d_i' := d_i/v$
for
$i \in \{1, 2, \Delta\}$
, so that the exponential term and the congruence in the summation over
$\ell$
may be rewritten as

where
$(d_1', d_2') = (d_1', d_\Delta') = (d_2', d_\Delta') = 1$
. At this point, it also makes sense to denote


to bound (by dropping some divisibility constraints on n’, t’)

(To verify the new coprimality constraints, recall that
$(d_a, b) = 1$
and
$d_1' d_2' d_\Delta' \mid b^\infty$
.) We may replace the restriction that
$(e, n') = 1$
with
$(e, d_1') = 1$
, since each follows from the other and the congruence
$n'\ell \equiv d_1' k \ (\textrm{mod } e)$
, where
$(\ell k, e) = 1$
. Moreover, by inserting 1-bounded coefficients
$\beta(e, h', \ell)$
, we can get rid of the coefficients
$\gamma'_\ell\, \textrm{e}(\ell\omega) y(h'd_a/a)$
, the residue constraints (modulo b) in the summations over e and
$\ell$
, as well as of the constraints that e is a prime and
$e \leqslant E'$
, and that
$a/d_a \mid h'$
. Overall, this yields

Finally, we insert a factor of
$\sqrt{|t'|/T'}$
into the sum, and apply Cauchy and Schwarz in the outer variables k, n’, t’ to bound

Conjugating if necessary when
$h' < 0$
or
$t' < 0$
, Proposition 8.6 implies that

for all
$\eta \in (0, 1)$
. Putting things together, we conclude that

as we required in (8.14).
9. Bombieri–Friedlander–Iwaniec-style estimates
In this section, we establish Proposition 8.6, thus completing the proof of Proposition 6.2 and Theorem 4.2. We build on Maynard’s work in [Reference MaynardMay25a, Chapter 18] (in a slightly more general setting, and using Theorem 3.10 instead of [Reference Deshouillers and IwaniecDI82, Theorem 9]), which is in turn based on Bombieri–Friedlander–Iwaniec’s work in [Reference Bombieri, Friedlander and IwaniecBFI87, §10]. To aid future research, we shall consider a general sum

where b, d are given positive integers with
$b \ll x^{O(\eta)}$
and
$d \ll x$
,
$\beta(e, h, \ell)$
are arbitrary 1-bounded coefficients, and the parameters K, N, T, E, H, L are almost arbitrary.
Remark 9.1. The trivial bound for
$\mathcal{B}$
is
$KN\left(TEH(({L}/{ET}) + 1)\right)^2 \ll KN(HL)^2 + KN(TEH)^2$
, but we need more than a power saving over this (note that the desired bound in (8.9) is of the order of
$KNL^2$
, since we need to make up for the factors of H introduced during Poisson summation). So the relative sizes of K, N, T, E, H, L (as given by Proposition 8.6 and (4.2)) will ultimately be crucial, although we only take them into account after proving a general bound in Proposition 9.5.
After expanding the square inside
$\mathcal{B}$
, we reach a more complicated version of the sum anticipated in (2.5). The ‘diagonal terms’ with
$h_1 e_1 \ell_2 = h_2 e_2 \ell_1$
bring a contribution of roughly
$O(KNTHL + KEHT^2L)$
, similarly to (2.6); our deamplification set-up will be helpful here. In the off-diagonal terms, we complete Kloosterman sums via Lemma 3.6, and the principal frequency will contribute
$O(NH^2L^2 + NH^2TL)$
. The remaining terms are ultimately separated into
$\mathcal{B}_=$
and
$\mathcal{B}_{\neq}$
(the latter corresponding to (2.7)), depending on whether
$\ell_1 = \ell_2$
or
$\ell_1 \neq \ell_2$
.
Lemma 9.2 (Splitting the BFI-style sum). For
$\eta \in (0, 1)$
,
$1 \ll K, N, T, E, H, L \ll x$
, and any positive integers b, d with
$b \ll x^{O(\eta)}$
and
$d \ll x$
, one has

where


and
$g_0(t)$
runs over smooth functions supported on
$t \asymp 1$
, satisfying
$\|g_0^{(j)}\|_\infty \ll_j 1$
for each
$j \geqslant 0$
(with fixed implicit constants). Here,
$\mu = \mu(\ell_1, \ell_2, t, e_0, e_1', e_2', d)$
is the unique solution
$\ (\textrm{mod } te_0e_1'e_2')$
to the congruences
$\mu \equiv d\overline{\ell_1} \ (\textrm{mod } te_0e_1')$
and
$\mu \equiv d\overline{\ell_2} \ (\textrm{mod } te_0e_2')$
.
Proof. This is essentially the same as the proof of [Reference MaynardMay25a, Lemma 18.5], but in a slightly more general setting (the main difference being the additional parameters b, d). We first replace the indicator functions of
$k \sim K$
and
$n \sim N$
with smooth majorants, using a suitable smooth compactly supported function
$f_0$
(we choose this as in the proof of [Reference MaynardMay25a, Lemma 18.5]). Expanding out the square in
$\mathcal{B}$
and swapping sums, then using that
$(n, et) = 1$
to deduce a congruence between the resulting variables
$\ell_1$
and
$\ell_2$
(indeed, if a prime p divided both n and et, then it would divide dk, but
$(et, dk) = 1$
), we obtain

et
$\mathcal{B}_1$
denote the contribution of the ‘diagonal’ terms with
$h_1 e_1 \ell_2 = h_2 e_2 \ell_1$
, and
$\mathcal{B}_2$
contain the other terms; thus we have
$\mathcal{B} \leqslant \mathcal{B}_1 + \mathcal{B}_2$
. As in (2.6), we first bound
$\mathcal{B}_1$
trivially (using the divisor bound), by

recovering the first two terms in the desired bound. Next, we consider
$\mathcal{B}_2$
, containing the terms with
$h_1 e_1 \ell_2 \neq h_2 e_2 \ell_1$
. We let
$e_0 := (e_1, e_2)$
,
$e_1' := e_1/e_0$
and
$e_2' := e_2/e_0$
and put
$e_0$
in dyadic ranges
$e_0 \sim E_0$
to write

Note that the inner sum over n can be rewritten as

where
$r_0 := te_0 (h_1e_1'\ell_2 - h_2e_2'\ell_1)\overline{b \ell_1 \ell_2}$
(defined mod k), and
$\mu = \mu(\ell_1, \ell_2, t, e_0, e_1', e_2', d)$
is the unique solution (mod
$te_0e_1'e_2'$
) to the congruences
$\mu \equiv d\overline{\ell_1} \ (\textrm{mod } te_0e_1')$
and
$\mu \equiv d\overline{\ell_2} \ (\textrm{mod } te_0e_2')$
; the latter is well-defined by the Chinese remainder theorem, since
$(te_0e_1', te_0e_2') = te_0(e_1', e_2') = te_0$
,
$[te_0e_1', te_0e_2'] = te_0e_1'e_2'$
, and
$d\overline{\ell_1} \equiv d\overline{\ell_2} \ (\textrm{mod } te_0)$
. Crucially, note that
$\mu$
does not depend on k.
We can thus complete Kloosterman sums using Lemma 3.6, with

giving us

where

We now plug this bound into our estimate for
$\mathcal{B}_2$
, isolate the contribution of
$j = 0$
into
$\mathcal{B}_3$
, and let
$\mathcal{B}_4$
contain the terms with
$j \neq 0$
. This yields

where


We bound
$\mathcal{B}_3$
trivially using the Ramanujan bound (Lemma 3.8)

giving the third and fourth terms in the desired bound. We finally turn to estimating
$\mathcal{B}_4$
, and start by removing the coprimality constraint
$(k, te_0) = 1$
, via Möbius inversion. We write
$\mathbb{1}_{(k, te_0) = 1} = \sum_{s \mid (k, te_0)} \mu(s)$
and
$k = k' s$
, and put j, s into dyadic ranges
$j \sim J, s \sim S$
to obtain

where

We now wish to separate the j, k’ variables in
$\mathcal{B}_6$
from the others, in the factors of
$f_0$
,
$\widehat{f_0}$
, and the exponential term; note that s,
$q = te_0e_1'e_2'$
and
$\mu = \mu(\ell_1, \ell_2, c, t, e_0, e_1', e_2', d)$
do not depend on j and k’. As in the proof of [Reference MaynardMay25a, Lemma 18.5], we make use of the special choice of the smooth function
$f_0(t) := \int_0^\infty \psi_0(y) \psi_0(t/y)\, dy/y$
(which is a multiplicative convolution of a bounded smooth function
$\psi_0$
supported on
$[1/2, 5/2]$
with itself) to find that

where
$U \asymp 1/S$
, and also

where
$V \asymp S/K$
and
$W \asymp NVE_0/(STE^2) \asymp NE_0/(KTE^2)$
. Plugging this into our expression for
$\mathcal{B}_6$
, taking the integrals over u, v, w outside the absolute value by the triangle inequality, and swapping them with the sum over
$h_1, h_2$
, we get

where

and the smooth function

is supported on
$t \asymp 1$
. Combining this with (9.5) to (9.7), moving the integrals in u, v, w to the front and taking an
$L^\infty$
bound, we find that

where
$TW \asymp {NE_0}/{KE^2}$
. Letting
$\mathcal{B}_=$
be the contribution of the terms
$\ell_1 = \ell_2$
and
$\mathcal{B}_{\neq}$
contain the terms with
$\ell_1 \neq \ell_2$
, and combining this with (9.1) to (9.4), we recover the desired bound for
$\mathcal{B}$
(note that when
$\ell_1 = \ell_2 = \ell$
, one can take
$\mu = d\overline{\ell}$
).
Lemma 9.3 (Contribution of
$\ell_1 = \ell_2$
). With the notation of Lemma 9.2, assuming that
$EHT \ll x^{O(\eta)} KNL$
, one has

Proof of Lemma 9.3 assuming Theorem 3.10. Here, we adapt the proof of [Reference MaynardMay25a, Lemma 18.7], using Theorem 3.10 instead of [Reference Deshouillers and IwaniecDI82, Theorem 9]. To do this, we need to eliminate the dependency of the inner exponential coefficients on
$\ell$
, so we write

where

We now denote

split into dyadic ranges
$m \sim M_=$
,
$r \sim R_=$
, and change variables from
$\ell$
to r to obtain

Crucially, once the variables
$t, e_0, e_1', e_2', \widehat{\ell}$
are fixed,
$\omega$
does not depend on
$r, s, m, h_1, h_2, k'$
, or j. Finally, we remove the absolute values by inserting 1-bounded coefficients
$\xi_{h_1,h_2}$
(also depending on
$t, e_0, e_1', e_2', \widehat{\ell}, r, s, m$
), and denote

to get

with

At this point we apply our Deshouillers–Iwaniec-style bound from Theorem 3.10, finding that

where, by Cauchy and Schwarz,

Plugging these bounds into (9.8), we obtain

Using
$M_= \ll HE/E_0$
and
$R_= \asymp x^{O(\eta)} LE^2/E_0^2$
, and recalling that
$J \ll (KTE^2 x^\eta)/ (NE_0)$
(from Lemma 9.2), we conclude that

where we lower-bounded
$S \gg 1$
in the factor raised to
$\theta_{\max}$
. Since
$1 \ll S \ll TE_0$
(from Lemma 9.2), our bound becomes

This expression is nonincreasing in
$E_0$
, even after extracting a factor of
$E_0^{-1}$
(since
$\theta_{\max} < 3/4$
); thus lower-bounding
$E_0 \gg 1$
we obtain

We can simplify this further using our assumption that
$EHT \ll x^{O(\eta)} KNL$
, which implies

allowing us to discard the term of
$HKT^2E^3/N$
in the second line. Thus

After slightly rearranging factors, this yields the desired bound.
Lemma 9.4 (Contribution of
$\ell_1 \neq \ell_2$
). With the notation of Lemma 9.2, assuming that
$EHT \ll x^{O(\eta)} KNL$
, one has

Proof of Lemma 9.4 assuming Theorem 3.10. Here, we follow the proof of [Reference MaynardMay25a, Lemma 18.6], using Theorem 3.10 instead of [Reference Deshouillers and IwaniecDI82, Theorem 9]. As in Lemma 9.3, we need to eliminate the dependency of the inner exponential coefficients on
$\ell_1$
and
$\ell_2$
, so we write

where

This is essentially the exponential sum anticipated in (2.7).
We then let
$\ell_0 := (\ell_1, \ell_2)$
,
$\ell_1' := \ell_1/\ell_0$
,
$\ell_2' := \ell_2/\ell_0$
, and put the variables

and
$\ell_0$
into dyadic ranges
$m \sim M_{\neq}$
,
$r \sim R_{\neq}$
,
$\ell_0 \sim L_0$
to obtain

where

As before, once the variables
$t, e_0, e_1', e_2', \widehat{\mu}$
are fixed,
$\omega$
does not depend on
$r, s, m, h_1, h_2, k'$
. We remove the absolute values by inserting 1-bounded coefficients
$\xi_{h_1,h_2}$
(also depending on
$t, e_0, e_1', e_2', \widehat{\mu}$
and r, s, m), and denote

to obtain

where

This is roughly the sum of Kloosterman sums anticipated in (2.11). By Theorem 3.10, we have

where, by Cauchy and Schwarz,

Plugging these bounds into (9.10), we find that

Recalling that
$J \ll (K T E^2 x^\eta)/(NE_0)$
(from Lemma 9.2),
$M_{\neq} \ll HEL/(E_0 L_0)$
and
$R_{\neq} \asymp x^{O(\eta)} L^2 E^2 / (L_0 E_0^2)$
(from (9.9)), this yields

Note that this expression is nonincreasing in the GCD parameter
$L_0$
, since
$\theta_{\max} \leqslant 1/2$
; thus lower-bounding
$L_0 \gg 1$
, and then using that
$1 \ll S \ll TE_0$
(from Lemma 9.2), we get

Finally, this expression is nonincreasing in the
$E_0$
parameter even after extracting a factor of
$E_0^{-1}$
, so lower-bounding
$E_0 \gg 1$
yields

Due to our assumption that
$EHT \ll x^{O(\eta)} KNL$
, we have

so we may ignore the term of
$HE^3LKT^2/N$
on the second line to obtain

Rearranging factors (and combining this with (9.9)), we obtain the desired bound.
Combining our results so far, we obtain the following general estimate.
Proposition 9.5 (The BFI-style bound with general parameters). For
$\eta \in (0, 1)$
,
$K, N, T, E, H, L \ll x$
, and any positive integers
$b \ll x^{O(\eta)}$
and
$d \ll x$
, assuming that
$EHT \ll x^{O(\eta)} KNL$
, one has

Proof of Proposition 9.5 assuming Theorem 3.10. This follows by putting together Lemmas 9.2 to 9.4 and squaring (the second line comes from Lemma 9.4, and the third line from Lemma 9.3).
Finally, we use Proposition 9.5 and the conditions from (4.2) to prove Proposition 8.6.
Proof of Proposition 8.6 assuming Theorem 3.10. Let
$\theta := \theta_{\max}$
; we will soon pick a value for E such that (5.2) holds. We can assume without loss of generality that
$K', N', T', E, H', L' \gg 1$
, since otherwise the sum in Proposition 8.6 is void. We now apply the bound in Proposition 9.5 (which is increasing in all six parameters) for the parameters K’, N’, T’, E, H’, L’ from Proposition 8.6, noting that

Plugging in the bounds
$K' \ll NL x^{O(\eta)}$
,
$N' \asymp N x^{O(\eta)}$
,
$L' \asymp L x^{O(\eta)}$
,
$T' \ll NL(RE)^{-1} x^{O(\eta)}$
,
$H' \ll RNL x^{O(\eta)-1}$
, we obtain

Simplifying terms and dividing both sides by
$N^4 L^6$
, we further get

and we wish to show that the right-hand side is
$\ll x^{O(\eta) - \varepsilon}$
. To handle the term of
$N^4 L^2 x^{-2} E^{-2}$
, we require that
$N^2 L \ll x^{1-\varepsilon} E$
; thus we pick

For (5.2) to hold, we also need to have
$E \ll x^{-\varepsilon} NL R^{-1}$
, so we impose the restrictions

(which are part of (4.2)). The fact that
$NR \ll x$
simplifies our expression a bit; combined with the fact that
$E \ll x^{-\varepsilon} N L R^{-1} \ll NL R x^{-1}$
(due to
$x^{1-\varepsilon} \ll R^2$
from (4.2)), this shows that

Moreover, since
$x^{(1-\varepsilon)/2} \ll R$
by (4.2), we have
$NR \ll x^{1-2\varepsilon} \ll R^2$
, so
$N \ll R$
, which implies

Overall, it remains to bound the expression

by
$O(x^{-\varepsilon})$
.
Using
$x^{(1-\varepsilon)/2} \ll R \ll NL \ll x^{2/3 - 6\varepsilon}$
(from (4.2)) and
$E \geqslant x^{4\varepsilon}$
, the first term is admissible since

The (square root of the) second term is similarly bounded:

For the third term in (9.12), we use our choice of E from (9.11) to obtain

Since
$NL \ll x^{2/3}$
by (4.2), we can ignore the 1-term in the last parenthesis. For the terms above to be admissible, we require the restrictions

(which are part of (4.2)). Finally, using that
$1 \ll E \ll x^{-\varepsilon} NLR^{-1}$
(from (5.2)), we crudely bound the fourth and last term in (9.12) by

which is at most
$O(x^{-\varepsilon})$
by the last condition in the first line of (4.2). This completes our proof.
10. Deshouillers–Iwaniec-style estimates
The seminal work [Reference Deshouillers and IwaniecDI82] of Deshouillers and Iwaniec on sums of Kloosterman sums makes repeated use of the Kuznetsov trace formula [Reference KuznetsovKuz80, Reference MotohashiMot97], which is in turn based on the spectral decomposition of
$L^2(\Gamma_0(q) \backslash \mathbf{H})$
with respect to the hyperbolic Laplacian (where q is a positive integer and
$\Gamma_0(q)$
is its associated Hecke congruence subgroup). Here, we prove Theorem 3.10, which is an optimization of [Reference Deshouillers and IwaniecDI82, Theorem 11] in the
$\theta$
-aspect, using the same technology. We note that such optimizations of Deshouillers–Iwaniec bounds (specifically of [Reference Deshouillers and IwaniecDI82, Theorem 12]) have also been used in [Reference Drappeau, Pratt and RadziwiłłDPR23].
We will use all of the notation (and normalization) from [Reference Deshouillers and IwaniecDI82], with the exception of making some dependencies on the level q explicit. In particular, we consider an orthonormal basis of Maass cusp forms
$(u_{j,q})_{j \geqslant 1}$
such that
$u_{j,q}$
has eigenvalue
$\lambda_{j,q}$
(which increases to
$\infty$
as
$j \to \infty$
), and Fourier coefficients
$\rho_{j,\mathfrak{a}}(n)$
when expanding around the cusp
$\mathfrak{a}$
of
$\Gamma_0(q)$
, via an implicit scaling matrix
$\sigma_\mathfrak{a} \in \textrm{PSL}_2(\mathbf{R})$
. We denote

whenever
$\mathfrak{a}$
is equivalent to
$u/w$
, for some relatively prime
$u, w \in \mathbf{Z}_+$
such that
$w \mid q$
; in particular, one has
$\mu(\infty) = q^{-1}$
. We also write

where
$\kappa_{j,q}$
is chosen such that either
$\kappa_{j,q} \geqslant 0$
(when
$\lambda_{j,q} \geqslant 1/4$
), or
$i\kappa_{j,q} > 0$
(when
$\lambda_{j,q}$
is exceptional). Recall from Notation 1.2 that
$\theta_q := \max_{\lambda_j < 1/4} \theta_{j,q}$
(with
$\theta_q := 0$
if there are no exceptional eigenvalues), and that
$\theta_{\max} := \sup_q \theta_q$
. Also, recall that all exceptional eigenvalues lie in the interval
$[3/16, 1/4)$
by [Reference Deshouillers and IwaniecDI82, Theorem 4] (in fact, the best currently known lower bound is
$975/4096$
, due to Kim and Sarnak [Reference KimKim03, Appendix 2]; this is equivalent to Theorem A).
The contribution of the exceptional Maass forms to the spectral side of the Kuznetsov trace formula would vanish if Selberg’s eigenvalue conjecture (Conjecture 1.3) were true, but would be dominating in most applications otherwise. To deduce better bounds for the geometric side (which consists of weighted sums of Kloosterman sums), Deshouillers and Iwaniec [Reference Deshouillers and IwaniecDI82] proved a series of large sieve inequalities for the Fourier coefficients of Maass cusp forms, which temper this exceptional contribution in bilinear sums. Remarkably, these results make further use of the Kuznetsov formula, applying it back and forth and ultimately reducing to the Weil bound.
Lemma 10.1 (Large sieve inequalities from [Reference Deshouillers and IwaniecDI82]). Given
$\varepsilon > 0$
,
$q \in \mathbf{Z}_+$
,
$N \gg 1$
, a complex sequence
$(a_n)_{n \sim N}$
, a cusp
$\mathfrak{a}$
of
$\Gamma_0(q)$
, and an associated scaling matrix
$\sigma_{\mathfrak{a}}$
, one has

for any
$0 < X \ll \max(1, \mu(\mathfrak{a})^{-1}N^{-1})$
. Moreover, if
$(\mathfrak{a}, \sigma_{\mathfrak{a}}) = (\infty, \textrm{Id})$
, then given
$Q \gg 1$
and
$\alpha \in \mathbf{R}/\mathbf{Z}$
, one has

in the larger range
$0 < X \ll \max(N, Q^2 N^{-1})$
.
Proof. The bounds in (10.1) and (10.2) follow immediately from [Reference Deshouillers and IwaniecDI82, Theorems 5 and 7], respectively. We note that changing the choice of the scaling matrix
$\sigma_{\mathfrak{a}}$
results in multiplying the Fourier coefficients
$\rho_{j,\mathfrak{a}}(n)$
by an exponential phase
$\textrm{e}(n\omega)$
; thus in (10.2), using an arbitrary value of
$\omega$
is equivalent to using an arbitrary (but consistent) choice of the scaling matrix
$\sigma_\infty$
.
We also remark that the proof of [Reference Deshouillers and IwaniecDI82, Theorem 7] from [Reference Deshouillers and IwaniecDI82, §8.3] only considers the case
$\omega = 0$
(and
$\sigma_\infty = \textrm{Id}$
), but the same proof extends to any
$\omega \in \mathbf{R}/\mathbf{Z}$
(or equivalently, to any valid scaling matrix
$\sigma_\infty$
); this was already noted, for instance, in [Reference Bombieri, Friedlander and IwaniecBFI87, Lemma 5]. Ultimately, this is because the proof of [Reference Deshouillers and IwaniecDI82, Theorem 14] also extends to sums with additional weights of
$\textrm{e}(m\omega_1)\, \textrm{e}(n\omega_2)$
.
Remark 10.2. The large sieve inequalities in [Reference Deshouillers and IwaniecDI82] are stated for general values of X on the left-hand sides (resulting in right-hand sides that depend on X), and are equivalent to those given in Lemma 10.1. Indeed, to recover large sieve inequalities with an arbitrary
$X > 0$
on the left-hand sides, it suffices to multiply the right-hand sides by
$(1 + (X/X_0)^{\theta_q})$
, where
$X_0$
is the best allowable value in Lemma 10.1.
We find the versions stated above easier to apply optimally in the
$\theta$
-aspect, and also easier to compare, by contrasting the maximal permitted values of X (recalling that
$\mu(\infty) = q^{-1}$
).
We now adapt the proof of [Reference Deshouillers and IwaniecDI82, Theorem 11], making the dependence on
$\theta_{\max}$
explicit.
Theorem 10.3 ([Reference Deshouillers and IwaniecDI82]-type multilinear Kloosterman bound). Let
$C, M, N, R, S \gg 1$
,
$(b_{n,r,s})$
be a complex sequence, and
$\omega \in \mathbf{R}/\mathbf{Z}$
. Then given a five-variable smooth function
$g(t_1, \ldots, t_5)$
with compact support in
$t_1 \asymp 1$
, and bounded derivatives
$\|({\partial^{\sum j_i}}/{\prod (\partial t_i)^{j_i}}) g\|_\infty \ll_{j_1,\ldots,j_5} 1$
, one has

Proof. We follow the proof of [Reference Deshouillers and IwaniecDI82, Theorem 11] in [Reference Deshouillers and IwaniecDI82, §9.1], reducing to the case of smooth functions of the form
$({CS\sqrt{R}}/{cs\sqrt{r}}) f ({4\pi \sqrt{mn}}/{cs\sqrt{r}})$
(up to using slightly different values of
$\omega$
and
$b_{n,r,s}$
); here, f(t) is a smooth function supported in
$t \asymp X^{-1}$
, for
$X := CS\sqrt{R}/\sqrt{MN}$
. After applying the Kuznetsov formula, we bound the contribution of the exceptional spectrum more carefully; as in [Reference Deshouillers and IwaniecDI82, §9.1], this is given by

where

Using the bounds
$\textrm{ch}(\pi \kappa_{j,rs}) \asymp 1$
and

(see [Reference Deshouillers and IwaniecDI82, (7.1)]), and denoting

for some
$X_1, X_2 \geqslant 1$
to be chosen shortly, we obtain

by Cauchy and Schwarz. Recall that
$\mu(1/s) = \mu(\infty) = (rs)^{-1}$
since
$(r, s) = 1$
; thus, using the divisor bound and Lemma 10.1, we conclude that

for
$X_1 = \max(M, R^2S^2M^{-1})$
(coming from (10.2)), and
$X_2 = \max(1, RSN^{-1})$
(from (10.1)), which gives the desired bound up to minor rearrangements. As in [Reference Deshouillers and IwaniecDI82, (9.4)], the non-exceptional spectrum contributes a similar amount of

and putting these together completes our proof.
Finally, Theorem 3.10 follows almost immediately from Theorem 10.3.
Proof of Theorem 3.10. We swap the m and n variables, and pick the second term in each maximum from (10.3) for an upper bound, resulting in a
$\theta$
-factor of

We also rewrite the last fraction in (10.3) as

and we use the lower bound
$CS\sqrt{R} + \sqrt{MN} \geqslant CS\sqrt{R}$
in the final term.
To reduce to a smooth function depending only on c, we can take

for some smooth compactly supported functions
$g_i$
, where
$g_2, g_3, g_4, g_5$
are equal to 1 on [1, 2].
Acknowledgements
The author wishes to thank his advisor, James Maynard, for his kind support and guidance, as well as Sary Drappeau, Lasse Grimmelt, Régis de la Bretèche, and the referees, for many helpful comments and suggestions.
Conflicts of interest
None.
Financial support
The author is supported by EPSRC.
Journal information
Compositio Mathematica is owned by the Foundation Compositio Mathematica and published by the London Mathematical Society in partnership with Cambridge University Press. All surplus income from the publication of Compositio Mathematica is returned to mathematics and higher education through the charitable activities of the Foundation, the London Mathematical Society and Cambridge University Press.