Hostname: page-component-cd9895bd7-gxg78 Total loading time: 0 Render date: 2024-12-27T07:22:28.559Z Has data issue: false hasContentIssue false

Asymptotic identities for additive convolutions of sums of divisors

Published online by Cambridge University Press:  01 April 2022

ROBERT J. LEMKE OLIVER
Affiliation:
Department of Mathematics, Tufts University, 503 Boston Ave, Medford, MA 02155, U.S.A. e-mail: Robert.Lemke_Oliver@tufts.edu
SUNROSE T. SHRESTHA
Affiliation:
Department of Mathematics and Computer Science, Wesleyan University, 45 Wyllys Ave, Middletown, CT 06459, U.S.A. e-mail: sunrose.shrestha@gmail.com
FRANK THORNE
Affiliation:
Department of Mathematics, University of South Carolina, 1523 Greene St, Columbia, SC 29201, U.S.A. e-mail: thorne@math.sc.edu
Rights & Permissions [Opens in a new window]

Abstract

In a 1916 paper, Ramanujan studied the additive convolution $S_{a, b}(n)$ of sum-of-divisors functions $\sigma_a(n)$ and $\sigma_b(n)$ , and proved an asymptotic formula for it when a and b are positive odd integers. He also conjectured that his asymptotic formula should hold for all positive real a and b. Ramanujan’s conjecture was subsequently proved by Ingham, and then by Halberstam with a power saving error term.

In this paper, we give a new proof of Ramanujan’s conjecture that obtains lower order terms in the asymptotics for most ranges of the parameters. We also describe a connection to a counting problem in geometric topology that was studied in the second author’s thesis and which served as our initial motivation in studying this sum.

Type
Research Article
Copyright
© The Author(s), 2022. Published by Cambridge University Press on behalf of Cambridge Philosophical Society

1. Introduction

For any integer a, let $\sigma_a(n)$ denote the sum of the ath powers of the divisors of n, that is,

\begin{equation*}\sigma_a(n) = \sum_{d\mid n} d^a.\end{equation*}

While the particular value of $\sigma_a(n)$ depends crucially on the divisibility properties of n, there are nevertheless many beautiful identities dating back to a 1916 paper of Ramanujan [ Reference Ramanujan18 ] relating additive convolutions of some of these functions to others. For positive integers a and b, let

\begin{equation*}S_{a,b}(n) \,:\!=\, \sum_{k=1}^{n-1} \sigma_a(k) \sigma_b(n-k).\end{equation*}

Perhaps the most well-known identity is

\begin{equation*}S_{3,3}(n) = \frac{1}{120}\sigma_7(n) + \frac{1}{120}\sigma_3(n)\end{equation*}

but Ramanujan establishes eight other exact identities of this type. He also establishes the asymptotic identity

(1·1) \begin{equation} S_{a,b}(n) = \frac{\Gamma(a+1)\Gamma(b+1)}{\Gamma(a+b+2)} \frac{\zeta(a+1)\zeta(b+1)}{\zeta(a+b+2)} \sigma_{a+b+1}(n) - \frac{1}{2}\zeta({-}a)\sigma_b(n)+ O\left(n^{\frac{2}{3}\left(a+b+1\right)}\right)\end{equation}

At the top of the second page of his paper, however, Ramanujan remarks, “It seems very likely that (the main part of the asymptotic in (1·1)) is true for all positive (real) values of a and b, but this I am at present unable to prove.” This less well known conjecture of Ramanujan was established in 1927 by Ingham [ Reference Ingham9 ], and then with a power saving error term in 1957 by Halberstam [ Reference Halberstam6 ]. Halberstam later [ Reference Halberstam7 ] proved that if both parameters are small, in that they satisfy $a+b<1$ , then there is a secondary term given by a different expression in this asymptotic formula. This formula does not, however, recover the secondary term in Ramanujan’s formula (1·1), both owing to its different formulation and to the requirement that $a+b<1$ .

In this paper we give another proof of the asymptotic in (1·1), improving upon the result by establishing lower-order terms in the asymptotic for many ranges of the parameters that recover Ramanujan’s secondary term. We begin with the following theorem on what is typically the largest of these lower order terms.

Theorem 1·1. If a and b are positive real numbers with $b>a \geqslant 1$ , then

\begin{align*}S_{a,b}(n)&=\frac{\Gamma(a+1)\Gamma(b+1)}{\Gamma(a+b+2)}\frac{\zeta(a+1)\zeta(b+1)}{\zeta(a+b+2)}\sigma_{a+b+1}(n)+\frac{\zeta(1-a)\zeta(b+1)}{(b+1)\zeta(b-a+2)}n^a\sigma_{b-a+1}(n)\\& \quad\quad+ O(n^b) + O\left(n^{\frac{a+b}{2}+1+\epsilon}\right).\end{align*}

Notice that when a is an odd integer $\geqslant 3$ , the secondary term in Theorem 1·1, which is $O\left(n^{b+1}\right)$ , actually vanishes, so Theorem 1·1 is consistent with (1·1) (which requires both parameters to be odd integers) but does not quite recover it. In fact, our proof shows that there are typically many lower order terms in the asymptotic formula for $S_{a,b}(n)$ , of orders $O\left(n^{b+1-m}\right)$ for non-negative integers $0 \leqslant m < ({b-a})/{2} + {7}/{4}$ . All of these terms but that of order $O\left(n^b\right)$ vanish if the smaller parameter a is an odd integer, and it is in fact this term that recovers Ramanujan’s secondary term.

Theorem 1·2. Let a and b be positive real numbers. If $b-a > 3/2$ , then

\begin{align*}S_{a,b}(n)&= \frac{\Gamma(a+1)\Gamma(b+1)}{\Gamma(a+b+2)}\frac{\zeta(a+1)\zeta(b+1)}{\zeta(a+b+2)}\sigma_{a+b+1}(n) \\[4pt]\notag &\quad + \frac{\zeta(1-a)\zeta(b+1)}{(b+1)\zeta(b-a+2)}n^a\sigma_{b-a+1}(n)+ \sum_{0 \leqslant m < \frac{b-a}{2}-\frac{3}{4}} \mathrm{Res}({-}m) + O_{a,b,\epsilon}\left(n^{\frac{a+b}{2}+\frac{3}{4}+\epsilon}\right),\end{align*}

where $\mathrm{Res}({-}m)$ is given explicitly by (4·7). It satisfies $\mathrm{Res}({-}m) \ll n^{b-m}$ in general, and if a is an odd integer, then $\mathrm{Res}(0) = -({1}/{2})\zeta({-}a)\sigma_b(n)$ and $\mathrm{Res}({-}m)=0$ for each $m \geqslant 1$ .

In particular, when $a\geqslant 3$ is an odd integer and $b > a+3/2$ , Theorem 1·2 implies

\begin{align*}S_{a,b}(n) = & \frac{\Gamma(a+1)\Gamma(b+1)}{\Gamma(a+b+2)}\frac{\zeta(a+1)\zeta(b+1)}{\zeta(a+b+2)}\sigma_{a+b+1}(n) -\frac{1}{2}\zeta({-}a)\sigma_b(n)\\[5pt]& +O_{a,b,\epsilon}\left(n^{\frac{a+b}{2}+\frac{3}{4}+\epsilon}\right),\end{align*}

recovering Ramanujan’s formula (1·1) but without requiring b to be an odd integer. Thus, Theorem 1·2 recovers and expands on the asymptotic formula for $S_{a,b}(n)$ available from the theory of modular forms. We note that when b is also an odd integer, it was conjectured by Ramanujan and proved by Deligne that the error term is of the form $O_{a,b,\epsilon}\left(n^{\frac{a+b}{2}+\frac{1}{2}+\epsilon}\right)$ . This improved error term is available only when b is an odd integer, however; we discuss possible improvements to the error term when b is not an odd integer in the final section of this paper.

The core of the paper is Section 4, where we state and prove a theorem subsuming Theorems 1·1 and 1·2. We first present in Section 3 a simple elementary proof of Ramanujan’s conjecture (with power saving error term) along similar lines as Halberstam [ Reference Halberstam6 ].

Also in this paper, in Section 2 we describe a problem in geometric topology which initially motivated our interest in this problem. In brief, the additive convolution $S_{1,2}(n)$ appears while counting primitive ramified degree n covers of the square torus (or in other words, square-tiled surfaces with n squares) with two ramification points. These surfaces can be classified according to their horizontal cylinder configurations. There are exactly four such configurations, and knowing the asymptotic for $S_{1,2}(n)$ , which already is difficult to find in the literature, enables us to compute asymptotic proportions of two of these four horizontal cylinder configurations.

2. Motivation from Geometric Topology

Our initial interest in studying additive convolutions of the kind $S_{a,b}$ arose from a counting problem in geometric topology. In order to describe succinctly where the additive convolution appears we begin with a brief exposition on translation surfaces and their moduli spaces.

2.1. Translation surfaces and their moduli spaces

A translation surface is a closed orientable surface obtained from the union of finitely many Euclidean polygons $\{\Delta_1, \dots, \Delta_n\}$ such that:

  1. (i) the embedding of the polygons in $\mathbb{R}^2$ is fixed only up to translation;

  2. (ii) the boundary of every polygon is oriented counterclockwise; and

  3. (iii) for every $1 \leqslant j \leqslant n$ and for every oriented side $s_j$ of $\Delta_j$ , there exist $1 \leqslant \ell \leqslant n$ and an oriented side $s_\ell$ of $\Delta_\ell$ so that $s_j$ and $s_\ell$ are parallel, of equal length and of opposite orientation. The sides $s_j$ and $s_\ell$ are glued together by a parallel translation.

A few key things follow from the definition.

  1. (i) The total angle around a vertex is $2 \pi (k+1)$ for some non-negative integer k. When $k > 0$ , we call the point a cone point.

  2. (ii) We distinguish between two polygons one obtained from the other by a nontrivial rotation. However, two polygons are “cut, parallel transport, and paste” equivalent. For instance, consider Figure 1. Hence, translation surfaces come with a well defined vertical direction.

    Fig. 1. On the left, the two translation surfaces differ by a nontrivial rotation, so are not considered equivalent. On the right, the two translation surfaces are cut and paste equivalent. We omit the orientation on the edges mentioned in the definition while representing the surfaces using polygons.

Some basic examples of translation surfaces include an axis parallel square with opposite sides identified to give a square torus and a regular octagon with opposite sides identified. One can also take two regular n-gons with n odd and identify opposite corresponding sides to form a translation surface. Consider Figure 2 for an example with $n=5$ . In general, the polygons need not be regular.

Fig. 2. A translation surface formed by two pentagons whose opposite corresponding sides are glued. This surface has genus 2 and lives in the stratum $\mathcal{H}(1,1)$ .

Translation surfaces also admit an alternate definition via complex analysis. Viewing the polygons as embedded in $\mathbb{C}$ , a translation surface has a complex structure with transition functions given by translations. The globally defined 1-form dz on $\mathbb{C}$ then induces a globally defined 1-form $\omega$ with zeroes exactly at the cone points. Hence, from the polygonal definition of a translation surface we obtain a pair $(X, \omega)$ where X is a Riemann surface and $\omega$ is holomorphic 1-form. On the other hand, given such a pair $(X, \omega)$ one can also recover the polygonal definition using a geodesic triangulation of X satisfying the appropriate properties outlined in the polygonal definition. Therefore, a translation surface can also be thought of as a pair $(X, \omega)$ of a Riemann surface X equipped with a holomorphic 1-form $\omega$ . See [ Reference Masur14 ] for a more precise formulation of the equivalence of these two definitions of translation surfaces.

The genus of a translation surface is given by the classical Gauss–Bonnet theorem which relates the Euler characteristic of a surface with the total curvature. Since translation surfaces are built out of Euclidean polygons, they are flat everywhere except the cone points, and the Gauss–Bonnet theorem takes on a simpler form. Hence, a surface of genus g with m cone points of angles $2 \pi(\alpha_1+1), \dots, 2 \pi (\alpha_m +1)$ satisfies the relation

\begin{equation*}\displaystyle 2g - 2 = \sum_{i=1}^m \alpha_i.\end{equation*}

The angle data around the cone points can be recorded in a vector $\alpha = (\alpha_1, \dots, \alpha_m)$ , where m is the number of cone points and $2\pi(\alpha_i+1)$ are the cone angles defined as above. The collection of translation surfaces sharing the same angle data is called a stratum and is denoted $\mathcal{H}(\alpha)$ .

For any $\alpha$ that is an integer partition of an even number, $\mathcal{H}(\alpha)$ can be given the structure of a complex orbifold. The main idea is that given $(X, \omega) \in \mathcal{H}(\alpha_1, \dots, \alpha_m)$ , we can fix a basis $\rho_1, \dots, \rho_{2g+m-1}$ for the first homology $H_1\left(X, \left\{P_1, \dots, P_m\right\};\,\mathbb{Z}\right)$ relative to the cone points. We can then get a map

(2·1) \begin{equation} \mathcal{H}(\alpha) \rightarrow \mathbb{C}^{2g+m-1} \text{ given by } (X, \omega) \longrightarrow \left(\int_{\rho_1} \omega, \dots, \int_{\rho_{2g+m-1}} \omega\right).\end{equation}

These are called ${period coordinates}$ for $\mathcal{H}(\alpha)$ . The period coordinates serve as local coordinates via which it can be shown, as in [ Reference Masur13 , Reference Veech22 , Reference Veech23 ], that the strata are complex orbifolds of dimension $2g+m-1$ , where g is the genus of the translation surface with cone point data $(\alpha_1, \dots, \alpha_m)$ . Kontsevich and Zorich [ Reference Kontsevich and Zorich10 ] classified the connected components of $\mathcal{H}(\alpha)$ for all $\alpha$ . In particular, any $\mathcal{H}(\alpha)$ can have at most 3 connected components. Moreover, any stratum admits an $\mathrm{SL}_2(\mathbb{R})$ action — given a translation surface built out of polygons $\{\Delta_i\}$ , its image under $A \in \mathrm{SL}_2(\mathbb{R})$ is simply the translation surface $\{A \cdot \Delta_i\}$ where A acts on the polygons linearly.

2.2. Volume in $\mathcal{H}(\alpha)$

The period coordinates can also be used to define a volume form on $\mathcal{H}(\alpha)$ . Consider the linear volume form on $\mathbb{C}^{2g+m-1}$ , normalised so that the fundamental domain of the integer lattice $(\mathbb{Z}+i\mathbb{Z})^{2g+m-1}$ has volume 1. The pullback of this volume form under the period map gives what is popularly called the Masur–Veech volume form on $\mathcal{H}(\alpha)$ . Furthermore, this induces a volume form on $\mathcal{H}_1(\alpha)$ , the set of translation surfaces in $\mathcal{H}(\alpha)$ of area 1 (i.e. collections of surfaces with total Euclidean area of the polgyons 1). The measure of $\mathcal{H}_1(\alpha)$ with respect to this induced volume form has been shown to be finite for any $\alpha$ , independently by Masur [ Reference Masur13 ] and Veech [ Reference Veech22 ].

Twenty years after, Eskin and Okounkov [ Reference Eskin and Okounkov4 ] computed the volume of these strata, $\mathcal{H}_1(\alpha)$ . They counted a particular type of translation surfaces called square-tiled surfaces (STSs), which are exactly those translation surfaces in which the polygons are axis parallel Euclidean unit squares. Alternatively, they are exactly those translation surfaces $(X , \omega)$ such that their image under the period map (2·1) is in $(\mathbb{Z} + i \mathbb{Z})^{2g+m-1}$ . In this manner, STSs have a lattice-like structure in the space of translation surfaces and can be thought of as “integer points” of strata. Topologically, STSs are also thought of as branched covers of the standard square-torus with branching over exactly one point.

The idea of the volume computation is motivated by the following simple case. To compute the surface area of a body in $\mathbb{R}^n$ , one can consider a large dilate of the body by $R > 1$ , and count the integer points inside. Asymptotically, the number of such integer points would be $c\cdot R^n$ since $\mathbb{R}^n$ is n-dimensional. The surface area of the body is then given by

\begin{equation*} \frac{d \left(c\cdot R^n\right)}{dR}\bigg|_{R = 1} = cn.\end{equation*}

To compute the volume of $\mathcal{H}_1(\alpha)$ , one applies the same technique. Applying a homothety to the codimension 1 subset $\mathcal{H}_1(\alpha)$ by n, we get the set of translation surfaces surfaces of area n. The integer points within this dilated region in $\mathcal{H}(\alpha)$ are STSs with at most n squares. The asymptotics of this count then yields the volume of $\mathcal{H}_1(\alpha)$ .

2.3. Connections to Number Theory

Using the volume computation heuristic described above, Zorich [ Reference Zorich25 ] computed the volume of the first few strata by hands-on counting and obtained

\begin{equation*}\mathrm{vol}(\mathcal{H}_1(\emptyset)) = 2\cdot\zeta(2); \qquad \mathrm{vol}(\mathcal{H}_1(2)) = \frac{3}{4}\cdot \zeta(4); \qquad \mathrm{vol}(\mathcal{H}_1(1,1)) = \frac{1}{3}\cdot\zeta(4).\end{equation*}

In general, Eskin and Okounkov [ Reference Eskin and Okounkov4 ] showed that the volume of $\mathcal{H}_1(\alpha)$ is given by

\begin{equation*}\mathrm{vol}\left(\mathcal{H}_1(\alpha)\right) = \frac{(|\alpha|+1) \lim_{D \rightarrow \infty} D^{-|\alpha|-1} \sum_{d =1}^D \mathcal{C}_d(\alpha)}{\dim \mathcal{H}(\alpha)},\end{equation*}

where $|\alpha| = \sum \alpha_i$ , and the $\mathcal{C}_d$ are the coefficients of a certain generating function $ \mathcal{C}(\alpha) = \sum_{d =1}^\infty \mathcal{C}_d(\alpha) q^d $ which they proved to be a quasimodular form, i.e, a polynomial in the Eisenstein series $G_k(q)$ for $k = 2, 4, 6$ . Consequently, they showed that

\begin{equation*} \frac{\mathrm{vol}(\mathcal{H}_1(\alpha))}{\pi^{2g}} \in \mathbb{Q}\end{equation*}

for any stratum $\mathcal{H}(\alpha)$ of genus g translation surfaces.

Since Eskin and Okounkov’s volume computations, various counting problems have received much attention in the study of STSs, including the enumeration of primitive square-tiled surfaces, i.e. those STSs whose covering of the square torus does not factor through another STS. In some ways this problem is analogous to counting primitive vectors in $\mathbb{Z}^n$ .

In 2006, Hubert and Lelievre [ Reference Hubert and Lelievre8 ] and McMullen [ Reference McMullen15 ] proved that primitive n-square STSs in $\mathcal{H}(2)$ partition into at most two orbits under the linear action of $\mathrm{SL}_2(\mathbb{Z})$ (induced by the linear action of $\mathrm{SL}_2(\mathbb{R})$ ). Subsequently, Lelievre and Royer [ Reference Lelievre and Royer12 ] obtained orbit-wise counting of primitive n-square STSs for odd n in $\mathcal{H}(2)$ . In the computation, they obtained and used closed forms of sums of the type

\begin{equation*}S_{1,1}^k(n) = \sum_{\substack{(a,b) \in \mathbb{N}^2 \\ ka +b = n}} \sigma_1(a)\sigma_1(b).\end{equation*}

Note that $S_{1,1}^1 = S_{1,1}$ as defined above, the convolution of $\sigma_1$ with itself. For $k=2, 4$ and $n\geqslant 1$ , they obtained

\begin{gather*}S_{1,1}^2(n) = \frac{1}{12}\sigma_3(n) + \frac{1}{3}\sigma_3\left(\frac{n}{2}\right) - \frac{1}{8}n \sigma_1(n) - \frac{1}{4}n\sigma_1\left(\frac{n}{2}\right) + \frac{1}{24}\sigma_1(n) + \frac{1}{24}\sigma_1\left(\frac{n}{2}\right),\\\end{gather*}

\begin{align*}&S_{1,1}^4(n) =\\[5pt]& \quad \frac{1}{48}\sigma_3(n) + \frac{1}{16}\sigma_3\left(\frac{n}{2}\right) + \frac{1}{3}\sigma_3\left(\frac{n}{4}\right) - \frac{1}{16}n \sigma_1(n) - \frac{1}{4}n\sigma_1\left(\frac{n}{4}\right) + \frac{1}{24}\sigma_1(n) +\frac{1}{24}\sigma_1\left(\frac{n}{4}\right).\end{align*}

They were able to express these sums as linear combinations of sums of powers of divisors using the fact that the spaces of quasimodular forms on congruence subgroups such as $M_4[\Gamma_0(4)]$ and $M_2[\Gamma_0(2)]$ are finite dimensional. Notably, however, since the generating functions for $\sigma_a$ for a even are odd weight Eisenstein series, the analysis of the convolution of $S_{a,b}$ for even a resists the theory of quasimodular forms, and hence we use alternate methods to understand the asymptotics of such sums.

We now describe the specific problem in the enumeration of STSs that motivated us to study $S_{a,b}$ for even a.

Every STS can be viewed as a union of horizontal square-tiled cylinders glued together. One way to analyse an STS in a given stratum is to categorise its horizontal cylinder decomposition type, popularly termed cylinder diagram that describes how many horizontal cylinders makes up the surface, and in what ways they are glued together.

In particular, STSs in $\mathcal{H}(1,1)$ (translation surfaces of genus two with two cone points) partition into exactly 4 cylinder diagrams. Figure 3 shows prototypical examples of surfaces in the 4 cylinder diagrams named A, B, C and D in $\mathcal{H}(1,1)$ .

Fig. 3. Examples of STSs in the four cylinder diagrams of $\mathcal{H}(1,1)$ , here named A, B, C, D. In each surface, collections of edges with the same label are glued via translation. For instance, in A, the 3 edges labelled p are glued to the 3 edges labelled p via translation to form a horizontal cylinder. Hence, diagram A is characterised by having exactly one (maximal) horizontal cylinder. Similarly, diagram D consists of STSs in $\mathcal{H}(1,1)$ with exactly three horizontal cylinders while diagram B and C consist of those with two horizontal cylinders but different gluing pattern. Adding squares to vary the parameters p, q, r, j, k, l, m gives surfaces with different number of squares in each of these cylinder diagrams.

The counting problem in question is to enumerate, given a fixed n, the number of primitive STSs in $\mathcal{H}(1,1)$ in each of the four cylinder diagrams and find the individual asymptotic densities of each them. For example, let the number of primitive n-square surfaces in $\mathcal{H}(1,1)$ with diagram D be D(n). The second author proved in [ Reference Shrestha21 ] that

\begin{equation*}D(n) = \frac{1}{6}n(n-1)J_2(n) - \bigl((\mu \cdot \sigma_2) * (S_{1,2})\bigr)(n),\end{equation*}

where $J_k(n) \,:\!=\, n^k \prod_{p | n} \left(1 - \dfrac{1}{p^k}\right)$ is the Jordan totient function of order k, $\mu$ is the Möbius function and $*$ is Dirichlet convolution. Using Theorem 3·1, the second author proved that surfaces with diagram D have asymptotic density $1- \dfrac{\zeta(2)\zeta(3)}{2\zeta(5)} \approx 0.047$ . For similar formulae and asymptotic densities concerning the other diagrams A, B and C, see [ Reference Shrestha21 , theorem 1·1].

An analogous problem for the other genus two stratum $\mathcal{H}(2)$ was solved by Zmiaikou [ Reference Zmiaikou24 ]. Complete results for strata of genus 3 and above are not known although the density of one cylinder surfaces (although not necessarily primitive) has been computed by Delecroix–Goujard–Zograf–Zorich [ Reference Delecroix, Goujard, Zograf and Zorich2 ].

3. Proof of Theorem 3·1

For the reader’s convenience, we begin with a short proof of Ramanujan’s conjecture, along similar lines to Halberstam [ Reference Halberstam6 ]:

Theorem 3·1. For any positive real numbers a and b, as $n\to\infty$ there holds

(3·1) \begin{equation}S_{a,b}(n) = \frac{\Gamma(a+1)\Gamma(b+1)}{\Gamma(a+b+2)} \frac{\zeta(a+1)\zeta(b+1)}{\zeta(a+b+2)} \sigma_{a+b+1}(n) + O(n^{a+b + \beta} (\log n)^{\kappa}),\end{equation}

where

\begin{equation*}\beta = \begin{cases}0 & \text{ if $a, b > 1$}, \\1 - a & \text{ if $a \leqslant 1$, $b \geqslant 1$}, \\1 - b & \text{ if $a \geqslant 1$, $b \leqslant 1$}, \\1 - \frac{ab}{b + a - ab} + \epsilon & \text{ if $a, b < 1$}, \end{cases} \end{equation*}

and $\kappa$ is $2$ if $a = b = 1, 1$ if $a = 1$ and $b \neq 1$ or vice versa, and zero otherwise.

The theorem also holds if a and b are complex numbers with positive real part, in which case replace a and b by their real parts everywhere in the error terms and inequalities.

We begin with two lemmas.

Lemma 3·2. For any integer n and residue class $k \,\left(\mathrm{mod}\,{m}\right)$ , we have

(3·2) \begin{equation}\sum_{\substack{j = 1 \\ j \equiv k \,\left(\mathrm{mod}\,{m}\right)}}^{n - 1} j^a (n - j)^b = \frac{n^{a + b + 1}}{m} \frac{\Gamma(a+1)\Gamma(b+1)}{\Gamma(a+b+2)}+ O_{a, b}\left(n^{a + b} \right).\end{equation}

Proof. (Sketch) As in [ Reference Halberstam6 ], we rewrite the sum in (3·2) as $n^{a + b} \sum_{j = 0}^{r-1} f\left(\alpha_0 + j\alpha\right)$ , where $f(t) \,:\!=\, t^a (1 - t)^b$ , for some $\alpha_0$ and r satisfying $0 \leqslant \alpha_0 < \alpha$ and $\left| r - \alpha^{-1} \right| < 1$ . After a change of variables, we recognise this as a Riemann sum approximation to the integral defining the beta function, yielding the result.

Lemma 3·3. We have, as a formal identity of Dirichlet series,

\begin{equation*}\sum_{n = 1}^{\infty} \sum_{\substack{m = 1 \\ (m,\, n) = 1}}^{\infty}n^{-r} m^{-s} =\frac{\zeta(r) \zeta(s)}{\zeta(r + s)}.\end{equation*}

Proof. This follows by rewriting the left-hand side as

\begin{align*}\sum_{d = 1}^{\infty} \mu(d) \sum_{u = 1}^{\infty} \sum_{v = 1}^{\infty}(du)^{-r} (dv)^{-s}=\sum_{d = 1}^{\infty} \mu(d) d^{-r- s} \sum_{u = 1}^{\infty} \sum_{v = 1}^{\infty}u^{-r} v^{-s}.\end{align*}

Proof of Theorem 3·1. We rewrite $S_{a, b}(n)$ in the form

(3·3) \begin{equation}S_{a, b}(n) = \sum_{k=1}^{n-1} \sigma_a(k) \sigma_b(n-k) =\sum_{d=1}^{n - 1} d^{-a}\sum_{e=1}^{n - 1} e^{-b}\sum_{\substack{k = 1 \\ d \mid k \\ e \mid n - k}}^{n - 1} k^a (n - k)^b.\end{equation}

If $(d, e) \nmid n$ then the inner sum vanishes. Otherwise, the divisibility conditions are equivalent to demanding that $k \equiv k_0 \,\left(\mathrm{mod}\,{({de}/{(d, e)})}\right)$ for some $k_0$ , and by Lemma 3·2 the inner sum equals

\begin{align*}&n^{a + b + 1} \left( \frac{de}{(d, e)} \right)^{-1} \frac{\Gamma(a+1)\Gamma(b+1)}{\Gamma(a+b+2)} +O\left(n^{a + b} \right),\end{align*}

so that

(3·4) \begin{equation}S_{a, b}(n) = \frac{\Gamma(a+1)\Gamma(b+1)}{\Gamma(a+b+2)} n^{a + b + 1}\sum_{\substack{d, e=1 \\ (d, e) \mid n}}^{n - 1} d^{-a} e^{-b}\left( \frac{(d, e)}{de} + O\left(n^{-1}\right) \right).\end{equation}

Assuming for now that $a, b > 1$ , the error term of $O\left(n^{-1}\right)$ above contributes an error bounded by

(3·5) \begin{equation}\ll n^{a + b} \sum_{d, e = 1}^{n - 1} d^{-a} e^{-b}\ll n^{a + b}.\end{equation}

The sum in the main term of (3·4) is equal to

\begin{align*}\sum_{w \mid n} &w^{-a - b - 1}\sum_{\substack{i, j = 1 \\ (i, j) = 1}}^{n/w-1} i^{-a - 1} j^{- b - 1} \\ & = \sum_{w \mid n} w^{-a - b - 1} \left(\sum_{\substack{i, j = 1 \\ (i, j) = 1}}^{\infty} i^{-a - 1} j^{- b - 1} + O\left( \left(\frac{n}{w} \right)^{- \min(a, b)} \right) \right) \\ & = \sum_{w \mid n} w^{-a - b - 1} \sum_{\substack{i, j = 1 \\ (i, j) = 1}}^{\infty} i^{-a - 1} j^{- b - 1} + O\left(n^{- \min(a, b)}\right).\\\end{align*}

By Lemma 3·3 the sum over i and j above is $\zeta(a+1)\zeta(b+1)/\zeta(a+b+2)$ , while the sum over w may be identified as $n^{-a-b-1}\sigma_{a+b+1}(n)$ . Assembling this in (3·4), we obtain Theorem 1·1 with an error of $O\left(n^{a + b}\right)$ in the case that $a,b>1$ .

If $a \leqslant 1$ and $b \geqslant 1$ , then in (3·5) the error term is $\ll n^{a + b + 1 - a} (\log n)^{\kappa}$ , where $\kappa$ is 2 if $a = b = 1$ , 1 if $a = 1$ and $b \neq 1$ or vice versa, and zero otherwise. If $a \geqslant 1$ and $b < 1$ , then the error is similarly $\ll n^{a + b + 1 - b} (\log n)^{\kappa}$ .

If instead $a, b < 1$ , take the sum in (3·5) only through $d \leqslant D$ and $e \leqslant E$ , making an error $\ll n^{a + b} D^{1 - a} E^{1 - b}$ . Rewriting (3·3) in the form

(3·6) \begin{equation}\sum_{k = 1}^{n - 1} O\left(n^{a + b}\right)\left( \sum_{d \mid k} d^{-a} \right)\left( \sum_{e \mid n - k} e^{-b} \right),\end{equation}

the contribution from $d > D$ is $O\left(n^{a + b + 1 + \epsilon} D^{-a}\right)$ , and the contribution from $e > E$ is similarly $O\left(n^{a + b + 1 + \epsilon} E^{-b}\right)$ . We therefore make a total error

\begin{equation*}\ll n^{a + b + 1 + \epsilon} \max\left( n^{-1} D^{1 - a} E^{1 - b}, D^{-a}, E^{-b} \right).\end{equation*}

Equating the parameters by choosing $D = n^{\frac{b}{b+a-ab}}$ and $E = n^{\frac{a}{b+a-ab}}$ , we obtain an error term

\begin{equation*}\ll n^{a + b + 1 + \epsilon - \frac{ab}{b+a-ab}}.\end{equation*}

This yields Theorem 3·1 in the remaining cases.

4. Main theorem and proof

Again, for notational simplicity we assume that b and a are both real; if not, replace b and a with $\text{Re}(b)$ and $\text{Re}(a)$ in all inequalities and error estimates. We also assume without loss of generality that $b \geqslant a$ (i.e., that $\text{Re}(b) \geqslant \text{Re}(a)$ if these quantities are complex).

To motivate our strategy, in place of $\sum_{k = 1}^{n - 1} \sigma_a(k) \sigma_b(n - k)$ , consider the problem of estimating the simpler sum $\sum_{k = 1}^{n - 1} \sigma_a(k) (n - k)^b$ . The factor $(n - k)^b$ appears to complicate matters, but via the theory of Riesz means and Mellin transforms it may be interpreted as a smoothing factor that helps in evaluating of the sum.

In particular, we have the following familiar formula.

Lemma 4·1. We have, for any Dirichlet series $\sum_k a(k) k^{-s}$ and any complex number b with $\text{Re}(b) > 0$ , the formula

(4·1) \begin{equation} \frac{1}{\Gamma(b + 1)} \sum_{k = 1}^n a(k) (n - k)^b = \frac{1}{2 \pi i} \int \left( \sum a(k) k^{-s} \right) \frac{\Gamma(s)}{\Gamma(s + b + 1)} n^{s + b} ds,\end{equation}

where the contour is over any vertical line where the Dirichlet series converges uniformly and absolutely.

Proof. Switching the order of integration and summation, this reduces to the formula

\begin{equation*}\frac{1}{2 \pi i} \int \frac{\Gamma(s)}{\Gamma(s + b + 1)} t^{s} ds = \begin{cases} 0 & \text{ if } 0 < t < 1, \\[5pt] \Gamma(b + 1)^{-1} \cdot (1 - t^{-1})^b & \text{ if } t > 1, \end{cases} \end{equation*}

for which see [ Reference Gradshteyn and Ryzhik5 , 17.43.22]. (It may be proved by shifting the contour infinitely far to the right or left as appropriate, and evaluating the sum of residues in the latter case.)

Our aim will be to first manipulate our sum into something resembling (4·1), where the Dirichlet series $\sum a(k) k^{-s}$ can be expressed in terms of zeta functions and therefore enjoys analytic continuation to $\mathbb{C}$ . As is familiar in various analytic number theory contexts, this will then allow us to shift the integral in (4·1) to the left.

Now, we have

(4·2) \begin{align}S_{a, b}(n) &= \nonumber \sum_{k = 1}^{n - 1} \sigma_a(k) \sigma_b(n - k) \\ \nonumber & = \sum_{k = 1}^{n - 1} \sigma_a(k) \left( \sum_{d \mid n - k} \left( \frac{n -k}{d} \right)^b \right) \\ \nonumber & = \sum_{d \geqslant 1} d^{-b} \sum_{\substack{k = 1 \\ d \mid n - k}}^{n - 1} \sigma_a(k) (n - k)^b \\ & = \Gamma(b + 1) \sum_{d \geqslant 1} d^{-b} \frac{1}{2 \pi i} \int_{(a + 2)} \left( \sum_{\substack{k \\ d \mid n - k}} \sigma_a(k) k^{-s} \right) \frac{\Gamma(s)}{\Gamma(s + b + 1)} n^{b + s} ds, \end{align}

where the integral is taken over the vertical line with $\text{Re}(s) = a + 2$ .

For any real $x>0$ , let $\zeta(s,x)$ be the Hurwitz zeta function, defined for $\text{Re}(s)>1$ by the Dirichlet series

\begin{equation*} \zeta(s,x) \,:\!=\, \sum_{n=0}^\infty \frac{1}{(n+x)^s}.\end{equation*}

We note that

\begin{align*}\sum_{k \equiv n \,\left(\mathrm{mod}\,{d}\right)} \frac{\sigma_a(k)}{k^s} &= \sum_{k_1k_2 \equiv n \,\left(\mathrm{mod}\,{d}\right)} \frac{k_1^a}{(k_1k_2)^s} \\ &= \sum_{\substack{ 1 \leqslant e_1,e_2 \leqslant d \\ e_1 e_2 \equiv n \,\left(\mathrm{mod}\,{d}\right)}} \left(\sum_{m_1 \geqslant 0} \frac{1}{(m_1d + e_1)^{s-a}} \right) \left( \sum_{m_2 \geqslant 0} \frac{1}{(m_2d+e_2)^s}\right) \\ &= \frac{1}{d^{2s-a}} \sum_{\substack{ 1 \leqslant e_1,e_2 \leqslant d \\ e_1 e_2 \equiv n \,\left(\mathrm{mod}\,{d}\right)}} \zeta(s-a, e_1/d) \zeta(s,e_2/d).\end{align*}

Thus, we conclude that

(4·3) \begin{align} &S_{a,b}(n) =\nonumber\\ &\Gamma(b+1) \sum_{d \geqslant 1} d^{a-b} \sum_{\substack{1 \leqslant e_1,e_2 \leqslant d \\ e_1e_2 \equiv n \,\left(\mathrm{mod}\,{d}\right)}} \frac{1}{2\pi i} \int_{(a+2)} \zeta(s-a,e_1/d) \zeta(s,e_2/d) \frac{\Gamma(s)}{\Gamma(s+b+1)}n^{b+s}d^{-2s}\,ds.\end{align}

The main aim of this section is to prove the following theorem, essentially a restatement of Theorems 1·1 and 1·2.

Theorem 4·2. Let a and b be positive real numbers.

(i) If $b-a > 3/2$ , then

\begin{align*}S_{a,b}(n) &= \frac{\Gamma(a+1)\Gamma(b+1)}{\Gamma(a+b+2)}\frac{\zeta(a+1)\zeta(b+1)}{\zeta(a+b+2)}\sigma_{a+b+1}(n) \\[5pt] \notag &\quad + \frac{\zeta(1-a)\zeta(b+1)}{(b+1)\zeta(b-a+2)}n^a\sigma_{b-a+1}(n) + \sum_{0 \leqslant m < \frac{b-a}{2}-\frac{3}{4}} \mathrm{Res}({-}m) + O_\epsilon\left(n^{\frac{a+b}{2}+\frac{3}{4}+\epsilon}\right), \end{align*}

where $\mathrm{Res}({-}m)$ denotes the residue of the integrand of (4·3) at $s=-m$ , and is given explicitly by (4·7). It satisfies $\mathrm{Res}({-}m) \ll n^{b-m}$ in general, and if a is an odd integer, then $\mathrm{Res}(0) = -({1}/{2})\zeta({-}a)\sigma_b(n)$ and $\mathrm{Res}({-}m)=0$ for each $m \geqslant 1$ .

(ii) If $\max\{a,2-a\} < b \leqslant a + {3}/{2}$ , then

\begin{align*}S_{a,b}(n) &= \frac{\Gamma(a+1)\Gamma(b+1)}{\Gamma(a+b+2)}\frac{\zeta(a+1)\zeta(b+1)}{\zeta(a+b+2)}\sigma_{a+b+1}(n) \\[5pt] & \quad\quad + \frac{\zeta(1-a)\zeta(b+1)}{(b+1)\zeta(b-a+2)}n^a\sigma_{b-a+1}(n) + O\left(n^{\frac{a+b}{2}+1+\epsilon}\right).\end{align*}

After recalling some analytic facts about the Hurwitz zeta function, we begin by analysing the poles and residues of the integrand. This constitutes an analysis of the main terms provided in Theorem 4·2. We then bound the error terms in Theorem 4·2 by means of the functional equation for Hurwitz zeta functions. This has the net effect of replacing the summation of Hurwitz zeta functions by a Dirichlet series whose coefficients are certain Kloosterman sums. This also implicitly gives another evaluation of the residual terms $\mathrm{Res}({-}m)$ .

Finally, we note that we can obtain the secondary term in a simpler fashion, with no Kloosterman sums, when $a > 1$ and $b > a + 2$ . We explain this in Section 4.4.

4.1. Properties of the Hurwitz zeta function

The following lemma recalls some basic properties of the Hurwitz zeta function. For proofs, see [ Reference Apostol1 ].

Lemma 4·3. For any real $x > 0$ and $\text{Re}(s) > 1$ , the Hurwitz zeta function $\zeta(s, x) \,:\!=\, \sum_{n = 0}^{\infty} (n + x)^{-s}$ satisfies the following:

  1. (i) (Analytic continuation) $\zeta(s, x)$ has analytic continuation to all of $\mathbb{C}$ , with a simple pole at $s = 1$ with residue 1, and holomorphic elsewhere;

  2. (ii) (Functional equation) $\zeta(s, x)$ satisfies a functional equation, which for $x = e/d$ rational and $\text{Re}(s) < 0$ can be written

    (4·4) \begin{equation} \zeta(1-s,e/d) = \frac{\Gamma(s)}{(2\pi)^s} \left( e^{\pi i s/2} \sum_{k \geqslant 1} \frac{e^{-2\pi i ke/d}}{k^s} + e^{-\pi i s/2} \sum_{k \geqslant 1} \frac{e^{2\pi i ke/d}}{k^s}\right);\end{equation}
  3. (iii) (Evaluation at negative integers) For integer values $k \geqslant 0$ , there is the special value

    (4·5) \begin{equation}\zeta({-}k, x) = \frac{-1}{k+1} B_{k+1}(x),\end{equation}
    where $B_{k+1}(x)$ denotes the degree $k+1$ Bernoulli polynomial.

To estimate the values of $\zeta(s, x)$ inside the critical strip, we will use the approximate functional equation, as proved in the following form by Miyagawa [ Reference Miyagawa16 ].

Lemma 4·4. Assume $s = \sigma + i t$ for some $0 < \sigma < 1 $ . Set $T = \sqrt{2 \pi (|t|+1)}$ . Then for any real $x > 0$ ,

\begin{align*}\zeta&(s,x) = \\&\sum_{0 \leqslant k \leqslant T} \frac{1}{(k+x)^s} + \frac{\Gamma(1-s)}{(2\pi)^{1-s}} \left[ e^{\frac{\pi i (1-s)}{2}}\sum_{k \leqslant T} \frac{e({-}kx)}{k^{1-s}} + e^{\frac{-\pi i (1-s)}{2}}\sum_{k \leqslant T} \frac{e(kx)}{k^{1-s}}\right] + O\left(t^{-\frac{\sigma}{2}}\right) \\[5pt]& \quad + O\left(t^{\frac{\sigma-1}{2}}\right).\end{align*}

We also note the following consequence of Stirling’s formula.

Lemma 4·5. For any b, we have

\begin{equation*} \frac{\Gamma(s)}{\Gamma(1 + b + s)} \ll_b (1 + |t|)^{-b - 1}.\end{equation*}

4.2. Analysis of poles and residues

We now proceed with our analysis of the integral (4·3). For each $e_1,e_2$ , the integrand has right-most pole at $s=a+1$ , coming from the factor of $\zeta(s-a,e_1/d)$ , which has a simple pole with residue 1. The sum of the residues is

\begin{align*}\frac{\Gamma(a+1)\Gamma(b+1)}{\Gamma(a+b+2)} \sum_{d \geqslant 1} &\frac{n^{a+b+1}}{d^{a+b+2}} \sum_{\substack{1 \leqslant e_1,e_2 \leqslant d \\ e_1e_2 \equiv n \,\left(\mathrm{mod}\,{d}\right)}} \zeta(a+1,e_2/d) \\[6pt] &= \frac{\Gamma(a+1)\Gamma(b+1)}{\Gamma(a+b+2)} \sum_{d \geqslant 1} \frac{n^{a+b+1}}{d^{b+1}} \sum_{k \geqslant 1} \frac{\#\{e_1 \,\left(\mathrm{mod}\,{d}\right) :\, ke_1 \equiv n \,\left(\mathrm{mod}\,{d}\right)\}}{k^{a+1}}.\end{align*}

We then note that

\begin{equation*}\#\{e_1 \,\left(\mathrm{mod}\,{d}\right) :\, ke_1 \equiv n \,\left(\mathrm{mod}\,{d}\right)\} = \begin{cases} (k,d), & \text{if} (k,d) \mid (d,n) \\[5pt] 0, & \text{otherwise.}\end{cases}\end{equation*}

Thus, write $f \,:\!=\, (k,d)$ , and observe that we may assume $f \mid n$ . So doing, and replacing d and k by fd and fk, respectively, our expression for the residue at $s=a+1$ becomes

\begin{align*}\frac{\Gamma(a+1)\Gamma(b+1)}{\Gamma(a+b+2)} \sum_{f \mid n} \frac{n^{a+b+1}}{f^{a+b+1}} &\sum_{\substack{d,k \\ (d,k)=1}} \frac{1}{d^{b+1} k^{a+1}}\\[6pt] &= \frac{\Gamma(a+1)\Gamma(b+1)}{\Gamma(a+b+2)} \frac{\zeta(a+1)\zeta(b+1)}{\zeta(a+b+2)} \sigma_{a+b+1}(n),\end{align*}

by Lemma 3·3.

Before turning to the residue of the pole at $s=1$ , we note one consequence of the above argument. In particular, for any fixed n and b, in the identity proved above,

(4·6) \begin{equation}\sum_{d \geqslant 1} \frac{n^{a+b+1}}{d^{a+b+2}} \sum_{\substack{1 \leqslant e_1,e_2 \leqslant d \\ e_1e_2 \equiv n \,\left(\mathrm{mod}\,{d}\right)}} \zeta(a+1,e_2/d) = \frac{\zeta(a+1)\zeta(b+1)}{\zeta(a+b+2)} \sigma_{a+b+1}(n),\end{equation}

both sides define analytic functions of a for $a>-b$ , $a\neq 0$ . Thus, this expression must hold for $-b<a<0$ , even though neither $\zeta(a+1,x)$ nor $\zeta(a+1)$ is defined via a convergent Dirichet series in this region. This will be useful in evaluating the residue at $s=1$ , which we now turn to.

Using (4·3) again, the pole at $s=1$ is seen to be

\begin{align*}\frac{\Gamma(b+1)}{\Gamma(b+2)} \sum_{d\geqslant 1} \frac{n^{b+1}}{d^{b-a+2}} \sum_{\substack{1 \leqslant e_1,e_2 \leqslant d \\ e_1e_2 \equiv n \,\left(\mathrm{mod}\,{d}\right)}}& \zeta(1-a,e_1/d) \\&= \frac{n^a}{b+1} \sum_{d \geqslant 1} \frac{n^{b-a+1}}{d^{b-a+2}} \sum_{\substack{1 \leqslant e_1,e_2 \leqslant d \\ e_1e_2 \equiv n \,\left(\mathrm{mod}\,{d}\right)}} \zeta(1-a,e_1/d).\end{align*}

Since we have assumed $a < b$ , it follows that $-a > -b$ , so by the identity (4·6), this evaluates to

\begin{equation*} \frac{n^a}{b+1} \frac{\zeta(1-a)\zeta(b+1)}{\zeta(b-a+2)}\sigma_{b-a+1}(n).\end{equation*}

Finally, we evaluate the residue at $s=-m$ , $m \geqslant 0$ , arising from the gamma function. We do so in general, but we only provide a clean simplification of the term when a is an odd integer. The residues for other values of a do not seem to have a natural multiplicative structure, for example, so we consider the case that a is odd to be the most interesting.

Using (4·3), the residue at $s=-m$ is

(4·7) \begin{equation} ({-}1)^m n^{b-m} \left({b}\atop {m}\right) \sum_{d \geqslant 1} d^{a-b+2m} \sum_{\substack{1 \leqslant e_1,e_2 \leqslant d \\ e_1e_2 \equiv n \,\left(\mathrm{mod}\,{d}\right)}} \zeta({-}m-a,e_1/d) \zeta({-}m,e_2/d).\end{equation}

When a is an integer, by the special value formula (4·5) the inner summation over $e_1,e_2$ in (4·7) becomes

\begin{equation*} \frac{1}{(m+1)(m+a+1)} \sum_{\substack{ 1 \leqslant e_1,e_2 \leqslant d \\ e_1e_2 \equiv n \,\left(\mathrm{mod}\,{d}\right)}} B_{m+1}(e_2/d) B_{m+a+1}(e_1/d).\end{equation*}

For fixed d, the substitution $(e_1,e_2) \mapsto (d-e_1,d-e_2)$ defines an involution on the set of pairs $(e_1,e_2)$ with $e_1,e_2 \neq d$ . Since $B_{k+1}(1-x) = ({-}1)^{k+1}B_{k+1}(x)$ , if a is odd, it follows for such $e_1,e_2$ that

\begin{equation*} B_{m+1}\Big(\frac{d-e_2}{d}\Big)B_{m+a+1}\Big(\frac{d-e_1}{d}\Big) = - B_{m+1}\Big(\frac{e_2}{d}\Big) B_{m+a+1}\Big(\frac{e_1}{d}\Big).\end{equation*}

Consequently, when a is odd, the sum over $e_1,e_2$ with $e_1,e_2 \neq d$ cancels, and it remains to consider only those pairs where one of $e_1$ and $e_2$ equals d. Given that $e_1$ and $e_2$ are restricted to satisfy the congruence $e_1e_2 \equiv n \,\left(\mathrm{mod}\,{d}\right)$ , such pairs arise only when $d \mid n$ . In this case, the summation over $e_1$ and $e_2$ in (4·7) collapses to

\begin{align*}&\sum_{e_1 = 1}^d \zeta({-}m-a,e_1/d)\zeta({-}m) + \sum_{e_2 =1}^d \zeta({-}m,e_2/d) \zeta({-}m-a) - \zeta({-}m-a)\zeta({-}m) \\ & \quad = d^{-m-a} \zeta({-}m-a)\zeta({-}m) + d^{-m} \zeta({-}m)\zeta({-}m-a) - \zeta({-}m)\zeta({-}m-a).\end{align*}

If $m \geqslant 1$ , then, since a is odd, every term above is 0, and consequently the residue (4·7) is 0 as well. On the other hand, if $m=0$ , then the above expression simplifies to $d^{-a}\zeta(0)\zeta({-}a) = -({d^{-a}}/{2})\zeta({-}a)$ . We then find for $m=0$ that (4·7) evaluates to

\begin{equation*} \frac{-\zeta({-}a)}{2} \sum_{d \mid n} \frac{n^b}{d^b} = -\frac{\zeta({-}a)}{2} \sigma_b(n).\end{equation*}

4.3. Error analysis via Kloosterman sums

Applying the functional equation (4·4) for both $\zeta(1-s-a,e_1/d)$ and $\zeta(1-s,e_2/d)$ , we will be led to consider exponential sums of the form

\begin{equation*} S_n(m,k;\,d) \,:\!=\, \sum_{\substack{e_1,e_2 \,\left(\mathrm{mod}\,{d}\right) \\ e_1 e_2 \equiv n \,\left(\mathrm{mod}\,{d}\right)}} e\Big( \frac{me_1+ke_2}{d}\Big),\end{equation*}

where we write $e(x) \,:\!=\, e^{2\pi i x}$ for any real x. By relating these to classical Kloosterman sums we obtain the following strong bound.

Lemma 4·6. With notation as above, we have

\begin{equation*}S_n(m,k;\,d) \ll_\epsilon d^{1/2+\epsilon} (d,k)^{1/2} (d,m)^{1/2}\end{equation*}

for any $\epsilon > 0$ .

Proof. Recall that the classical Kloosterman sums are defined by

\begin{equation*} K(a,b;\,q) \,:\!=\, S_1(a,b;\,q) = \sum_{ xy \equiv 1 \,\left(\mathrm{mod}\,{q}\right)} e\left(\frac{ax+by}{q}\right).\end{equation*}

We begin by proving the identity

\begin{equation*} S_n(m,k;\,d) = \sum_{f \mid (d,n,k)} f\ K(m, kn/f^2;\, d/f).\end{equation*}

For $e_1$ as in the sum defining $S_n(m,k;\,d)$ , let $f = (e_1,d)$ , and note that there are no terms with $f \nmid (d,n)$ . Write $e_1 = e_1^\prime f$ , where $(e_1^\prime, d/f) = 1$ . Let $e_2^\prime$ be such that $e_1^\prime e_2^\prime \equiv 1\,\left(\mathrm{mod}\,{d/f}\right)$ , so that the allowed values of $e_2 \,\left(\mathrm{mod}\,{d}\right)$ are given by $e_2 = e_2^\prime n/f + jd/f$ for $0 \leqslant j \leqslant f-1$ .

Thus, we find

\begin{align*}S_n(m,k;\,d) &= \sum_{f \mid (d,n)} \sum_{e_1^\prime e_2^\prime \equiv 1 \,\left(\mathrm{mod}\,{\frac{d}{f}}\right)} e\left(\frac{m e_1^\prime f + kne_2^\prime/f}{d}\right)\sum_{j=0}^{f-1} e\left(\frac{jk}{f}\right) \\ &= \sum_{f \mid (d,n,k)} f \sum_{e_1^\prime e_2^\prime \equiv 1 \,\left(\mathrm{mod}\,{\frac{d}{f}}\right)} e\left(\frac{me_1^\prime + kne_2^\prime /f^2}{d/f}\right) \\ &= \sum_{f \mid (d,n,k)} f\ K\left(m,kn/f^2;\,d/f\right),\end{align*}

as claimed.

Now apply the Weil bound $|K(a,b;\,q)| \leqslant \tau(q)q^{1/2}\mathrm{gcd}(a,b,q)^{1/2}$ to conclude

\begin{align*}|S_n(m,k;\,d)| &\leqslant \sum_{f \mid (d,n,k)} d^{1/2} f^{1/2} \tau\left(\frac{d}{f}\right) \mathrm{gcd}\left(m,\frac{kn}{f^2},\frac{d}{f}\right)^{1/2} \\[5pt] &\ll_\epsilon d^{1/2+\epsilon} (d,k)^{1/2} (d,m)^{1/2},\end{align*}

as desired.

We first assume that $b > a + 3/2$ . We will shift the contour in (4·3) to the line $\text{Re}(s)=1-\delta$ for some $\delta > 1$ . Using Stirling’s formula, along the line $\text{Re}(s)=1-\delta$ for $\delta>1$ , the integrand in (4·3) is

\begin{equation*} \ll_{a,b,\delta} (1+|t|)^{a-b+2\delta-2}\sum_{d \geqslant 1} \frac{n^{b+1-\delta}}{d^{b-a+2-2\delta}} \sum_{k,m\geqslant 1} \frac{|S_n(m,k;\,d)|+|S_n(m,-k;\,d)|}{m^\delta k^{\delta + a}}.\end{equation*}

The integral (4·3) thus converges absolutely on the line $\text{Re}(s)=1-\delta$ provided that $\delta < ({b-a+1}/{2})$ . This is compatible with the assumption that $\delta>1$ by the assumption $b>a+3/2$ .

Using Lemma 4·6, the integral in (4·3), evaluated on the line $\text{Re}(s)=1-\delta$ , is

\begin{equation*} \ll_{a,b,\delta,\epsilon} \sum_{d \geqslant 1} \frac{n^{b+1-\delta}}{d^{b-a+\frac{3}{2}-2\delta-\epsilon}},\end{equation*}

by the assumption that $\delta > 1$ . Since $b>a+{3}/{2}$ , we take $\delta = ({b-a})/{2} + {1}/{4}-\epsilon$ and conclude the integral is

\begin{equation*} \ll_{a,b,\epsilon} n^{\frac{a+b}{2}+\frac{3}{4} + \epsilon} \sum_{d \geqslant 1} \frac{1}{d^{1+\epsilon}} \ll_{a,b,\epsilon} n^{\frac{a+b}{2}+\frac{3}{4} + \epsilon}.\end{equation*}

Together with the analysis of the poles, this yields the first part of Theorem 4·2.

Now, assume that $b > \max\{a,2-a\}$ . Our goal in this case is to show that the contour in (4·3) may be shifted to the line $\text{Re}(s)=\sigma$ for some $0 < \sigma < 1$ . This is equivalent to obtaining sufficient cancellation in the series

(4·8) \begin{equation}\sum_{d \geqslant 1} d^{a-b-2s} \sum_{\substack{ 1 \leqslant e_1,e_2 \leqslant d \\ e_1e_2 \equiv n\,\left(\mathrm{mod}\,{d}\right)}} \zeta(s-a, e_1/d) \zeta(s, e_2/d)\end{equation}

on the line $\text{Re}(s)=\sigma$ . We shall find it convenient to assume that $\sigma < a$ so that $\zeta(s-a,e_1/d)$ is related to an absolutely convergent Dirichlet series via the functional equation (4·4). For $\zeta(s,e_2/d)$ , we do not have this luxury, so we instead invoke the approximate functional equation of Lemma 4·4.

In principle, in applying the functional equation for $\zeta(s-a,e_1/d)$ and the approximate functional equation for $\zeta(s,e_2/d)$ , we are forced to consider six summations, corresponding to pairing each of the two terms in (4·4) with the three terms in Lemma 4·4. However, the two summations in (4·4) have the same shape as each other, as do the second and third summations in Lemma 4·4. Consequently, it essentially suffices to consider only two types of summation, corresponding to pairing the first term from Lemma 4·4 with a term from (4·4) or pairing one of the latter two terms from Lemma 4·4 with a term from (4·4).

In the first of these two cases, where the first term of Lemma 4·4 for $\zeta(s,e_2/d)$ is paired with one of the terms in (4·4) for $\zeta(s-a,e_1/s)$ , we are led to consider series of the form

(4·9) \begin{align}\sum_d \frac{1}{d^{b-a+2s}} &\sum_{\substack{ 1 \leqslant e_1,e_2 \leqslant d \\ e_1e_2 \equiv n\,\left(\mathrm{mod}\,{d}\right)}} \sum_{0 \leqslant k \leqslant T} \sum_{m \geqslant 1} \frac{e\big(\frac{me_1}{d}\big)}{(k+e_2/d)^s m^{1+a-s}}\\[5pt]& = \sum_d \frac{1}{d^{b-a+s}} \sum_{k \leqslant d(T+1)} \sum_{m \geqslant 1} \frac{1}{k^s m^{1+a-s}} \sum_{e_1 k \equiv n \,\left(\mathrm{mod}\,{d}\right)} e\left( \frac{me_1}{d}\right),\nonumber \end{align}

where, as in Lemma 4·4, we have set $T = \sqrt{2\pi(1+|t|)}$ . The exponential sum in (4·9) is 0 unless $(d,k) \mid (n,d,m)$ , in which case it is of absolute value (d, k). Thus, since we have assumed $\text{Re}(s)=\sigma < a$ , (4·9) is bounded by

(4·10) \begin{align}\sum_{d\geqslant 1} \frac{1}{d^{b-a+\sigma}} \sum_{k \leqslant d(T+1)} \sum_{m \geqslant 1} \frac{(d,k)}{k^\sigma m^{1+a-\sigma}} &\ll \sum_{d \geqslant 1} \frac{1}{d^{b-a+\sigma}} \sum_{f \mid d} f^{1-\sigma} \left(\frac{Td}{f}\right)^{1-\sigma}\\&\ll T^{1-\sigma} \sum_{d \geqslant 1} \frac{1}{d^{b-a+2\sigma-1-\epsilon}} \nonumber\\ &\ll T^{1-\sigma} \nonumber\\ &\ll (1+|t|)^{\frac{1-\sigma}{2}},\nonumber \end{align}

provided that $\sigma > 1 - ({b-a})/{2}.$ Since we have assumed $b>2-a$ , there is some $\sigma < a$ for which this holds. Using Stirling’s formula, the additional factors in (4·4) as applied to $\zeta(s - a, e_1/d)$ coming from the gamma function and exponentials may be bounded by $O\left((1+|t|)^{a-\sigma+\frac{1}{2}}\right)$ . Altogether, the contribution to (4·8) from the first term in the approximate functional equation for $\zeta(s,e_2/d)$ is seen to be $O\left((1+|t|)^{a-\frac{3\sigma}{2}+1}\right)$ .

We now consider the second type of summation, arising from the second and third terms in the approximate functional equation. In particular, we are led to estimate

(4·11) \begin{align}\sum_d \frac{1}{d^{b-a+2s}} \sum_{m \geqslant 1} \sum_{k \leqslant T} \frac{1}{k^{1-s} m^{a+1-s}} & \sum_{\substack{1 \leqslant e_1,e_2 \leqslant d \\ e_1e_2 \equiv n \,\left(\mathrm{mod}\,{d}\right)}} e\left(\frac{\pm me_1\pm ke_2}{d}\right)\\[5pt]& =\sum_d \frac{1}{d^{b-a+2s}} \sum_{m \geqslant 1} \sum_{k \leqslant T} \frac{S_n(\pm m,\pm k;\,d)}{k^{1-s} m^{a+1-s}}.\nonumber \end{align}

We appeal to Lemma 4·6 to conclude that (4·11) is bounded by

(4·12) \begin{align}\sum_{d \geqslant 1} \frac{1}{d^{b-a+2\sigma}} \sum_{m \geqslant 1} \sum_{k \leqslant T} \frac{d^{1/2+\epsilon}(m,d)^{1/2}(k,d)^{1/2}}{k^{1-\sigma} m^{a+1-\sigma}} &\ll \sum_{d \geqslant 1} \frac{1}{d^{b-a+2\sigma-1/2-\epsilon}} \sum_{f \mid d} f^{\sigma-\frac{1}{2}} \left(\frac{T}{f}\right)^{\sigma}\\& \ll T^\sigma \sum_{d \geqslant 1} \frac{1}{d^{b-a+2\sigma-1/2-\epsilon}} \nonumber\\ & \ll T^\sigma \nonumber\\ & \ll (1+|t|)^{\frac{\sigma}{2}}.\nonumber \end{align}

Once again, the additional factors in (4·4) are of size $O((1+|t|)^{a-\sigma+\frac{1}{2}}$ , while those in Lemma 4·4 are seen to be $O((1+|t|)^{\frac{1}{2}-\sigma})$ . We thus find that terms arising from the second and third summations in Lemma 4·4 contribute an amount that is $O\left((1+|t|)^{a-\frac{3\sigma}{2}+1}\right)$ to (4·8), matching the contribution from those terms arising from the first summation in Lemma 4·4. The error terms in Lemma 4·4 contribute a smaller amount, and we conclude that on the line $\text{Re}(s)=\sigma$ ,

(4·13) \begin{equation}\sum_{d \geqslant 1} d^{a-b-2s} \sum_{\substack{ 1 \leqslant e_1,e_2 \leqslant d \\ e_1e_2 \equiv n\,\left(\mathrm{mod}\,{d}\right)}} \zeta(s-a, e_1/d) \zeta(s, e_2/d) \ll (1+|t|)^{a - \frac{3\sigma}{2}+1},\end{equation}

provided that $1 - ({b-a})/{2} < \sigma < a$ .

Thus, estimating the quotient of gamma factors by Lemma 4·5, the integrand in (4·3) is $O_{a,b,\sigma}\left(n^{b+\sigma} (1+|t|)^{a-b-\frac{3\sigma}{2}}\right)$ . The integral therefore converges absolutely on the line $\text{Re}(s) = 1 - ({b-a})/{2}+\epsilon$ for any $\epsilon>0$ . This yields the second part of the theorem when $\max\{a,2-a\} < b\leqslant a + 3/2$ .

4.4. A simpler version of the error analysis

We present an alternative treatment of the error that avoids the complications of the last section, obtaining a weaker error term of $o\left(n^{b + 1}\right)$ for some ranges of the parameters. In particular, we assume that $a > 1$ and $b>a+2$ .

Shift the contour in (4·3) to $\text{Re}(s) = 1 - \epsilon$ for small $\epsilon > 0$ . We have $\zeta(s - a, e_1/d) \ll (1 + |t|)^{a - \frac{1}{2} + \epsilon}$ by the functional equation and Stirling’s formula; we have $\zeta(s, e_2/d) \ll (1 + |t|)^{\epsilon} \cdot \left( {e_2}/{d} \right)^{-1}$ by the convexity bound, with the term $\left( {e_2}/{d}\right)^{-1}$ arising from the first term $(e_2/d)^{-s}$ of $\zeta(s, e_2/d)$ ; and we again use Lemma 4·5 to estimate the quotient of gamma functions.

We conclude that the integrand is

\begin{equation*} \ll \sum_{d \geqslant 1} \frac{n^{b+1-\epsilon}}{d^{b-a-1-2\epsilon}} (1+|t|)^{a-b-\frac{3}{2}+2\epsilon}.\end{equation*}

This yields an error term of $O(n^{b + 1 - \epsilon})$ provided that the sum over d and the integral over t converge. These conditions are satisfied for some $\epsilon>0$ if $b - a > 2$ .

5. Possible improvements

As made clear in the discussion surrounding Lemma 4·6, the error term in Theorem 1·2 is controlled by sums of Kloosterman sums $K(r,s;\,q)$ , where q denotes the modulus. The Weil bound implies that $K(r,s;\,q) \ll q^{1/2+\epsilon}$ , and this is a key ingredient in the proof. However, it is expected that much greater cancellation holds on average. We expect that if the estimate $K(r,s;\,q) \ll q^{\theta+\epsilon}$ holds on average for some $0 \leqslant \theta \leqslant 1/2$ , then the error term in Theorem 1·2 may be improved to $O\left(n^{\frac{a+b}{2}+\frac{1+\theta}{2}+\epsilon}\right)$ . Assuming a conjecture of Selberg [ Reference Selberg19 ], the value $\theta=0$ is likely admissible, and this would yield a Ramanujan–Deligne quality error term in Theorem 1·2. Using work of Deshouillers and Iwaniec [ Reference Deshouillers and Iwaniec3 ] on sums of Kloosterman sums, we speculate it may be possible to improve the error in Theorem 1·2, perhaps to the level $O\left(n^{\frac{a+b}{2}+\frac{7}{12}+\epsilon}\right)$ . Alternatively, Shparlinski suggested to us that his work with Zhang [ Reference Shparlinski and Zhang20 ] on cancellation amongst Kloosterman sums to prime moduli could be readily generalised to the composite case without difficulty, again leading to possible improvements. We leave these questions for future work.

Finally, as P. Humphries pointed out to us, these questions can also be addressed via the spectral theory of automorphic forms. We refer to Kuznetsov [ Reference Kuznetsov11 ] and Motohashi [ Reference Motohashi17 ] for some related results along these lines, including a treatment by Motohashi of the case $a = b = 0$ . Humphries suggested to us that these techniques may be able to address complex a and b in greater generality, and again we leave this question for future work.

Acknowledgements

The authors would like to thank Bruce Berndt, Michael Filaseta, Peter Humphries, Karl Mahlburg, Ken Ono, Ian Petrow, Igor Shparlinski and Matt Young for useful discussions and for pointing us to relevant related works. We would also like to thank an anonymous referee for helpful comments.

RJLO was partially supported by NSF grant DMS-1601398. FT was partially supported by grants from the Simons Foundation (Nos. 563234 and 586594).

References

Apostol, T. M.. Introduction to Analytic Number Theory (Springer-Verlag, New York-Heidelberg, 1976). Undergraduate Texts in Mathematics.Google Scholar
Delecroix, V., Goujard, E., Zograf, P. and Zorich, A.. Contribution of one-cylinder square-tiled surfaces to Masur–Veech volumes. Astérisque (415, Quelques aspects de la théorie des systèmes dynamiques: un hommage à Jean-Christophe Yoccoz. I) (2020), 223–274.CrossRefGoogle Scholar
Deshouillers, J.-M. and Iwaniec, H.. Kloosterman sums and Fourier coefficients of cusp forms. Invent. Math. 70(2) (1982/83), 219288.CrossRefGoogle Scholar
Eskin, A. and Okounkov, A.. Asymptotics of numbers of branched coverings of a torus and volumes of moduli spaces of holomorphic differentials. Invent. Math. 145 (2001), 59103.CrossRefGoogle Scholar
Gradshteyn, I. S. and Ryzhik, I. M.. Table of integrals, series, and products. Fourth edition prepared by Ju. V. Geronimus and M. Ju. Cetlin. Translated from the Russian by Scripta Technica, Inc. Translation edited by Alan Jeffrey (Academic Press, New York-London, 1965).Google Scholar
Halberstam, H.. Four asymptotic formulae in the theory of numbers. J. London Math. Soc. 24 (1949), 1321.Google Scholar
Halberstam, H.. An asymptotic formula in the theory of numbers. Trans. Amer. Math. Soc. 84 (1957), 338351.CrossRefGoogle Scholar
Hubert, P. and Lelievre, S.. Prime arithmetic teichmüller discs in $\mathcal{H}(2)$ . Israel J. Math. 151(1) (2006), 281321.CrossRefGoogle Scholar
Ingham, A. E.. Some asymptotic formulae in the theory of numbers. J. London Math. Soc. 2(3) (1927), 202208.Google Scholar
Kontsevich, M. and Zorich, A.. Connected components of the moduli spaces of abelian differentials with prescribed singularities. Invent. Math. 153 (2003), 631678, 2003.CrossRefGoogle Scholar
Kuznetsov, N. V.. Convolution of Fourier coefficients of Eisenstein–Maass series. volume 129, pages 43–84 (1983). Automorphic functions and number theory, I.Google Scholar
Lelievre, S. and Royer, E.. Orbit countings in $\mathcal{H}(2)$ and quasimodular forms. Internat. Math. Res. Not. 2006(42151) (2006), 130.Google Scholar
Masur, H.. Interval exchange transformations and measured foliations. Ann. Math. 115 (1982), 169200.CrossRefGoogle Scholar
Masur, H.. Ergodic theory of translation surfaces. In B. Hasselblatt and A. Katok, editors, Handbook of dynamical sytems, volume 1B, pages 527–547 (Elsevier B. V., 2006).CrossRefGoogle Scholar
McMullen, C. T.. Teichmüller curves in genus two: discriminant and spin. Math. Ann. 333(1) (2005), 87130.CrossRefGoogle Scholar
Miyagawa, T.. Approximate functional equations for the Hurwitz and Lerch zeta-functions. Comment. Math. Univ. St. Pauli 66(1–2) (2017), 1527.Google Scholar
Motohashi, Y.. The binary additive divisor problem. Ann. Sci. école Norm. Sup. (4) 27(5) (1994), 529572.CrossRefGoogle Scholar
Ramanujan, S.. On certain arithmetical functions Trans. Camb. Phil. Soc. 22 (1916), no. 9, 159–184]. In Collected papers of Srinivasa Ramanujan, pages 136–162 (AMS Chelsea Publ., Providence, RI, 2000).Google Scholar
Selberg, A.. On the estimation of Fourier coefficients of modular forms. In Proc. Sympos. Pure Math., Vol. VIII, pages 1–15 (Amer. Math. Soc., Providence, R.I., 1965).CrossRefGoogle Scholar
Shparlinski, I. E. and Zhang, T.. Cancellations amongst Kloosterman sums. Acta Arith. 176(3) (2016), 201210.CrossRefGoogle Scholar
Shrestha, S. T.. Counting formulae for square-tiled surfaces in genus two. Ann. Math. Blaise Pascal 27(1) (2020), 83123.CrossRefGoogle Scholar
Veech, W.. Gauss measures for transformations on the space of interval exchange maps. Ann. of Math. 115(2) (1982), 201242.CrossRefGoogle Scholar
Veech, W.. Moduli spaces of quadratic differentials. J. Anal. Math. 55(1) (1990), 117171.CrossRefGoogle Scholar
Zmiaikou, D.. The probability of generating the symmetric group with a commutator condition. Preprint (2012), available at https://arxiv.org/abs/1205.6718.Google Scholar
Zorich, A.. Square tiled surfaces and Teichmüller volumes of the moduli spaces of abelian differentials. In Rigidity in Dynamics and Geometry (Cambridge, 2000), pages 459–471 (Springer, Berlin, 2002).CrossRefGoogle Scholar
Figure 0

Fig. 1. On the left, the two translation surfaces differ by a nontrivial rotation, so are not considered equivalent. On the right, the two translation surfaces are cut and paste equivalent. We omit the orientation on the edges mentioned in the definition while representing the surfaces using polygons.

Figure 1

Fig. 2. A translation surface formed by two pentagons whose opposite corresponding sides are glued. This surface has genus 2 and lives in the stratum $\mathcal{H}(1,1)$.

Figure 2

Fig. 3. Examples of STSs in the four cylinder diagrams of $\mathcal{H}(1,1)$, here named A, B, C, D. In each surface, collections of edges with the same label are glued via translation. For instance, in A, the 3 edges labelled p are glued to the 3 edges labelled p via translation to form a horizontal cylinder. Hence, diagram A is characterised by having exactly one (maximal) horizontal cylinder. Similarly, diagram D consists of STSs in $\mathcal{H}(1,1)$ with exactly three horizontal cylinders while diagram B and C consist of those with two horizontal cylinders but different gluing pattern. Adding squares to vary the parameters p, q, r, j, k, l, m gives surfaces with different number of squares in each of these cylinder diagrams.