1 Introduction
For $n \in \mathbb {N}$ , let $\Omega (n)$ denote the number of prime factors of n, counted with multiplicity. The study of the asymptotic behavior of $ \Omega (n) $ has a rich history and finds important applications to number theory. For instance, the classical prime number theorem is equivalent to the statement that the set $ \{ n \in \mathbb {N} : \Omega (n) \text { is even}\} $ has natural density 1/2 [Reference LandauLan53, Reference von MangoldtvM97]. Recently, a dynamical approach to this question was introduced by Bergelson and Richter [Reference Bergelson and RichterBR21]. They show that given a uniquely ergodic dynamical system $(X, \mu , T)$ , the sequence $\{T^{\Omega (n)} x\}_{n \in \mathbb {N}}$ is uniformly distributed in X for every point $x \in X$ (see §2 for relevant definitions). The precise statement is as follows.
Theorem 1.1. [Reference Bergelson and RichterBR21, Theorem A]
Let $(X, \mu , T)$ be uniquely ergodic. Then,
for all $x \in X$ and $g \in C(X)$ .
The purpose of this paper is to continue this dynamical exploration of the properties of $\Omega (n)$ . Relaxing the assumptions of Bergelson and Richter’s theorem, we obtain further results regarding the convergence of ergodic averages along $\Omega (n)$ . In §3.3, we show that pointwise almost-everywhere convergence for $L^1$ functions does not hold in any non-atomic ergodic system.
Theorem 1.2. Let $(X, \mathcal {B}, \mu , T)$ be a non-atomic ergodic dynamical system. Then there is a set $A \in \mathcal {B}$ such that for almost every (a.e.) $x \in X$ ,
where $1_A$ denotes the indicator function of A.
In particular, Theorem 1.2 demonstrates that the assumptions in Theorem 1.1 that the system is uniquely ergodic and g is continuous are not only necessary for pointwise convergence to the proper limit, but for pointwise convergence to hold at all. To prove Theorem 1.2, the key idea is to approximate the ergodic averages along $\Omega (n)$ by weighted sums. We show that for all $\varepsilon> 0$ and $N \in \mathbb {N}$ , there are weight functions $w_*(N): \mathbb {N} \to \mathbb {R}$ , supported on large intervals $I_N$ , such that
as N tends to infinity. Leveraging the size and placement of the intervals $I_N$ , we employ a standard argument to demonstrate the failure of pointwise convergence. Moreover, our method shows that there is not just one set $A \in \mathcal {B}$ for which equation (1) holds, but rather there exists a dense $G_\delta $ subset $\mathcal {R} \subseteq \mathcal {B}$ such that equation (1) holds for every $A \in \mathcal {R}$ . Thus, the operators $T^{\Omega (n)}: \mathcal {B} \to L^1(\mu )$ defined by $T^{\Omega (n)}A(x) := 1_A(T^{\Omega (n)} x)$ are shown to have the strong sweeping-out property.
In any ergodic system, the set of generic points has full measure. Generic points are those whose ergodic averages converge to $\int _X f \, d\mu $ for every continuous function f (see §2 for the precise definition). In light of Theorems 1.1 and 1.2, it is natural to wonder whether convergence still holds when the ergodic averages are taken along the sequence $\Omega (n)$ . However, the answer is no, and in §3.1, we explicitly construct a symbolic system yielding a counterexample.
Bergelson and Richter show in [Reference Bergelson and RichterBR21] that Theorem 1.1 is a direct generalization of the prime number theorem. In §4, we demonstrate the relationship of Theorem 1.1 to another fundamental result from number theory, the Erdős–Kac theorem. Let $C_c(\mathbb {R})$ denote the set of continuous functions on $\mathbb {R}$ of compact support. An equivalent version of the Erdős–Kac theorem states that for all $F \in C_c(\mathbb {R})$ ,
Roughly speaking, this says that for large N, the sequence $\{\Omega (n): 1 \leq n \leq N\}$ approaches a normal distribution with mean and variance $\log \log N$ . We have now introduced two sequences describing distinct behaviors of $ \Omega (n) $ : $ \{\kern-1pt F( ({\Omega (n) \kern1.2pt{-}\kern0.8pt \log \log N }){/}\kern-0.5pt{\sqrt {\log \log N}} ) \kern-1pt\}_{n = 1}^{N} $ capturing the Erdős–Kac theorem and $ \{ g(T^{\Omega (n)}x) \}_{n \in \mathbb {N}} $ capturing Theorem 1.1. Two sequences $a,b: \mathbb {N} \to \mathbb {C}$ are called asymptotically uncorrelated if
In §4, we show that Theorem 1.1 and the Erdős–Kac theorem exhibit a form of disjointness, in that the sequences capturing their behavior are asymptotically uncorrelated.
Theorem 1.3. Let $(X, \mu , T)$ be uniquely ergodic and let $F \in C_c(\mathbb {R})$ . Then,
for all $g \in C(X)$ and $x \in X$ .
We prove Theorem 1.3 as a corollary of the following more general estimate. For each $ N \in \mathbb {N} $ , let
Then for any bounded arithmetic function $ \mathbf {a}: \mathbb {N} \to \mathbb {C} $ and any $ F \in C_c(\mathbb {R}) $ ,
To prove equation (2), our strategy is to approximate each average by a double average involving dilations by primes. The key observation is that $ F(\varphi _N(n)) $ is asymptotically invariant under dilations by primes, whereas $ \Omega (n) $ is highly sensitive to such dilations. This sensitivity is particularly noticeable in the case that $ \mathbf {a}(n) = (-1)^n $ , so that $ {\mathbf {a}(\Omega (p n)) = - \mathbf {a}(\Omega (n)) }$ . We leverage these contrasting behaviors to obtain the desired invariance in equation (2).
Let $\unicode{x3bb} (n) = (-1)^{\Omega (n)}$ denote the classical Liouville function. Another equivalent formulation of the prime number theorem states that
This formulation of the prime number theorem can be seen as a special case of Theorem 1.1 by choosing $(X, \mu , T)$ to be the uniquely ergodic system given by rotation on two points (see [Reference Bergelson and RichterBR21] or §2 for details). In a similar fashion, we obtain the following corollary of Theorem 1.3.
Corollary 1.4. Let $F \in C_c(\mathbb {R})$ . Then,
Corollary 1.4 demonstrates that the behaviors of $\Omega (n)$ captured by the Erdős–Kac theorem and the prime number theorem exhibit disjointness. This can be interpreted as saying that, for large N, the sequence $ \{ \Omega (n) : 1 \leq n \leq N, \Omega (n) \text { is even} \} $ still approaches a normal distribution with mean and variance $\log \log N$ .
2 Background material
2.1 Measure-preserving systems
By a topological dynamical system, we mean a pair $(X,T)$ , where X is a compact metric space and T a homeomorphism of X. A Borel probability measure $\mu $ on X is called T-invariant if $\mu (T^{-1}A) = \mu (A)$ for all measurable sets A. By the Bogolyubov–Krylov theorem (see, for instance, [Reference WaltersWal82, Corollary 6.9.1]), every topological dynamical system has at least one T-invariant measure. When a topological system $(X, T)$ admits only one such measure, $(X, T)$ is called uniquely ergodic.
By a measure-preserving dynamical system, we mean a probability space $(X, \mathcal {B}, \mu )$ , where X is a compact metric space and $\mathcal {B}$ the Borel $\sigma $ -algebra on X, accompanied by a measure-preserving transformation $T: X \to X$ . We often omit the $\sigma $ -algebra $\mathcal {B}$ when there is no ambiguity. A measure-preserving dynamical system is called ergodic if for any $A \in \mathcal {B}$ such that $T^{-1}A = A$ , one has $\mu (A) = 0$ or $\mu (A) = 1$ . Though unique ergodicity was defined above as a topological property, it is easy to verify that the unique invariant measure is indeed ergodic.
One of the most fundamental results in ergodic theory is the Birkhoff pointwise ergodic theorem, which states that for any ergodic system $(X, \mu , T)$ and $f \in L^1(\mu )$ ,
for a.e. $x \in X$ .
A point $x \in X$ is called generic for the measure $\mu $ if
for all $f \in C(X)$ , where $C(X)$ denotes the space of continuous functions on X. Thus, generic points are those for which pointwise convergence holds for every continuous function. When $\mu $ is ergodic, the set of generic points has full measure.
2.2 Symbolic systems
Let $\mathcal {A}$ be a finite set of symbols. Let $\mathcal {A}^{\mathbb {N}}$ denote the set of all infinite sequences with entries in $\mathcal {A}$ . The set $\mathcal {A}^{\mathbb {N}}$ is endowed with the product topology coming from the discrete topology on the alphabet $\mathcal {A}$ . In fact, this forms $\mathcal {A}^{\mathbb {N}}$ into a compact metric space. Denote an element in $\mathcal {A}^{\mathbb {N}}$ by $\mathbf {x} = (\mathbf {x}(i))_{i \in \mathbb {N}}$ . One equivalent choice of metric generated by this topology is given by
This space carries a natural homomorphism $\sigma : \mathcal {A}^{\mathbb {N}} \to \mathcal {A}^{\mathbb {N}}$ , called the left shift, defined by $ (\sigma \mathbf {x})(i) = \mathbf {x}(i+1) $ .
2.3 Background on $\Omega (n)$
Let $\Omega (n)$ denote the number of prime factors of n, counted with multiplicity. One equivalent formulation of the prime number theorem states that asymptotically, $\Omega (n)$ is even exactly half the time [Reference LandauLan53, Reference von MangoldtvM97]. This statement can be expressed by the classical Liouville function $\unicode{x3bb} (n) = (-1)^{\Omega (n)}$ :
This formulation involving averages is useful from a dynamical point of view. However, a rephrasing of this statement leads to several naturally stated generalizations. For a set $E \subseteq \mathbb {N}$ , the natural density of E in $\mathbb {N}$ is defined to be
For $ m \in \mathbb {N} $ , define $E_m := \{n \in \mathbb {N} \,:\, \Omega (n) \equiv 0 \text { (mod m)} \}$ . The prime number theorem states that $E_2$ has natural density 1/2. In other words, $\Omega (n)$ distributes evenly over residue classes mod 2. The following theorem due to Pillai and Selberg [Reference PillaiPil40, Reference SelbergSel39] states that $\Omega (n)$ distributes evenly over all residue classes.
Theorem 2.1. (Pillai, Selberg)
For all $m \in \mathbb {N}$ and $r \in \{0,\ldots , m-1\}$ , the set $\{ n \in \mathbb{N} \,:\, \Omega(n) \equiv r\ \mod m \}$ has natural density $1/m$ .
Complementing this result is a theorem due to Erdős and Delange. A sequence $\{a(n)\}_{n \in \mathbb {N}} \subseteq \mathbb {R}$ is uniformly distributed mod 1 if
for all continuous functions $f: [0,1] \to \mathbb {C}$ . Erdős mentions without proof [Reference ErdősErd46, p. 2] and Delange later proves [Reference DelangeDel58] the following statement.
Theorem 2.2. (Erdős, Delange)
Let $\alpha \in \mathbb {R}\setminus \mathbb {Q}$ . Then, $\{\Omega (n)\alpha \}_{n \in \mathbb {N}}$ is uniformly distributed mod 1.
Bergelson and Richter’s Theorem 1.1 uses dynamical methods to provide a simultaneous generalization of these number theoretic results [Reference Bergelson and RichterBR21, p. 3]. We review their argument for obtaining the prime number theorem from Theorem 1.1, as we use a similar argument in §4.2 to obtain Corollary 1.4 from Theorem 1.3. Let $X = \{0,1\}$ and define ${T: X \to X}$ by $T(0) = 1$ and $T(1) = 0$ . Let $\mu $ be the Bernoulli measure given by $\mu (\{ 0 \}) = 1/2$ and $\mu (\{ 1 \}) = 1/2$ . This system is commonly referred to as rotation on two points and is uniquely ergodic. Define a continuous function $F: X \to \mathbb {R}$ by $F(0) = 1$ and $F(1) = -1$ . Then,
Finally, one can check that
Hence,
where the last equality follows by Theorem 1.1.
We now state two theorems that give further insight into the statistical properties of $\Omega (n)$ . Hardy and Ramanujan showed that for large $ N $ and almost all $ 1 \leq n \leq N $ , $ \Omega (n) $ falls within a specified interval centered at $ \log \log N $ [Reference Hardy and RamanujanHR17, Theorem C].
Theorem 2.3. (Hardy–Ramanujan theorem)
For $C> 0$ , define $g_C: \mathbb {N} \to \mathbb {N}$ by
Then for all $\varepsilon> 0$ , there is some $C \geq 1$ such that
Erdős and Kac later generalized this theorem to show that $\Omega (n)$ actually becomes normally distributed within such intervals [Reference Erdos and KacEK40].
Theorem 2.4. (Erdős–Kac theorem)
Define $K_N:\mathbb {Z} \times \mathbb {Z} \to \mathbb {N}$ by
Then,
Thus, the Erdős–Kac theorem states that for large N, the number of prime factors of an integer $n \leq N$ becomes roughly normally distributed with mean and variance $\log \log N$ . Recall that the prime number theorem has an equivalent formulation in terms of averages of the Liouville function, making it well suited for dynamical settings. The Erdős–Kac theorem has a similar formulation. Let $C_c(\mathbb {R})$ denote the set of continuous functions on $\mathbb {R}$ with compact support and let $F \in C_c(\mathbb {R})$ . Theorem 2.4 is equivalent to the statement
One direction of this equivalence follows by a standard argument approximating the indicator function on the interval $[A,B]$ by continuous compactly supported functions. The other follows by the fact that any compactly supported continuous function can be approximated by simple functions of the form
where $1_E(x)$ denotes the indicator function of the set E.
2.4 Mean convergence
We show that mean convergence holds along $\Omega (n)$ .
Theorem 2.5. Suppose $(X, \mu , T)$ is ergodic and let $f \in L^2(\mu )$ . Then,
This statement seems to be well known, but we were unable to find a proof, so one is included here for completeness.
Proof of Theorem 2.5
By a standard argument applying the spectral theorem, it is enough to check that for any $\beta \in (0,1)$ ,
First, suppose that $\beta \in (0,1) \setminus \mathbb {Q}$ . By Theorem 2.2, the sequence $\{\Omega (n) \beta \}$ is uniformly distributed mod 1. Then the Weyl equidistribution criterion (see, for instance, [Reference WeylWey16] or [Reference Einsiedler and WardEW11, Lemma 4.17]) implies that
as desired. Now, suppose that $\beta = {p}/{q}$ , where $p,q \in \mathbb {Z}$ are coprime. By Theorem 2.1, $\Omega (n)$ distributes evenly over residue classes mod q. It is straightforward to check this is equivalent to the statement
where $\zeta $ is a primitive qth root of unity. Since $\gcd (p,q) = 1$ , $e^{2\pi i p/q}$ is a primitive qth root of unity, and we are done.
3 Counterexamples to convergence
The condition of unique ergodicity is essential to the proof of Theorem 1.1. In this section, we show that, removing this assumption, convergence need not hold for an arbitrary generic point and pointwise almost-everywhere convergence does not hold in any non-atomic ergodic system.
3.1 Counterexample for generic points
We show that convergence need not hold for generic points.
Proposition 3.1. There exists an ergodic system $(X, \mu , T)$ , a generic point $x \in X$ for the measure $\mu $ , and a continuous function $F \in C(X)$ such that the averages
do not converge.
We explicitly construct a symbolic system, generic point, and continuous function for which the above averages do not converge.
Proof. Let $(X, \sigma )$ be the one-sided shift on the alphabet $\{0,1\}$ and let $\delta _{\mathbf {0}}$ denote the delta mass at $\mathbf {0} = (.00 \ldots ) \in X$ . Notice that $\delta _{\mathbf {0}}$ is $\sigma $ -invariant and trivially ergodic. Define a sequence $\mathbf {a} \in X$ by
We claim that $ \mathbf {a} $ is generic for $\delta _{\mathbf {0}}$ , meaning that for any $f \in C(X)$ ,
Fix $\varepsilon> 0$ . For $N \in \mathbb {N}$ , define
and
Then for each N,
It is immediate from the definition of $B_N$ that
We now consider the sum over $A_N$ . Let $M> 0$ be a bound for $|f|$ . Since f is continuous, there is some $\delta> 0$ such that $ d(\sigma ^n \mathbf {a}, \mathbf {0}) \leq \delta $ implies $|f(\sigma ^n \mathbf {a}) - f(\mathbf {0})| \leq \varepsilon $ . Define ${C_N \subseteq \mathbb {N}}$ by
Notice that $A_N \subseteq C_N$ for all N. Then,
Let $m \in \mathbb {N}$ be the smallest integer such that $2^{-(m+1)} \leq \delta $ . Then d( $ \sigma ^n(\mathbf {a}), \mathbf {0} )> \delta $ when
for some $ k $ . Hence,
Combining equations (4) and (5), we obtain
for large enough N. Combining the estimates from equations (3) and (6), and letting $\varepsilon \to 0$ , this completes the proof of the claim.
Now, define $F: X \to \mathbb {R}$ by $F(\mathbf {x}) = \mathbf {x}(0)$ , so that $\mathbf {a}(n) = F(\sigma ^n \mathbf {a})$ . Since $ \mathbf {a} $ is generic for the measure $\delta _{\mathbf {0}}$ ,
Define a subsequence $\{N_k\}_{k \in \mathbb {N}} \subseteq \mathbb {N}$ by $\log \log N_k = 3^k$ . We first estimate the sum
for fixed k. Let $I_k = [3^k-2^k, 3^k+2^k]$ . It is easy to check that for $n \leq N_k$ , $\log \log n$ lands in the interval $I_k$ when $n \geq N_k^{1/(2^{2^k})}$ . Then,
Since $\mathbf {a}(n) = 1$ on $I_k$ ,
Hence,
We now replace $ \log \log n $ by $ \Omega (n) $ in equation (7). Furthering Theorem 2.3, Hardy and Ramanujan showed that the same result holds replacing $ \log \log N $ by $ \log \log n $ [Reference Hardy and RamanujanHR17, Theorem C′]. Let $ \varepsilon> 0 $ , and let $ C> 0 $ be the constant guaranteed by this variant of Theorem 2.3. Set
so that $ |G_C(N)| = g_C(N) $ . Define
Notice that for $ 1 \leq n \leq N_k $ , we have $ \sqrt {\log \log n} \leq \sqrt {3^k} $ . Then,
One directly calculates
so that
where the last inequality follows from Theorem 2.3. This indicates that the averages along $\Omega (n)$ either converge to 1 or do not converge at all. However, consider the subsequence $ \{M_k\}_{k \in \mathbb {N}} $ given by $ \log \log M_k = 2(3^k - 2^{k-1}) $ , so that $ \log \log M_k $ lands in the middle of the kth interval of zero’s in the definition of $\mathbf {a}$ . By an analogous argument,
Hence, the averages along $\Omega (n)$ do not converge.
Proposition 3.1 tells us that for a given point, convergence of the Birkhoff averages is not enough to guarantee convergence of the Birkhoff averages along $\Omega (n)$ . However, given a stronger assumption on the convergence of the standard Birkhoff averages, convergence along $\Omega (n)$ does hold. In fact, this follows from a more general result on bounded arithmetic functions. Let $ \mathbf {a}: \mathbb {N} \to \mathbb {C} $ be a bounded arithmetic function. We say that the averages of a converge uniformly to zero if
See, for instance, [Reference Host and KraHK09, Definition 2.7].
Proposition 3.2. Suppose $ \mathbf {a} : \mathbb {N} \to \mathbb {C} $ is a bounded arithmetic function whose averages converge uniformly to zero. Then, $ \lim _{N \to \infty } ({1}/{N}) \sum _{n=1}^{N} \mathbf {a}(\Omega (n)) = 0 $ .
Proof. It follows from [Reference RichterRic21, Theorem 1.1] that for any fixed $k \in \mathbb {N}$ ,
Hence,
assuming the limits exist. Let $ \varepsilon> 0 $ . Since the averages of $ \mathbf {a} $ converge uniformly to zero, there is some $ K_0 \in \mathbb {N} $ such that for $ K \geq K_0 $ ,
Then, for any fixed $ K \geq K_0 $ ,
Hence,
for all $ \varepsilon> 0 $ and we are done.
Corollary 3.3. Let $ (X, \mu , T) $ be ergodic. Suppose $ x \in X $ and $ f \in C(X) $ are such that the averages $ \frac {1}{N} \sum _{n=1}^{N} f(T^n x) $ converge uniformly to zero. Then,
3.2 A transition to weighted sums
From this point on, we denote
Most commonly, we take $m = 2,3$ . To study pointwise convergence without the condition of unique ergodicity, we first introduce a different formulation of the ergodic averages along $\Omega (n)$ . Let $\mathbf {a}: \mathbb {N} \to \mathbb {C}$ be a bounded arithmetic function. We show that there are weight functions $w_k(N)$ such that
Regrouping the terms by the value of $\Omega (n)$ yields an exact formulation for these weights. Let $\pi _k(N)$ denote the number of integers not exceeding N with exactly k prime factors, counted with multiplicity. Then,
so that $w_k(N) = \pi _k(N)/N$ . However, this exact formulation does not give much insight into the shape of these weight functions. Instead, we rely on an estimate of Erdős [Reference ErdősErd48, Theorem II] to show that on large intervals, the weight functions $ w_k (N) $ can be approximated by a Gaussian with mean and variance $\log ^2 N. $
Lemma 3.4. Let $ \pi _k(N) $ be defined as above. Then there exists $ C> 0$ such that
uniformly for $ k \in [\log ^2 N - C \sqrt {\log ^2 N}, \log ^2 N + C \sqrt {\log ^2 N} ] $ .
Remark 3.5. Lemma 3.4 can be shown using probability theory. Erdős’s estimate can be viewed as approximating $\pi _k(N)/N$ by a Poisson distribution with parameter $\log ^2 N$ . Since $ \log ^2 N $ tends to infinity with N, for large values of N, this Poisson distribution can be approximated by a Gaussian distribution with mean and variance $\log ^2 N$ . However, since we do not take a probabilistic viewpoint in this paper, the computation is included for completeness.
Proof of Lemma 3.4
Let $ C> 0$ be that given by Theorem 2.3 and set
By an estimate of Erdős [Reference ErdősErd48, Theorem II],
This estimate is uniform for $ k \in I_N $ . Applying Stirling’s formula,
We now rewrite k in the following form:
for some $A \in \mathbb {R}$ . For such values of k,
By the quadratic approximation $\log (1+\varepsilon ) = \varepsilon - \varepsilon ^2/2 + O(\varepsilon ^3)$ for $\varepsilon < 1$ , we obtain
Exponentiating, we obtain the following estimate for $\pi _k(N)/N$ :
Rewriting A in terms of k,
3.3 Failure of pointwise convergence
We conclude this section by demonstrating the failure of pointwise convergence along $\Omega (n)$ in every non-atomic ergodic system. The strategy is to first approximate the ergodic averages using Lemma 3.4. We then use the Rokhlin lemma to construct a set of small measure on which the ergodic averages along $\Omega (n)$ become large. A lemma from functional analysis then implies the failure of pointwise convergence. This lemma is proven in much greater generality in [Reference Rosenblatt, Wierdl, Peterson and SalamaRW94, Theorem 5.4], but here we state it only for the averaging operators $T_N: \mathcal {B} \to L^1(\mu )$ defined by
where $C> 0$ is a constant to be chosen later.
Lemma 3.6. Let $T_N$ be defined as in equation (8) and let $N_0 \in \mathbb {N}$ . Assume that for all $\varepsilon> 0$ and $N \geq N_0$ , there is a set $A \in \mathcal {B}$ with $\mu (A) < \varepsilon $ and
Then, there is a dense $G_\delta $ subset $\mathcal {R} \subset \mathcal {B}$ such that for $A \in \mathcal {R}$ ,
and
Operators satisfying the conclusion of Lemma 3.6 are said to have the strong sweeping-out property. As we demonstrate in the case of the operators $T_N$ , averaging operators with strong sweeping-out property fail for pointwise convergence.
Proof of Theorem 1.2
We want to find a set $A \in \mathcal {B}$ such that for a.e. $x \in X$ , the averaging operators
satisfy equation (1). For each $N \in \mathbb {N}$ , define
where the constant C is chosen later. Then for all $ A \in \mathcal {B} $ ,
Let $\varepsilon \in (0,1)$ . Choose $C> 0$ satisfying Theorem 2.3. Then,
For $k \in I_N$ , we approximate $\pi _k(N)/N$ using Lemma 3.4. Since this estimate is uniform over $k \in I_N$ , there are $\varepsilon _N: \mathbb {N} \to \mathbb {R}$ such that $ \lim _{N \to \infty } \sup _{k \in I_N}|\varepsilon _N(k)| = 0$ and
Since $|\varepsilon _N(k)|$ tends to zero uniformly in k as N tends to infinity,
and
for all $ A \in \mathcal {B} $ . The upper bound in equation (11) and the lower bound in equation (12) are trivial. Now, fix $N_0 \in \mathbb {N} $ such that for all $M \geq N_0$ ,
Let $N \geq M \geq N_0$ . By the Rokhlin lemma, there is a set $E \in \mathcal {B}$ such that the sets $T^k E$ are pairwise disjoint for $k = 0, \ldots , \lfloor \log ^2 N + C \sqrt {\log ^2 N} \rfloor $ and
Note the upper bound is trivial. Using the disjointness condition, we obtain bounds for the measure of E:
Define $A_N := \bigcup _{k = \lceil \log ^2 N - C\sqrt {\log ^2 N}\rceil }^{\lfloor \log ^2 N + C\sqrt {\log ^2 N}\rfloor } T^k E$ . Using the upper bound given in equation (13),
Hence, the measure of $A_N$ tends to zero as N tends to infinity, so that, for large N, $A_N$ satisfies the first condition of Lemma 3.6. We now construct sets $B_N$ such that $\mu (B_N) \to 1 - \varepsilon $ as $N \to \infty $ and
Let $j \in \{0 , \ldots , N - M\}$ . Then, $M \leq N - j \leq N$ . Define
For $x \in T^{\kappa (j)}E $ , we have $T^k x \in A_N$ for $k = \lceil \log ^2 (N-j) - C \sqrt {\log ^2 (N-j)} \rceil , \ldots ,$ $\lfloor \log ^2 (N-j) + C \sqrt {\log ^2 (N-j)} \rfloor $ so that
where the last inequality holds since $N-j \geq M \geq N_0$ . Then for all $j \in \{0, \ldots , N - M\}$ , each $x \in T^{\kappa (j)}E$ is such that
Define $B_N := \bigcup _{j = 0}^{N-M} T^{\kappa (j)} E$ . Since $0 \leq \kappa (j) \leq \kappa (N-M)$ are all integer valued, $B_N$ is the union of $\kappa (N-M)$ disjoint sets. Using the lower bound in equation (13),
Hence, $ \mu (B_N) $ tends to $ 1- \varepsilon $ . Now, take $N \geq N_0$ large enough that $\mu (A_N) \leq \varepsilon $ and $ \mu (B_N) \geq 1 - 2\varepsilon $ . Set $A = A_N$ . Then A satisfies the hypothesis of Lemma 3.6, and we obtain a dense $G_\delta $ subset $\mathcal {R} \subset \mathcal {B}$ such that for $A \in \mathcal {R}$ and a.e. $x \in X$ ,
Equations (11) and (12) then yield
and
for all $\varepsilon> 0$ .
4 Independence of the Erdős–Kac theorem and Theorem 1.1
4.1 A logarithmic version of prime number theorem
Let $ B \subset \mathbb {N} $ be a finite, non-empty subset of the integers. For a function $ f: B \to \mathbb {C} $ , we define the Cesáro averages of f over B by
and the logarithmic averages of f over B by
Define $[N] := \{1, 2, \ldots , N\}$ . Let $\mathbb {P}$ denote the set of primes. For $k \in \mathbb {N}$ , let $\mathbb {P}_k$ denote the set of k-almost primes, the integers with exactly k prime factors, not necessarily distinct. Our averaging set B is often chosen from the aforementioned sets.
Before proving Theorem 1.3, we first present a proof of a logarithmic version of the prime number theorem. This statement is well known and follows directly from the Cesáro version of the prime number theorem. However, we include the following method of proof, as it illustrates a streamlined version of the core ideas that arise in the proof of Theorem 1.3.
Theorem 4.1. (Logarithmic prime number theorem)
Let $\unicode{x3bb} (n)$ denote the Liouville function. Then,
The following standard trick allows us to simplify the argument in the case of logarithmic averages.
Lemma 4.2. Let $\mathbf {a}: \mathbb {N} \to \mathbb {C}$ be a bounded arithmetic function. Then for any $p \in \mathbb {N}$ ,
Intuitively, this is due to the weight of $1/n$ in the definition of the logarithmic average. As N becomes large, the terms between $N/p$ and N are weighted so heavily that they contribute very little to the overall average.
Proof. Let $M> 0$ be a bound for $|\mathbf {a}|$ . For $N \in \mathbb {N}$ , define $A_N := \sum _{n=1}^N 1/n$ . We calculate:
We now use the fact that
where $ \gamma $ denotes the Euler–Mascheroni constant. Then,
so that $ \lim _{N \to \infty } | \underset {n \in [N]\,}{\mathbb {E}^{\log }} \mathbf {a}(n) - \underset {n \in [N/p]\,}{\mathbb {E}^{\log }} \mathbf {a}(n) | = 0 $ .
We also require the following proposition, which can be thought of as estimating the average number of divisors of an integer $ n $ that come from a specified set.
Proposition 4.3. Let $B \subseteq \mathbb {N}$ be finite and non-empty. Define $\Phi (n,m) = \gcd (n,m) -1$ and let $1_{m|n}$ take value 1 if m divides n and zero otherwise. Then:
-
(i) $\limsup _{N \to \infty } \underset {n \in [N]\,}{\mathbb {E}} | \underset {m \in B\,}{\mathbb {E}^{\log }} (1- m 1_{m|n}) | \leq ( \underset {m \in B\,}{\mathbb {E}^{\log }} \underset {n \in B\,}{\mathbb {E}^{\log }} \Phi (n,m) )^{1/2}$ ;
-
(ii) $\limsup _{N \to \infty } \underset {n \in [N]\,}{\mathbb {E}^{\log }} | \underset {m \in B\,}{\mathbb {E}^{\log }} (1- m 1_{m|n}) | \leq ( \underset {m \in B\,}{\mathbb {E}^{\log }} \underset {n \in B\,}{\mathbb {E}^{\log }} \Phi (n,m) )^{1/2}$ .
Proof. A proof of statement (i) can be found in [Reference Bergelson and RichterBR21, Proposition 2.1]. The proof of (ii) is completely analogous, replacing Cesáro averages by logarithmic averages.
In the following, we often choose the set B so that the quantity $ \underset {m \in B\,}{\mathbb {E}^{\log }} \underset {n \in B\,}{\mathbb {E}^{\log }} \Phi (n,m) $ is small. Intuitively, this means that two random elements from B have a high chance of being coprime.
Proof of Theorem 4.1
Let $\varepsilon> 0$ . By definition,
Notice that for m and n from $\mathbb {P} \cap [s]$ ,
Then,
so that
Take $s_0 \in \mathbb {N}$ such that for all $s \geq s_0$ ,
Fix $s \geq s_0$ . By Proposition 4.3,
Then by a direct calculation,
Hence,
Since $\unicode{x3bb} (p n) = - \unicode{x3bb} (n)$ for any prime p, this reduces to
Finally, applying Lemma 4.2 to $\unicode{x3bb} (n)$ , we remove the dependence of the inner logarithmic average on p:
Letting $ \varepsilon \to 0 $ , we conclude that $\lim _{N \to \infty } \underset {n \in [N]\,}{\mathbb {E}^{\log }} \unicode{x3bb} (n) = 0$ .
4.2 Proof of Theorem 1.3
A more technical version of this argument can be applied to obtain Theorem 1.3. The main difficulty arises in the last two steps, in which we remove the dependence of the inner average on the primes p. To get around this, we rely on the following technical proposition.
Proposition 4.4. For all $ \varepsilon \in (0,1) $ and $ \rho \in (1, 1+\varepsilon ] $ , there exist finite, non-empty sets $ B_1 $ , $ B_2 \subseteq \mathbb {N}$ with the following properties:
-
(i) $ B_1 \subset \mathbb {P} $ and $ B_2 \subset \mathbb {P}_2 $ ;
-
(ii) $ B_1 $ and $ B_2 $ have the same cardinality when restricted to $ \rho $ -adic intervals: $ |B_1 \cap (\rho ^j, \rho ^{j+1}]| = |B_2 \cap (\rho ^j, \rho ^{j+1}]| $ for all $ j \in \mathbb {N} \cap \{0\} $ ;
-
(iii) $ \underset {m \in B_i\,}{\mathbb {E}^{\log }} \underset {n \in B_i\,}{\mathbb {E}^{\log }} \Phi (m,n) \leq \varepsilon $ for $ i = 1,2$ ;
-
(iv) for any $\mathbf {a}: \mathbb {N} \to \mathbb {C}$ with $|\mathbf {a}| \leq M$ for some $M> 0$ ,
$$ \begin{align*} \Big| \underset{p \in B_1\,}{\mathbb{E}^{\log}} \underset{n \in [N/p]\,}{\mathbb{E}} \mathbf{a}(n) - \underset{p \in B_2\,}{\mathbb{E}^{\log}} \underset{n \in [N/p]\,}{\mathbb{E}} \mathbf{a}(n) \Big| \leq 3M \varepsilon. \end{align*} $$
Proof. Statements (i)–(iii) can be found in [Reference Bergelson and RichterBR21, Lemma 2.2]. Statement (iv) follows from statement (iii). The proof for arithmetic functions of modulus 1 can be found in [Reference Bergelson and RichterBR21, Lemma 2.3]. The argument for bounded arithmetic functions is completely analogous.
Proposition 4.5. Define $ \varphi _N(n) := ({\Omega (n) - \log \log N })/{\sqrt {\log \log N}}$ . Then for any bounded arithmetic function $ \mathbf {a}: \mathbb {N} \to \mathbb {C} $ and $ F \in C_c(\mathbb {R}) $ ,
Proof. Let $\varepsilon \in (0,1)$ and $\rho \in [1, 1+ \varepsilon )$ . Let $B_1$ and $B_2$ be finite, non-empty sets satisfying the conditions of Proposition 4.4. Let $M_1$ be a bound for $|F|$ and $M_2$ a bound for $|\mathbf {a}|$ . Then,
where the last inequality follows by Proposition 4.3. Hence,
Replacing $B_1$ by $B_2$ in the above argument, we obtain
Since $B_1$ consists only of primes and $B_2$ consists only of 2-almost primes, equations (14) and (15) yield
and
respectively. Since $\lim _{N \to \infty } |F(\varphi _N(p n)) - F(\varphi _N(n))| = 0$ for each $p \in B_i$ , we can find an $N_0 \in \mathbb {N}$ such that for $N \geq N_0$ , $|F(\varphi _N(p n)) - F(\varphi _N(n))| \leq \varepsilon $ for all $p \in B_1, B_2$ . Then for $i = 1,2$ ,
where $C_1$ and $C_2$ do not depend on N or $\varepsilon $ . Hence, we can remove the dependence on p from the summands of equations (16) and (17), yielding
and
Finally, Proposition 4.4 yields
Combining equations (18), (19), and (20), and letting $N \to \infty $ , $\varepsilon \to 0$ , we are done.
Proof of Theorem 1.3
Fix $F \in C_c(\mathbb {R})$ and $x \in X$ . We first perform a reduction using the condition of unique ergodicity. For $ N \in \mathbb {N} $ , set $\varphi _N(n) = ({\Omega (n) - \log \log N})/{\sqrt {\log \log N}}$ and define the measure $\mu _N$ by
Explicitly, $\mu _N = ({1}/{N}) \sum _{n=1}^N F( \varphi _N(n) ) \, \delta _{T^{\Omega (n)}x}$ , where $\delta _y$ denotes the point mass at y. Now, define
Then the conclusion of the theorem is equivalent to convergence of the sequence $\{ \mu _N \}_{n \in \mathbb {N}}$ to $\mu '$ in the weak-* topology. Notice that if each limit point of $\{\mu _N\}_{n \in \mathbb {N}}$ is T-invariant, then since $\mu $ is uniquely ergodic, each limit point is equal to $\mu '$ and we are done. Hence, it remains to show that each limit point is T-invariant. To do this, we show that for all $g \in C(X)$ ,
By definition of the measures $\mu _N$ , we need to show that for all $g \in C(X)$ ,
Fix $g \in C(X)$ . By Proposition 4.5 applied to $\mathbf {a}(n) = g(T^n x)$ , we are done.
Proof of Corollary 1.4
Let $ (X, \mu , T) $ be the uniquely ergodic system given by rotation on two points (see §2.3 for the precise definition). Define $ g: X \to \{-1,1\} $ by $ g(0) = -1 $ and $ g(1) = 1 $ . Then,
By Theorem 1.3 followed by the prime number theorem,
Acknowledgements
The author thanks Bryna Kra and Florian Richter for numerous helpful discussions throughout this project and the anonymous referee for detailed comments and suggestions. The author was partially supported by NSF grant DMS-1502632.