1. Introduction
A special case of a result of Windisch [Reference Windisch15]—extended further in [Reference Černý and Teixeira1]—states that the trace of a simple random walk on the discrete d-dimensional torus $(\mathbb{Z} / N \mathbb{Z})^d$ , for $d\ge 3$ , started from stationarity and run for time $u N^d$ converges, in a local sense, to an interlacement process at level u, as $N \to \infty$ . In this paper we will be concerned with a variation on this result, for which our motivation was a heuristic analysis of an algorithm we used to simulate high-dimensional loop-erased random walks and the sandpile height distribution [Reference Járai and Sun7]. Let us first describe our main result and then discuss the motivating problem.
Consider a discrete-time lazy simple random walk $(Y_t)_{t \ge 0}$ starting at the origin o on $\mathbb{Z}^d$ . We write $\textbf{P}_o$ for the probability measure governing this walk. We stop the walk at the first time $T_L$ when it exits the large box $({-}L,L)^d$ , where L is an integer. We will take $L=L(N)$ of the form $L = m N$ , where $m = m(N)$ and N is an integer, such that $L^2 \sim A N^d$ for some $A \in (0,\infty)$ , as $N \to \infty$ . We consider the projection of the trajectory $\{ Y_t\, :\,0 \le t \lt T_L \}$ to the N-torus $\mathbb{T}_N = [ -N/2, N/2 )^d \cap \mathbb{Z}^d$ . The projection is given by the map $\varphi_N\, :\,\mathbb{Z}^d \to \mathbb{T}_N$ , where for any $x \in \mathbb{Z}^d$ , $\varphi_N(x)$ is the unique point of $\mathbb{T}_N$ such that $\varphi_N(x) \equiv x$ $\pmod N$ , where congruence $\pmod{N}$ is understood coordinate-wise.
Let $\sigma_1$ denote the exit time from $({-}1,1)^d$ of a standard Brownian motion started at o. We write $\mathbb{E}$ for the expectation associated to this Brownian motion. For any finite set $K \subset \mathbb{Z}^d$ , let $\textrm{Cap}(K)$ denote the capacity of K [Reference Lawler and Limic9]. For any $0 \lt R \lt \infty$ and $x \in \mathbb{Z}^d$ , we define $B_R(x) = \{ y \in \mathbb{Z}^d\, :\,|y - x| \lt R \}$ , where $| \cdot |$ is the Euclidean norm. Let $\mathcal{K}_R$ denote the collection of all subsets of $B_R(o)$ . Given $\textbf{x} \in \mathbb{T}_N$ , let $\tau_{\textbf{x},N}\, :\,\mathbb{T}_N \to \mathbb{T}_N$ denote the translation of the torus by $\textbf{x}$ . Let $g\, :\, \mathbb{N} \to (0, \infty)$ be any function satisfying $g(N) \to \infty$ .
Theorem 1.1. Let $d \ge 3$ . For any $0 \lt R \lt \infty$ , any $K \in \mathcal{K}_R$ , and any $\textbf{x}$ satisfying $\tau_{\textbf{x},N} \varphi_N(B_R(o)) \cap \varphi_N(B_{g(N)}(o)) = \emptyset$ we have
The error term depends on R and g, but is uniform in K and $\textbf{x}$ .
Note that the trace of the lazy simple random walk stopped at time T is the same as the trace of the simple random walk stopped at the analogous exit time. We use the lazy walk for convenience of the proof.
Our result is close in spirit—although the details differ—to a result of Sznitman [Reference Sznitman12] that is concerned with a simple random walk on a discrete cylinder. The interlacement process was introduced by Sznitman in [Reference Sznitman13]. It consists of a one-parameter family $(\mathcal{I}^u)_{u \gt 0}$ of random subsets of $\mathbb{Z}^d$ ( $d \ge 3$ ), where the distribution of $\mathcal{I}^u$ can be characterized by the relation
The precise construction of a process satisfying (2) represents $\mathcal{I}^u$ as the trace of a Poisson cloud of bi-infinite random walk trajectories (up to time-shifts), where u is an intensity parameter. We refer to [Reference Sznitman13] and the books [Reference Drewitz, Ráth and Sapozhnikov3, Reference Sznitman14] for further details. Comparing (1) and (2), we now formulate precisely what we mean by saying that the stopped trajectory, locally, is described by an interlacement process at the random level $u = A d \sigma_1$ .
Let $g^{\prime}\, :\, \mathbb{N} \to (0, \infty)$ be any function satisfying $g^{\prime}(N) \to \infty$ . Note this does not have to be the same function as g(N). Let $\textbf{x}_N$ be an arbitrary sequence satisfying $\tau_{\textbf{x}_N,N} \varphi_N(B_{g^{\prime}(N)}(o)) \cap \varphi_N(B_{g(N)}(o)) = \emptyset$ . Define the sequence of random configurations $\omega_N \subset \mathbb{Z}^d$ by
Define the process $\tilde{\mathcal{I}}$ by requiring that for all finite $K \subset \mathbb{Z}^d$ we have
To see that this formula indeed defines a process that is also unique, write the right-hand side as
where $f_{\sigma_1}$ is the density of $A d \sigma_1$ . Then via the inclusion–exclusion formula, we see that we necessarily have for all finite sets $B \subset K$ the equality
and the right-hand side can be used as the definition of the finite-dimensional marginals of $\tilde{\mathcal{I}}$ . Note that $\tilde{\mathcal{I}}$ lives in a compact space (the space can be identified with $\{ 0, 1\}^{\mathbb{Z}^d}$ with the product topology). Hence the finite-dimensional marginals uniquely determine the distribution of $\tilde{\mathcal{I}}$ , by Kolmogorov’s extension theorem.
Theorem 1.2. Let $d \ge 3$ . Under $\textbf{P}_o$ , the law of the configuration $\omega_N$ converges weakly to the law of $\tilde{\mathcal{I}}$ , as $N \to \infty$ .
Proof of Theorem 1.2 assuming Theorem 1.1. For events of the form $\{ \omega_N \cap K = \emptyset \}$ , Theorem 1 immediately implies that
For events of the form $\{ \omega_N \cap K = B \}$ , the inclusion–exclusion formula represents $\textbf{P}_o [ \omega_N \cap K = B ]$ as a linear combination of probabilities of the former kind, and hence convergence follows.
Our motivation for studying the question in Theorem 1.1 was a simulation problem that arose in our numerical study of high-dimensional sandpiles [Reference Járai and Sun7]. We refer the interested reader to [Reference Dhar2, Reference Járai6, Reference Redig, Bovier, Dunlop and Van Enter11] for background on sandpiles. In our simulations we needed to generate loop-erased random walks (LERWs) from the origin o to the boundary of $[{-}L,L]^d$ , where $d \ge 5$ . The LERW is defined by running a simple random walk from o until it hits the boundary, and erasing all loops from its trajectory chronologically, as they are created. We refer to the book [Reference Lawler and Limic9] for further background on LERWs (which is not needed to understand the results in this paper). It is known from results of Lawler [Reference Lawler8] that in dimensions $d \ge 5$ the LERW visits on the order of $L^2$ vertices, the same as the simple random walk generating it. As the number of vertices visited is much smaller than the volume $c L^d$ of the box, an efficient way to store the path generating the LERW is provided by the well-known method of hashing. We refer to [Reference Járai and Sun7] for a discussion of this approach, and only provide a brief summary here. Assign to any $x \in [{-}L,L]^d \cap \mathbb{Z}^d$ an integer value $f(x) \in \{ 0, 1, \dots, C L^2 \}$ that is used to label the information relevant to position x, where C can be a large constant or slowly growing to infinity. Thus f is necessarily highly non-injective. However, we may be able to arrange that with high probability the restriction of f to the simple random walk trajectory is not far from injective, and then memory use can be reduced from order $L^d$ to roughly $O(L^2)$ .
A simple possible choice of the hash function f can be to compose the map $\varphi_N\, :\,[{-}L,L]^d \cap \mathbb{Z}^d \to \mathbb{T}_N$ with a linear enumeration of the vertices of $\mathbb{T}_N$ , whose range has the required size. (This is slightly different from what was used in [Reference Járai and Sun7].) The method can be expected to be effective, if the projection $\varphi_N(Y[0,T))$ spreads roughly evenly over the torus $\mathbb{T}_N$ with high probability. Our main theorem establishes a version of such a statement, as the right-hand-side expression in (1) is independent of $\textbf{x}$ .
We now make some comments on the proof of Theorem 1.1. We refer to [Reference Drewitz, Ráth and Sapozhnikov3, Theorem 3.1] for the strategy of the proof in the case when the walk is run for a fixed time $u N^d$ . The argument presented there proceeds by decomposing the walk into stretches of length $\lfloor N^\delta \rfloor$ for some $2 \lt \delta \lt d$ , and then estimating the (small) probability in each stretch that $\tau_{\textbf{x},N} \varphi_N(K)$ is hit by the projection. We follow the same outline for the stopped lazy random walk. However, the elegant time-reversal argument given in [Reference Drewitz, Ráth and Sapozhnikov3] is not convenient in our setting, and we need to prove a delicate estimate on the probability that $\tau_{\textbf{x},N} \varphi_N(K)$ is hit, conditional on the starting point and endpoint of the stretch. For this, we only want to consider stretches with ‘well-behaved’ starting points and endpoints. We also classify a stretch as a ‘good stretch’ if the total displacement is not too large, and as a ‘bad stretch’ otherwise. We do this in such a way that the expected number of ‘bad stretches’ is small, and summing over the ‘good stretches’ gives us the required behaviour.
Possible generalizations.
-
(1) It is not essential that we restrict to the simple random walk: any random walk for which the results in Section 2 hold (such as finite-range symmetric walks) would work equally well.
-
(2) The paper [Reference Windisch15] considers several distant sets $K^1, \dots, K^r$ , and we believe this would also be possible here, but would lead to further technicalities in the presentation.
-
(3) It is also not essential that the rescaled domain be $({-}1,1)^d$ , and we believe it could be replaced by any other domain with sufficient regularity of the boundary.
A note on constants. All constants will be positive and finite. Constants denoted by C or c will depend only on the dimension d and may change from line to line. If we need to refer to a constant later, it will be given an index, such as $C_1$ .
We now describe the organization of this paper. In Section 2, we first introduce some basic notation, then recall several useful known results on random walks and state the key propositions required for the proof of the main theorem, Theorem 1.1. Section 3 contains the proof of the main theorem, assuming the key propositions. Finally, in Section 4 we provide the proofs of the propositions stated in Section 2.
2. Preliminaries
2.1. Some notation
We first introduce some notation used in this paper. In Section 1 we defined the discrete torus $\mathbb{T}_N = [{-}N/2, N/2)^d \cap \mathbb{Z}^d$ , $d\geq 3$ , and the canonical projection map $\varphi_N: \mathbb{Z}^d \to \mathbb{T}_N$ . From here on, we will omit the N-dependence and write $\varphi$ and $\tau_{\textbf{x}}$ instead.
We write vertices and subsets of the torus in bold, i.e. $\textbf{x} \in \mathbb{T}_N$ and $\textbf{K} \subset \mathbb{T}_N$ . In order to simplify notation, in the rest of the paper we abbreviate $\textbf{K} = \tau_{\textbf{x}} \varphi(K)$ .
Let $(Y_t)_{t \ge 0}$ be a discrete-time lazy simple random walk on $\mathbb{Z}^d$ ; that is,
We denote the corresponding lazy random walk on $\mathbb{T}_N$ by $(\textbf{Y}_t)_{t\geq 0} = (\varphi(Y_t))_{t\geq 0}$ . Let $\textbf{P}_{x^{\prime}}$ denote the distribution of the lazy random walk on $\mathbb{Z}^d$ started from $x^{\prime} \in \mathbb{Z}^d$ , and write $\textbf{P}_{\textbf{x}}$ for the distribution of the lazy random walk on $\mathbb{T}_N$ started from $\textbf{x} = \varphi(x^{\prime}) \in \mathbb{T}_N$ . We write $p_t(x^{\prime},y^{\prime}) = \textbf{P}_{x^{\prime}} [ Y_t = y^{\prime} ]$ for the t-step transition probability. Further notation we will use includes the following:
$L = m N$ , where $L^2 \sim A N^d$ as $N \to \infty$ for some constant $A \in (0,\infty)$ ;
$D = ({-}m,m)^d$ , the rescaled box, indicates which copy of the torus the walk is in;
$n = \lfloor N^\delta \rfloor$ for some $2 \lt \delta \lt d$ , long enough for the mixing property on the torus, but short compared to $L^2$ ;
$\textbf{x}_\textbf{0} \in \textbf{K}$ is a fixed point of $\textbf{K}$ ;
we write points in the original lattice $\mathbb{Z}^d$ with a prime, such as $y^{\prime}$ , and decompose a point $y^{\prime}$ as $y N+\textbf{y}$ with y in another lattice isomorphic to $\mathbb{Z}^d$ and $\textbf{y} = \varphi(y^{\prime}) \in \mathbb{T}_N$ ;
$T = \inf \big\{ t \ge 0\, :\,Y_t \not\in ({-}L,L)^d \big\}$ , the first exit time from $({-}L,L)^d$ ;
$S = \inf \big\{ \ell \ge 0\, :\,Y_{n \ell} \not\in ({-}L,L)^d \big\}$ , so that the first multiple of n when the rescaled point $Y_{n \ell} / N$ is not in $({-}m,m)^d$ equals $S\cdot n$ .
For simplicity, we omit the dependence on d and N from some of the notation above.
2.2. Some auxiliary results on random walks
In this section, we collect some known results required for the proof of Theorem 1.1. We will rely heavily on the local central limit theorem (LCLT) [Reference Lawler and Limic9, Chapter 2], with error term, and the martingale maximal inequality [Reference Lawler and Limic9, Equation (12.12) of Corollary 12.2.7]. We will also use [Reference Lawler and Limic9, Equation (6.31)], which relates $\textrm{Cap}(K)$ to the probability that a random walk started from the boundary of a large ball with radius n hits the set K before exiting the ball. In estimating some error terms in our arguments, sometimes we will use the Gaussian upper and lower bounds [Reference Hebisch and Saloff-Coste5]. We also need to derive a lemma related to the mixing property on the torus [Reference Levin, Peres and Wilmer10, Theorem 5.6] to show that the starting positions of different stretches are not far from uniform on the torus; see Lemma 2.1.
We recall the LCLT from [Reference Lawler and Limic9, Chapter 2]. The following is a specialization of [Reference Lawler and Limic9, Theorem 2.3.11] to lazy simple random walks. The covariance matrix $\Gamma$ and the square root $J^{*}(x)$ of the associated quadratic form are given by
where I is the $(d \times d)$ -unit matrix.
Let $\bar{p}_t(x^{\prime})$ denote the estimate of $p_t(x^{\prime})$ that one obtains by the LCLT, for a lazy simple random walk. We have
The lazy simple random walk $(Y_t)_{t\geq 0}$ in $\mathbb{Z}^d$ is aperiodic and irreducible with mean zero, finite second moment, and finite exponential moments. All joint third moments of the components of $Y_1$ vanish.
Theorem 2.1. ([Reference Lawler and Limic9, Theorem 2.3.11].) For a lazy simple random walk $(Y_t)_{t\geq 0}$ in $\mathbb{Z}^d$ , there exists $\rho\gt 0$ such that for all $t\geq 1$ and all $x^{\prime}\in\mathbb{Z}^d$ with $|x^{\prime}|\lt\rho t$ ,
The martingale maximal inequality in [Reference Lawler and Limic9, Equation (12.12) of Corollary 12.2.7] is stated as follows. Let $\Big(Y^{(i)}_t\Big)_{t \ge 0}$ denote the ith coordinate of $(Y_t)_{t \ge 0}$ ( $1 \le i \le d$ ). The standard deviation $\sigma$ of $Y^{(i)}_1$ is given by $\sigma^2 = (2d)^{-1}$ . For all $t \ge 1$ and all $r \gt 0$ we have
Now we state the result of [Reference Lawler and Limic9, Equation (6.31)]. Recall that $B_r(o)$ is the discrete ball centred at o with radius r. For any subset $B \subset \mathbb{Z}^d$ , let
Let $\partial B_r(o) = \big\{ y^{\prime} \in \mathbb{Z}^d \setminus B_r(o)\, :\,\exists x^{\prime} \in B_r(o)$ such that $|x^{\prime} - y^{\prime}| = 1 \big\}$ . For a given finite set $K \subseteq\mathbb{Z}^d$ , let $H_K$ denote the hitting time
Then we have
Here $\textrm{Cap}(K)$ is the capacity of K; see [Reference Lawler and Limic9, Section 6.5], which states the analogous statement for the simple random walk. Since we consider the lazy random walk, this introduces a factor of $1/2$ .
In estimating some error terms in our arguments, sometimes we will use the Gaussian upper and lower bounds [Reference Hebisch and Saloff-Coste5]: there exist constants $C = C(d)$ and $c = c(d)$ such that
Recall that the norm $|\cdot|$ refers to the Euclidean norm.
Regarding mixing times, recall that the lazy simple random walk on the N-torus mixes in time $N^2$ [Reference Levin, Peres and Wilmer10, Theorem 5.6]. With this in mind we derive the following simple lemma.
Recall that $2\lt\delta\lt d$ and $n=\lfloor N^{\delta} \rfloor$ .
Lemma 2.1. There exists $C=C(d)$ such that for any $N \geq 1$ and any $t \ge n$ we have
Proof. Using the Gaussian upper bound, the left-hand side can be bounded by
Here we bounded the number of x in $\mathbb{Z}^d$ satisfying $k \sqrt{t}/N \le |x| \lt (k+1) \sqrt{t}/N$ by $C(k+1)^{d-1} t^{d/2}/ N^{d}$ , where $k=0, 1, 2, \dots$ .
2.3. Key propositions
In this section we state some propositions to be used in Section 3 to prove Theorem 1.1. The propositions will be proved in Section 4.
The strategy of the proof is to consider stretches of length n of the walk, and estimate the small probability in each stretch that $\textbf{K}$ is hit by the projection. What makes this strategy work is that we can estimate, conditionally on the starting point and endpoint of a stretch, the probability that $\textbf{K}$ is hit, and this event asymptotically decouples from the number of stretches. The number of stretches will be the random variable S. Since $n S \approx T$ , and T is $\Theta\big(N^d\big)$ in probability, we have that S is $\Theta\big(N^d/n\big)$ . In Lemma 2.2 below we show a somewhat weaker estimate for S (which suffices for our needs).
The main part of the proof will be to show that during a fixed stretch, $\textbf{K}$ is not hit with probability
Heuristically, conditionally on S this results in the probability
and we will conclude by showing that $\left(n/N^d\right ) S$ converges in distribution to a constant multiple of the Brownian exit time $\sigma_1$ .
The factor $n/N^d$ in (6) arises as the expected time spent by the projected walk at a fixed point of the torus during a given stretch. The capacity term arises as we pass from expected time to hitting probability.
For the above approach to work, we need a small-probability event on which the number of stretches or endpoints of stretches are not sufficiently well-behaved. First, we will need to restrict to realizations where $\big(\sqrt{\log \log n}\big)^{-1} \left(N^d/n\right) \le S \le \log N \left(N^d/n\right )$ , which occurs with high probability as $N \to \infty$ (see Lemma 2.2 below). Second, suppose that the $\ell$ th stretch starts at the point $y^{\prime}_{\ell-1}$ and ends at the point $y^{\prime}_{\ell}$ ; that is, $y^{\prime}_{\ell-1}$ and $y^{\prime}_{\ell}$ are realizations of $Y_{(\ell-1) n}$ and $Y_{\ell n}$ . In order to have a good estimate of the probability that $\textbf{K}$ is hit during this stretch, we will need to impose a few conditions on $y^{\prime}_{\ell-1}$ and $y^{\prime}_{\ell}$ . One of these is that the displacement $|y^{\prime}_{\ell} - y^{\prime}_{\ell-1}|$ is not too large: we will require that for all stretches, it is at most $f(n) \sqrt{n}$ , for a function to be chosen later that increases to infinity with N. We will be able to choose f(n) of the form $f(n) = C_1 \sqrt{\log N}$ in such a way that this restriction holds for all stretches with high probability. A third condition we need to impose, that will also hold with high probability, is that $y^{\prime}_{\ell-1}$ is at least a certain distance $N^\zeta$ from $\varphi^{-1}(\textbf{K})$ for a parameter $0 \lt \zeta \lt 1$ (this will only be required for $\ell \ge 1$ , and is not needed for the first stretch starting with $y^{\prime}_0 = o$ ). The reason we need this is to be able to appeal to (4) to extract the $\textrm{Cap}(\textbf{K})$ contribution, when we know that $\textbf{K}$ is hit from a long distance (we will take $r = N^\zeta$ in (4)). The larger the value of $\zeta$ , the better error bound we get on the approach to $\textrm{Cap}(\textbf{K})$ . On the other hand, $\zeta$ should not be too close to 1, because we want the separation of $y^{\prime}_{\ell-1}$ from $\textbf{K}$ to occur with high enough probability.
The set $\mathcal{G}_{\zeta,C_1}$ defined below represents realizations of S and the sequence $Y_{n}$ , $Y_{2n}$ , $\dots$ , $Y_{Sn}$ satisfying the above restrictions. Proposition 2.1 below implies that these restrictions hold with high probability. First, we will need $2 \lt \delta \lt d$ to satisfy the inequality
This can be satisfied if $d \ge 3$ and $\delta$ is sufficiently close to d, say $\delta = \frac{7}{8} d$ . Since the left-hand side of the left-hand inequality in (7) equals $(\delta/d)(d-2)$ , we can subsequently choose $\zeta$ such that we also have
With the parameter $\zeta$ fixed satisfying the above, we now define
where $f(n) = C_1 \,\sqrt{\log N}$ and recall that we write $y^{\prime}_\ell = y_\ell N + \textbf{y}_\ell$ and $y^{\prime}_{\ell-1} = y_{\ell-1} N + \textbf{y}_{\ell-1}$ , and we define $y^{\prime}_0 = o$ . The time $\tau$ is corresponding to a particular value of the exit time S, so $y_\ell \in D$ for $1 \le \ell \lt \tau$ and $y_{\tau} \notin D$ . The parameter $C_1$ will be chosen in the course of the proof. See Figure 1 for a visual illustration of the sequence $y^{\prime}_0, y^{\prime}_1, \dots$ in the definition of $\mathcal{G}_{\zeta,C_1}$ .
The next lemma shows that the restriction made on the time-parameter $\tau$ in the definition of $\mathcal{G}_{\zeta,C_1}$ holds with high probability for S.
Lemma 2.2. We have
as $N \to \infty$ .
Proof. By the definitions of S and T, we first notice that
where $Y^{(i)}$ denotes the ith coordinate of the d-dimensional lazy random walk.
We are going to use (3). Setting $t= \left(\sqrt{\log\log n}\right)^{-1} N^d$ and $r\sigma\sqrt{t} = L$ , we can evaluate each term $\Big($ similarly for the event with $-Y^{(i)}_j\Big)$ in the sum
Recall that $L^2 \sim A N^d$ and $\sigma^2 = 1/2d$ . For the main term in the exponential in (10), we have the upper bound
The big-O term in the exponential in (10) produces an error term because
Coming to the second event $\left\{\frac{S n}{N^d} \gt \log N \right\}$ , observe that the central limit theorem applied to $\Big(Y_{k n \lfloor N^d/n \rfloor) + t} - Y_{k n \lfloor N^d/n \rfloor}\Big)_{t \ge 0}$ implies that
for some $c \gt 0$ . Hence we have $\textbf{P}_{o}\left[S \gt k\frac{N^d}{n}\right]\leq e^{-c k}$ for all $k \ge 0$ .
Applying this with $k=\log N$ , we obtain
as required.
The starting point for the proof of Theorem 1.1 is the following proposition, which decomposes the probability we are interested in into terms involving single stretches of duration n.
Proposition 2.1. For a sufficiently large value of $C_1$ , we have that
Furthermore,
where $o(1) \to 0$ as $N \to \infty$ .
Central to the proof of Theorem 1.1 is the following proposition, which estimates the probability of hitting a copy of $\textbf{K}$ during a ‘good stretch’ where the displacement $|y^{\prime}_\ell - y^{\prime}_{\ell-1}|$ is almost of order $\sqrt{n}$ . This will not hold for all stretches with high probability, but the fraction of stretches for which it fails will vanish asymptotically.
Proposition 2.2. There exists a sufficiently large value of $C_1$ so that the following holds. Let $\big( \tau, (y_\ell, \textbf{y}_\ell)_{\ell=1}^\tau \big) \in \mathcal{G}_{\zeta,C_1}$ . Then for all $2 \le \ell \le \tau$ such that $|y^{\prime}_\ell - y^{\prime}_{\ell-1}| \le 10\sqrt{n}\log\log n$ we have
In addition to the above proposition (that we prove in Section 4.2), we will need a weaker version for the remaining ‘bad stretches’ that have less restriction on the distance $|y^{\prime}_\ell - y^{\prime}_{\ell-1}|$ . This will be needed to estimate error terms arising from the ‘bad stretches’, and it will also be useful in demonstrating some of our proof ideas for other error terms arising later in the paper. It will be proved in Section 4.1.
Proposition 2.3. Let $\big( \tau, (y_\ell, \textbf{y}_\ell)_{\ell=1}^\tau \big) \in \mathcal{G}_{\zeta,C_1}$ . For all $2 \le \ell \le \tau$ we have
and for the first stretch we have
Here the $-O\big(n/N^d\big)$ and $-o(1)$ error terms are negative.
Our final proposition is needed to estimate the number of stretches that are ‘bad’.
Proposition 2.4. We have
as $N \to \infty$ .
3. Proof of the main theorem assuming the key propositions
This section gives the proof of Theorem 1.1.
Proof of Theorem 1.1 assuming Propositions 2.1–2.4 .
First, given any $\big(\tau,(y_\ell, \textbf{y}_\ell)_{\ell=1}^\tau\big) \in \mathcal{G}_{\zeta,C_1}$ , we define
Thus we have
We further define
We have by Proposition 2.1 that
By Proposition 2.4, we can replace the summation over elements of $\mathcal{G}_{\zeta,C_1}$ by summation over just elements of $\mathcal{G}^{\prime}_{\zeta,C_1}$ , at the cost of an o(1) term. That is,
Applying Proposition 2.2 for the factors $\ell \in \mathcal{L}$ and Proposition 2.3 for the factors $\ell \in \{ 1 \} \cup \mathcal{L}^{\prime}$ , we get
Note that since the summation is over elements of $\mathcal{G}^{\prime}_{\zeta,C_1}$ only, we have
By (21), we can lower-bound the last product in (20) by
Since the product is also at most 1, it equals $1 + o(1)$ .
Also, by (21), we have
Since $\tau \ge \frac{N^d}{n} \big(\sqrt{\log \log n}\big)^{-1}$ , we have $|\mathcal{L}|=(1+o(1))\tau$ . This implies that the penultimate product in (20) equals
Recall that $S=\inf\big\{\ell\geq 0: Y_{n\ell}\notin ({-}L,L)^d\big\}$ . By summing over $(y_\ell, \textbf{y}_\ell)_{\ell=1}^\tau$ and appealing to (11), we get that (20) equals
where the primed summation denotes restriction to
Since, by Lemma 2.2, S satisfies the bounds on $\tau$ with probability going to 1, the latter expression equals
Let $\Gamma_n$ denote the covariance matrix for $Y_n$ , so that $\Gamma_n=\frac{n}{2d}I$ . Let $Z_{1}=\sqrt{\frac{2d}{n}}\,Y_{n}$ , with the covariance matrix $\Gamma_{Z} = I$ . Let $Z_\ell=\sqrt{\frac{2d}{n}}\,Y_{n\ell}$ for $\ell\ge 0$ .
Since $L^2\sim A\,N^d$ , the event $\big\{Y_{n\ell} \notin ({-}L,L)^d\big\}$ is the same as
Converting to events in terms of Z we have
Now we can write S as
Let $\sigma_1 = \inf\big\{t\gt 0\,:\,B_t\notin({-}1,1)^d\big\}$ be the exit time of Brownian motion from $({-}1,1)^d$ . By Donsker’s theorem [Reference Durrett4, Theorem 8.1.5] we have
Then we have that $\frac{n}{N^d} S$ converges in distribution to $c \sigma_1$ , with $c = 2dA$ . This completes the proof.
4. Proofs of the key propositions
4.1. Proof of Proposition 2.3
In the proof of the proposition we will need the following lemma, which bounds the probability of hitting some copy of $\textbf{K}$ in terms of the Green’s function of the random walk. Recall that the Green’s function is defined by
and in all dimensions $d \ge 3$ it satisfies the bound [Reference Lawler and Limic9]
for a constant $C_G = C_G(d)$ . For Part (ii) of the lemma recall that $\textbf{K} \cap \varphi(B_{g(N)}(o)) = \emptyset$ . We also define $\textrm{diam}(\textbf{K})$ as the maximum Euclidean distance between two points in $\textbf{K}$ .
Lemma 4.1. Let $d \ge 3$ . Assume that N is large enough so that $N^\zeta \ge \textrm{diam}(\textbf{K})$ .
-
(i) If $y^{\prime} \in \mathbb{Z}^d$ satisfies $\varphi(y^{\prime}) \not\in B(\textbf{x}_0,N^{\zeta})$ , then for all sufficiently small $\varepsilon \gt 0$ we have
(25) \begin{equation} \sum_{t=0}^{N^{2+6\varepsilon}} \sum_{x \in \mathbb{Z}^d} \sum_{x^{\prime} \in \textbf{K}+xN} p_t(y^{\prime},x^{\prime}) \le \frac{C}{N^{\zeta(d-2)}}.\end{equation} -
(ii) If $g(N) \le N^\zeta$ , then for all sufficiently small $\varepsilon \gt 0$ we have
(26) \begin{equation} \sum_{t=0}^{N^{2+6\varepsilon}} \sum_{x \in \mathbb{Z}^d} \sum_{x^{\prime} \in \textbf{K}+xN} p_t(o,x^{\prime}) \le \frac{C}{g(N)^{(d-2)}}.\end{equation} -
(iii) If $y^{\prime} \in \mathbb{Z}^d$ satisfies $\varphi(y^{\prime}) \not\in B(\textbf{x}_0,N^{\zeta})$ , then for all sufficiently small $\varepsilon \gt 0$ we have
(27) \begin{equation} \sum_{x \in \mathbb{Z}^d} \sum_{\substack{x^{\prime} \in \textbf{K}+xN \\ |x^{\prime}-y^{\prime}| \le n^{\frac{1}{2}+\varepsilon}}} G(y^{\prime},x^{\prime}) \le \frac{C}{N^{d-\delta-2\delta\varepsilon}}.\end{equation}
Proof. (i) We split the sum according to whether $|y^{\prime}-x^{\prime}| \gt N^{1+4\varepsilon}$ or $\le N^{1+4\varepsilon}$ . In the first case we use (5) and write $r = \lfloor |x^{\prime}-y^{\prime}| \rfloor$ to get
For the remaining terms, we have the upper bound
Let Q(k N) be the cube with radius kN centred at o. Then $y^{\prime} + ( Q(k N)\backslash Q((k-1)N))$ are disjoint annuli for $k = 1, 2, \dots$ . We decompose the sum over $x^{\prime}$ according to which annulus $x^{\prime}$ falls into. For $k \ge 2$ we have
where $C_G$ is the Green’s function constant. The contribution from any copy of $\textbf{K}$ in $y^{\prime} + Q(N)$ will be of order $N^{2-d}$ if its distance from $y^{\prime}$ is at least $N/3$ , say. Note that there is at most one copy of $\textbf{K}$ within distance $N/3$ of $y^{\prime}$ , which may have a distance as small as $N^\zeta$ .
We have to sum over the following values of k:
Since $x^{\prime}\in\varphi^{-1}(\textbf{K})$ and $y^{\prime}\notin \varphi^{-1}\left(B(\textbf{x}_0, N^{\zeta})\right)$ for $\textbf{x}_0\in \textbf{K}$ , the distance between $x^{\prime}$ and $y^{\prime}$ is at least $N^{\zeta}$ . Therefore, we get the upper bound as follows:
Here the last inequality follows from the choice of $\zeta$ , (8), for sufficiently small $\varepsilon \gt 0$ .
-
(ii) The proof is essentially the same, except for the contribution of the ‘nearest’ copy of $\textbf{K}$ , which is now $C |\textbf{K}| g(N)^{2-d}$ .
-
(iii) The proof is very similar to that in Part (i). Recall that $n=\lfloor N^{\delta} \rfloor$ . This time we need to sum over $k = 1, \dots, n^{\frac{1}{2}+\varepsilon} / N$ , which results in the bound
\begin{align*} C |\textbf{K}| N^{-\zeta(d-2)} + C|\textbf{K}| N^{2-d} \times N^{\delta + 2 \delta \varepsilon - 2} = C|\textbf{K}| \left[ N^{-\zeta(d-2)} + N^{\delta - d + 2 \delta \varepsilon} \right]\kern-2pt. \end{align*}Here, for $\varepsilon \gt 0$ small enough, the second term dominates thanks to the choice of $\zeta$ ; see (8).
Proof of Proposition 2.3. Since
we need to show that
Define $A(x)= \left\{Y_n=y^{\prime}_{\ell},\, Y_t \in x N + \textbf{K} \text { for some } 0 \le t \lt n \right\}$ , so that
We have
We bound this by splitting up the sum into different contributions. Let $\varepsilon \gt 0$ ; this will be chosen sufficiently small in the course of the proof.
Case 1: $n_1, n_2\geq N^{2+6\varepsilon}$ and $|y^{\prime}_{\ell-1}-x^{\prime}|\leq n^{\frac{1}{2}+\varepsilon}_1$ , $|x^{\prime}-y^{\prime}_{\ell}|\leq n^{\frac{1}{2}+\varepsilon}_2$ . By the LCLT we have that
For this note that we have
where the first term tends to 0 and the rest equals
Here we choose $\varepsilon$ so that $-\varepsilon+6\varepsilon^2 \lt 0$ . A similar observation shows the estimate for $p_{n_2}\big(x^{\prime},y^{\prime}_\ell\big)$ .
The way we are going to use (30) is to replace the summation over $x^{\prime}$ by a summation over all $u^{\prime} \in \mathbb{T}_N + xN$ , at the same time inserting a factor $|\textbf{K}|/N^d$ . Hence the contribution of the values of $n_1, n_2$ and x in Case 1 to the right-hand side of (28) is at most
This completes the bound in Case 1. For future use, note that if $\varepsilon_n \to 0$ is any sequence, and we add the restriction $n_1 \le \varepsilon_n n$ to the conditions in Case 1, we obtain the upper bound
Case 2a: $n_1, n_2 \ge N^{2+6\varepsilon}$ but $\big|x^{\prime} - y^{\prime}_{\ell-1}\big| \gt n_1^{\frac{1}{2}+\varepsilon}$ . In this case we bound $p_{n_2}\big(x^{\prime}, y^{\prime}_\ell\big) \le 1$ and have that the contribution of this case to the right-hand side of (28) is at most
where in the first step we used (3) and in the last step we used the Gaussian lower bound (5) for $p_n$ . Indeed, the requirement for the Gaussian lower bound is satisfied for sufficiently large N because $\big|y^{\prime}_{\ell}-y^{\prime}_{\ell+1}\big| \le C_1 \sqrt{\log n}\sqrt{n} \le c\,n$ . Therefore, we have
Then we have
Case 2b: $n_1, n_2 \ge N^{2+6\varepsilon}$ but $|y^{\prime}_{\ell} - x^{\prime}| \gt n_2^{1/2+\varepsilon}$ . This case can be handled very similarly to Case 2a.
Case 3a: $n_1 \lt N^{2+6\varepsilon}$ and $\big|x^{\prime}-y^{\prime}_{\ell-1}\big|\le N^{\frac{\delta}{2}-\varepsilon}$ . By the LCLT we have
We claim that
We first note that $n_2=n-n_1 = n(1+o(1))$ , then deduce that $n_2^{-d/2}= O\big(n^{-d/2}\big)$ and
Since we have $\big|x^{\prime}-y^{\prime}_{\ell-1}\big|\le N^{\frac{\delta}{2}-\varepsilon}$ in the exponent, as $N\to\infty$ we have
and
These imply that
Thus (33) follows from comparing the LCLT approximations of the two sides.
We now have that the contribution of this case to the right-hand side of (28) is at most
where in the first step we used Lemma 4.1(i) and the last step holds for the value of $\zeta$ we chose; cf. (8).
Case 3b: $n_1 \lt N^{2+6\varepsilon}$ but $\big|x^{\prime}-y^{\prime}_{\ell-1}\big|\gt N^{\frac{\delta}{2}-\varepsilon}$ . Use the Gaussian upper bound (5) to bound $p_{n_1}$ , and bound the sum over all $x^{\prime}\in\mathbb{Z}^d$ of $p_{n_2}$ by 1 using symmetry of $p_{n_2}$ , to get
In the last step we used a Gaussian lower bound for $p_n$ ; cf. (32).
Case 4a: $n_2 \lt N^{2+6\varepsilon}$ and $|y^{\prime}_\ell - x^{\prime}|\le N^{\frac{\delta}{2}-\varepsilon}$ . This case can be handled very similarly to Case 3a.
Case 4b: $n_2 \lt N^{2+6\varepsilon}$ and $|y^{\prime}_\ell - x^{\prime}|\gt N^{\frac{\delta}{2}-\varepsilon}$ . This case can be handled very similarly to Case 3b.
Therefore, we have discussed all possible cases and proved statement (14) of the proposition as required.
The proof of (15) is similar to that of the first part, with only a few modifications. In this part we have to show that
Define $A_0(x)= \left\{Y_n=y^{\prime}_1, Y_t\in x N + \textbf{K}\right.$ for some $\left.0\leq t \lt n \right\}$ , so that
We have
We bound the term above by splitting up the sum into the same cases as in the proof of (14). The different cases can be handled very similarly to the first part. The difference is only in Case 3a while applying the Green’s function bound Lemma 4.1.
In Case 3a, by the LCLT, we can deduce that
If $g(N) \gt N^\zeta$ , the bound of Lemma 4.1(i) can be used as before. If $g(N) \le N^\zeta$ , by Lemma 4.1(ii), we have that the contribution of this case to the right-hand side of (34) is at most
Here we used that $g(N) \to \infty$ .
Note that Case 4a can be handled in the same way as in the proof of (14), since the distance between $y^{\prime}_1$ and any copy of $\textbf{K}$ is at least $N^\zeta$ .
Therefore, we have discussed all possible cases and proved (15) as required.
For future use, we extract a few corollaries of the proof of Proposition 2.3.
Corollary 4.1. Assume that $y^{\prime}, y^{\prime\prime} \in \mathbb{Z}^d$ are points such that $|y^{\prime\prime} - y^{\prime}| \le 2 C_1 \sqrt{\log n} \sqrt{n}$ . Then for all $n/2 \le m \le n$ we have
Proof. In the course of the proof of Proposition 2.3, we established the above with $m = n$ , where $y^{\prime} = y^{\prime}_{\ell-1}$ and $y^{\prime\prime} = y^{\prime}_{\ell}$ , and with $C_1$ in place of $2 C_1$ in the upper bound on the displacement $|y^{\prime\prime} - y^{\prime}|$ . Note that in this case the restriction $|y^{\prime} - x^{\prime}|, |y^{\prime\prime} - x^{\prime}| \gt N^\zeta$ in the summation holds for all x, because of conditions imposed on $y^{\prime}_{\ell-1}$ and $y^{\prime}_{\ell}$ in the definition of $\mathcal{G}_{\zeta,C_1}$ .
The arguments when $n/2 \le m \lt n$ and with the upper bound increased by a factor of 2 are essentially the same. The information that $y^{\prime}_{\ell-1}$ and $y^{\prime}_\ell$ are at least distance $N^\zeta$ from $\varphi^{-1}(\textbf{K})$ was only used in Cases 3a and 4a to handle terms $x^{\prime}$ close to these points. Since in (36) we exclude such $x^{\prime}$ from the summation, the statement follows without restricting the location of $y^{\prime}$ , $y^{\prime\prime}$ .
The following is merely a restatement of what was observed in (31) (with Part (ii) below holding by symmetry).
Corollary 4.2.
-
(i) For $\ell \ge 2$ and any sequence $\varepsilon_n \to 0$ , we have
(37) \begin{equation}\begin{split} \sum_{\substack{n_1+n_2=n \\ n_1 \le \varepsilon_n n \\ n_1, n_2 \ge N^{2+6\varepsilon}}} \ \sum_{x \in \mathbb{Z}^d} \ \sum_{\substack{x^{\prime} \in \textbf{K} +xN\, :\,\\ |y^{\prime}_{\ell-1}-x^{\prime}|\leq n_1^{\frac{1}{2}+\varepsilon} \\ |x^{\prime}-y^{\prime}_{\ell}|\leq n_2^{\frac{1}{2}+\varepsilon}}} p_{n_1}\big(y^{\prime}_{\ell-1},x^{\prime}\big) p_{n_2}\big(x^{\prime},y^{\prime}_{\ell}\big) = o(1) \frac{n}{N^d} p_n\big(y^{\prime}_{\ell-1}, y^{\prime}_\ell\big).\end{split}\end{equation} -
(ii) The same right-hand-side expression is valid if we replace the restriction $n_1 \le \varepsilon_n n$ by $n_2 \le \varepsilon_n n$ .
The following is a restatement of the bounds of Cases 2a and 2b.
Corollary 4.3. For $\ell \ge 2$ we have
-
(i)
(38) \begin{equation}\begin{split} \sum_{\substack{n_1+n_2=n \\ n_1, n_2 \ge N^{2+6\varepsilon}}} \ \sum_{x \in \mathbb{Z}^d} \ \sum_{\substack{x^{\prime} \in \textbf{K} +xN\, :\,\\ |y^{\prime}_{\ell-1}-x^{\prime}| \gt n_1^{\frac{1}{2}+\varepsilon}}} p_{n_1}\big(y^{\prime}_{\ell-1},x^{\prime}\big) p_{n_2}\big(x^{\prime},y^{\prime}_{\ell}\big) = o(1) \frac{n}{N^d} p_n\big(y^{\prime}_{\ell-1}, y^{\prime}_\ell\big),\end{split}\end{equation}(ii)(39) \begin{equation}\begin{split} \sum_{\substack{n_1+n_2=n \\ n_1, n_2 \ge N^{2+6\varepsilon}}} \ \sum_{x \in \mathbb{Z}^d} \ \sum_{\substack{x^{\prime} \in \textbf{K} +xN\, :\,\\ |x^{\prime} - y^{\prime}_\ell| \gt n_2^{\frac{1}{2}+\varepsilon}}} p_{n_1}\big(y^{\prime}_{\ell-1},x^{\prime}\big) p_{n_2}\big(x^{\prime},y^{\prime}_{\ell}\big) = o(1) \frac{n}{N^d} p_n\big(y^{\prime}_{\ell-1}, y^{\prime}_\ell\big).\end{split}\end{equation}
The following is a restatement of the bounds of Cases 3a and 3b combined.
Corollary 4.4. For $\ell \ge 2$ we have
4.2. Proof of Proposition 2.2
In this section we need $C_1$ large enough so that we have
We have
where
The strategy is to estimate the probability via the Bonferroni inequalities:
We are going to use a parameter $A_n$ that we choose as $A_n = 10 \log \log n$ , so that in particular $A_n \to \infty$ .
4.2.1. The main contribution
In this section, we consider only stretches with $|y^{\prime}_\ell - y^{\prime}_{\ell-1}| \le A_n \sqrt{n}$ . We will show that the main contribution in (42) comes from x in the set
We first examine $\textbf{P}_{y^{\prime}_{\ell-1}} [ A(x) ]$ for $x \in G$ . Putting $B_{0,x} = B(\textbf{x}_0 + x N, N^{\zeta})$ , let $n_1$ be the time of the last visit to $\partial B_{0,x}$ before hitting $\textbf{K}+xN$ , let $n_1 + n_2$ be the time of the first hit of $\textbf{K}+xN$ , and let $n_3 = n - n_1 - n_2$ . See Figure 2 for an illustration of this decomposition.
Then we can write
where
We are going to use another parameter $\varepsilon_n$ that will need to go to 0 slowly. We choose it as $\varepsilon_n = (10 \log \log n)^{-1} \to 0$ . The main contribution to (43) will be when $n_1 \ge \varepsilon_n n$ , $n_3 \ge \varepsilon_n n$ , and $n_2 \le N^{2\delta/d} \sim n^{2/d}$ . Therefore, we split the sum over $n_1, n_2, n_3$ in (43) into a main contribution I(x) and an error term II(x). In order to define these, let
Then with
we have
Lemma 4.2. When $x \in G$ and $n_3 \ge \varepsilon_n n$ , we have
with the o(1) term uniform in $x^{\prime}$ and $u^{\prime}$ .
Proof. By the LCLT, we have
We compare the exponents
as $N\to\infty$ .
Lemma 4.3. When $x \in G$ and $n_1 \ge \varepsilon_n n$ , we have
with the o(1) term uniform in $z^{\prime}$ and $u^{\prime}$ .
Proof. The statement will follow if we show the following claim:
For this, observe that by (5) we have
On the other hand, using the Markov property, (5), and the fact that for $x^{\prime} \in \textbf{K}+xN$ we have $\big|y^{\prime}_{\ell-1} - x^{\prime}\big| \ge c N^{\zeta}$ and $|x^{\prime} - z^{\prime}| \ge c N^{\zeta}$ , we get
We note here that the sum over $1 \le m \le n_1/2$ and the sum over $n_1/2 \le m \le n_1-1$ are symmetric. Bounding the sum over $1 \le m \le n_1/2$ gives
In the second sum we can bound the exponential by 1 and get the upper bound
In the first sum, we group terms on dyadic scales k so that $2^k \le N^{2 \zeta}/m \le 2^{k+1}$ , $k = 0, \dots, \lfloor \log_2 N^{2 \zeta}\rfloor +1 $ . This gives the bound
which is also $o\big(n^{-d/2+o(1)}\big)$ .
In order to apply the previous two lemmas to analyse I(x) in (45), we first define a modification of F in (44), where $z^{\prime}$ and $x^{\prime}$ are both replaced by a vertex $u^{\prime} \in \mathbb{T}_N + xN$ . That is, we define
Then Lemmas 4.2 and 4.3 allow us to write, for $x \in G$ , the main term I(x) in (45) as
where the sum over $x^{\prime}$ is removed since
Lemma 4.4. Assume that $n_1, n_3 \ge \varepsilon_n n$ and $n_2 \le N^{2 \delta/d}$ .
-
(i) We have
\begin{align*} p_{n_1+n_3} \big(y^{\prime}_{\ell-1}, y^{\prime}_\ell\big) = (1 + o(1)) p_n\big(y^{\prime}_{\ell-1}, y^{\prime}_\ell\big). \end{align*} -
(ii) We have
(50) \begin{equation} \sum_{x \in G} \sum_{u^{\prime} \in \mathbb{T}_N+xN} p_{n_1} \big(y^{\prime}_{\ell-1}, u^{\prime}\big) \, p_{n_3}\big(u^{\prime}, y^{\prime}_\ell\big) = (1 + o(1)) p_{n_1+n_3} \big(y^{\prime}_{\ell-1}, y^{\prime}_\ell\big).\end{equation}
Proof.
-
(i) When $n_2 \le N^{2 \delta/d} \sim n^{2/d}$ , we have
\begin{align*} n_1 + n_3 = n \left( 1 - O \left( n^{-1+2/d} \right) \right)\kern-2pt. \end{align*}Hence the exponential term in the LCLT for $p_{n_1+n_3}\big(y^{\prime}_{\ell-1},y^{\prime}_\ell\big)$ is\begin{align*} \exp \left({-}\frac{|y^{\prime}_{\ell} - y^{\prime}_{\ell-1}|^2}{n} \left( 1 + O \big( n^{-1+2/d} \big) \right) \right) = (1 + o(1)) \, \exp \left({-}\frac{\big|y^{\prime}_{\ell} - y^{\prime}_{\ell-1}\big|^2}{n} \right)\kern-2pt, \end{align*}where we used that $A_n = 10 \log \log n$ , and hence $\big|y^{\prime}_{\ell} - y^{\prime}_{\ell-1}\big|^2 \le A_n^2 n = n\, o\big( n^{1-2/d} \big)$ . -
(ii) If we summed over all $x \in \mathbb{Z}^d$ , we would get exactly $p_{n_1+n_3}\big(y^{\prime}_{\ell-1}, y^{\prime}_\ell\big)$ . Thus the claim amounts to showing that
(51) \begin{equation} \sum_{x \in \mathbb{Z}^d \setminus G} \sum_{u^{\prime} \in \mathbb{T}_N+xN} p_{n_1} \big(y^{\prime}_{\ell-1}, u^{\prime}\big) \, p_{n_3}\big(u^{\prime}, y^{\prime}_\ell\big) = o(1) p_{n_1+n_3} \big(y^{\prime}_{\ell-1}, y^{\prime}_\ell\big).\end{equation}First, note that from the LCLT we have\begin{align*} p_{n_1+n_3} \big(y^{\prime}_{\ell-1}, y^{\prime}_\ell\big) = (1 + o(1)) \overline{p}_{n_1+n_3} \big(y^{\prime}_{\ell-1}, y^{\prime}_\ell\big). \end{align*}In order to estimate the left-hand side of (51), using (5), we can estimate the contribution of $\big\{ x \in \mathbb{Z}^d \setminus G\, :\,\max \big\{ \big|y^{\prime}_{\ell-1} - xN\big|, \big|x N - y^{\prime}_{\ell-1}\big| \big\} \gt A_n^2 \sqrt{n} \big\}$ as follows. First, we have\begin{align*} p_{n_1+n_3} \big(y^{\prime}_{\ell-1}, y^{\prime}_\ell\big) \ge \frac{c}{n^{d/2}} \exp \big({-}C A_n^2 (1+ o(1)) \big) \ge \frac{c}{n^{d/2}} \exp \big({-}C (\log \log n)^2 \big). \end{align*}Here we used $\big|y^{\prime}_{\ell} - y^{\prime}_{\ell-1}\big|^2 \le A_n^2 n$ and $n_1+n_3 = n(1-o(1))$ .
On the other hand, note that either $n_1 \ge n/3$ or $n_3 \ge n/3$ . Without loss of generality, assume that $n_3 \ge n/3$ . Then the contribution to the left-hand side of (51), using (5), and by summing in dyadic shells with radii $2^k A_n^2 \sqrt{n}$ , $k = 0, 1, 2, \dots$ , we get the bound
The above lemma allows us to write
The next lemma will help us extract the $\textrm{Cap}(K)$ contribution from the right-hand side of (43).
Lemma 4.5. We have
Proof. Performing the sum over $n_2$ allows us to rewrite the expression in the left-hand side of (54) as
Here the $1/2$ before $\textrm{Cap}(K)$ comes from the random walk being lazy; see (4). Using time-reversal for the summand in the last term, we get the expression
The subtracted term in the right-hand side of (55) is at most
Since $\zeta \lt \delta/d$ , this expression is o(1).
From the above lemma we get that the main contribution equals
It remains to estimate all the error terms.
4.2.2. The error terms
Lemma 4.6. We have
Proof. We split the estimates according to which condition is violated in the sum. Recall that in the proof of Proposition 2.3 we chose $\varepsilon \gt 0$ such that $-\varepsilon+6\varepsilon^2 \lt 0$ . Here we make the further restriction that $\varepsilon \lt 2\delta/d - 2\zeta$ .
Case 1: $n_2 \gt N^{2\delta/d}$ . We claim that
Since, in every time interval of duration $N^{2\zeta}$ , the walk has a positive chance of exiting the ball $B_{0,x}$ , we have
By (5) on $p_{n_2}$ , and since $\zeta \lt \delta/d$ and $ N^{2\delta /d} \lt n_2 \lt n$ , we have
Here we lower-bounded $\exp\left({-}C\frac{N^{2\zeta}}{n_2}\right)$ by c. The claim (57) is proved.
We also have the bound
We then get (summing over $z^{\prime}$ and $x^{\prime}$ ) that the contribution to $\sum_{x \in \mathbb{Z}^d} II(x)$ from Case 1 is at most
Case 2: $n_2 \le N^{2\delta/d}$ and $n_1 \lt \varepsilon_n n$ . Note that since $n_2 \le N^2 \le \varepsilon_n n$ for large enough N, if we put $n^{\prime}_1 = n_1 + n_2$ and $n^{\prime}_2 = n_3$ , we can upper-bound the contribution of this case by
Now we can make use of the corollaries stated after the proof of Proposition 1 as follows.
Case 2- (i). $N^{2+6 \varepsilon} \le n^{\prime}_1 \le 2 \varepsilon_n n$ , $\big|y^{\prime}_{\ell-1} - x^{\prime}\big| \le \big(n^{\prime}_1\big)^{\frac{1}{2}+\varepsilon}$ and $\big|x^{\prime} - y^{\prime}_\ell\big| \le \big(n^{\prime}_2\big)^{\frac{1}{2}+\varepsilon}$ . Note that for large enough N we have $n^{\prime}_2 \ge (n - 2 \varepsilon_n n) \ge N^{2+6 \varepsilon}$ . Hence, by Corollary 4.2(i) (with $\varepsilon_n$ there replaced by $2 \varepsilon_n$ ), the contribution of this case is
Case 2- (ii). $N^{2+6 \varepsilon} \le n^{\prime}_1 \le 2 \varepsilon_n n$ but either $\big|y^{\prime}_{\ell-1} - x^{\prime}\big| \gt \big(n^{\prime}_1\big)^{\frac{1}{2}+\varepsilon}$ or $\big|x^{\prime} - y^{\prime}_\ell\big| \gt \big(n^{\prime}_2\big)^{\frac{1}{2}+\varepsilon}$ . Again, we have $n^{\prime}_2 \ge N^{2+6 \varepsilon}$ . Hence, neglecting the requirement $n^{\prime}_1 \le 2 \varepsilon_n n$ , Corollary 4.3 immediately implies that the contribution of this case is
Case 2- (iii). $n^{\prime}_1 \lt N^{2+6\varepsilon}$ . It follows immediately from Corollary 4.4 that the contribution of this case is
Case 3: $n_2 \le N^{2\delta/d}$ and $n_3 \lt \varepsilon_n n$ . By symmetry, this case can be handled very similarly to Case 2.
Lemma 4.7. We have
Proof. By the same arguments as in Lemma 4.4(ii), we have
For $x \in \mathbb{Z}^d \setminus G$ , let k be the dyadic scale that satisfies
The same bounds hold up to constants for $\big|x^{\prime} - y^{\prime}_{\ell}\big|$ .
Then we have
By symmetry of the right-hand side, it is enough to consider the contribution of $1 \le m \le n/2$ , which is bounded by
Now, summing over $x \in \mathbb{Z}^d \setminus G$ , we have that the number of the copies of the torus at dyadic scale $2^k A_n^2 \sqrt{n}$ is at most $C \frac{1}{N^d} \left( 2^k A_n^2 \sqrt{n} \right)^d$ . Hence
Lemma 4.8. We have
Proof. The summand on the left-hand side is bounded above by
By symmetry it is enough to consider the first term inside the summation. The estimates are again modelled on the proof of Proposition 2.3.
Case 1: $m_1 + m_2 \ge n/2$ and $\big|x^{\prime}_2 - y^{\prime}_{\ell-1}\big| \le 2 C_1 \sqrt{n} \sqrt{\log n}$ . In this case we can use Corollary 4.1 with $y^{\prime} = y^{\prime}_{\ell-1}$ and $y^{\prime\prime} = x^{\prime}_2$ to perform the summation over $x^{\prime}_1$ and $x_1$ and get the upper bound
where we have written $m^{\prime}_1 = m_1 + m_2$ and $m^{\prime}_2 = m_3$ . Using Corollary 4.1 again, this time with $y^{\prime} = y^{\prime}_{\ell-1}$ and $y^{\prime\prime} = y^{\prime}_\ell$ , yields the upper bound
Case 2: $m_1 + m_2 \ge n/2$ and $2 C_1 \sqrt{n} \sqrt{\log n} \lt \big|x^{\prime}_2 - y^{\prime}_{\ell-1}\big| \le n^{\frac{1}{2}+\varepsilon}$ . We are going to use that $\varepsilon \le 1$ , which we can clearly assume. First sum over all $x^{\prime}_1 \in \mathbb{Z}^d$ to get the upper bound
where the primed summation denotes the restriction $2 C_1 \sqrt{n} \sqrt{\log n} \lt \big|x^{\prime}_2 - y^{\prime}_{\ell-1}\big| \le n^{\frac{1}{2}+\varepsilon}$ . The choice of $C_1$ (recall (41)) implies that $p_{m^{\prime}_1}$ is $o\big(1/n^{1+3d/2} N^d\big)$ . By the triangle inequality we also have $\big|y^{\prime}_\ell - x^{\prime}_2\big| \gt C_1 \sqrt{n} \sqrt{\log n}$ . Using the LCLT for $p_{m^{\prime}_2}$ we get that
Substituting this bound and $p_{m^{\prime}_1} = o\big(1/n^{1+3d/2} N^d\big)$ into (60), we get
Case 3: $m_1 + m_2 \ge n/2$ and $\big|x^{\prime}_2 - y^{\prime}_{\ell-1}\big| \gt n^{\frac{1}{2}+\varepsilon}$ . Summing over all $x^{\prime}_1 \in \mathbb{Z}^d$ , we get the transition probability $p_{m_1+m_2}\big(y^{\prime}_{\ell-1}, x^{\prime}_2\big)$ . This is stretched-exponentially small, and hence this case satisfies the required bound.
4.3. Proof of Proposition 2.1
Proof of Proposition 2.1. We start with the proof of the second claim. We denote the error term in (12) by E, which we claim to satisfy $|E| \le E_1 + E_2 + E_3 + E_4$ , with
Since $T\leq Sn$ , we have
By the Markov property, for $\big(\tau, (y_\ell, \textbf{y}_\ell)_{\ell=1}^\tau \big) \in \mathcal{G}_{\zeta,C_1}$ ,
We denote the probability on the right-hand side by $p\big(\tau, (y_\ell, \textbf{y}_\ell)_{\ell=1}^\tau\big)$ . On the event of the right-hand side, since $y_{\tau}\in D^{c}$ , we have $S=\tau$ . We claim that
Let $\mathcal{E}_1, \mathcal{E}_2, \mathcal{E}_4$ be the events in the definitions of $E_1, E_2, E_4$ respectively. Let
Then we have
From the above, the claim (62) follows.
The proof of the second claim of Proposition 2.1 follows subject to Lemmas 4.9, 4.11, and 4.12 below, which show that $E_j \to 0$ for $j= 2, 3, 4$ . We have already shown $E_1 \to 0$ in Lemma 2.2.
Similarly, the first claim of the proposition follows from Lemmas 2.2, 4.9, and 4.12.
We bound $E_2$ , $E_3$ , and $E_4$ in the following lemmas.
Lemma 4.9. We have $E_2 \le C \log N \frac{N^{d \zeta}}{n}$ for some C. Consequently, $E_2 \to 0$ .
Proof. Since the number of points in $B(\textbf{x}_\textbf{0}, N^{\zeta})$ is $O\big(N^{d\zeta}\big)$ , and since we are considering times after the first stretch, the random walk is well mixed, so by Lemma 2.1 the probability of visiting any point in the torus is $O\big(1/N^d\big)$ . Using a union bound we have
Since $\zeta \lt \delta/d$ , we have
Before we bound the error term $E_3$ , we first introduce the following lemma. Let $T^{(i)}_0 = \inf\Big\{t\ge 0: Y^{(i)}_t = 0 \Big\}$ , where we recall that $Y^{(i)}$ denotes the ith coordinate of the d-dimensional lazy random walk. We will denote by $t_0$ an instance of $T^{(i)}_0$ .
Lemma 4.10. For all $1 \le i \le d$ , for any integer y such that $-t_0 \le y \lt 0$ and $0 \lt t_0 \le n$ we have
Proof. Using the Markov property at time $t_0$ , we get
where the third step is due to the Cauchy–Schwarz inequality and $\textbf{P}_0\left[Y^{(i)}_{n-t_0}\gt 0\right]\geq c_0$ for some $c_0 \gt 0$ , and the second-to-last step is due to $\textbf{E}_0\left[\Big(Y^{(i)}_{n-t_0}\Big)^2\right] = (n-t_0)/2d$ .
We can similarly bound the conditional expectation of $\Big(Y^{(i)}_n\Big)^2$ as follows:
Lemma 4.11. We have $E_3 \to 0$ as $N \to \infty$ .
Proof. First we are going to bound the time difference between T and Sn. We are going to consider separately the cases when $Y_T$ is in each face of the cube $({-}L,L)^d$ . Assume that we have $Y^{(i)}_T = L$ for some $1 \le i \le d$ . (The arguments needed are very similar when $Y^{(i)}_T = -L$ for some $1\le i \le d$ , and these will not be given.)
Let us consider the lazy random walk $(Y_t)_{t\ge 0}$ in multiples of n steps. Let
and similarly, let
We let $M_0 = L-Y^{(i)}_{T+s_1}$ and $M_r = L-Y^{(i)}_{T+s_{r+1}}$ for $r\ge 1$ . We have that $(M_r)_{r\ge 0}$ is a martingale. Let $\tilde{S} = \inf \{ r \ge 0: M_r \le 0\}$ , and we are going to bound $\textbf{P} \big[ \tilde{S} \gt N^{\varepsilon_1} \big]$ for some small $\varepsilon_1$ that we will choose in the course of the proof. We are going to adapt an argument in [Reference Levin, Peres and Wilmer10, Proposition 17.19] to this purpose.
Define
where we set $h= \sqrt{n}\sqrt{N^{\varepsilon_1}}$ . Let $(\mathcal{F}_r)_{r \ge 0}$ denote the filtration generated by $(M_r)_{r \ge 0}$ . We have
here, recall that $\sigma^2$ is the variance of $Y^{(i)}_1$ .
We first estimate $\textbf{E} \big[ M_0 \,|\, \tilde{S} \gt 0 \big]$ . Since $0 \le s_1 \lt n$ , by the same argument as in Lemma 4.10 we have that
We first bound $\textbf{P}\big[M_{T_h}\ge h \,|\, M_0\big]$ . Since $\big(M_{r\wedge T_h}\big)$ is bounded, by the optional stopping theorem, we have
where we denote $\textbf{E}\Big[M_{T_h}\textbf{1}_{\{M_{T_h}\leq 0\}}|M_0\Big]$ by $-m_{-}(h) \leq 0$ , and the last step is due to $M_{T_h} \textbf{1}_{\{M_{T_h} \ge h\}} \ge h \textbf{1}_{\{M_{T_h} \ge h\}}$ . Hence, we have
We bound $m_{-}(h)$ using Lemma 4.10:
Hence, we have
We now estimate $\textbf{P} [ T_h \geq r \,|\, M_0 ]$ . Let $G_r = M^2_r-hM_r-\sigma^2 nr$ . The sequence $(G_r)$ is a martingale by (63). We can bound both the ‘overshoot’ above h and the ‘undershoot’ below 0 by Lemma 4.10. For the ‘undershoot’ below 0 we have
For the ‘overshoot’ above h, write $M_{T_h} =: N_{T_h}+h$ ; then we have
Hence
For $r \lt T_h$ , we have $(M_r - h) M_r \lt 0$ . Therefore, we have
Since $(G_{r\wedge T_h})$ is a martingale,
We conclude that
Letting $r\to\infty$ , by the monotone convergence theorem,
where $h= \sqrt{n}\sqrt{N^{\varepsilon_1}}$ . This gives
Taking expectations of both sides, we have
Combining the above bounds, we get
We now bound the probability that a copy of $\textbf{K}$ is hit between times T and $s_{N^{\varepsilon_1}}$ .
We first show that the probability that the lazy random walk on the torus is in the ball $B(\textbf{x}_0, N^\zeta)$ at time T goes to 0. Indeed, we have
where we have $\zeta\lt\delta/d \lt 1$ , so the last expression goes to 0. Here we used that $\textbf{P}_{o} [ Y_T = y^{\prime} ] \le C/L^{d-1}$ , for example using a half-space Poisson kernel estimate [Reference Lawler and Limic9, Theorem 8.1.2]. As for the number of terms in the sum, we have that there are $C L^{d-1}/N^{d-1}$ copies of the torus within $\ell_{\infty}$ distance $\le N$ of the boundary $\partial ({-}L,L)^d$ . Considering the intersection of the union of balls $\varphi^{-1}(B(\textbf{x}_0, N^{\zeta}))$ and the boundary $\partial ({-}L, L)^d$ , the worst case is that within a single copy of the torus the intersection has size at most $C N^{\zeta(d-1)}$ .
Condition on the location $y^{\prime}$ of the walk at the exit time T. For $y^{\prime}\notin \varphi^{-1}\left(B(\textbf{x}_0, N^{\zeta})\right)$ , we first bound the probability of hitting $\textbf{K}$ between the times between 0 and $s_2$ . After time $s_2$ , the random walk is well mixed, and we can apply a simpler union bound.
We thus have the upper bound
The first term is stretched-exponentially small by the martingale maximal inequality (3). The Green’s function term is bounded by Lemma 4.1(iii).
After time $s_2$ , by Lemma 2.1, we have that
Therefore, combining the above upper bounds, we have the required result:
if $\varepsilon$ and $\varepsilon_1$ are sufficiently small.
Lemma 4.12. We have $E_4 \le C e^{-cf(n)^2}\frac{N^d \log N}{n}$ for some C. Consequently, there exists $ C_1$ such that if $f(n)\geq C_1\sqrt{\log N}$ , then $E_4\to 0$ .
Proof. By the martingale maximal inequality (3), we have that
Taking, say, $C_1 \gt \sqrt{d/c}$ implies that if $f(n)\geq C_1\sqrt{\log N}$ , we have
4.4. Proof of Proposition 2.4
Proof. By the martingale maximal inequality (3) used in the second step, we have
Hence we have
By Markov’s inequality, it follows that
as $N\to\infty$ .
Acknowledgements
We thank two anonymous referees for their constructive criticism.
Funding information
The research of M. Sun was supported by an EPSRC doctoral training grant to the University of Bath with project reference EP/N509589/1/2377430.
Competing interests
There were no competing interests to declare which arose during the preparation or publication process of this article.