Hostname: page-component-78c5997874-m6dg7 Total loading time: 0 Render date: 2024-11-13T12:01:50.364Z Has data issue: false hasContentIssue false

Proper $3$-colorings of $\mathbb {Z}^{2}$ are Bernoulli

Published online by Cambridge University Press:  27 April 2022

GOURAB RAY*
Affiliation:
Department of Mathematics, University of Victoria, Victoria, BC, Canada V8W 2Y2
YINON SPINKA
Affiliation:
Department of Mathematics, University of British Columbia, Vancouver, BC, Canada V6T 1Z2 School of Mathematical Sciences, Tel Aviv University, Tel Aviv 6997801, Israel (e-mail: yinon@math.ubc.ca)
Rights & Permissions [Opens in a new window]

Abstract

We consider the unique measure of maximal entropy for proper 3-colorings of $\mathbb {Z}^{2}$ , or equivalently, the so-called zero-slope Gibbs measure. Our main result is that this measure is Bernoulli, or equivalently, that it can be expressed as the image of a translation-equivariant function of independent and identically distributed random variables placed on $\mathbb {Z}^{2}$ . Along the way, we obtain various estimates on the mixing properties of this measure.

Type
Original Article
Copyright
© The Author(s), 2022. Published by Cambridge University Press

1 Introduction

A (proper) $3$ -coloring of a graph is an assignment of one of three colors, say from $\{0,1,2\}$ , to each of its vertices so that no two adjacent vertices receive the same color. In this paper, we are concerned with 3-colorings of the square lattice $\mathbb {Z}^{2}$ with nearest-neighbor adjacency. Specifically, our goal is to show that a certain natural translation-invariant measure on the space of such 3-colorings is Bernoulli, meaning that it is isomorphic as a measure-preserving dynamical system to an independent and identically distributed (i.i.d.) process on $\mathbb {Z}^{2}$ , that is, there is an invertible measure-preserving map from an i.i.d. process to it, which is defined almost everywhere and commutes with all translations of  $\mathbb {Z}^{2}$ . Alternatively, being Bernoulli is equivalent to being a factor of an i.i.d. process.

A vertex of $\mathbb {Z}^{2}$ is even if the sum of its two coordinates is even. Let D be a finite subset of $\mathbb {Z}^{2}$ and let $\partial D$ denote its internal vertex boundary, namely, the set of vertices in D that have a neighbor outside D. Let $\mu ^{01}_{D}$ be the uniform measure on the set of all 3-colorings of D whose values on $\partial D$ are fixed to be 0 and 1 on even and odd vertices, respectively. We shall show that $\mu ^{01}_{D}$ converges as $D \uparrow \mathbb {Z}^{2}$ (for sufficiently nice D such as boxes) to a translation-invariant measure $\mu $ on 3-colorings of $\mathbb {Z}^{2}$ . It is this measure that we are concerned with here. Our main result is the following theorem.

Theorem 1.1. The measure $\mu $ is Bernoulli.

Let us make several quick remarks. Firstly, the limiting measure $\mu $ does not depend on the specific choice of boundary condition used to define $\mu ^{01}_{D}$ . To be more precise, for $\xi \in \{0,1,2\}^{\partial D}$ , let $\Omega ^{\xi }_{D}$ be the set of all 3-colorings of D which agree with $\xi $ on $\partial D$ . We refer to $\xi $ as a boundary condition. Suppose that $\Omega ^{\xi }_{D}$ is non-empty and let $\mu ^{\xi }_{D}$ denote the uniform measure on $\Omega ^{\xi }_{D}$ . The convergence of $\mu ^{\xi }_{D}$ to $\mu $ holds for a larger class of boundary conditions, namely, those whose oscillations (in terms of the associated height function) are of smaller order than the square root of the logarithm of the in-radius of D; see Remark 4.5.

Standard arguments imply that $\mu $ is a Markov random field and a uniform Gibbs measure for 3-colorings of $\mathbb {Z}^{2}$ , meaning that, for any finite set $D \subset \mathbb {Z}^{2}$ and any boundary condition $\xi \in \{0,1,2\}^{\partial D}$ such that $\mu (\xi )>0$ , conditioned on $\xi $ , the coloring on D has distribution $\mu ^{\xi }_{D}$ and is independent of the coloring on $\mathbb {Z}^{2} \setminus D$ .

Our last remark concerns the notion of a measure of maximal entropy. We shall not define this notion precisely (and we shall not need it), but simply mention that it roughly means that the restriction of the measure to a large box has Shannon entropy which is nearly as large as possible on a volume scale. The topological entropy of 3-colorings of $\mathbb {Z}^{2}$ has been computed by Lieb [Reference Lieb20] to be $\tfrac 32 \log \tfrac 43$ , which means that the number of 3-colorings of an n-by-n grid grows like $(4/3)^{(3/2) n^{2} (1+o(1))}$ as ${n \to \infty }$ . It is known that $\mu $ is a measure of maximal entropy for 3-colorings of $\mathbb {Z}^{2}$ (see [Reference Galvin, Kahn, Randall and Sorkin11, Theorem 1.2] for an elementary proof). Furthermore, it follows from the results in [Reference Chandgotia, Peled, Sheffield and Tassy4, Reference Sheffield29] (with some minor additional arguments) that there is a unique measure of maximal entropy for 3-colorings of $\mathbb {Z}^{2}$ . Thus, our main theorem can be formulated concisely as the unique measure of maximal entropy for 3-colorings of $\mathbb {Z}^{2}$ is Bernoulli.

The proof of Theorem 1.1 relies crucially on the Russo–Seymour–Welsh theory for homomorphisms from $\mathbb {Z}^{2}$ to $\mathbb {Z}$ recently developed in [Reference Chandgotia, Peled, Sheffield and Tassy4, Reference Duminil-Copin, Harel, Laslier, Raoufi and Ray9]. This also allows us to establish a quantitative power-law mixing condition, which we regard as the main probabilistic content of this paper; see Theorem 3.1.

1.1 Related results

Bernoullicity is a type of mixing condition. Several related mixing conditions for translation-invariant random fields on $\mathbb {Z}^{d}$ in increasing order of strength are ergodicity, weak mixing, (strong) mixing, k-fold mixing ( $k \ge 3$ ), Bernoullicity, and finitary factor of an i.i.d. process. There is an additional condition called K, which, for Markov random fields, is equivalent to full tail triviality [Reference den Hollander and Steif7] and is between k-fold mixing and Bernoullicity. Slawny [Reference Slawny31] gave examples of measures which are ergodic but not weakly mixing (for $d \ge 2$ ), measures which are weakly mixing but not mixing (for $d \ge 3$ ), and measures which are mixing but not 3-fold mixing (for $d \ge 3$ ). In two dimensions, Ledrappier [Reference Ledrappier19] gave a simple construction of a zero-entropy Markov random field which is mixing but not 3-fold mixing, and Hoffman [Reference Hoffman15, Reference Hoffman16] constructed Markov random fields which are K but not Bernoulli. Van den Berg and Steif [Reference van den Berg and Steif36] showed the existence of Markov random fields which are Bernoulli but not finitary factors of i.i.d. processes (for $d \ge 2$ ).

The question of determining whether a given random field is Bernoulli is non-trivial and has received much attention in the literature (for example, [Reference den Hollander and Steif6, Reference Häggström, Jonasson and Lyons14, Reference Nam, Sly and Zhang21, Reference Ornstein and Weiss24, Reference Sly and Zhang32]). We mention the following open question of van den Berg and Steif [Reference van den Berg and Steif36, Question 3]: if a translation-invariant Markov random field is the unique Markov random field with its conditional probabilities, is it necessarily Bernoulli? It is remarked there that such a Markov random field is known to be K (equivalently, full tail trivial).

Let us discuss the situation for 3-colorings of $\mathbb {Z}^{d}$ for $d \ge 3$ . In general, the convergence of the finite-volume measures $\mu ^{01}_{D}$ (defined as before) is not known. In sufficiently high dimensions, convergence holds along ‘nice’ domains of a given parity (so-called even or odd domains) [Reference Feldheim and Spinka10]. The limiting measures in this case depend on the parity of the domain [Reference Galvin, Kahn, Randall and Sorkin11, Reference Peled25]. In particular, these measures are not translation-invariant, but rather invariant only with respect to parity-preserving translations (or other parity-preserving automorphisms). This leads to the fact that the translation-invariant measures of maximal entropy are not mixing and thus also not Bernoulli (as they are mixtures of distinct extremal measures). Nevertheless, when restricting to the action of the subgroup of parity-preserving translations, these measures become Bernoulli (in fact, they are quite weak Bernoulli with exponential rate; see the proof of [Reference Feldheim and Spinka10, Lemma 6.4]).

Let us also remark on the situation for q-colorings of $\mathbb {Z}^{d}$ for $q \ge 4$ and $d \ge 2$ . For a fixed number of colors q, in high enough dimensions ( $d \ge Cq^{10}\log ^{2} q$ suffices), the situation is similar to that of $3$ -colorings in that the measures obtained from fixed-color boundary conditions are not translation-invariant, but they are invariant to parity-preserving automorphisms [Reference Peled and Spinka26]. On the other hand, in any given dimension, when the number of colors is large ( $q>4d$ suffices), there is a unique Gibbs measure, the finite-volume measures (with any boundary conditions) converge to this measure, and this measure is translation-invariant and satisfies strong spatial mixing (this all follows from Dobrushin’s uniqueness condition; see, for example, [Reference Peled and Spinka27]). Consequently, this measure is Bernoulli, and in fact, also a finitary factor of an i.i.d. process [Reference Spinka33]. In two dimensions, it is known that q-colorings satisfy strong spatial mixing for any $q \ge 6$ [Reference Achlioptas, Molloy, Moore and Van Bussel1, Reference Goldberg, Jalsenius, Martin and Paterson12], and hence, in this case, the unique Gibbs measure is also a finitary factor of an i.i.d. process [Reference Spinka33]. The strongest mixing properties that hold for four and five colors in two dimensions are still unknown. It would be interesting to determine whether the measure on 3-colorings of $\mathbb {Z}^{2}$ studied here is also a finitary factor of an i.i.d. process. This is a particular instance of a question raised by Steif [Reference Boyle3, Question 18.2]: if a subshift of finite type (in $\mathbb {Z}^{d}$ , $d \ge 2$ ) has a unique measure of maximal entropy which is Bernoulli, must it be a finitary factor of an i.i.d. process?

Proper q-colorings may be viewed as the zero-temperature antiferromagnetic q-state Potts model. The two-state Potts model is known as the Ising model. Ornstein and Weiss [Reference Ornstein and Weiss24] (see also [Reference Adams2]) showed that the plus state of the ferromagnetic Ising model on $\mathbb {Z}^{d}$ at any positive temperature is Bernoulli. Häggström, Jonasson and Lyons [Reference Häggström, Jonasson and Lyons14] extended this to the ferromagnetic Potts model for any $q \ge 2$ . Our result shows this for (a certain Gibbs measure of) the zero-temperature antiferromagnetic three-state Potts model. It is natural to expect that this extends to positive temperature. However, as our proof relies crucially on the height function representation for 3-colorings, which does not extend to positive temperature, we are unable to answer this question.

2 Locally mixing measures

Several conditions related to Bernoullicity have been introduced in the literature, among which are weak Bernoulli, very weak Bernoulli, quite weak Bernoulli and Følner independence. In this section we introduce a new notion, which we call local mixing, and present some general results about it. The results in this section apply to random fields on $\mathbb {Z}^{d}$ in any dimension d. All measures here are probability measures.

Before providing the definition of local mixing, let us give some motivation. Suppose $\mu $ is the law of a random field on $\mathbb {Z}^{d}$ . Fix a vertex v and suppose we want to compare the conditional laws of $\mu $ on v under two different boundary conditions $\xi $ and $\xi ^{\prime }$ outside a big box containing v. For the measures of interest in this paper, the conditional laws could be very far apart in the worst case. However, for the purposes of showing Bernoullicity, it is enough that the conditional laws are close for typical $\xi $ and $\xi ^{\prime }$ . Actually, we also want to make sure that there is a coupling between the conditional measures in the entire box which gives a low probability of disagreement at any particular vertex (depending on its distance to the boundary). Let us mention that we will eventually prove in §4.2 (see also §3.3) that the measure $\mu $ of Theorem 1.1 is locally mixing by a multi-step Markovian exploration process.

It will be useful for us to define the notion of local mixing for a family of measures, rather than just for a single measure. A rate function is any decreasing function $\rho \colon \mathbb {N} \to [0,\infty )$ such that $\rho (k) \to 0$ as $k \to \infty $ . Let $\Lambda _{n}$ denote the box $[-n,n]^{d} \cap \mathbb {Z}^{d}$ .

Definition 2.1. (Locally mixing)

Let $\mathcal {A}$ be finite and let $\mathcal {M}$ be a collection of pairs $(\mu ,D)$ such that $D \subset \mathbb {Z}^{d}$ and $\mu $ is a measure on $\mathcal {A}^{D}$ . We say that $\mathcal {M}$ is locally mixing with rate function $\rho $ if the following assertion holds. Let $n \ge 1$ and let $(\mu ,D),(\mu ^{\prime },D^{\prime }) \in \mathcal {M}$ be such that $\Lambda _{n} \subset D \cap D^{\prime }$ . Then there exists a coupling of $f \sim \mu $ and $f^{\prime } \sim \mu ^{\prime }$ such that:

  1. (1) $f_{|D \setminus \Lambda _{n}}$ and $f^{\prime }$ are independent;

  2. (2) $\mathbb {P}(f(v) \neq f^{\prime }(v)) \le \rho (n-k)$ for any $0 \le k \le n$ and $v \in \Lambda _{k}$ .

We say that $\mathcal {M}$ is locally mixing if it locally mixing with some rate function. We stress that there is no restriction on the sets D, and, in particular, some of them may be infinite. Clearly, if a family $\mathcal {M}$ is locally mixing, then so is any subset of it (with the same rate function).

The notion of local mixing also makes sense for a single measure. We say that a measure $\mu $ on $\mathcal {A}^{\mathbb {Z}^{d}}$ is locally mixing if $\mathcal {M}=\{(\mu ,\mathbb {Z}^{d})\}$ is. The following simple proposition shows that a locally mixing family gives rise to a unique limiting measure, which is itself locally mixing.

Proposition 2.2. Let $\mathcal {M}=\{(\mu _{i},D_{i})\}_{i=1}^{\infty }$ be locally mixing with $D_{i} \uparrow \mathbb {Z}^{d}$ as $i \to \infty $ . Then $\mu _{i}$ converges as $i \to \infty $ to a measure $\mu $ on $\mathcal {A}^{\mathbb {Z}^{d}}$ which is locally mixing with the same rate function.

Proof. Suppose that $\mathcal {M}$ is locally mixing with rate function $\rho $ . To establish the convergence, it suffices to show that, for any $i,j,k,n$ such that $\Lambda _{k} \subset \Lambda _{n} \subset D_{i} \cap D_{j}$ ,

$$ \begin{align*} \operatorname{\mathrm{dist}}_{\mathrm{TV}}((\mu_{i})_{|\Lambda_{k}}, (\mu_{j})_{|\Lambda_{k}}) \le |\Lambda_{k}| \cdot \rho(n-k).\end{align*} $$

Indeed, sampling $f \sim \mu _{i}$ and $f^{\prime } \sim \mu _{j}$ from a coupling as in the definition of local mixing, the inequality follows by a union bound. Let $\mu $ be the limiting measure. It is straightforward from the convergence that $\mathcal {M} \cup \{(\mu ,\mathbb {Z}^{d})\}$ is locally mixing. In particular, $\mu $ is locally mixing.

A key property of local mixing is that it implies two other mixing properties (for translation-invariant measures): full tail triviality (which implies strong mixing and ergodicity) and Bernoullicity. The former (which we do not define here) easily follows from the definition of local mixing. The latter is stated in the following proposition.

Proposition 2.3. Any translation-invariant locally mixing measure on $\mathcal {A}^{\mathbb {Z}^{d}}$ is Bernoulli.

Our strategy for proving Proposition 2.3 is to verify a classic condition called very weak Bernoulli, which is known to be equivalent to Bernoulli. In fact, we verify a stronger condition called Følner independence [Reference Adams2], which we now proceed to define.

We begin by defining the so-called $\bar d$ -distance between two measures $\nu $ and $\unicode{x3bb} $ on $\mathcal {A}^{V}\kern-1.2pt$ , with $\mathcal {A}$ and V finite. Roughly speaking, the $\bar d$ -distance is small if one can couple samples of $\nu $ and $\unicode{x3bb} $ so that they tend to agree on most elements of V. To be precise, the $\bar d$ -distance between $\unicode{x3bb} $ and $\nu $ is

$$ \begin{align*} \bar d(\nu, \unicode{x3bb}) := \inf_{\substack{(X,Y)\\X\sim \nu, Y \sim \unicode{x3bb}}} \bigg \{ \frac1{|V|} \sum_{v \in V} \mathbb{P}({X(v) \neq Y(v)}) \bigg \}, \end{align*} $$

where the infimum is taken over couplings of random variables X and Y with distributions $\nu $ and $\unicode{x3bb} $ , respectively.

Consider now a translation-invariant measure $\mu $ on $\mathcal {A}^{\mathbb {Z}^{d}}$ . Loosely speaking, $\mu $ is Følner independent if, for most conditionings outside a large box, the conditional measure inside the box is close in $\bar d$ -distance to the unconditional measure inside the box. Given a set $U \subset \mathbb {Z}^{d}$ , we denote by $\mu |_{U}$ the restriction of $\mu $ to U. Given a finite set $B \subset \mathbb {Z}^{d}$ and a feasible $\xi \in \mathcal {A}^{B}$ (by ‘feasible’ we mean that $\mu (\xi )>0$ ), we denote by $\mu |^{\xi }_{U}$ the restriction of $\mu $ to U when conditioned on $\xi $ . Here and throughout the paper, we sometimes identify a configuration $\xi \in \mathcal {A}^{B}$ with the event $\{ f \in \mathcal {A}^{\mathbb {Z}^{d}} : f_{|B}=\xi \}$ .

Definition 2.4. (Følner independence)

A translation-invariant measure $\mu $ on $\mathcal {A}^{\mathbb {Z}^{d}}$ is Følner independent if for all $\varepsilon>0$ there exists N such that for any $n \ge N$ and finite $S \subset \mathbb {Z}^{d} \setminus \Lambda _{n}$ ,

(2.1) $$ \begin{align} \bar d (\mu|^{\xi}_{\Lambda_{n}}, \mu|_{{\Lambda_{n}}}) <\varepsilon \end{align} $$

for all feasible $\xi \in \mathcal {A}^{S}$ , except for a set of $\mu $ -measure at most $\varepsilon $ .

We mention that the very weak Bernoulli condition is defined similarly, with the only difference being that S is required to be a subset of a certain ‘lexicographical past’ of $\Lambda _{n}$ . We do not give the precise definition here, but content ourselves with the fact that, by its definition, it is weaker than Følner independence (for Markov random fields, the two are in fact equivalent [Reference den Hollander and Steif7]). We rely on the following important theorem, which is due to Ornstein [Reference Ornstein22] and Ornstein and Weiss [Reference Ornstein and Weiss23] in the one-dimensional case. For the general case, we refer to [Reference Conze5, Reference Kammeyer17, Reference Katznelson and Weiss18, Reference Thouvenot34].

Theorem 2.5. [Reference Conze5, Reference Kammeyer17, Reference Katznelson and Weiss18, Reference Thouvenot34]

A translation-invariant ergodic measure on $\mathcal {A}^{\mathbb {Z}^{d}}$ is Bernoulli if and only if it is very weak Bernoulli. In particular, if it is Følner independent, then it is Bernoulli.

Thus, Proposition 2.3 will follow once we establish the simple fact that local mixing implies Følner independence. In fact, as it turns out, the two are actually equivalent. As this latter statement requires more work to establish and as it is not our main concern, we postpone its proof to the end of the paper (§5).

Proposition 2.6. A translation-invariant measure on $\mathcal {A}^{\mathbb {Z}^{d}}$ is locally mixing if and only if it is Følner independent.

Proof of first half of Proposition 2.6 (local mixing implies Følner independence)

Let $\mu $ be a translation-invariant locally mixing measure on $\mathcal {A}^{\mathbb {Z}^{d}}$ . We show that for any $n \ge m \ge 1$ and finite $S \subset \mathbb {Z}^{d} \setminus \Lambda _{n}$ , the set $\mathcal {G} \subset \mathcal {A}^{S}$ of feasible configurations $\xi $ such that

(2.2) $$ \begin{align} \bar d (\mu|^{\xi}_{\Lambda_{n}}, \mu|_{{\Lambda_{n}}}) \le \varepsilon := \sqrt{\rho(n-m) + \frac{|\Lambda_{n} \setminus \Lambda_{m}|}{|\Lambda_{n}|}} \end{align} $$

satisfies $\mu (\mathcal {G}) \ge 1-\varepsilon $ . Since $\varepsilon $ can be made arbitrarily small by taking n large enough and $m=n-\sqrt {n}$ , this will establish that $\mu $ is Følner independent. Let $f,f^{\prime } \sim \mu $ be sampled from a coupling as in the definition of local mixing. Let X denote the average number of vertices $v \in \Lambda _{n}$ such that $f(v) \neq f^{\prime }(v)$ . Then

$$ \begin{align*} \mathbb{E} X = \frac{1}{|\Lambda_{n}|} \sum_{v \in \Lambda_{n}} \mathbb{P}(f(v) \neq f^{\prime}(v)) \le \rho(n-m) + \frac{|\Lambda_{n} \setminus \Lambda_{m}|}{|\Lambda_{n}|} = \varepsilon^{2}.\end{align*} $$

Define $Y := \mathbb {E}[X \mid f_{|S}]$ and note that $\mathbb {E} Y = \mathbb {E} X \le \varepsilon ^{2}$ . Thus, Markov’s inequality yields that $\mathbb {P}(Y \ge \varepsilon ) \le \varepsilon $ . Finally, since $f_{|\mathbb {Z}^{d} \setminus \Lambda _{n}}$ and $f^{\prime }$ are independent, the conditional distribution of $f^{\prime }_{|\Lambda _{n}}$ given $f_{|S}$ is $\mu |_{\Lambda _{n}}$ . It follows that the event $\{Y<\varepsilon \}$ is contained in the event $\{f_{|S} \in \mathcal {G}\}$ , and hence, that $\mu (\mathcal {G}) \ge 1-\varepsilon $ .

3 Proof outline

Given the results about locally mixing measures discussed in §2, our main result on the Bernoullicity of the 3-coloring measure will follow by showing that the appropriate family of 3-coloring measures is locally mixing.

We extend the definition of $\mu ^{01}_{D}$ to allow for any two fixed boundary colors. Specifically, for distinct $i,j \in \{0,1,2\}$ , let $\mu ^{ij}_{D}$ be the uniform measure on the set of all 3-colorings of D whose values on $\partial D$ are fixed to be i and j on even and odd vertices, respectively. A subset D of $\mathbb {Z}^{2}$ is simply connected if it is connected and its complement $\mathbb {Z}^{2} \setminus D$ is connected. A rate function $\rho $ is a power-law rate function if it satisfies $\rho (n) \le Cn^{-\alpha }$ for some $C,\alpha>0$ and all $n \ge 1$ .

Theorem 3.1. Let $\mathcal {M}$ be the family of all pairs $(\mu ^{ij}_{D},D)$ with $i,j \in \{0,1,2\}$ distinct and $D \subset \mathbb {Z}^{2}$ finite and simply connected. Then $\mathcal {M}$ is locally mixing with a power-law rate.

We remark that a power-law rate is best possible; see Corollary 4.6. The proof of Theorem 3.1 also shows that the larger family consisting of all pairs $(\mu ^{\xi }_{D},D)$ with D finite and simply connected and with $\xi $ having bounded oscillations (in terms of the associated height function) is locally mixing with a power-law rate; see Remark 4.5.

Together with Proposition 2.2, the theorem immediately implies that $\mu ^{ij}_{D}$ converges to a measure $\mu $ on 3-colorings of $\mathbb {Z}^{2}$ as D increases to $\mathbb {Z}^{2}$ along simply connected finite subsets. Furthermore, since the limit does not depend on i and j (as $\mathcal {M}$ contains pairs with every choice of i and j), it follows that $\mu $ is invariant to permutations of the colors. It is also straightforward that $\mu $ is invariant to any automorphism T of $\mathbb {Z}^{2}$ , since $\mu ^{01}_{D} \circ T^{-1}$ is either $\mu ^{01}_{T(D)}$ or $\mu ^{10}_{T(D)}$ , both of which belong to $\mathcal {M}$ .

Proof of Theorem 1.1

Theorem 3.1 and Proposition 2.2 imply that $\mu $ is locally mixing, and Proposition 2.3 implies that $\mu $ is Bernoulli.

The rest of the paper is mostly focused on proving Theorem 3.1. Below we introduce the height function representation for 3-colorings and give preliminary results for it in §3.2. We then proceed to describe the proof strategy in §3.3. The full proof is given in §4.

3.1 The height function

It is well known that $3$ -colorings of $\mathbb {Z}^{2}$ can be represented as homomorphisms from $\mathbb {Z}^{2}$ to $\mathbb {Z}$ . Recall that a homomorphism $\varphi $ from $D \subseteq \mathbb {Z}^{2}$ to $\mathbb {Z}$ is a map such that $|\varphi (x)-\varphi (y)|=1$ whenever $x,y \in D$ are adjacent. We also refer to such homomorphisms as height functions. We always work with height functions which are even on the even sublattice of $\mathbb {Z}^{2}$ .

Suppose that D is simply connected. It can be checked that the mapping $h \mapsto h\, \mod 3 $ is a bijection from the space of height functions whose value is fixed on some vertex of D to the space of 3-colorings of D whose color is fixed on that same vertex. The inverse mapping can be defined as follows. If $\mathsf c$ is a $3$ -coloring, then the gradient of h along the directed edge $e=(u,v)$ is given by $\nabla h (e) =+1$ if $\mathsf c(v) - \mathsf c(u) \in \{1,-2\}$ and $-1$ if $\mathsf c(v) - \mathsf c(u) \in \{-1,2\}$ . In particular, given a $3$ -coloring of $\mathbb {Z}^{2}$ , this mapping defines the gradient of a height function h on $\mathbb {Z}^{2}$ . In general, height functions will come with a predefined set of boundary conditions (a precise definition will follow) which will always determine the function from its gradient.

Suppose that D is finite and simply connected. Let $\phi _{D}^{01}$ denote the uniform measure on height functions on D whose values on the boundary $\partial D$ are fixed to be $0$ and 1 on even and odd vertices, respectively. Then $\phi _{D}^{01}$ is pushed forward to $\mu ^{01}_{D}$ by the modulo 3 map.

3.2 Preliminary results

We begin with some notation. For any m and n, denote $\Lambda _{m,n} = [-m,m] \times [-n,n] \cap \mathbb {Z}^{2}$ , and recall that $\Lambda _{n}=\Lambda _{n,n}$ . For $n \ge m$ , let $A_{m,n}$ denote the annulus $\Lambda _{n} \setminus \Lambda _{m}$ . A path in $\mathbb {Z}^{2}$ is a sequence of vertices $(v_{1},v_{2},\ldots , v_{n})$ such that $v_{i}$ is adjacent to $v_{i+1}$ for $i=1,\ldots , n-1$ . A path is a loop if $v_{i} \neq v_{j}$ for all $1\le i \neq j \le n-1$ and $v_{1} = v_{n}$ . We also require a notion of diagonal connectivity in $\mathbb {Z}^{2}$ : say that u is a $\times $ -neighbor of v if u and v are at Euclidean distance $\sqrt {2}$ (that is, they are diagonal neighbors). We define a $\times $ -path ( $\times $ -loop) in a similar way, replacing the $\mathbb {Z}^{2}$ adjacency by $\times $ -adjacency. Note that a $\times $ -path consists of vertices of a single parity, so that we may talk about even and odd $\times $ -paths. A domain is a simply connected finite subset D of $\mathbb {Z}^{2}$ whose boundary $\partial D$ is entirely contained in the even lattice. When D is a domain, we write $\mu ^{0}_{D}$ and $\phi ^{0}_{D}$ for $\mu ^{01}_{D}$ and $\phi ^{01}_{D}$ , respectively.

A key tool we need is the recently established Russo–Seymour–Welsh estimate for height functions [Reference Duminil-Copin, Harel, Laslier, Raoufi and Ray9] (a classical estimate in planar percolation-type models), which was used in [Reference Duminil-Copin, Harel, Laslier, Raoufi and Ray9] to show logarithmic variance for the height function. Let $\mathcal {H}_{h\ge k}(\Lambda _{\rho n,n})$ (respectively, $\mathcal {H}_{h=k}^{\times }(\Lambda _{\rho n,n})$ ) be the event that there is a path (respectively, $\times $ -path) with height at least k (respectively, equal to k) joining the left and right boundaries of $\Lambda _{\rho n,n}$ and lying completely inside $\Lambda _{\rho n,n}$ .

Theorem 3.2. (Russo–Seymour–Welsh estimate [Reference Duminil-Copin, Harel, Laslier, Raoufi and Ray9])

For every $\varepsilon ,R,\rho ,k>0$ , there exists $c=c(\varepsilon ,R,\rho ,k)>0$ such that for any $n \ge {10k}/{\varepsilon \wedge \rho }$ and any domain $\Lambda _{\rho n,n} \subset D\subset \Lambda _{Rn}$ such that the distance between $\Lambda _{\rho n,n}$ and $\partial D$ is at least $\varepsilon n$ ,

(3.1) $$ \begin{align} c \le\ &\phi_{D}^{0} [\mathcal{H}_{h\ge k}(\Lambda_{\rho n,n})] \le 1- c, \end{align} $$
(3.2) $$ \begin{align} c \le\ &\phi_{D}^{0} [\mathcal{H}_{h=k}^{\times}(\Lambda_{\rho n,n})] \le 1-c. \end{align} $$

There are two key tools used in proving the above theorem, versions of which were also classically used to study planar percolation and random-cluster models [Reference Duminil-Copin8]. The first one is duality of paths, which roughly states that a left-to-right crossing of height $h \ge m$ is blocked by a top-to-bottom crossing of height $h<m$ . However, one needs to be careful with the type of connectivity used, and we will carefully point this out at the relevant place in the proof (rather than stating a general duality lemma, such as [Reference Duminil-Copin, Harel, Laslier, Raoufi and Ray9, Lemma 2.4], which we do not need).

The second key tool is a monotonicity property for the height function, classically known as the Fortuin–Kasteleyn–Ginibre (FKG) inequality and lattice condition. To properly state this, we introduce a general notion of boundary condition. Given a domain D, a boundary condition is a pair $(B,\kappa )$ with $B \subset D$ and $\kappa $ a function that assigns a subset $\kappa _{v} \subset \mathbb {Z}$ to each $v \in B$ . Let $\text {Hom}( D,B,\kappa )$ denote the set of homomorphisms h on D such that $h_{v}\in \kappa _{v}$ for every $v\in B$ . We say that the boundary condition $(B,\kappa )$ is admissible if $\text {Hom}( D,B,\kappa )$ is non-empty and finite. For an admissible boundary condition $(B,\kappa )$ , we let $\phi ^{B, \kappa }_{D}$ denote the uniform measure on $\text {Hom}(D,B,\kappa )$ . When $B=\partial D$ , we omit B from the notation. A function $F\colon \mathbb {Z}^{D} \mapsto \mathbb {R}$ is increasing if $F(h) \ge F(h^{\prime })$ for any $h, h^{\prime } \in \mathbb {Z}^{D}$ satisfying $h_{v} \ge h^{\prime }_{v}$ for all $v \in D$ .

Proposition 3.3. (Monotonicity for h [Reference Duminil-Copin, Harel, Laslier, Raoufi and Ray9])

Consider a domain D and two admissible boundary conditions $(B,\kappa )$ and $(B,\kappa ^{\prime })$ satisfying that for every $v\in B$ , $\kappa _{v}=[a_{v},b_{v}]$ and $\kappa ^{\prime }_{v}=[a^{\prime }_{v},b^{\prime }_{v}]$ with $a_{v}\le a^{\prime }_{v}$ and $b_{v}\le b^{\prime }_{v}$ (the previous integers may be equal to $\pm \infty $ ). Then:

  • for every increasing function F, $\phi ^{B, \kappa ^{\prime }}_{D}[F(h)]\ge \phi ^{B, \kappa }_{D}[F(h)]$ ;

  • for any two increasing functions $F,G$ , $\phi ^ {B, \kappa }_{D}[F(h)G(h)] \ge \phi ^ {B, \kappa }_{D}[F(h)] \phi ^ {B, \kappa }_{D} [G(h)]$ .

The first property is called the comparison between boundary conditions and the second the FKG inequality. We also crucially use monotonicity properties of $|h|$ in addition to those of h. We say that the boundary condition $(B,\kappa )$ is $|h|$ -adapted if there exists a partition $B_{\text {pos}}(\kappa ) \sqcup B_{\text {sym}}(\kappa )$ of B such that:

  • for any $v \in B_{\text {pos}}(\kappa )$ , $\kappa _{v}\subset \mathbb {Z}_{+}:=\{1,2,\ldots \}$ ;

  • for any $w \in B_{\text {sym}}(\kappa )$ , $\kappa _{w} =-\kappa _{w}$ .

For example, if $w \in B_{\text {sym}}(\kappa )$ , $\kappa _{w}$ could be $\{-3,-2,2,3\}$ .

Proposition 3.4. (Monotonicity for $|h|$ [Reference Duminil-Copin, Harel, Laslier, Raoufi and Ray9])

Consider a domain D and two admissible $|h|$ -adapted boundary conditions $(B,\kappa )$ and $(B,\kappa ^{\prime })$ satisfying $B_{\text {pos}}(\kappa ) \subseteq B_{\text {pos}}(\kappa ^{\prime })$ and for every $v\in B$ , $\kappa _{v}\cap (\mathbb {Z}_{+} \cup \{0\})=[a_{v},b_{v}]$ and $\kappa ^{\prime }_{v}\cap (\mathbb {Z}_{+} \cup \{0\})=[a^{\prime }_{v},b^{\prime }_{v}]$ with $a_{v}\le a^{\prime }_{v}$ and $b_{v}\le b^{\prime }_{v}$ . Then:

  • for every increasing function F, $\phi ^{B, \kappa ^{\prime }}_{D}[F(|h|)]\ge \phi ^{B, \kappa }_{D}[F(|h|)]$ ;

  • for any two increasing functions $F,G$ , $\phi ^ {B, \kappa }_{D}[F(|h|)G(|h|)] \ge \phi ^ {B, \kappa }_{D}[F(|h|)] \phi ^ {B, \kappa }_{D} [G(|h|)]$ .

It is these monotonicity properties that make working with the height function representation beneficial for us.

3.3 Proof strategy

Given a height function h, a level loop is a $\times $ -loop on which the height is constant. We claim that the Russo–Seymour–Welsh estimate guarantees that under $\phi ^{0}_{D}$ , for a domain D containing $\Lambda _{n}$ but not containing a much larger box, most points in $\Lambda _{n}$ are surrounded by a level loop of height 0 contained entirely in $\Lambda _{n}$ . Indeed, if we look at exponential scales starting from the boundary inwards, there is a good chance of finding many successive level loops with height increment $\pm 2$ . Using the symmetry of the height function, we can argue that the actual increment is $+2$ or $-2$ with equal chance, and hence the heights along these nested loops constitute a simple random walk. Since a simple random walk is recurrent, there is a high chance of hitting a loop of height zero, thereby yielding the claim.

Now suppose we have two measures on height functions on D, one with boundary condition $\xi $ and one with boundary condition 0 (here we think of $\xi $ as specifying only the absolute values on the boundary). By the FKG inequality for $|h|$ , we can couple samples $h^{\xi }$ and $h^{0}$ from these measures so that inside the outermost level loops of $0$ for $h^{\xi }$ , the two height functions agree. Indeed, the FKG inequality for $|h|$ implies that $|h^{\xi }|$ stochastically dominates $|h^{0}|$ , which means that we can couple $h^{\xi }$ and $h^{0}$ so that $|h^{\xi }|$ dominates $|h^{0}|$ pointwise. However, since 0 is the lowest absolute value, the outermost level loop of height 0 for $h^{\xi }$ must also be a level loop of height 0 for $h^{0}$ , and hence, by the domain Markov property, we can couple them inside to be the same. There are some technical issues with applying this argument as is, but these can be handled using standard exploration procedures.

However, combining the above two arguments is not enough, since when the boundary condition $\xi $ is very high in absolute value, there is a good chance that there is no outermost level loop of height 0, and indeed the height function delocalizes as $\Lambda _{n}$ becomes large [Reference Chandgotia, Peled, Sheffield and Tassy4Reference den Hollander and Steif9]. Thus, we need to reduce to the case where the boundary condition $\xi $ can be taken to be not too large. For this, we show that with high probability in the Gibbs measure of the coloring, we can find a loop $\mathcal {L}$ of color 0 in the annulus $A_{n,\Delta n}$ for some large $\Delta $ . Now, by the domain Markov property, the law of the coloring inside the domain enclosed by $\mathcal {L}$ can be viewed as a uniform homomorphism height function with boundary conditions 0 on $\mathcal {L}$ . Therefore, with high probability, we have somewhat reduced to the desired case above (the boundary condition is on the loop rather than on the boundary of the box). Note that we need to switch back to the height function representation of the coloring to establish the  coupling since we are crucially using the FKG inequality for the absolute value of the height function to that end.

By employing a two-step iteration of the above argument, one may establish the required coupling needed to verify the local mixing condition. However, this would not yield a power-law rate function. The reason for this stems from the fact that, with probability that is polynomial in k, a simple random walk does not return to 0 within k steps, whereas the number of steps k is with high probability only logarithmic in n, so that the former argument would yield a rate function $\rho $ decaying as $\rho (n) \le C(\log n)^{-\alpha }$ . To obtain a power-law rate, we apply the above argument iteratively, switching back and forth between the coloring and height function representations. Such a back-and-forth procedure seems essential for this (see Corollary 4.6). Roughly speaking, we first find the outermost monochromatic loop in the coloring. We then we switch to the height function and start looking for a level loop of height 0, allowing ourselves to stop if we do not find such a loop after searching a constant number of scales. If we find such a loop quickly, then we are done. Otherwise, we give up on finding the level loop of height 0 in this iteration, and proceed to the next iteration by going back to the coloring in order to find the next outermost monochromatic loop. Repeating this procedure yields a coupling which establishes local mixing with a power-law rate.

4 Proof of Theorem 3.1

We begin with some results about the level lines of the height function in §4.1, and then proceed to give the proof of Theorem 3.1 in §4.2.

Policy on constants. In the rest of the paper, we employ the following policy on constants. We write $C,c,C^{\prime },c^{\prime }$ for positive absolute constants, whose values may change from line to line. Specifically, the values of $C,C^{\prime }$ may increase and the values of $c,c^{\prime }$ may decrease from line to line.

4.1 Level lines of the height function

In this section we prove some results about the level loops of the height function. Recall that a level loop of a height function is a $\times $ -loop on which the height is constant. We will mostly be focused on level loops in the even lattice, or equivalently, level loops on which the height is even.

Let D be a domain containing the origin and let h be a height function on D sampled from $\phi ^{0}_{D}$ . We inductively define a sequence of nested level loops $\mathcal {L}_{0},\ldots ,\mathcal {L}_{M}$ surrounding the origin as follows. First, we set $\mathcal {L}_{0}$ to be the boundary loop of the domain D. Next, we let $\mathcal {L}_{1}$ be the outermost level loop with $|h| = 2$ surrounding the origin. Now suppose that we have already defined the level loop $\mathcal {L}_{m}$ and that the height on it is $H_{m}\in 2\mathbb {Z}$ . Let $\mathcal {L}_{m+1}$ be the outermost level loop surrounding the origin in the domain enclosed by $\mathcal {L}_{m}$ and having height $H_{m+1}$ satisfying $|H_{m+1}-H_{m}| = 2$ . If no such loop exists, then we set $M=m$ and stop the inductive procedure. We say that the level loops $\mathcal {L}_{0},\ldots ,\mathcal {L}_{M}$ are essential, and note that between $\mathcal {L}_{m}$ and $\mathcal {L}_{m+1}$ there may be many non-essential level loops of height $H_{m}$ surrounding the origin.

Observe that, conditioned on the exploration up to $\mathcal {L}_{m}$ , the height difference $H_{m+1}-H_{m}$ is uniform in $\{-2,2\}$ . Indeed, given the exploration, the map $h \mapsto 2H_{m} - h$ inside the domain enclosed by $\mathcal {L}_{m}$ is an involution which inverts the sign of $H_{m+1}-H_{m}$ . Therefore, conditioned on the entire sequence of essential level loops $\mathcal {L}_{0},\ldots ,\mathcal {L}_{M}$ , the random variables $\{H_{m+1}-H_{m}\}_{m=0}^{M-1}$ are i.i.d. uniform $\pm 2$ . In particular, conditioned on M, the sequence $(\tfrac 12 H_{m})_{m=0}^{M}$ defines an M-step simple symmetric random walk starting from 0.

Our main technical result about the level lines of the height function is the following, which shows that, with high probability, the number of essential level loops in an annulus is linear in the logarithm of the aspect ratio of the outer and inner boundaries of the annulus. More specifically, we provide an upper bound on the number of such loops that intersect the annulus and a lower bound on the number of such loops that are entirely contained in the annulus.

Proposition 4.1. There exist constants $C,c>0$ such that the following assertion holds. Let $k \ge 200$ , let $a \ge 1$ and let D be a domain. Let $N=N(k,a)$ be the number of essential level loops contained in $A_{k,2^{a}k}$ and let $N^{\prime } = N^{\prime }(k,a) \ge N$ be the number of essential level loops intersecting $A_{k,2^{a}k}$ . Then

(4.1) $$ \begin{align} \phi_{D}^{0}(N^{\prime} \ge n) \le e^{-cn \log (\frac na \wedge k)} \quad\text{for any }n \ge Ca, \end{align} $$

and if D contains $\Lambda _{2^{a}k}$ then

(4.2) $$ \begin{align} \phi_{D}^{0}(N \le ca) \le Ce^{-ca}. \end{align} $$

We point out that there is no assumption on the domain in (4.1).

4.1.1 Corollaries to Proposition 4.1

We give four corollaries to Proposition 4.1 here. The first two will be used for the proof of Theorem 3.1, whereas the second two will not and simply provide additional information. The former two focus on level loops whose heights are multiples of 6. Such level loops correspond in the coloring representation to monochromatic loops of color 0 in the even lattice. This conforms with the convention that the height function is even on the even lattice and with the fact that the coloring is the modulo 3 of the height function.

The first corollary shows that the origin is typically surrounded by a level loop of height 0 modulo 6 which lies inside an annulus of a fixed aspect ratio. This will later allow us to relate various boundary conditions which are arbitrarily far away from a given box to zero boundary conditions which are near the boundary of the box.

Corollary 4.2. There exist $C,\alpha>0$ such that the following assertion holds. Let $k \ge 1$ , let $\Delta>1$ and let D be a domain containing $\Lambda _{\Delta k}$ . Then

$$ \begin{align*} \phi^{0}_{D}(\Lambda_{k}\text{ is surrounded by a level loop of height }0\text{ modulo }6\text{ inside }\Lambda_{\Delta k}) \ge 1-C\Delta^{-\alpha}. \end{align*} $$

Moreover, the same bound holds under the measure $\phi ^{\kappa }_{D}$ for any finite simply connected set D containing $\Lambda _{\Delta k}$ and any admissible boundary condition $\kappa $ on $\partial D$ such that $\bigcup _{v \in \partial D} \kappa _{v}$ is contained in some interval of size $10$ .

Proof. Recall that, given M, the sequence $(\tfrac 12 H_{m})_{m=0}^{M}$ defines an M-step simple symmetric random walk. Let N be the number of essential level loops contained in $A_{k,\Delta k}$ . Since an n-step simple symmetric random walk (starting from anywhere) visits $3\mathbb {Z}$ with probability at least $1-Ce^{-cn}$ , it suffices to show that $\phi ^{0}_{D}(N \le c\log \Delta ) \le \Delta ^{-\alpha }$ . This in turn follows from (4.1).

Consider now the general case and suppose that $\kappa _{v} \subset [a+1,b-1]$ for all v and some $a,b \in 2\mathbb {Z}$ having $b-a=20$ . Let $\mathcal {D}$ be the largest domain (with respect to inclusion) contained in D and containing $\Lambda _{k}$ . It follows that every $v \in \partial \mathcal {D}$ is at distance at most $1$ from $\partial D$ . In particular, under $\phi ^{\kappa }_{D}$ , the height on $\partial \mathcal {D}$ must be between a and b everywhere. Thus, by the FKG inequality, we can couple $h^{-} \sim \phi ^{a}_{\mathcal {D}}$ , $h^{+} \sim \phi ^{b}_{\mathcal {D}}$ and $h \sim \phi ^{\kappa }_{D}$ so that $h^{-} \le h \le h^{+}$ and $h^{+}-h^{-}=20$ in $\mathcal {D}$ . (The existence of such a three-way coupling is not immediate from the stochastic ordering of $h^{-}$ , h, $h^{+}$ and the distribution equality $h^{+}=h^{-}+20$ , but follows easily using a Markov chain argument as in the proof of [Reference Grimmett13, Theorem 2.1].) In particular, the essential level loops $\mathcal {L}_{1},\ldots ,\mathcal {L}_{M}$ of $h^{-}$ and $h^{+}$ coincide. As before, the number N of essential level loops in $A_{k,\Delta k}$ is at least $c\log \Delta $ with probability at least $1-\Delta ^{-\alpha }$ . Note that if two of these N loops, say $\mathcal {L}_{i}$ and $\mathcal {L}_{j}$ , have height difference at least 26 in $h^{-}$ (equivalently, in $h^{+}$ ), then h must have a level loop of height 0 mod 6 somewhere in between $\mathcal {L}_{i}$ and $\mathcal {L}_{j}$ . Since an n-step simple random walk starting from some x reaches $x\pm 26$ with probability at least $1-Ce^{-cn}$ , the required bound follows.

The next corollary roughly says that if a domain contains a box whose size is of the same order as the in-radius of the domain, then, with constant probability, the first essential loop of height 0 modulo 6 inside the box will have height exactly 0 (not merely 0 modulo 6).

Corollary 4.3. There exist $K,c>0$ such that the following assertion holds. Let $k \ge K$ , let $\Delta \ge 2$ and let D be a domain such that $\Lambda _{k} \subset D \not \supset \Lambda _{\Delta k}$ . Then

$$ \begin{align*} \phi^{0}_{D}(\text{first essential level loop of height }0\text{ mod }6\text{ inside }\Lambda_{k}\text{ has height }0) \ge \frac c{\sqrt{\log\Delta}}. \end{align*} $$

Proof. Let N be the number of essential level loops contained in $\Lambda _{k}$ . Note that the first $N^{\prime }:=M-N$ loops in $\mathcal {L}_{1},\ldots ,\mathcal {L}_{M}$ are not contained in $\Lambda _{k}$ and therefore intersect the annulus $A_{k,\Delta k}$ . Given N and $N^{\prime }$ , and on the event that $N \ge 2$ , since $(\tfrac 12 H_{m})_{m=0}^{M}$ is a simple symmetric random walk starting from 0, the probability that the first essential level loop of height 0 mod 6 inside $\Lambda _{k}$ exists and has height 0 is at least the probability that a simple random walk of length $N^{\prime \prime } := N^{\prime }+1+{\textbf {1}}_{\{N^{\prime }\text { even}\}}$ ends at 0. The latter is $2^{-N^{\prime \prime }} \binom {N^{\prime \prime }}{N^{\prime \prime }/2} \ge c/\sqrt {N^{\prime \prime }}$ . Finally, Proposition 4.1 implies that $\phi ^{0}_{D}(N \ge 2,N^{\prime } \le C\log \Delta ) \ge \frac {1}{2}$ , as long as k is large enough. Combining these estimates yields the corollary.

The next corollary shows that small oscillations in the boundary conditions do not have much effect on the distribution of the height function deep inside the domain. This implies that the convergence of $\mu ^{ij}_{D}$ to $\mu $ extends to a wider class of boundary conditions; see Remark 4.5.

Corollary 4.4. For any $\varepsilon>0$ , there exists $\delta>0$ such that the following assertion holds. Let $n \ge k \ge 1$ , let D be a domain containing $\Lambda _{n}$ and let $\tau \in \mathbb {Z}^{\partial D}$ be an admissible boundary condition such that $|\tau _{v}| \le \delta \sqrt {\log ( n/k)}$ for all $v \in \partial D$ . Then

$$ \begin{align*} \operatorname{\mathrm{dist}}_{\mathrm{TV}}((\phi^{\tau}_{D})_{|\Lambda_{k}}, (\phi^{0}_{D})_{|\Lambda_{k}}) \le \varepsilon. \end{align*} $$

Proof. Denote $a := \lceil \log (n/k) \rceil $ and $b := \max |\tau _{v}| \le \delta \sqrt {a}$ . It suffices to show that the total variation distance between the restrictions of $\phi ^{b}_{D}$ and $\phi ^{-b}_{D}$ to $\Lambda _{k}$ is at most $\varepsilon $ , as the two measures $\phi ^{\tau }_{D}$ and $\phi ^{0}_{D}$ are stochastically between these by the FKG inequality. Let $h^{\pm }$ be sampled from $\phi ^{\pm b}_{D}$ , and coupled as follows. First, couple their essential level loops $\mathcal {L}_{1},\ldots ,\mathcal {L}_{M}$ to coincide. Second, couple their heights on these loops, denoted by $H^{\pm }_{m}$ , so that their absolute values increase/decrease together (that is, $H^{+}_{m+1}-H^{+}_{m} = -(H^{-}_{m+1}-H^{-}_{m})$ ) until the first time $M^{\prime }$ where they both reach $H^{\pm }_{M^{\prime }}=0$ , at which point one may couple $h^{+}$ and $h^{-}$ so that they coincide in the domain $\mathcal {D}$ enclosed by $\mathcal {L}_{M^{\prime }}$ . This coupling shows that $\operatorname {\mathrm {dist}}_{\mathrm {TV}}(h^{+}_{|\Lambda _{k}},h^{-}_{|\Lambda _{k}}) \le \mathbb {P}(\mathcal {D}\text { does not contain }\Lambda _{k})$ . Let N be the number of essential level loops surrounding $\Lambda _{k}$ . By (4.2), there exist a universal constant $c>0$ and a constant $a_{0}=a_{0}(\varepsilon )$ such that $\phi _{D}^{0}(N \ge ca) \ge 1-\varepsilon $ if $a \ge a_{0}$ . Standard estimates yield that a simple random walk of length $ca$ starting from $\lfloor \delta \sqrt {a} \rfloor $ hits 0 with probability at least $1-\varepsilon $ if $\delta =\delta (\varepsilon )>0$ is chosen small enough. Thus, $\phi _{D}^{0}(\Lambda _{k} \subset \mathcal {D}) \ge 1-2\varepsilon $ as long as $a \ge a_{0}$ . Finally, note that by decreasing $\delta $ if necessary, we can ensure that $a<a_{0}$ implies that $b=0$ , in which case the desired statement is trivial.

Remark 4.5. Recall that Theorem 3.1 implies that $\mu ^{ij}_{D}$ converges to $\mu $ as D increases to $\mathbb {Z}^{2}$ along simply connected finite subsets. Suppose that a feasible boundary condition $\xi \in \{0,1,2\}^{\partial D}$ has oscillation at most m if there is a height function $\tau $ on D such that $\xi $ equals $\tau $ modulo 3 on $\partial D$ and $|\tau |\le m$ on $\partial D$ . Then Corollary 4.4 implies that $\mu ^{\xi }_{D}$ converges to $\mu $ as long as $\xi $ has oscillation at most $o(\sqrt {\log (r_{D})})$ , where $r_{D}$ is the largest n such that $\Lambda _{n} \subset D$ .

We end this section with a result about correlation decay for the height function and for the colorings. While the latter decays as a power law in the in-radius of the domain, the former only decays as a power law in the logarithm of the in-radius. This shows that a n rate in Theorem 3.1 is best possible. The argument for the colorings (corresponding to the statement about the height modulo 3) was suggested to us by Ron Peled.

Corollary 4.6. There exist constants $C,c,\alpha ,\beta>0$ such that for any $n \ge 2$ and any domain D having $\Lambda _{n} \subset D \not \supset \Lambda _{n+1}$ ,

$$ \begin{align*} \frac13 + \frac{c}{n^{\alpha}} \le \phi^{0}_{D}(\text{height at origin is }0\text{ modulo }3) \le \frac13 + \frac{C}{n^{\beta}} \end{align*} $$

and

$$ \begin{align*} \frac{c}{(\log n)^{3/2}} \le \phi^{0}_{D}(\text{height at origin is }0) - \phi^{2}_{D}(\text{height at origin is }0) \le \frac{C}{(\log n)^{3/2}}. \end{align*} $$

Proof. Let h be sampled from $\phi ^{0}_{D}$ and denote $X := h(0,0)$ . Since X is symmetric,

$$ \begin{align*} \mathbb{P}(X \in 3\mathbb{Z}) - \frac13 = \frac23\bigg[\mathbb{P}(X \in 3\mathbb{Z}) - \frac12 \mathbb{P}(X \notin 3\mathbb{Z})\bigg] = \frac23 \mathbb{E}\bigg[\cos\bigg(\frac{2\pi X}3\bigg)\bigg].\end{align*} $$

Recall that the essential level loops $\mathcal {L}_{1},\ldots ,\mathcal {L}_{M}$ were defined as the outermost level loops with $\pm 2$ height increments. Define level loops $\mathcal {L}^{\prime }_{1},\ldots ,\mathcal {L}^{\prime }_{N}$ similarly as the outermost level loops with $\pm 1$ increments (so that the essential level loops are a subset of these). Note that, given N, X is distributed as the sum of N independent uniform $\pm 1$ variables. Thus,

$$ \begin{align*} \mathbb{E}[\cos(\theta X) \mid N] = \Re \mathbb{E}[e^{i\theta X} \mid N] = \cos(\theta)^{N}. \end{align*} $$

As N is even (since X is), plugging in $\theta = {2\pi }/3$ , we get that $\mathbb {E}[\cos ({2\pi X}/3) \mid N] = 2^{-N}$ . Thus,

$$ \begin{align*} \mathbb{P}(X \in 3\mathbb{Z}) = \frac13 + \frac23 \mathbb{E}[2^{-N}]. \end{align*} $$

It remains to bound $\mathbb {E}[2^{-N}]$ from above and below. By tuning the constants in the statement, we may assume that n is sufficiently large. To obtain the stated upper bound, it suffices to show that $\mathbb {P}(N \le c\log n) \le n^{-\zeta }$ for some universal constants $c,\zeta>0$ . Since $N \ge M$ , this follows from (4.2). For the lower bound, it suffices to show that $\mathbb {P}(N \le C\log n) \ge \tfrac 14$ for some universal constant $C>0$ . It follows from (4.1) that $\mathbb {P}(M \le C\log n) \ge \tfrac 12$ . To obtain the bound for N, note that, given N, M is distributed as $\max \{m : T_{1}+\cdots +T_{m} \le N\}$ , where $\{T_{i}\}$ are independent copies of $T:=\text {min}\{ k : |\xi _{1}+\cdots +\xi _{k}|=2 \}$ , where $\{\xi _{i}\}$ are i.i.d. uniform $\pm 1$ variables. By Markov’s inequality, $\mathbb {P}(M<cN) \le \mathbb {P}(T_{1}+\cdots +T_{\lceil cN \rceil }>N) \le \tfrac 14$ for some universal constant $c>0$ . Thus, $\mathbb {P}(N \ge (C/c)\log n) \le \tfrac 34$ , as required.

We now turn to the second inequality in the corollary. Observe that the quantity in question is the same as $\mathbb {P}(X=0)-\mathbb {P}(X=2)$ . Conditioning on N, we have

$$ \begin{align*} \mathbb{P}(X=0 \mid N)-\mathbb{P}(X=2 \mid N) &= 2^{-N}\binom{N}{N/2} - 2^{-N}\binom{N}{N/2+1} \\&= \frac{2^{-N}}{N/2+1}\binom{N}{N/2} = \Theta(N^{-3/2}). \end{align*} $$

Using the same bounds on N as before yields the required inequality.

4.1.2 Proof of Proposition 4.1

We begin with two lemmas. The first lemma asserts that if we look at the height function inside a domain containing some box $\Lambda $ , then, with high probability, we will find a level loop of absolute height at least 2 surrounding a slightly smaller box. More precisely, the loop will be between the boundary of the domain (which could be far away $\Lambda $ ) and the boundary of a box whose size is at most a geometric number of scales smaller than $\Lambda $ .

Lemma 4.7. There exists a constant $c>0$ such that the following assertion holds. Let $k \ge 1$ , let D be a domain containing $\Lambda _{k}$ and let $(B,\kappa )$ be $|h|$ -adapted boundary conditions with $B_{\text {sym}} = \partial D$ and $B_{\text {pos}} = \emptyset $ . Let $\mathcal {E}_{i}$ be the event that the annulus $A_{2^{-i}k, 2^{-i+1}k}$ contains a level loop of $|h|\ge 2$ surrounding the origin. Then, for all $n \ge 1$ with $2^{-n}k \ge 100$ ,

$$ \begin{align*} \phi^{\kappa}_{D} (\mathcal{E}_{1} \cup \cdots \cup \mathcal{E}_{n}) \ge 1 - e^{-cn}. \end{align*} $$

Proof. Observe that $\mathcal {E}_{i}$ is measurable with respect to $|h|_{|A_{i}}$ , where $A_{i}$ is the annulus $A_{2^{-i}k, 2^{-i+1}k}$ . It suffices to show that

$$ \begin{align*} \phi^{\kappa}_{D} (\mathcal{E}_{n} \mid {\textbf{1}}_{\mathcal{E}_{1}},\ldots,{\textbf{1}}_{\mathcal{E}_{n-1}}) \ge c \quad\text{almost surely}. \end{align*} $$

In fact, we prove the stronger statement that

$$ \begin{align*} \phi^{\kappa}_{D} (\mathcal{E}_{n} \mid |h|_{|D \setminus D_{n}}) \ge c \quad\text{almost surely}, \end{align*} $$

where $D_{n}$ is the domain $\Lambda ^{\text {e}}_{2^{-n+1}k-1}$ . In words, we first explore the absolute value from the boundary of D inward until we reach the outer boundary of $A_{n}$ and their even neighbors inside $A_{n}$ , and then, regardless of what this exploration reveals, the conditional probability of $\mathcal {E}_{n}$ is at least some universal constant c.

Toward showing this, we first argue that, when sampling from $\phi ^{\kappa }_{D}$ , the conditional distribution of $|h|_{|D_{n}}$ given $|h|_{|(D \setminus D_{n}) \cup \partial D_{n}}$ is almost surely stochastically larger than $\phi ^{0}_{D_{n}}(|h| \in \cdot )$ . Indeed, the domain Markov property and the assumption that the boundary condition has $B_{\text {pos}} = \emptyset $ imply that the former conditional distribution is $\phi ^{\kappa _{n}}_{D_{n}}(|h| \in \cdot )$ , where $\kappa _{n}$ is the boundary condition on $\partial D_{n}$ that equals $\{|h_{v}|,-|h_{v}|\}$ at each $v \in \partial D_{n}$ . The FKG inequality for absolute value (Proposition 3.4) now implies that $\phi ^{\kappa _{n}}_{D_{n}}(|h| \in \cdot )$ stochastically dominates $\phi ^{0}_{D_{n}}(|h| \in \cdot )$ .

Now observe that $\mathcal {E}_{n}$ is increasing in $|h|$ so that the above stochastic domination implies that $\phi ^{\kappa }_{D}(\mathcal {E}_{n} \mid |h|_{|D \setminus D_{n}}) \ge \phi _{D_{n}}^{0}(\mathcal {E}_{n})$ almost surely. Finally, the uniform lower bound on $\phi _{D_{n}}^{0}(\mathcal {E}_{n})$ follows from (3.2) and the standard trick of using the FKG inequality (for $|h|$ ) to glue together crossings of four rectangles into a loop.

Corollary 4.8. There exists a constant $c>0$ such that for all $k \ge 1$ , all domains $D \supset \Lambda _{k}$ and all $n \ge 1$ with $2^{-n}k \ge 100$ ,

$$ \begin{align*} \phi_{D}^{0}(\Lambda_{2^{-n}k}\text{ is surrounded by a level loop of }|h|=2\text{ in }D) \ge 1-e^{-cn}. \end{align*} $$

Proof. Lemma 4.7 implies that with probability at least $1-e^{-cn}$ there is a $\times $ -loop of $|h|\ge 2$ surrounding $\Lambda _{2^{-n}k}$ in D. Since the boundary conditions put height 0 on $\partial D$ , on the former event, there must be a level loop of $|h|=2$ surrounding $\Lambda _{2^{-n}k}$ in D.

The next lemma essentially shows that a level loop of height 2 intersecting a box cannot have long ‘tentacles’ going very far away from the box. A $\times $ -crossing of an annulus inside a domain D is a $\times $ -path in D connecting the inside and the outside boundaries of the annulus.

Lemma 4.9. There exists a constant $c>0$ such that the following assertion holds. Let D be a domain and let $k \ge 100$ . For $i \ge 1$ , let $\mathcal {C}_{i}$ be the event that there exists a $\times $ -crossing of height $2$ of the annulus $A_{2^{i-1} k, 2^{i}k}$ inside D. Then, for all $n \ge 1$ ,

$$ \begin{align*}\phi_{D}^{0}(\mathcal{C}_{1} \cap \cdots \cap \mathcal{C}_{n}) \le e^{-cn}.\end{align*} $$

We remark that no assumption on the location of the domain is made. In particular, it may intersect some of the annuli without containing them.

Proof. Let $A_{i}$ be the slightly thinner even annulus $(\Lambda ^{\text {e}}_{2^{i}k-10} \setminus \Lambda ^{\text {e}}_{2^{i-1}k+10}) \cup \partial \Lambda ^{\text {e}}_{2^{i-1}k+10}$ . With a slight abuse of notation, we redefine $\mathcal {C}_{i}$ to be the event that there exists a $\times $ -crossing of height 2 of this thinner annulus inside D, noting that this increases the event so that it suffices to prove the stated inequality with this new definition.

Denote $D_{i} := A_{i} \cap D$ and $B := \bigcup _{i \ge 1} \partial D_{i}$ , and let $B_{0} := B \setminus \partial D$ be the portions of the boundaries of the even annuli in $D \setminus \partial D$ . Consider the boundary condition $(B,\kappa )$ in which $\kappa $ equals $\{0\}$ on $\partial D$ and $\{2\}$ on $B_{0}$ . We first show that

$$ \begin{align*} \phi^{0}_{D}(\mathcal{C}_{1} \cap \cdots \cap \mathcal{C}_{n}) \le \phi_{D}^{B,\kappa} (\mathcal{C}_{1} \cap \cdots \cap \mathcal{C}_{n}).\end{align*} $$

Let $\mathcal {C}^{\prime }_{i}$ be the event that there exists a $\times $ -crossing of height $0$ of the annulus $A_{i}$ inside D. Note that, by symmetry,

$$ \begin{align*} \phi^{0}_{D}(\mathcal{C}_{1} \cap \cdots \cap \mathcal{C}_{n}) = \phi^{-2}_{D}(\mathcal{C}^{\prime}_{1} \cap \cdots \cap \mathcal{C}^{\prime}_{n}) = \phi^{2}_{D}(\mathcal{C}^{\prime}_{1} \cap \cdots \cap \mathcal{C}^{\prime}_{n}) \end{align*} $$

and, similarly,

$$ \begin{align*} \phi^{B,\kappa}_{D}(\mathcal{C}_{1} \cap \cdots \cap \mathcal{C}_{n}) = \phi^{B,\kappa^{\prime}}_{D}(\mathcal{C}^{\prime}_{1} \cap \cdots \cap \mathcal{C}^{\prime}_{n}),\end{align*} $$

where $\kappa ^{\prime }$ equals $\{2\}$ on $\partial D$ and $\{0\}$ on $B_{0}$ . Thus, it suffices to show that

$$ \begin{align*} \phi^{2}_{D}(\mathcal{C}^{\prime}_{1} \cap \cdots \cap \mathcal{C}^{\prime}_{n}) \le \phi_{D}^{B,\kappa^{\prime}} (\mathcal{C}^{\prime}_{1} \cap \cdots \cap \mathcal{C}^{\prime}_{n}).\end{align*} $$

Since $\mathcal {C}^{\prime }_{i}$ is a decreasing event in the absolute value of h, this follows from the FKG inequality for absolute value (Proposition 3.4), where the constant boundary condition 2 is viewed as the boundary condition $(B,\kappa ^{\prime \prime })$ in which $\kappa ^{\prime \prime }$ equals $\{2\}$ on $\partial D$ and $\mathbb {Z}$ on $B_{0}$ (the latter is equivalent to having no condition on the height on $B_{0}$ , that is, vertices on $B_{0}$ are free to take any value). Here, both boundary conditions have the same partition of B into $B_{\text {pos}}=\partial D$ and $B_{\text {sym}}=B_{0}$ .

Observe that the boundary condition $(B,\kappa )$ decouples the events $\mathcal {C}_{1},\ldots ,\mathcal {C}_{n}$ . Thus,

$$ \begin{align*} \phi^{0}_{D}(\mathcal{C}_{1} \cap \cdots \cap \mathcal{C}_{n}) \le \phi_{D}^{B,\kappa} (\mathcal{C}_{1} \cap \cdots \cap \mathcal{C}_{n}) &= \phi_{D}^{B,\kappa}(\mathcal{C}_{1}) \cdots \phi_{D}^{B,\kappa}(\mathcal{C}_{n})\\ &= \phi_{D_{1}}^{\kappa_{1}}(\mathcal{C}_{1}) \cdots \phi_{D_{n}}^{\kappa_{n}}(\mathcal{C}_{n}),\end{align*} $$

where $\kappa _{i}$ equals $\{0\}$ on $\partial D_{i} \cap \partial D$ and $\{2\}$ on $\partial D_{i} \setminus \partial D$ . It remains to show that

$$ \begin{align*} \phi_{D_{i}}^{\kappa_{i}}(\mathcal{C}_{i}) \le 1-\alpha \end{align*} $$

for some universal constant $\alpha>0$ . The case when $D_{i}=A_{i}$ (that is, when the domain D contains the annulus $A_{i}$ ) is more straightforward than the case $D_{i} \subsetneq A_{i}$ . In order to treat the latter case in a similar way to the former (and simultaneously), it is convenient to work in the entire annulus $A_{i}$ rather than in the portion $D_{i}$ of it. To this end, with a slight abuse of notation, we extend $\kappa _{i}$ to $B_{i}=\partial D_{i} \cup \partial A_{i}$ by defining it to be $\{2\}$ on $\partial A_{i} \setminus \partial D_{i}$ . Then, by the domain Markov property, $\phi _{D_{i}}^{\kappa _{i}}(\mathcal {C}_{i}) = \phi _{A_{i}}^{B_{i},\kappa _{i}}(\mathcal {C}_{i})$ . Now consider the event $\bar {\mathcal {C}}_{i}$ that there exists a $\times $ -crossing of height $2$ of the annulus $A_{i}$ (not necessarily inside D). Clearly, $\mathcal {C}_{i} \subset \bar {\mathcal {C}}_{i}$ so that $\phi _{A_{i}}^{B_{i},\kappa _{i}}(\mathcal {C}_{i}) \le \phi _{A_{i}}^{B_{i},\kappa _{i}}(\bar {\mathcal {C}}_{i})$ . Let $\mathcal {E}_{i}$ be the event that there exists a $*$ -loop of height equal to 0 surrounding the origin inside $A_{i}$ (a $*$ -path is a sequence of vertices in $\mathbb {Z}^{2}$ with consecutive vertices at nearest-neighbor distance equal to 2). By duality, the events $\bar {\mathcal {C}}_{i}$ and $\mathcal {E}_{i}$ are disjoint. Indeed, by [Reference Duminil-Copin, Harel, Laslier, Raoufi and Ray9, Lemma 2.4, first item], a $\times $ -cluster of height at least $2$ is blocked by a $*$ -loop of height at most 0. Since, however, the boundary conditions on $\partial A_{i}$ are at least 0 everywhere (namely, 0 or 2), the latter is equivalent to having a $*$ -loop of height exactly equal to 0 in the annulus. Thus, $\phi _{A_{i}}^{B_{i},\kappa _{i}}(\bar {\mathcal {C}}_{i}) \le 1 - \phi _{A_{i}}^{B_{i},\kappa _{i}}(\mathcal {E}_{i})$ . So far, we have shown that

$$ \begin{align*} \phi_{D_{i}}^{\kappa_{i}}(\mathcal{C}_{i}) \le 1 - \phi_{A_{i}}^{B_{i},\kappa_{i}}(\mathcal{E}_{i}).\end{align*} $$

It remains to show that $\phi _{A_{i}}^{B_{i},\kappa _{i}}(\mathcal {E}_{i})$ is uniformly bounded below. To this end, we first argue that $\phi _{A_{i}}^{B_{i},\kappa _{i}}(\mathcal {E}_{i}) \ge \phi _{A_{i}}^{2}(\mathcal {E}_{i})$ . Indeed, since the event $\mathcal {E}_{i}$ is decreasing in $|h|$ , by viewing the constant two-boundary condition as equaling $\mathbb {Z}$ on $B_{i} \setminus \partial A_{i}$ (that is, taking free boundary conditions there), this follows from the FKG inequality for absolute value. Finally, the estimate $\phi _{A_{i}}^{2}(\mathcal {E}_{i}) \ge \alpha $ follows from (3.2) and the standard trick of using the FKG inequality (again, for $|h|$ ) to glue together crossings of four rectangles into a loop.

We are now ready to prove Proposition 4.1.

Proof of Proposition 4.1

For $j>i \ge 0$ , let $N_{i,j}$ be the number of essential level loops contained in $A_{2^{i}k,2^{j}k}$ and let $N^{\prime }_{i,j} \ge N_{i,j}$ be the number of essential level loops intersecting $A_{2^{i}k,2^{j}k}$ . We are interested in lower-bounding $N=N_{0,a}$ and upper-bounding $N^{\prime }=N^{\prime }_{0,a}$ . For the reader’s convenience, we recall the precise statements in the proof below.

Upper bound. We show the upper bound (4.1) on $N^{\prime }$ , namely, that for some universal constants $C,c>0$ , we have

(4.3) $$ \begin{align} \phi_{D}^{0}(N^{\prime} \ge n) \le e^{-cn \log(\frac na \wedge k)} \quad\text{for any }n \ge Ca. \end{align} $$

The proof follows a strategy similar to that of [Reference Duminil-Copin, Harel, Laslier, Raoufi and Ray9, Proposition 4.8] with some simplifications. Let $D_{m}$ be the domain enclosed by $\mathcal {L}_{m}$ and recall that $H_{m}$ is the height on $\mathcal {L}_{m}$ . Consider the line $(k,2^{a}k] \times \{0\}$ crossing the annulus $A_{k,2^{a}k}$ . Let $\ell _{m} \in (k,2^{a}k]$ denote the distance to the origin of the rightmost intersection point $(\ell _{m},0)$ of $\mathcal {L}_{m}$ with this line. Fix a large r, and let $\mathcal {B}_{m}$ be the event that $\ell _{m}-\ell _{m+1} \le 2^{-r}\ell _{m}$ , or equivalently, $\ell _{m+1} \ge (1-2^{-r})\ell _{m}$ . In particular, on this event, the loops $\mathcal {L}_{m}$ and $\mathcal {L}_{m+1}$ are relatively close to one another. Specifically, if $\mathcal {B}_{m}$ occurs, then the loop $\mathcal {L}_{m+1}$ creates a crossing of $A_{2^{-r}\ell _{m},\ell _{m}} ((\ell _{m},0)) \cap D_{m}$ by a $\times $ -path of height $H_{m+1}$ . Thus, by Lemma 4.9,

$$ \begin{align*} \phi_{D}^{0}(\mathcal{B}_{m} \mid \mathcal{L}_{1},\ldots,\mathcal{L}_{m}) \le e^{-dr},\end{align*} $$

where $d>0$ is a universal constant, as long as $2^{-r}\ell _{m} \ge 100$ .

Since $\mathcal {B}_{m}$ occurs unless $\log \ell _{m} - \log \ell _{m+1}> -\log (1-2^{-r}) \ge 2^{-r}$ , it is straightforward that all but at most $2^{r}a \log 2$ of the essential level loops $\mathcal {L}_{m}$ intersecting $A_{k,2^{a}k}$ must trigger the corresponding event $\mathcal {B}_{m}$ . Thus, by the above bound on the conditional probability of $\mathcal {B}_{m}$ and a union bound on those m for which $\mathcal {B}_{m}$ occurs, we have

$$ \begin{align*} \phi_{D}^{0} (N^{\prime} = n ) \le 2^{n} e^{-dr(n-2^{r}a \log 2)}. \end{align*} $$

Now choosing r to be the minimum between $\log _{2} ( n/{2a})$ and $\log _{2}({k}/{100})$ (so as to ensure that $2^{-r}\ell _{m} \ge 100$ always holds) and summing over n yields the required bound (4.3).

Lower bound. We now turn to the lower bound (4.2) on $N $ , namely, that for some universal constants $C,c>0$ , if D contains $\Lambda _{2^{a}k}$ , then

$$ \begin{align*} \phi_{D}^{0}(N \le ca) \le Ce^{-ca}.\end{align*} $$

The main issue to overcome is the case when D is much further away from the annulus in question. We need to get around the fact that the outermost essential level loop which intersects $\Lambda _{2^{a}k}$ is not too irregular. We divide the proof into three steps.

Step 1. We first establish a similar bound on the number of essential level loops surrounding $\Lambda _{k}$ . Namely,

(4.4) $$ \begin{align} \phi_{D}^{0}(N_{0,\infty} \le ca) \le e^{-ca}. \end{align} $$

For $0 \le m < N_{0,\infty }$ , let $S_{m}$ denote the largest integer $i \ge 0$ such that $\mathcal {L}_{m}$ surrounds $\Lambda _{2^{i}k}$ . We think of $S_{m}$ as the scale at which we fully discover $\mathcal {L}_{m}$ . Set $S_{m}=-1$ for $m \ge N_{0,\infty }$ . Note that $(S_{m})_{m=0}^{\infty }$ is a decreasing sequence with $S_{0} \ge a$ and $S_{N_{0,\infty }-1} \ge 0$ . By Corollary 4.8, for any $m \ge 0$ , the conditional distribution of $S_{m}-S_{m+1}$ given $\mathcal {L}_{1},\ldots ,\mathcal {L}_{m}$ is almost surely stochastically dominated by a geometric random variable of some universal parameter p. Thus,

$$ \begin{align*} N_{0,\infty}\text{ stochastically dominates }1+\max \{ m \ge 0 : \xi_{1}+\cdots+\xi_{m} \le a\},\end{align*} $$

where $\{\xi _{i}\}$ are independent geometric random variables with parameter p. It now follows from a standard Chernoff bound that there exists a universal constant $c>0$ such that

$$ \begin{align*} \phi_{D}^{0}(N_{0,\infty} \le ca) \le \mathbb{P}(\xi_{1}+\cdots+\xi_{\lfloor ca \rfloor}> a) \le e^{-ca}.\end{align*} $$

Step 2. Next, we establish the lower bound on N in the case when the domain D does not contain $\Lambda _{2^{a+1}k}$ . In this case, any essential level loop not contained in $\Lambda _{2^{a}k}$ must intersect $A_{2^{a}k,2^{a+1}k}$ , so that $N=N_{0,\infty } - N^{\prime }_{a,a+1}$ . Thus,

$$ \begin{align*} \phi_{D}^{0}\bigg(N \le \frac{ca}2\bigg) \le \phi_{D}^{0}(N_{0,\infty} \le ca) + \phi_{D}^{0}\bigg(N^{\prime}_{a,a+1} \ge \frac{ca}2\bigg) \le Ce^{-ca}, \end{align*} $$

where the last inequality follows from (4.4) and (4.3). We note for use in the next step that this inequality can be restated (by applying it with a different choice of k and a, and readjusting the constants appropriately) as, for any $a_{2} \ge a_{1} \ge 0$ and any domain D such that $\Lambda _{2^{a_{2}}k} \subset D \not \supset \Lambda _{2^{a_{2}+1}k}$ ,

(4.5) $$ \begin{align} \phi_{D}^{0}(N_{a_{1},a_{2}} \le c(a_{2}-a_{1})) \le Ce^{-c(a_{2}-a_{1})}. \end{align} $$

Step 3. Finally, we are ready to establish the bound for an arbitrary domain D containing $\Lambda _{2^{a}k}$ . Let I be the smallest integer $i>a/2$ such that there exists an essential level loop surrounding $\Lambda _{2^{a/2}k}$ and intersecting $\Lambda _{2^{i+1}k}$ (recall that we also consider the boundary of the domain as an essential level loop, so that I is always finite). It suffices to show that

$$ \begin{align*} \phi_{D}^{0}(N \le ca,~I=i) \le Ce^{-ci} \quad\text{for all }i>a/2.\end{align*} $$

Fix $i>a/2$ and suppose that the event $\{I=i\}$ occurs. Let $M_{0}$ denote the smallest index $0 \le m \le M$ such that the loop $\mathcal {L}_{m}$ intersects $\Lambda _{2^{i+1}k}$ , and note that this loop necessarily surrounds $\Lambda _{2^{i}k}$ . Let $D_{0}$ be the domain enclosed by $\mathcal {L}_{M_{0}}$ , and note that $D_{0}$ contains $\Lambda _{2^{i}k}$ but not $\Lambda _{2^{i+1}k}$ . Note also that we may explore $\mathcal {L}_{1},\ldots ,\mathcal {L}_{M_{0}}$ from the outside, so that the height function inside $D_{0}$ is conditionally independent by the domain Markov property. We now consider two cases: if $i \le a$ , then $N \ge N_{0,i}$ , so that (4.5) yields the required bound; and if $i>a$ , then $N_{a/2,i}=0$ , so that (4.5) again yields the required bound.

4.2 Proof of Theorem 3.1

Our goal is to show that the family $\mathcal {M}$ of all pairs $(\mu ^{ij}_{D},D)$ , with D finite and simply connected and $i,j \in \{0,1,2\}$ distinct, is locally mixing with a power-law rate function. To this end, we fix $n \ge 1$ , distinct $i,j \in \{0,1,2\}$ , distinct $i^{\prime },j^{\prime } \in \{0,1,2\}$ and two finite and simply connected sets $D,D^{\prime } \subset \mathbb {Z}^{2}$ containing $\Lambda _{n}$ , and we aim to construct a coupling between $f \sim \mu ^{ij}_{D}$ and $f^{\prime } \sim \mu ^{i^{\prime }j^{\prime }}_{D^{\prime }}$ with the desired properties (recall Definition 2.1). We first handle the case when D and $D^{\prime }$ are domains and $i=i^{\prime }=0$ (in  which case j and $j^{\prime }$ are irrelevant), and later explain the general case. We further assume without loss of generality that one of the domains, say $D^{\prime }$ , is $\Lambda _{n}^{\text {e}}$ , in which case the other domain necessarily contains it (the required coupling is then easily obtained from two such couplings).

In order not to introduce cumbersome notation, we begin by describing the coupling ‘as seen from’ a single vertex v. The construction will involve an iteration of a three-step procedure in which we alternate between exploring (1) only f, (2) both f and $f^{\prime }$ simultaneously, and (3) only $f^{\prime }$ . At the end of iteration i, we will have defined a sequence $L_{0},L^{\prime }_{0},L_{1},L^{\prime }_{1},\ldots ,L_{i},L^{\prime }_{i}$ of nested even $\times $ -loops surrounding v (some loops may coincide or partially overlap), where $L_{0},\ldots ,L_{i}$ are color-0 loops of f and $L^{\prime }_{0},\ldots ,L^{\prime }_{i}$ are color-0 loops of $f^{\prime }$ , and we will have fully explored f and $f^{\prime }$ outside of $L_{i}$ and $L^{\prime }_{i}$ , respectively, but not at all inside the domains $D_{i}$ and $D^{\prime }_{i}$ enclosed by them; see Figure 1. In particular, the domain Markov property will imply that at the end of iteration i, the conditional distributions of $f_{|D_{i}}$ and $f^{\prime }_{|D^{\prime }_{i}}$ will be $\mu ^{0}_{D_{i}}$ and $\mu ^{0}_{D^{\prime }_{i}}$ , respectively.

Figure 1 The sequence of nested loops $L_{0},L^{\prime }_{0},L_{1},L^{\prime }_{1},\ldots $ surrounding v.

Initially, we set $L_{0}:=\partial D$ and $L^{\prime }_{0}:=\partial D^{\prime }$ , and note that $L_{0}$ surrounds $L^{\prime }_{0}$ by our assumption on the domains. This is all that is done in iteration 0. Suppose that we have completed iteration i and that the conditional distributions of $f_{|D_{i}}$ and $f^{\prime }_{|D^{\prime }_{i}}$ are $\mu ^{0}_{D_{i}}$ and $\mu ^{0}_{D^{\prime }_{i}}$ . Let us now explain how to define $L_{i+1}$ and $L^{\prime }_{i+1}$ . At this point, we switch to the height function representation, noting that we may view $f_{|D_{i}}$ as a height function $h_{i}$ for which $L_{i}$ is a level loop of height 0, and similarly, we may view $f^{\prime }_{|D^{\prime }_{i}}$ as a height function $h^{\prime }_{i}$ for which $L^{\prime }_{i}$ is a level loop of height 0. Thus, $h_{i} \sim \phi ^{0}_{D_{i}}$ and $h^{\prime }_{i}\sim \phi ^{0}_{D^{\prime }_{i}}$ . We now use this representation in order to describe the three-step exploration procedure of f and $f^{\prime }$ in iteration $i+1$ . For readability, we write $h=h_{i}$ and $h^{\prime }=h^{\prime }_{i}$ below.

Step 1. We explore the absolute value $|h|$ of the height function h (independently of $h^{\prime }$ ) up to the loop $L^{\prime }_{i}$ (that is, on $D_{i} \setminus D^{\prime }_{i}$ and on the boundary of $D^{\prime }_{i}$ ), thereby revealing a random $|h|$ -adapted boundary condition for h on the domain $D^{\prime }_{i}$ . By the domain Markov property, the conditional law of h in $D^{\prime }_{i}$ is that of a uniform homomorphism height function in $D^{\prime }_{i}$ with this boundary condition (which has $B_{\text {sym}}=L^{\prime }_{i}$ and $B_{\text {pos}}=\emptyset $ ).

Step 2. Since $h^{\prime }$ is zero on $L^{\prime }_{i}$ , the FKG inequality for absolute value implies that $|h|_{|D^{\prime }_{i}}$ stochastically dominates $|h^{\prime }|_{|D^{\prime }_{i}}$ (given the exploration in the first step). Consequently, we can construct a coupling between $|h|$ and $|h^{\prime }|$ inside $D^{\prime }_{i}$ by simultaneously exploring and revealing the values of both $|h|$ and $|h^{\prime }|$ , vertex by vertex, starting from the boundary of $D^{\prime }_{i}$ inwards, ensuring along the way that $|h| \ge |h^{\prime }|$ (this type of exploration is standard; see, for example, [Reference van den Berg and Maes35]). We explore $|h|$ and $|h^{\prime }|$ in this way until we discover the outermost level loop of height 0 mod 6 for h surrounding v and inside $D^{\prime }_{i}$ ; this loop is $L_{i+1}$ (note that it is a color-0 even $\times $ -loop for f).

Step 3. The domain Markov property implies that, at this point, the conditional distributions of $|h|_{|D_{i+1}}$ and $|h^{\prime }|_{|D_{i+1}}$ are those of the absolute values of uniform homomorphism height functions in $D_{i+1}$ with the corresponding $|h|$ -adapted boundary conditions that have been revealed by the exploration in the previous step. If these boundary conditions happen to be identical, then we jointly sample h and $h^{\prime }$ in the domain $D_{i+1}$ according to a single sample from their common distribution, and we stop the iterative procedure; in this case, we say that the $i+1$ iteration resulted in a successful coupling for v. Otherwise, we continue exploring $|h^{\prime }|$ alone (independently of h) until we discover the outermost level loop of height 0 mod 6 for $h^{\prime }$ surrounding v and inside $D_{i+1}$ ; this loop is $L^{\prime }_{i+1}$ . At this point, we have coupled the absolute values of the height functions up to $L_{i+1}$ and $L^{\prime }_{i+1}$ . Finally, we complete the coupling h and $h^{\prime }$ by coupling their signs in any manner (for example, independently).

This completes the description of iteration $i+1$ , at the end of which we have explored f up to $L_{i+1}$ and $f^{\prime }$ up to $L^{\prime }_{i+1}$ , so that by the domain Markov property, the conditional distributions of $f_{|D_{i+1}}$ and $f^{\prime }_{|D^{\prime }_{i+1}}$ are indeed $\mu ^{0}_{D_{i+1}}$ and $\mu ^{0}_{D^{\prime }_{i+1}}$ , as claimed. If at some iteration i we cannot find one of the loops we are looking for (that is, $L_{i}$ or $L^{\prime }_{i}$ does not exist), then we stop the iterative procedure and say that the coupling has failed for v. We emphasize that in iteration i, when we discover the loops $L_{i}$ and $L^{\prime }_{i}$ , the heights of h and $h^{\prime }$ on these loops are some (different) multiples of 6, but that in the next iteration we then shift each of the two height functions by the corresponding amount so that the height again becomes 0 for both before we start looking for the loops $L_{i+1}$ and $L^{\prime }_{i+1}$ .

The actual coupling between f and $f^{\prime }$ does not treat v as a distinguished vertex, but rather attempts to couple all vertices in $\Lambda _{n}$ in parallel. Specifically, whenever we tried to find the outermost level loop of height 0 mod 6 surrounding v, we instead find all outermost level loops of height 0 mod 6 (with no specific target vertex). The collection of all such loops is still explorable from the outside. Thus, iteration i generates for us two collections of loops $\{L_{i,j}\}_{j}$ and $\{L^{\prime }_{i,j,j^{\prime }}\}_{j,j^{\prime }}$ (with $L^{\prime }_{i,j,j^{\prime }}$ nested inside $L_{i,j}$ ), and in iteration $i+1$ we recursively repeat this procedure inside each of these loops. This completes the description of the coupling between f and $f^{\prime }$ .

We now turn to show that the constructed coupling has the required properties. We first check that $f_{|D \setminus \Lambda _{n}}$ and $f^{\prime }$ are independent. Indeed, in the first step of the first iteration of the construction, we explore f independently of $f^{\prime }$ up to $\partial D^{\prime }$ and thereby reveal $f_{|(D \setminus D^{\prime }) \cup \partial D^{\prime }}$ . Since $D^{\prime } \setminus \partial D^{\prime } \subset \Lambda _{n}$ and since, conditionally on this exploration, the law of $f^{\prime }$ is still $\mu ^{0}_{D^{\prime }}$ , we see that $f^{\prime }$ is independent of $f_{|D \setminus \Lambda _{n}}$ .

We now turn to the main issue at hand, namely, to show that the constructed coupling has a good chance of successfully coupling any given vertex. To be precise, we need to show that, under this coupling, $\mathbb {P}(f(v) \neq f^{\prime }(v)) \le C(n-k)^{-\alpha }$ for some $C,\alpha>0$ and any $0 \le k \le n$ and $v \in \Lambda _{k}$ . Fix such a k and v. By construction, $f(v)$ and $f^{\prime }(v)$ are equal unless the coupling fails for v. Thus, it suffices to show that

(4.6) $$ \begin{align} \mathbb{P}(\text{coupling fails for }v) \le C(n-k)^{-\alpha}. \end{align} $$

We continue to use the loops $L_{i}$ and $L^{\prime }_{i}$ as defined above with respect to v. Let $\mathcal {F}_{i}$ denote the $\sigma $ -algebra generated by $L_{0},L^{\prime }_{0},\ldots ,L_{i},L^{\prime }_{i}$ and $f_{|(D \setminus D_{i}) \cup L_{i}}$ and $f^{\prime }_{|(D^{\prime } \setminus D^{\prime }_{i}) \cup L^{\prime }_{i}}$ . Note that $\mathcal {F}_{i}$ represents the information revealed at the end of iteration i.

Let us first show that it is unlikely that the coupling fails for v after few iterations. To be precise, we claim that for some constants $C,c,\alpha>0$ , we have

(4.7) $$ \begin{align} \mathbb{P}(\text{coupling fails for }{v}\text{ before iteration }c \log(n-k)) \le C(n-k)^{-\alpha}. \end{align} $$

To this end, let $S_{i}$ denote the largest $j \ge 0$ such that $L^{\prime }_{i}$ surrounds $\Lambda _{2^{j}}(v)$ , and set $S_{i}=-1$ if $L^{\prime }_{i}$ does not exist. Note that $S_{0} \ge S_{1} \ge S_{2} \ge \cdots $ since the loops are nested, and that $S_{0} \ge \lfloor \log _{2}(n-k) \rfloor $ since $\Lambda _{n-k}(v) \subset \Lambda _{n} \subset D^{\prime }$ . Note also that if the coupling fails for v before iteration i, then $S_{i}=-1$ . Thus, it suffices to bound $\mathbb {P}(S_{m}=-1)$ , where $m := \lceil c\log (n-k) \rceil $ . We claim that, for any $i \ge 0$ , conditioned on $\mathcal {F}_{i}$ , the difference $S_{i}-S_{i+1}$ is almost surely stochastically dominated by a random variable T having exponential tails. Indeed, if $S_{i}=s$ and $S_{i}-S_{i+1} \ge 2t+1$ , then $t \le s/2$ and either the annulus $A_{2^{s-t},2^{s}}(v)$ contains no loop of height 0 mod 6 surrounding v for $h_{i}$ or the annulus $A_{2^{s-2t},2^{s-t}}(v)$ contains no loop of height 0 mod 6 surrounding v for $h^{\prime }_{i}$ . Corollary 4.2 implies that, given $\mathcal {F}_{i}$ , each of these events has probability at most $Ce^{-ct}$ for some universal constants $C,c>0$ . Therefore, letting $\{T_{j}\}_{j}$ be independent copies of T,

$$ \begin{align*} \mathbb{P}(S_{m}=-1) \le \mathbb{P}(T_{1}+\cdots+T_{m} \ge \lfloor \log_{2}(n-k) \rfloor) \le C(n-k)^{-\alpha},\end{align*} $$

where the second inequality follows from a standard Chernoff bound for i.i.d. random variables with exponential tails by choosing c small enough.

Now that we know that the coupling does not fail for v before order $\log (n-k)$ iterations, we aim to show that in each such iteration there is a constant probability of a successful coupling for v. It will be helpful to consider two consecutive iterations at a time. Thus, we aim to show that, for some constants $K,c>0$ , for any even $i \ge 0$ , conditioned on $\mathcal {F}_{i}$ , on the event that $L^{\prime }_{i}$ is at distance at least K from v, with probability at least c, the $i+2$ iteration results in a successful coupling for v. This yields that, for all $i \ge 0$ ,

$$ \begin{align*} \mathbb{P}(\text{coupling neither succeeds nor fails for }{v}\text{ before iteration }i) \le C(1-c)^{i}.\end{align*} $$

Together with (4.7), this gives the required bound (4.6). The reason for considering two consecutive iterations is that, given $\mathcal {F}_{i}$ , it may happen that $L^{\prime }_{i}$ is deep inside $L_{i}$ . We use the first of the two iterations to gain some control on this, showing that, regardless of the relative geometry of $L_{i}$ and $L^{\prime }_{i}$ , there is a constant probability that $L^{\prime }_{i+1}$ is not far from $L_{i+1}$ . In the next iteration, conditioning on $\mathcal {F}_{i+1}$ , we may then assume that we are on this good event, in which case we will be able to show that there is a constant probability that $L_{i+2}$ is a color-0 loop for both f and $f^{\prime }$ (in fact, a loop of height 0 for both $h_{i+1}$ and $h^{\prime }_{i+1}$ ), resulting in a successful coupling for v. We now make this precise.

Condition on $\mathcal {F}_{i}$ and consider the loops $L_{i}$ and $L^{\prime }_{i}$ . By what we have shown above about $S_{i}-S_{i+1}$ , with probability at least $\frac {1}{2}$ , we have that

(4.8) $$ \begin{align} \Lambda_{2^{s-a}}(v) \subset D^{\prime}_{i+1} \subset D_{i+1} \subset D^{\prime}_{i} \not\supset \Lambda_{2^{s+1}}(v) \end{align} $$

for some universal constant a, where $s:=S_{i}$ . Now condition on $\mathcal {F}_{i+1}$ and assume that (4.8) occurs. Then Corollary 4.3 implies that $L_{i+2}$ exists and is a level loop of height 0 for $h_{i+1}$ with probability at least c for some universal constant $c>0$ , as long as $2^{s-a}$ is larger than some universal constant $K^{\prime }$ (which is ensured by choosing $K=2^{a+1}K^{\prime }$ ). Now observe that the domination maintained in the second step of the construction of the coupling implies that $L_{i+2}$ must also be a level loop of height 0 for $h^{\prime }_{i+1}$ , implying that iteration $i+2$ resulted in a successful coupling for v. Thus, there is probability at least $c/2$ that iteration $i+2$ results in a successful coupling for v. This finishes the proof that the constructed coupling has the two properties required by the local mixing condition.

We have shown above that the family $\mathcal {M}^{\prime } \subset \mathcal {M}$ consisting of all pairs $(\mu ^{0}_{D},D)$ , where D is a domain, is locally mixing with a power-law rate function. It remains to explain that $\mathcal {M}$ is locally mixing with such a rate function. Suppose that $f \sim \mu ^{ij}_{D}$ for general $D,i,j$ . We may still assume as before that $D^{\prime }$ is the domain $\Lambda ^{\text {e}}_{n}$ and that $i^{\prime }=0$ . The above proof applies to this situation as is, with the only difference being that $L_{0}$ is no longer a color-0 loop for f so that when we appeal to Corollary 4.2 in the first iteration (that is, when arguing that $L_{1}$ is discovered quickly), we need to use the full strength of the corollary (the moreover part). In fact, this shows more, namely, that the larger class $\mathcal {M}^{\prime \prime } \supset \mathcal {M}$ consisting of all pairs $(\mu ^{\xi }_{D},D)$ with D finite and simply connected and $\xi $ a feasible boundary condition with bounded oscillation (as in the sense of Remark 4.5) is also locally mixing with a power-law rate function.

5 Følner independence implies local mixing

In this section, we complete the proof of Proposition 2.6, by showing that any translation-invariant measure $\mu $ on $\mathcal {A}^{\mathbb {Z}^{d}}$ that is Følner independent is also locally mixing. In the proof, we will compare $\mu $ to a measure $\nu _{\mathcal {B}}$ consisting of independent ‘blocks’ having the same marginal distributions as $\mu $ . See [Reference Rudolph and Schmidt28, Reference Shields30] for similar notions.

Suppose that $\mu $ satisfies the definition of Følner independence (Definition 2.4) with $\varepsilon =\tilde {\rho }(n)$ for some rate function $\tilde {\rho }$ . We shall show that $\mu $ is locally mixing with rate function $2\rho $ given by $\rho (n) := 4\tilde {\rho }(\lfloor {n}/{12} \rfloor )$ . Thus, we fix $n \ge 1$ and aim to construct a coupling between two samples of $\mu $ with the two properties required by the definition of local mixing (Definition 2.1). To avoid measure-theoretic technicalities, we also fix $N \gg n$ and construct the coupling between two samples of $\mu _{|\Lambda _{N}}$ (with the bound on the probability of disagreement independent of N). Taking any subsequential limit of these couplings as $N \to \infty $ will yield the required coupling. Thus, it suffices to construct a coupling between $f \sim \mu _{|\Lambda _{N}}$ and $f^{\prime } \sim \mu _{|\Lambda _{n}}$ such that $f_{|\Lambda _{N} \setminus \Lambda _{n}}$ and $f^{\prime }$ are independent and $\mathbb {P}(f(v) \neq f^{\prime }(v)) \le 2\rho (k)$ for any $0 \le k \le n$ and $v \in \Lambda _{n-k}$ . In turn, it suffices to construct a measure $\nu $ on $\Lambda _{n}$ and a coupling between $f \sim \mu _{|\Lambda _{N}}$ and $f^{\prime } \sim \nu $ such that $f_{|\Lambda _{N} \setminus \Lambda _{n}}$ and $f^{\prime }$ are independent and $\mathbb {P}(f(v) \neq f^{\prime }(v)) \le \rho (k)$ for any $0 \le k \le n$ and $v \in \Lambda _{n-k}$ (this indeed suffices as one may first sample $f^{\prime } \sim \nu $ and then, conditionally on $f^{\prime }$ , independently sample f and $\tilde f$ from the latter coupling; since $f_{|\Lambda _{N} \setminus \Lambda _{n}}$ is independent of $f^{\prime }$ , it is also independent of $\tilde f$ ; this thus yields the required coupling).

Throughout the proof, we redefine $\Lambda _{k}$ to be the box $\{-k+1,\ldots ,k\}^{d}$ so that it has side length $2k$ and volume $(2k)^{d}$ . This is merely for notational convenience, so that $\Lambda _{k}$ perfectly tiles $\Lambda _{mk}$ for any integer $m \ge 1$ . The notions of local mixing and Følner independence are clearly unaffected by this change. We also let $\Lambda _{0}$ denote the singleton consisting of the origin.

By the choice of $\tilde {\rho }$ , for any $k \ge 0$ , there is a collection of couplings $(\pi _{k}^{\tau })_{\tau \in \mathcal {A}^{\Lambda _{2N} \setminus \Lambda _{k}}}$ between $\mu (f_{|\Lambda _{k}} \in \cdot \mid f_{|\Lambda _{2N} \setminus \Lambda _{k}}=\tau )$ and $\mu |_{\Lambda _{k}}$ such that $({1}/{|\Lambda _{k}|}) \sum _{v \in \Lambda _{k}} \pi _{k}^{\tau }(f(v) \neq f^{\prime }(v)) \le \tilde {\rho }(k)$ for all $\tau $ but a set of $\mu _{|\Lambda _{2N} \setminus \Lambda _{k}}$ -measure at most $\tilde {\rho }(k)$ . By sampling $f_{|\Lambda _{2N} \setminus \Lambda _{k}}$ from $\mu _{|\Lambda _{2N} \setminus \Lambda _{k}}$ and then sampling from $\pi _{k}^{f_{|\Lambda _{2N} \setminus \Lambda _{k}}}$ , this gives a (non-random) coupling $\pi _{k}$ of $f \sim \mu _{|\Lambda _{2N}}$ and $f^{\prime } \sim \mu _{|\Lambda _{k}}$ such that $f_{|\Lambda _{2N} \setminus \Lambda _{k}}$ and $f^{\prime }$ are independent and whose marginal probabilities satisfy

$$ \begin{align*} \frac{1}{|\Lambda_{k}|}\sum_{v \in \Lambda_{k}} \pi_{k}(f(v) \neq f^{\prime}(v)) \le 2\tilde{\rho}(k).\end{align*} $$

We aim to construct such a coupling (with $\Lambda _{k}$ replaced by $\Lambda _{n}$ ) in which a similar such bound holds term by term, not just on average.

We extend the collection $(\pi _{k}^{\tau })$ to include $\tau $ which are defined on any subset of $\Lambda _{2N} \setminus \Lambda _{k}$ , by averaging over the values on the remaining part outside of $\Lambda _{k}$ . That is, if $S \subsetneq \Lambda _{2N} \setminus \Lambda _{k}$ , then for $\tau \in \mathcal {A}^{S}$ , we define $\pi ^{\tau }_{k}(\cdot ) := \mathbb {E}[\pi _{k}^{\xi _{|\Lambda _{2N} \setminus \Lambda _{k}}}(\cdot )]$ , where $\xi \sim \mu (\cdot \mid \tau )$ . Observe that for any such S, if $\tau \sim \mu _{|S}$ , then for any $v \in \Lambda _{k}$ ,

$$ \begin{align*} \mathbb{E}[\pi^{\tau}_{k}(f(v) \neq f^{\prime}(v))] = \pi_{k}(f(v) \neq f^{\prime}(v)) =: p_{k,v}.\end{align*} $$

Note that the above would not necessarily hold if instead of the above averaging we were to appeal to Følner independence again (which would yield an unrelated $\pi _{k}^{\tau }$ ).

We now also extend the collection $(\pi _{k}^{\tau })$ to allow translates of $\Lambda _{k}$ as follows. Let $B=\Lambda _{k}+b$ be a box centered at b and suppose that $B \subset \Lambda _{n}$ . For a boundary condition $\tau $ defined on a subset of S of $\Lambda _{N} \setminus B$ , we define $\pi _{B}^{\tau }$ to be the coupling between ${\mu (f_{|B} \in \cdot \mid f_{|\Lambda _{N} \setminus B}=\tau )}$ and $\mu _{|B}$ obtained by translating B and $\tau $ to the origin, applying the appropriate coupling, and translating back. Specifically, define $\pi _{B}^{\tau }(E) := \mathbb {P}((f_{v-b},f^{\prime }_{v-b})_{v \in B} \in E)$ for any $E \subset \mathcal {A}^{B} \times \mathcal {A}^{B}$ , where $(f,f^{\prime }) \sim \pi _{k}^{\tau ^{\prime }}$ and $\tau ^{\prime } \in \mathcal {A}^{S-b}$ is defined by $\tau ^{\prime }_{v-b}:=\tau _{v}$ for $v \in S$ . Note that this is well defined since $\tau ^{\prime }$ is defined on $S-b$ which is a subset of $\Lambda _{2N} \setminus \Lambda _{k}$ , and that this is a coupling between the two claimed measures by the translation-invariance of $\mu $ .

Let $\mathcal {B}=\{B_{1},\ldots ,B_{\ell }\}$ be a partition of $\Lambda _{n}$ into boxes (of various sizes). We construct a coupling ${\sf P}_{\mathcal {B}}$ of $f \sim \mu _{|\Lambda _{N}}$ and $f^{\prime } \sim \nu _{\mathcal {B}} := \mu |_{B_{1}} \times \cdots \times \mu |_{B_{\ell }}$ as follows. Let $k_{1},\ldots ,k_{\ell }$ be the sizes of the boxes and let $b_{1},\ldots ,b_{\ell }$ be their centers, so that $B_{i}=\Lambda _{k_{i}}+b_{i}$ for all i. Denote $B_{0} := \Lambda _{N} \setminus \Lambda _{n}$ . First, sample $f_{|B_{0}}$ . Next, conditione on $f_{|B_{0}}$ , sample $(f_{|B_{1}},f^{\prime }_{|B_{1}})$ from $\pi _{B_{1}}^{f_{|B_{0}}}$ . Now suppose we have already sampled f on $B_{0} \cup \cdots \cup B_{i-1}$ and $f^{\prime }$ on $B_{1} \cup \cdots \cup B_{i-1}$ , and conditioned on this, sample $(f_{|B_{i}},f^{\prime }_{|B_{i}})$ from $\pi _{B_{i}}^{f_{|B_{0} \cup \cdots \cup B_{i-1}}}$ . It is straightforward that this procedure defines a pair $(f,f^{\prime })$ such that $f \sim \mu _{|\Lambda _{N}}$ and $f^{\prime } \sim \nu _{\mathcal {B}}$ , and such that $f_{|B_{0}}$ is independent of $f^{\prime }$ . Furthermore,

(5.1) $$ \begin{align} {\sf P}_{\mathcal{B}}(f(v) \neq f^{\prime}(v)) \le p_{k_{i},v-b_{i}} \quad\text{for any }1 \le i \le \ell\text{ and } v \in B_{i}. \end{align} $$

We define a coupling $\sf P$ between $f \sim \mu _{|\Lambda _{N}}$ and $f^{\prime } \sim \nu $ (with $\nu $ defined below) by choosing $\mathcal {B}$ randomly and then applying ${\sf P}_{\mathcal {B}}$ (conditionally independently of $\mathcal {B}$ ). We construct $\mathcal {B}$ as follows (see Figure 2). Let $m:=\lfloor \log _{4} n \rfloor $ and choose a uniformly random $x \in \Lambda _{4^{m}}$ . For every integer i between $0$ and m, and in decreasing order (that is, starting from $i=m$ ), extend $\Lambda _{4^{i}}+x$ to a tiling of $\mathbb {Z}^{d}$ by translates of $\Lambda _{4^{i}}$ , and add to $\mathcal {B}$ those boxes of the tiling that are at distance at least $4^{i}$ from $\Lambda _{n}^{c}$ and disjoint from all boxes already in $\mathcal {B}$ . At the end of this procedure, any vertex of $\Lambda _{n}$ that is not finally covered by a box in $\mathcal {B}$ is added to $\mathcal {B}$ as a singleton. We also order the boxes in $\mathcal {B}$ arbitrarily. This yields a coupling $\sf P$ between $f \sim \mu _{|\Lambda _{N}}$ and $f^{\prime } \sim \nu := \mathbb {E}[\nu _{\mathcal {B}}]$ .

Figure 2 The random partition of $\Lambda _{n}$ into boxes.

By construction, under this coupling, $f_{|\Lambda _{N}\setminus \Lambda _{n}}$ and $f^{\prime }$ are independent. It remains to show that ${\sf P}(f(v) \neq f^{\prime }(v)) \le \rho (k)$ for any $0 \le k \le n$ and vertex $v \in \Lambda _{n-k}$ . We may assume that $k \ge 12$ as otherwise $\rho (k)>1$ and there is nothing to prove.

Suppose first that $3 \cdot 4^{i} < k < 4^{i+1}$ for some $1 \le i \le m$ . In this case, whatever x happens to be, v always belong to a box $B_{v} \in \mathcal {B}$ of size $4^{i}$ . Since x is chosen uniformly in $\Lambda _{4^{m}}$ and $4^{m}$ is a multiple of $4^{i}$ , it is easy to see using (5.1) that

$$ \begin{align*} {\sf P}(f(v) \neq f^{\prime}(v)) \le \frac{1}{|\Lambda_{4^{i}}|} \sum_{u \in \Lambda_{4^{i}}} p_{4^{i},u} \le 2\tilde{\rho}(4^{i}) \le 2\tilde{\rho}\bigg(\frac k4\bigg) \le \rho(k).\end{align*} $$

Otherwise, $4^{i} \le k \le 3 \cdot 4^{i}$ for some $1 \le i \le m-1$ . In this case, depending on the value of x, the box $B_{v} \in \mathcal {B}$ to which v belongs has size either $4^{i}$ or $4^{i-1}$ . For each $u \in \Lambda _{4^{i}}$ , let $a_{u}$ denote the number of choices for x such that $B_{v}$ is a box of size $4^{i}$ and $v \in u+4^{i} \mathbb {Z}^{d}$ , and similarly, for each $u \in \Lambda _{4^{i-1}}$ , let $b_{u}$ denote the number of choices for x such that $B_{v}$ is a box of size $4^{i-1}$ and $v \in u+4^{i-1} \mathbb {Z}^{d}$ . Then, using (5.1), we obtain that

$$ \begin{align*} {\sf P}(f(v) \neq f^{\prime}(v)) \le \frac{1}{|\Lambda_{4^{m}}|} \bigg[ \sum_{u \in \Lambda_{4^{i}}} a_{u} p_{4^{i},u} + \sum_{u \in \Lambda_{4^{i-1}}} b_{u} p_{4^{i-1},u} \bigg].\end{align*} $$

It is not hard to see that there exist a and b such that $a_{u} \in \{a,a+1\}$ for all $u \in \Lambda _{4^{i}}$ and $b_{u} \in \{b,b+1\}$ for all $u \in \Lambda _{4^{i-1}}$ . Using the bounds $a_{u} \le a+1$ , $b_{u} \le b+1$ , $a|\Lambda _{4^{i}}|+b|\Lambda _{4^{i-1}}| \le |\Lambda _{4^{m}}|$ and $|\Lambda _{4^{i}}|+|\Lambda _{4^{i-1}}|\le |\Lambda _{4^{m}}|$ yields that

$$ \begin{align*} {\sf P}(f(v) \neq f^{\prime}(v)) \le 4\tilde{\rho}(4^{i-1}) \le 4\tilde{\rho}\bigg(\frac {k}{12}\bigg) \le \rho(k). \end{align*} $$

Acknowledgements

We thank Nishant Chandgotia, Tom Meyerovitch and Ron Peled for several fruitful discussions. We also thank Tom for suggesting this question. The first author thanks Benoit Laslier for some stimulating conversations. Research of G.R. was supported in part by NSERC 50311-57400 and University of Victoria start-up 10000-27458. Research of Y.S. was supported in part by NSERC of Canada.

References

Achlioptas, D., Molloy, M., Moore, C. and Van Bussel, F.. Rapid mixing for lattice colourings with fewer colours. J. Stat. Mech. Theory Exp. 2005(10) (2005), P10012.CrossRefGoogle Scholar
Adams, S.. Følner independence and the amenable Ising model. Ergod. Th. & Dynam. Sys. 12(4) (1992), 633657.CrossRefGoogle Scholar
Boyle, M.. Open problems in symbolic dynamics. Contemp. Math. 469 (2008), 69118.CrossRefGoogle Scholar
Chandgotia, N., Peled, R., Sheffield, S. and Tassy, M.. Delocalization of uniform graph homomorphisms from ${\mathbb{Z}}^2$ to $\mathbb{Z}$ . Comm. Math. Phys. 387(2) (2021), 621647.CrossRefGoogle Scholar
Conze, J. P.. Entropie d’un groupe abélien de transformations. Z. Wahrsch. Verwandte Gebiete 25(1) (1972), 1130.CrossRefGoogle Scholar
den Hollander, F. and Steif, J. E.. Mixing properties of the generalized $T,{T}^{-1}$ -process. J. Anal. Math. 72(1) (1997), 165202.CrossRefGoogle Scholar
den Hollander, F. and Steif, J. E.. On K-automorphisms, Bernoulli shifts and Markov random fields. Ergod. Th. & Dynam. Sys. 17(2) (1997), 405415.CrossRefGoogle Scholar
Duminil-Copin, H.. Lectures on the Ising and Potts models on the hypercubic lattice. PIMS-CRM Summer School in Probability. Springer, Cham, 2017, pp. 35161.CrossRefGoogle Scholar
Duminil-Copin, H., Harel, M., Laslier, B., Raoufi, A. and Ray, G.. Logarithmic variance for the height function of square-ice. Preprint, 2022, arXiv:1911.00092.CrossRefGoogle Scholar
Feldheim, O. N. and Spinka, Y.. Long-range order in the 3-state antiferromagnetic Potts model in high dimensions. J. Eur. Math. Soc. (JEMS) 21(5) (2019), 15091570.CrossRefGoogle Scholar
Galvin, D., Kahn, J., Randall, D. and Sorkin, G.. Phase coexistence and torpid mixing in the 3-coloring model on . SIAM J. Discrete Math. 29(3) (2015), 12231244.CrossRefGoogle Scholar
Goldberg, L. A., Jalsenius, M., Martin, R. and Paterson, M.. Improved mixing bounds for the anti-ferromagnetic Potts model on ${{\textsf{Z}}}^2$ . LMS J. Comput. Math. 9 (2006), 120.CrossRefGoogle Scholar
Grimmett, G. R.. The Random-Cluster Model (Grundlehren der mathematischen Wissenschaften, 333). Springer, Berlin, 2006.CrossRefGoogle Scholar
Häggström, O., Jonasson, J. and Lyons, R.. Coupling and Bernoullicity in random-cluster and Potts models. Bernoulli 8(3) (2002), 275294.Google Scholar
Hoffman, C.. A Markov random field which is K but not Bernoulli. Israel J. Math. 112(1) (1999), 249269.CrossRefGoogle Scholar
Hoffman, C.. A family of nonisomorphic Markov random fields. Israel J. Math. 142(1) (2004), 345366.CrossRefGoogle Scholar
Kammeyer, J. W.. A complete classification of the two-point extensions of a multidimensional Bernoulli shift. J. Anal. Math. 54 (1990), 113163.CrossRefGoogle Scholar
Katznelson, Y. and Weiss, B.. Commuting measure-preserving transformations. Israel J. Math. 12(2) (1972), 161173.CrossRefGoogle Scholar
Ledrappier, F.. Un champ Markovien peut être d’entropie nulle et mélangeant. C. R. Acad. Sci. Paris 287(7) (1978), A561A563.Google Scholar
Lieb, E. H.. Residual entropy of square ice. Condensed Matter Physics and Exactly Soluble Models. Eds. B. Nachtergaele, J. P. Solovej and J. Yngvason. Springer, Berlin, 2004, pp. 461471.CrossRefGoogle Scholar
Nam, D., Sly, A. and Zhang, L.. Ising model on trees and factors of IID. Comm. Math. Phys. 389 (2022), 10091046.CrossRefGoogle Scholar
Ornstein, D.. Ergodic Theory, Randomness and Dynamical Systems (Yale Mathematical Monographs, 5). Yale University Press, New Haven, CT, 1974.Google Scholar
Ornstein, D. and Weiss, B.. Finitely determined implies very weak Bernoulli. Israel J. Math. 17(1) (1974), 94104.CrossRefGoogle Scholar
Ornstein, D. and Weiss, B.. -actions and the Ising model, unpublished, 1977.Google Scholar
Peled, R.. High-dimensional Lipschitz functions are typically flat. Ann. Probab. 45(3) (2017), 13511447.CrossRefGoogle Scholar
Peled, R. and Spinka, Y.. Rigidity of proper colorings of ${\mathbb{Z}}^d$ . Preprint, 2020, arXiv:1808.03597.Google Scholar
Peled, R. and Spinka, Y.. Three lectures on random proper colorings of ${\mathbb{Z}}^d$ . Preprint, 2020, arXiv:2001.11566.Google Scholar
Rudolph, D. J. and Schmidt, K.. Almost block independence and Bernoullicity of ${\mathbb{Z}}^d$ -actions by automorphisms of compact abelian groups. Invent. Math. 120(1) (1995), 455488.CrossRefGoogle Scholar
Sheffield, S.. Random Surfaces. Société Mathématique de France, Paris, 2005.Google Scholar
Shields, P. C.. Almost block independence. Z. Wahrsch. Verwandte Gebiete 49(1) (1979), 119123.CrossRefGoogle Scholar
Slawny, J.. Ergodic properties of equilibrium states. Comm. Math. Phys. 80(4) (1981), 477483.CrossRefGoogle Scholar
Sly, A. and Zhang, L.. Stationary distributions for the voter model in $d\ge 3$ are factors of IID. Preprint, 2022, arXiv:1908.09450.CrossRefGoogle Scholar
Spinka, Y.. Finitary codings for spatial mixing Markov random fields. Ann. Probab. 48(3) (2020), 15571591.CrossRefGoogle Scholar
Thouvenot, J.-P.. Convergence en moyenne de l’information pour l’action de ${\mathbb{Z}}^2$ . Z. Wahrsch. Verwandte Gebiete 24(2) (1972), 135137.CrossRefGoogle Scholar
van den Berg, J. and Maes, C.. Disagreement percolation in the study of Markov fields. Ann. Probab. 22(2) (1994), 749763.Google Scholar
van den Berg, J. and Steif, J. E.. On the existence and nonexistence of finitary codings for a class of random fields. Ann. Probab. 27 (1999), 15011522.Google Scholar
Figure 0

Figure 1 The sequence of nested loops $L_{0},L^{\prime }_{0},L_{1},L^{\prime }_{1},\ldots $ surrounding v.

Figure 1

Figure 2 The random partition of $\Lambda _{n}$ into boxes.