1 Introduction
A number $x\in [0,1)$ with base $10$ expansion $x=0.a_{1}a_{2}a_{3}\ldots \,$ is said to be normal (to base $10$ ) if for any finite string $s=[c_{1},c_{2},\ldots ,c_{k}]$ we have that
Although almost all real numbers are normal,Footnote 1 we still do not know of a single commonly used mathematical constant, such as $\unicode[STIX]{x1D70B}$ , $e$ , or $\sqrt{2}$ , that is normal.
In his PhD thesis under Lehmer, Wall [Reference WallWal50] proved a series of results on normal numbers which are now considered classical, elementary facts. Among them, Wall proved that if $x$ is normal, then $qx+r$ is normal for any rational numbers $q,r$ with $q\neq 0$ . Chang [Reference ChangCha76] appears to have discovered this result independently, while Doty et al. [Reference Doty, Lutz and NandakumarDLN07] knew of Wall’s result and reproved it by a different method. Aistleitner [Reference AistleitnerAis11] has given the only significant extension of Wall’s result the author is aware of, showing that if $x$ is normal and $y\in \mathbb{R}$ is a number with almost all of its digits equal to $0$ , then $x+ry$ is normal for any rational $r$ . (See also [Reference BugeaudBug12, p. 97].)
Although the definition of normality is easily extended to many other digital systems, questions about which operations preserve normality are still unanswered in most cases. Recently, the present author, with Airey and Mance [Reference Airey, Mance and VandeheyAMV15], studied how rational multiplication and addition act for $Q$ -Cantor series expansions.
However, in this paper, we shall be interested in normality for continued fraction expansions, which we shall abbreviate as CF-normality and define explicitly in a moment. Mendès France first asked the question of which operations preserve CF-normality [Reference MauduitMau00, pp. 17–18]. He actually asked a simpler question, namely if non-zero rational multiplication preserves simple normalityFootnote 2 for continued fractions. Yann Bugeaud extended the question to ask whether non-zero rational multiplication preserved CF-normality [Reference BugeaudBug12, Problem 10.56, p. 222]. Part of the difficulty of proving such a result comes from the fact that rational multiplication and addition are operations that are not very well understood for continued fractions. Research on these topics appears to have come almost completely from a computational side (‘Given a continued fraction expansion $x$ , how can we quickly compute the continued fraction expansion of $qx+r$ ?’). Notable works include Gosper [Reference GosperGos72], Raney [Reference RaneyRan73], and Liardet and Stambul [Reference Liardet and StambulLS98]. On the theoretical side (‘If $x$ has a continued fraction expansion with property $Y$ , does $qx+r$ have property $Z$ ?’), the author is unaware of any significant result.
We recall some standard definitions for continued fractions. The (regular) continued fraction expansion of a number $x\in \mathbb{R}$ is given by
We will denote this expansion by $\langle a_{0};a_{1},a_{2},\ldots \,\rangle$ for typographical simplicity. This expansion is infinite if and only if $x$ is irrational. We will refer to the $n$ th digit of the continued fraction expansion of $x$ by $a_{n}(x)$ or just $a_{n}$ if the choice of $x$ is clear. The Gauss map $T:[0,1)\rightarrow [0,1)$ given by
acts as a forward shift on the continued fraction digits, ignoring $a_{0}$ . The Gauss measure $\unicode[STIX]{x1D707}$ given by
is a probability measure, preserved by $T$ , and is ergodic with respect to $T$ . Given a string $s=[c_{1},c_{2},\ldots ,c_{k}]$ , we define the cylinder set of $s$ to be
and we say this cylinder set has rank $k$ . We shall also need the usual matrix action on real numbers given by
With these definitions in mind, we say that a point $x\in [0,1)$ is CF-normal if for any string $s$ , we have
Since $T^{i}x\in C_{s}$ if and only if the string $s$ appears in the continued fraction expansion of $x$ starting at the ( $i+1$ )th position, the left-hand side of (1) represents the limiting frequency with which $s$ appears in the continued fraction expansion of $x$ . By the pointwise ergodic theorem, almost all numbers $x\in [0,1)$ are CF-normal. We extend the definition of CF-normal to all $x\in \mathbb{R}$ to say $x$ is CF-normal if $x-a_{0}(x)$ is CF-normal.
Our main result will be the following, which not only answers Bugeaud’s question in the affirmative, but shows that any non-trivial linear fractional transformation preserves CF-normality. The method of proof, which we shall outline in § 1.1, makes use of ideas about extending ergodicity to a skew product (as was done in [Reference Jager and LiardetJL88]) as well as ideas about finite automata acting on the digits of a normal number (as was done in [Reference Becher, Carton and HeiberBCH15]).
Theorem 1.1. Let $M$ be a $2\times 2$ matrix with coefficients in $\mathbb{Z}$ and non-zero determinant. Let $x\in \mathbb{R}$ be CF-normal. Then $Mx$ is also CF-normal.
In particular, if $x=\langle 0;a_{1},a_{2},\ldots \,\rangle$ is CF-normal, $c_{0}\in \mathbb{Z}$ , and $c_{1},c_{2},\ldots ,c_{k}\in \mathbb{N}$ , then
It is easy to see that this preserves normality. However, all of the matrices on the left here have determinant $\pm 1$ , and thus so does their product. In fact, for any matrix with determinant $\pm 1$ , the action of the matrix on an irrational number $x$ will alter the head of the expansion and leave the tail unchanged (see [Reference Borwein, van der Poorten, Shallit and ZudilinBvdPSZ14, Theorem 2.37]), thus preserving CF-normality.
We emphasize that Theorem 1.1 works for any matrix that does not have determinant zero, not just those with determinant $\pm 1$ . The reason why we exclude matrices with determinant zero is that in such a case $Mx$ will always be the same rational number, regardless of which $x$ is chosen.
Due to earlier work of Kraaikamp and Nakada [Reference Kraaikamp and NakadaKN00] and the author [Reference VandeheyVan14], we know that normality for regular continued fraction expansions, the kind we are studying in this paper, is equivalent to normality for nearest-integer continued fractions and continued fractions with odd partial quotients. Thus, Theorem 1.1 holds for these expansions as well.
1.1 The idea and outline of the proof
Let us return to the question of normality to base $10$ and give a glimpse into why Wall’s result is true.
Given a number $x$ that is base $10$ normal, how often do we expect to see the digit $7$ appear in the base $10$ expansion of $2x$ ? We should see a $7$ appear in the $n$ th position of $2x$ whenever we see one of the strings $[3,5]$ , $[3,6]$ , $[3,7]$ , $[3,8]$ , $[3,9]$ , $[8,5]$ , $[8,6]$ , $[8,7]$ , $[8,8]$ , or $[8,9]$ appear starting in the $n$ th position of $x$ . We call these strings trigger strings for the string $7$ . But since $x$ is normal, each of these strings appears with limiting frequency $1/100$ and there are $10$ of them, so we expect to see $7$ appear with limiting frequency $1/10$ . In this case, understanding how often trigger strings occur relies on knowing how the sequence $(10^{n}x)$ is distributed modulo $1$ .
A slightly harder problem: if $x$ is base $10$ normal, then how often do we expect to see $7$ in the base $10$ expansion of $x/3$ ? Here, to determine what the $n$ th digit of $x/3$ is, we must not only know what the $n$ th digit of $x$ is, we must know something about all of the first $n-1$ digits. In particular, the $n$ th digit of $x/3$ is $7$ if the $n$ th digit of $x$ is $1$ , $2$ , or $3$ , and the sum of the first $n-1$ digits of $x$ is $2$ modulo $3$ . If one could show that each of these options appeared with limiting frequency $1/30$ ,Footnote 3 that would give the desired limiting frequency for the string $7$ in the base $10$ expansion of $x/3$ .
This suggests that to show that division by $3$ preserves normality to base $10$ , we want to understand how the pairs
distribute in the set $[0,1)\times \{0,1,2\}$ . (Here $a_{i}(x)$ is referring to the $i$ th digit of the base $10$ expansion, not the continued fraction expansion, of $x$ .)
We will run into a similar difficulty with continued fractions. What we will want is some way to examine the tail of the expansion of $Mx$ , and to do this, we want to be able to push the matrix $M$ through the first $n$ digits of the continued fraction expansion for $x$ like so:
with $M_{n}$ belonging to some set ${\mathcal{M}}$ . Our analogy to (2) will be the sequence $(T^{n}x,M_{n})$ and we want to show this distributes nicely in the space $[0,1)\times {\mathcal{M}}$ .
As an explicit example, suppose $M=(\!\begin{smallmatrix}1 & 0\\ 0 & 2\end{smallmatrix}\!)$ . Then
By reinterpreting this in terms of matrices, we get that
If instead, $M=(\!\begin{smallmatrix}2 & 0\\ 0 & 1\end{smallmatrix}\!)$ , then there are more possibilities, depending on the value of $a_{1}(x)$ :
In § 2, we state some of the results of Liardet and Stambul mentioned earlier, which will essentially say that if we started with a ‘nice’ matrix $M$ , then we can in fact choose the matrices $M_{n}$ in (3) to always belong to a particular finite set ${\mathcal{M}}$ ; and in Lemma 2.3, we show that it suffices to prove Theorem 1.1 when $M\in {\mathcal{M}}$ . Theorem 3.1 then says that the sequence $(T^{n}x,M_{n})$ is nicely distributed with respect to some measure, provided $M$ starts in a subset of ${\mathcal{M}}$ with good properties. In § 4, we again show that it suffices to prove Theorem 1.1 when $M$ is in such a subset. Finally, we complete the proof of Theorem 1.1 in § 5, using a new definition of trigger strings as part of a key lemma.
1.2 Notation and definitions
We will denote the continued fraction expansion of $x$ by $\langle a_{0};a_{1},a_{2},\ldots \,\rangle$ and the continued fraction expansion of $Mx$ by $\langle b_{0};b_{1},b_{2},\ldots \,\rangle$ . We will commonly use $n$ to denote an index for $x$ (so we might talk about the $n$ th digit of $x$ ) while we will use $m$ to denote an index for $Mx$ . The length of a string $s=[c_{0};c_{1},\ldots ,c_{n}]$ , denoted by $|s|$ , is $n$ , regardless of the value of $c_{0}$ ; and when $c_{0}=0$ , we will often not write it or the semi-colon, so the string $[c]$ with no semi-colon will always denote the string $[0;c]$ . When we wish to distinguish between strings of digits in $x$ and strings of digits in $Mx$ , we will use $s=[c_{0};c_{1},c_{2},\ldots ,c_{n}]$ for strings in $x$ and $t=[d_{0};d_{1},d_{2},\ldots ,d_{m}]$ for strings in $Mx$ . We will also make reference to a string $r$ that is a substring in the expansion of $Mx$ . We will use $M$ to denote an arbitrary matrix and ${\mathcal{M}}$ to denote a collection of matrices. Some further definitions related to strings and matrices will be given in § 2.
We shall make use of standard asymptotic notation as well. We will say that $f(n)=O(g(n))$ if there exists a constant $C$ (called an implicit constant) such that $|f(n)|\leqslant C\cdot g(n)$ . We will say that $f(n)\asymp g(n)$ (with implicit constant $C$ ) if $f(n)=O(g(n))$ and $g(n)=O(f(n))$ (both with implicit constant $C$ ). If we have two $k\times k$ matrices $K_{1},K_{2}$ , then we say that $K_{1}\asymp K_{2}$ (with implicit constant $C$ ) if $(K_{1})_{i,j}\asymp (K_{2})_{i,j}$ for $1\leqslant i,j\leqslant k$ (uniformly with implicit constant $C$ ). We will say $f(n)=o(g(n))$ if $f(n)/g(n)\rightarrow 0$ as $n\rightarrow \infty$ . If a variable appears in a subscript of a big-O or little-o, this denotes that the implicit constant or rate of decay is dependent on this variable.
We will say a vector or matrix is non-negative (or positive) if all its coordinates are non-negative (or positive). We will call a vector a probability vector if it is non-negative and the sum of its coordinates is $1$ .
2 Matrices and resultant strings
In this section, we will make use of several results of Liardet and Stambul [Reference Liardet and StambulLS98]. Since we rely so heavily on their results, we will also borrow a lot of their notation. We make some critical changes, however, and will point out where our use differs from theirs.
We will define ${\mathcal{S}}$ to be the set of all strings $[c_{0};c_{1},c_{2},\ldots ,c_{k}]$ with $c_{0}\in \mathbb{N}_{{\geqslant}0}$ and $c_{i}\in \mathbb{N}$ for $1\leqslant i\leqslant k$ . We will equate the string $[0;]$ with the empty string, denoted by $\wedge$ . We will let ${\mathcal{S}}^{\ast }$ denote the subset of ${\mathcal{S}}$ of strings with $c_{0}=0$ . Given two strings $s=[c_{0};c_{1},c_{2},\ldots ,c_{k}]$ and $s^{\prime }=[c_{0}^{\prime };c_{1}^{\prime },c_{2}^{\prime },\ldots ,c_{k^{\prime }}^{\prime }]$ both in ${\mathcal{S}}$ , we define the concatenation $s.s^{\prime }\in {\mathcal{S}}$ in the following non-standard way:
We also have $s.\wedge =\wedge .s=s$ . The definition of concatenation extends naturally to ${\mathcal{S}}^{\ast }$ .
Given a string $s=[c_{0};c_{1},c_{2},\ldots ,c_{k}]\in {\mathcal{S}}$ , we let
(If we let
then, in Liardet and Stambul’s notation, this would be $\unicode[STIX]{x1D6F1}_{c_{0}c_{1}c_{2}\ldots c_{k}}$ multiplied by $J$ on the right.) We always have that $|\!\det (\unicode[STIX]{x1D6F1}_{s})|=1$ .
Given a string $s\in {\mathcal{S}}$ , we will define $s.x$ for $x\in \mathbb{R}$ by $\unicode[STIX]{x1D6F1}_{s}x$ . By the way that concatenation is defined, we can quickly see that $\unicode[STIX]{x1D6F1}_{s.s^{\prime }}=\unicode[STIX]{x1D6F1}_{s}\unicode[STIX]{x1D6F1}_{s^{\prime }}$ .
We shall let ${\mathcal{E}}$ denote the set of matrices
(This is what Liardet and Stambul refer to as ${\mathcal{E}}_{2}$ . We note that ${\mathcal{E}}$ is invariant under the action of multiplying by $J$ on the left or the right, so that our changes to the definition of $\unicode[STIX]{x1D6F1}_{s}$ from Liardet and Stambul will not have a noticeable impact here.)
Given $D\in \mathbb{N}$ , $D>1$ , let ${\mathcal{M}}_{D}$ denote the subset of ${\mathcal{E}}$ of matrices $M$ satisfying $|\!\det (M)|=D$ .
Lemma 2.1 [Reference Liardet and StambulLS98, Lemma 5].
For any $D\in \mathbb{N}$ , $D>1$ , we have that ${\mathcal{M}}_{D}$ is finite.
Let $D\in \mathbb{N}$ , $D>1$ be fixed. For $a\in \mathbb{N}$ , we wish to define two functions $\unicode[STIX]{x1D719}_{a}:{\mathcal{M}}_{D}\rightarrow {\mathcal{M}}_{D}$ and $\unicode[STIX]{x1D713}_{a}:{\mathcal{M}}_{D}\rightarrow {\mathcal{S}}$ given by the following:
-
(i) if $M\unicode[STIX]{x1D6F1}_{[a]}=M^{\prime }$ for some $M^{\prime }\in {\mathcal{M}}_{D}$ , then $\unicode[STIX]{x1D719}_{a}(M)=M^{\prime }$ and $\unicode[STIX]{x1D713}_{a}(M)=\wedge$ ; and
-
(ii) if $M\unicode[STIX]{x1D6F1}_{[a]}\not \in {\mathcal{M}}_{D}$ , then according to [Reference Liardet and StambulLS98, Theorem 1], there exists a unique factorization $M\unicode[STIX]{x1D6F1}_{[a]}=\unicode[STIX]{x1D6F1}_{s}M^{\prime }$ with $M^{\prime }\in {\mathcal{M}}_{D}$ and $s\in {\mathcal{S}}$ , and so we let $\unicode[STIX]{x1D719}_{a}(M)=M^{\prime }$ and $\unicode[STIX]{x1D713}_{a}(M)=s$ .
With the functions $\unicode[STIX]{x1D719}_{a}$ and $\unicode[STIX]{x1D713}_{a}$ defined in this way, we have that for any $M\in {\mathcal{M}}_{D}$ and irrational $x=\langle a_{1},a_{2},a_{3}\ldots \,\rangle \in [0,1)$ , that
Iterating this procedure again gives the following:
In order to continue iterating this procedure without devolving into a typographical nightmare, let us define two new functions $\unicode[STIX]{x1D6F7}_{s}:{\mathcal{M}}_{D}\rightarrow {\mathcal{M}}_{D}$ and $\unicode[STIX]{x1D6F9}_{s}:{\mathcal{M}}_{D}\rightarrow {\mathcal{S}}$ for $s\in {\mathcal{S}}^{\ast }$ .Footnote 4 These functions are defined iteratively beginning with the base cases
and the iterative relations
This function $\unicode[STIX]{x1D6F9}_{s}(M)$ , which Liardet and Stambul refer to as the output of the transducer associated to the string $s$ and the initial state $M$ , we shall instead refer to as the resultant string of the pair $(s,M)$ .
With these definitions, we see that for any $M\in {\mathcal{M}}_{D}$ and any $x=\langle a_{1},a_{2},a_{3}\ldots \,\rangle \in [0,1)$ the following relation holds
Lemma 2.2. Let $M\in {\mathcal{M}}_{D}$ for an integer $D>1$ and let $s=[c_{1},c_{2},\ldots ,c_{k}]\in {\mathcal{S}}$ . We have that
Proof. The lower inequality is an immediate consequence of [Reference Liardet and StambulLS98, Theorem 2].
The upper inequality starts with the fact that $|s.s^{\prime }|=|s|+|s^{\prime }|$ , thus we only need to prove that $|\unicode[STIX]{x1D713}_{c}(M)|\ll 1$
Consider the matrix $M\unicode[STIX]{x1D6F1}_{[c]}$ . If this is in ${\mathcal{M}}_{D}$ , then $\unicode[STIX]{x1D713}_{c}(M)=\wedge$ and there is nothing to prove. Suppose otherwise for the remainder of the proof, that $M\unicode[STIX]{x1D6F1}_{[c]}\not \in {\mathcal{M}}_{D}$ .
Let us define $\unicode[STIX]{x1D6FC},\unicode[STIX]{x1D6FC}^{\prime },\unicode[STIX]{x1D6FD},\unicode[STIX]{x1D6FD}^{\prime },\unicode[STIX]{x1D6FE},\unicode[STIX]{x1D6FE}^{\prime },\unicode[STIX]{x1D6FF},\unicode[STIX]{x1D6FF}^{\prime }$ by
By [Reference Liardet and StambulLS98, Theorem 1, part (ii)], since the matrices of (5) equal $\unicode[STIX]{x1D6F1}_{\unicode[STIX]{x1D713}_{c}(M)}\unicode[STIX]{x1D719}_{c}(M)$ , we have that if $\unicode[STIX]{x1D6FE}^{\prime }$ (or $\unicode[STIX]{x1D6FF}^{\prime }$ ) is larger than $0$ , then the fraction $\unicode[STIX]{x1D6FC}^{\prime }/\unicode[STIX]{x1D6FE}^{\prime }$ (respectively, $\unicode[STIX]{x1D6FD}^{\prime }/\unicode[STIX]{x1D6FF}^{\prime }$ ) has at least $|\unicode[STIX]{x1D713}_{c}(M)|$ digits beyond the zeroth digit. However, any fraction with at least $n$ digits in its continued fraction expansion has a denominator that is at least $n$ . Therefore $\unicode[STIX]{x1D6FE}^{\prime }$ (respectively, $\unicode[STIX]{x1D6FF}^{\prime }$ ) either is zero or is greater than or equal to $|\unicode[STIX]{x1D713}_{c}(M)|$ .
By Lemma 2.1, the set ${\mathcal{M}}_{D}$ is finite, so the coefficients of $M$ must be bounded from above, let us say by $K$ . If $\unicode[STIX]{x1D6FF}\neq 0$ , then $\unicode[STIX]{x1D6FE}^{\prime }=\unicode[STIX]{x1D6FF}\neq 0$ and thus $|\unicode[STIX]{x1D713}_{c}(M)|\leqslant \unicode[STIX]{x1D6FE}^{\prime }=\unicode[STIX]{x1D6FF}\leqslant K$ . Otherwise, if $\unicode[STIX]{x1D6FF}=0$ , then $\unicode[STIX]{x1D6FF}^{\prime }=\unicode[STIX]{x1D6FE}\neq 0$ , and so $|\unicode[STIX]{x1D713}_{c}(M)|\leqslant \unicode[STIX]{x1D6FF}^{\prime }=\unicode[STIX]{x1D6FE}\leqslant K$ . These two cases complete the proof.◻
We also require the following result.
Lemma 2.3. It suffices to prove Theorem 1.1 in the case where $M\in {\mathcal{M}}_{D}$ for some $D\in \mathbb{N}$ , $D>1$ and where $x\in [0,1)$ .
Proof. By sending
we may assume without loss of generality that $x\in [0,1)$ .
As noted after the statement of Theorem 1.1, if $M$ is a $2\times 2$ matrix with coefficients in $\mathbb{Z}$ and determinant $\pm 1$ , then $Mx$ is normal if and only if $x$ is normal. We will make frequent use of this fact in the remainder of this proof.
Let $M=(\!\begin{smallmatrix}\unicode[STIX]{x1D6FC} & \unicode[STIX]{x1D6FD}\\ \unicode[STIX]{x1D6FE} & \unicode[STIX]{x1D6FF}\end{smallmatrix}\!)$ have coefficients in $\mathbb{Z}$ and non-zero determinant.
Suppose that either $\unicode[STIX]{x1D6FE}=0$ or $\unicode[STIX]{x1D6FF}=0$ . Since $\det (M)\neq 0$ , we must have that $\unicode[STIX]{x1D6FC}\neq 0$ or $\unicode[STIX]{x1D6FD}\neq 0$ respectively. Thus, there exists some integer $a$ such that the bottom row of
consists only of non-zero integers. Since the determinant of $(\!\begin{smallmatrix}1 & 0\\ a & 1\end{smallmatrix}\!)$ is $1$ , we have that $M^{\prime }x$ is normal if and only if $Mx$ is normal. Also $|\!\det (M^{\prime })|\neq 0$ . Thus, we may assume without loss of generality that $\unicode[STIX]{x1D6FE}\neq 0$ and $\unicode[STIX]{x1D6FF}\neq 0$ .
Now suppose that $\operatorname{sgn}(\unicode[STIX]{x1D6FC})\neq \operatorname{sgn}(\unicode[STIX]{x1D6FE})$ or that $\operatorname{sgn}(\unicode[STIX]{x1D6FD})\neq \operatorname{sgn}(\unicode[STIX]{x1D6FF})$ . Then there exists some integer $a$ such that
satisfies $\operatorname{sgn}(\unicode[STIX]{x1D6FC}^{\prime })=\operatorname{sgn}(\unicode[STIX]{x1D6FE}^{\prime })$ and $\operatorname{sgn}(\unicode[STIX]{x1D6FD}^{\prime })=\operatorname{sgn}(\unicode[STIX]{x1D6FF}^{\prime })$ . Since $(\!\begin{smallmatrix}1 & a\\ 0 & 1\end{smallmatrix}\!)$ has determinant $1$ , we have that $M^{\prime }x$ is normal if and only if $Mx$ is normal. Thus, we may assume without loss of generality that $\operatorname{sgn}(\unicode[STIX]{x1D6FC})=\operatorname{sgn}(\unicode[STIX]{x1D6FE})$ and $\operatorname{sgn}(\unicode[STIX]{x1D6FD})=\operatorname{sgn}(\unicode[STIX]{x1D6FF})$ .
Now suppose $\operatorname{sgn}(\unicode[STIX]{x1D6FE})=-1$ or $\operatorname{sgn}(\unicode[STIX]{x1D6FF})=-1$ . Then by appropriately multiplying $M$ on the left by a matrix of the form $(\!\begin{smallmatrix}\pm 1 & 0\\ 0 & \pm 1\end{smallmatrix}\!)$ (which again has determinant $\pm 1$ ), we may assume without loss of generality that $\operatorname{sgn}(\unicode[STIX]{x1D6FE})=1$ and $\operatorname{sgn}(\unicode[STIX]{x1D6FF})=1$ . This also preserves all previous assumptions.
Thus, in particular, we may assume that all of the coefficients of $M$ are positive. By [Reference Liardet and StambulLS98, Theorem 1, part (i)], there exists a string $s=[c_{0};c_{1},\ldots ,c_{n}]\in {\mathcal{S}}$ and $M^{\prime }\in {\mathcal{M}}_{D}$ such that $M=\unicode[STIX]{x1D6F1}_{s}M^{\prime }$ . However, since $\unicode[STIX]{x1D6F1}_{s}$ has determinant $\pm 1$ , we have that $Mx$ is normal if and only if $M^{\prime }x$ is normal, and thus we may assume without loss of generality that $M\in {\mathcal{M}}_{D}$ all along.◻
3 Normality on a skew product
By our heuristic argument in § 1.1, we want to show that the sequence
is nicely distributed in some sense. This will allow us to show that the last term of (4), which gives the tail of the CF expansion of $Mx$ , is likewise nicely distributed.
Let $\unicode[STIX]{x1D6FA}\subset [0,1)$ denote the subset of irrational points and let ${\mathcal{M}}$ denote some finite set, which we will later take to be a set of matrices. We will let $x$ denote elements in $\unicode[STIX]{x1D6FA}$ and $M$ denote elements of ${\mathcal{M}}$ . We will consider cylinder sets of $\unicode[STIX]{x1D6FA}$ to be the intersection of the usual cylinder sets (for the continued fraction expansion) of $[0,1)$ with $\unicode[STIX]{x1D6FA}$ .
We wish to extend the Gauss map $T$ to a transformation $\widetilde{T}$ on a larger domain $\widetilde{\unicode[STIX]{x1D6FA}}=\unicode[STIX]{x1D6FA}\times {\mathcal{M}}$ . For any $(x,M)\in \widetilde{\unicode[STIX]{x1D6FA}}$ , we define the skew product
for some function $f_{a_{1}}:{\mathcal{M}}\rightarrow {\mathcal{M}}$ that is indexed by the first digit of $x$ . Since the second coordinate of $\widetilde{T}(x,M)$ only depends on $M$ and the first CF-digit of $x$ , we see that this second coordinate is constant for all $x$ in the same rank $1$ cylinder. Given a cylinder set $C_{s}$ for $\unicode[STIX]{x1D6FA}$ , we call $C_{s}\times \{M\}$ (for any $M\in {\mathcal{M}}$ ) a cylinder set for $\widetilde{\unicode[STIX]{x1D6FA}}$ . Moreover, we define $\widetilde{\unicode[STIX]{x1D707}}(E\times \{M\})=\unicode[STIX]{x1D707}(E)/|{\mathcal{M}}|$ for any measurable subset $E$ of $\unicode[STIX]{x1D6FA}$ and $b\in {\mathcal{M}}$ .
For easier readability, we will use $(E,M)$ to denote $E\times \{M\}$ for any measurable set $E\subset \unicode[STIX]{x1D6FA}$ , with measurability being determined by Lebesgue measure or, equivalently, the Gauss measure. We will also let $(E,{\mathcal{M}})$ denote $E\times {\mathcal{M}}$ .
We adapt our definition of normality on this space. We will say that $(x,M)\in \widetilde{\unicode[STIX]{x1D6FA}}$ is $\widetilde{T}$ -normal with respect to a measure $\unicode[STIX]{x1D70C}$ on $\widetilde{\unicode[STIX]{x1D6FA}}$ , if for any cylinder set $(C_{s},M^{\prime })$ we have
We say $\widetilde{T}$ is transitive if for any $M_{1},M_{2}\in {\mathcal{M}}$ , there exists a string $s\in {\mathcal{S}}$ of length $n$ such that
The goal of this section is to prove the following result, which is similar to a previous result of the author [Reference VandeheyVan14]; however, as this paper contains significant departures (notably not assuming that the functions $f_{a}$ are bijective and hence not being able to assume that $\tilde{\unicode[STIX]{x1D707}}$ is $\widetilde{T}$ -invariant), we present the proof in full.
Theorem 3.1. If $\widetilde{T}$ is transitive, then there exists a probability measure $\unicode[STIX]{x1D70C}$ on $\widetilde{\unicode[STIX]{x1D6FA}}$ that is absolutely continuous with respect to $\widetilde{\unicode[STIX]{x1D707}}$ and such that $\widetilde{T}$ preserves $\unicode[STIX]{x1D70C}$ and is ergodic with respect to $\unicode[STIX]{x1D70C}$ . Moreover, if $x$ is CF-normal, then for any $M\in {\mathcal{M}}$ , the point $(x,M)$ is $\widetilde{T}$ -normal with respect to $\unicode[STIX]{x1D70C}$ .
Since almost all numbers $x\in [0,1)$ are CF-normal, almost all $(x,M)\in \widetilde{\unicode[STIX]{x1D6FA}}$ are $\widetilde{T}$ -normal with respect to $\unicode[STIX]{x1D70C}$ . Therefore, calling these points $(x,M)$ ‘normal’ is reasonable to do.
The proof of Theorem 3.1 breaks into two pieces. Proving that $\unicode[STIX]{x1D70C}$ exists and satisfies the desired properties follows very standard ergodic theoretic techniques (such as Knopp’s lemma). Proving that any CF-normal $x\in \unicode[STIX]{x1D6FA}$ lifts to $\widetilde{T}$ -normal points $(x,M)$ requires much more work. We will need to show that the density of $\unicode[STIX]{x1D70C}$ and the density of $\tilde{\unicode[STIX]{x1D707}}$ are within a constant multiple of one another and then apply the Pyatetskiĭ–Shapiro normality criterion (Lemma 3.3).
Remark 3.2. Although we are using $T$ as the Gauss map here, the only properties of the Gauss map that we use are the fact that all cylinders are full (i.e. $T^{|s|}C_{s}=\unicode[STIX]{x1D6FA}$ ), that the Pyatetskiĭ–Shapiro normality criterion can be applied via the cylinder sets, that the corresponding ergodic, invariant measure $\unicode[STIX]{x1D707}$ is finite, and that $T$ satisfies Renyi’s condition (6). Therefore, any map $T$ which also satisfies these conditions will also satisfy Theorem 3.1.
3.1 Necessary lemmata for Theorem 3.1
In order to simplify the readability of the proof of Theorem 3.1, we will include several lemmas here. All of these results make use of the definitions and assumptions at the start of § 3.
In order to show that $(x,M)$ is $\widetilde{T}$ -normal with respect to $\unicode[STIX]{x1D70C}$ , we will need to make use of the Pyatetskiĭ–Shapiro normality criterion in the following form.
Lemma 3.3. Let $(x,M)\in \widetilde{\unicode[STIX]{x1D6FA}}$ and suppose a measure $\unicode[STIX]{x1D70C}$ exists satisfying the first part of Theorem 3.1. If for any cylinder set $(C_{s},M^{\prime })$ , we have
for some constant $\unicode[STIX]{x1D70E}$ independent of our choice of cylinder set, then $x$ is $\widetilde{T}$ -normal with respect to $\unicode[STIX]{x1D70C}$ .
Proof. This is a simple consequence of [Reference Moshchevitin and ShkredovMS03, Theorem 1]. We briefly describe how using the terminology from their paper.
We let the family $\{C_{m}\}$ denote the family of all cylinder sets on $\widetilde{\unicode[STIX]{x1D6FA}}$ and also let $\unicode[STIX]{x1D711}(t)=\unicode[STIX]{x1D70E}\cdot t$ . Since the set $A_{l}(T,\unicode[STIX]{x1D712}_{I},\unicode[STIX]{x1D6FF})$ is a disjoint union of rank- $l$ cylinder sets, we have that $H_{\unicode[STIX]{x1D711}}(A_{l}(T,\unicode[STIX]{x1D712}_{I},\unicode[STIX]{x1D6FF}))$ equals $\unicode[STIX]{x1D70E}\cdot \unicode[STIX]{x1D707}(A_{l}(T,\unicode[STIX]{x1D712}_{I},\unicode[STIX]{x1D6FF}))$ and thus goes to $0$ as $l\rightarrow \infty$ .◻
A well-known consequence of Renyi’s condition for continued fraction expansions (see [Reference SchweigerSch95, ch. 9]) states that there exists an absolute constant ${\mathcal{C}}>0$ so that for any measurable set $E$ and cylinder $C_{s}$ of rank $n$ , we have that
It is clear that one could replace $C_{s}$ by any set that can be expressed as a disjoint union of rank $n$ cylinder sets.
We will also want a similar equality (sans the cylinder set) to hold for $\widetilde{T}$ and $\tilde{\unicode[STIX]{x1D707}}$ , for which we will require the following results.
Lemma 3.4. Let $\{K_{n}\}_{n=1}^{\infty }$ be a sequence of $k\times k$ Markov matrices such that $K_{n_{1}}\asymp K_{n_{2}}$ uniformly for $n_{1},n_{2}\in \mathbb{N}$ . Assume that there exists a power $\ell$ such that $K_{1}^{\ell }$ has all positive coordinates, and also assume that $\{K_{1}\}_{i,i}>0$ for $1\leqslant i\leqslant k$ . Then there exists an integer $n_{0}\in \mathbb{N}$ and a constant $c\in (0,1)$ such that for any $1\times k$ probability vector $\vec{v}$ and any $n\geqslant n_{0}$ , we have that all coordinates of $\vec{v}K_{1}K_{2}K_{3}\ldots K_{n}$ are in the interval $(c,1-c)$ .
Proof. This is a special case of [Reference Saloff-Coste and ZúñigaSCZ11, Proposition 2.13]: the restriction that $K_{n_{1}}\asymp K_{n_{2}}$ uniformly for $n_{1},n_{2}\in \mathbb{N}$ implies that $K_{n+1}K_{n+2}\ldots K_{n+\ell }\asymp K_{1}^{\ell }$ with an implicit constant dependent on $\ell$ . This allows us to replace the ‘uniform irreducibility’ assumption with our simpler condition that there exists a power $\ell$ such that $K_{1}^{\ell }$ has all positive coordinates.◻
Lemma 3.5. Suppose $\widetilde{T}$ is transitive, then there exists a constant ${\mathcal{D}}>0$ and integer $n_{0}\in \mathbb{N}$ such that for any measurable set $E\subset \widetilde{\unicode[STIX]{x1D6FA}}$ and any $n\geqslant n_{0}$ , we have that
Proof. We will make a few definitions to start. Let ${\mathcal{S}}_{n,i}$ denote the set of cylinders $(C_{s},M)\subset \widetilde{\unicode[STIX]{x1D6FA}}$ such that $|s|=n$ and $\widetilde{T}^{n}(C_{s},M)=(\unicode[STIX]{x1D6FA},M_{i})$ . We let ${\mathcal{A}}_{i,j}$ denote the set of $a\in \mathbb{N}$ such that $f_{a}(M_{i})=M_{j}$ . We also consider a sequence of probability vectors $\vec{v}_{n}=\{v_{n,1},v_{n,2},\ldots ,v_{n,|{\mathcal{M}}|}\}$ , $n\geqslant 0$ , such that
For example, $\vec{v}_{0}=\{1/|{\mathcal{M}}|,1/|{\mathcal{M}}|,\ldots ,1/|{\mathcal{M}}|\}$ .
Our goal will be to show that there exists a constant $c\in (0,1)$ and integer $n_{0}\in \mathbb{N}$ so that $v_{n,i}\in (c,1-c)$ if $n\geqslant n_{0}$ .
Consider a sequence of $|{\mathcal{M}}|\times |{\mathcal{M}}|$ matrices $(K_{n})_{n=1}^{\infty }$ defined by
It is clear by construction that these matrices are stochastic and that $\vec{v}_{n+1}=\vec{v}_{n}K_{n}$ .
We let $K_{\ell ,n}=K_{\ell +1}K_{\ell +2}\ldots K_{n}$ .
Unfortunately, the matrices $K_{n}$ are not all the same, so $K_{0,n}$ represents a Markov chain that is time-inhomogeneous. However, they are not far from being time-homogenous. By writing the matrix coefficients in a different way and applying (6), we see that
with the same implicit constant ${\mathcal{C}}$ as in (6). Thus, $K_{n_{1}}\asymp K_{n_{2}}$ for any $n_{1},n_{2}\in \mathbb{N}$ with uniform implicit constant ${\mathcal{C}}^{2}$ . This also implies that for any fixed $L$ , we have that $K_{1}^{L}\asymp K_{nL,(n+1)L}$ uniformly for any $n\in \mathbb{N}$ with implicit constant ${\mathcal{C}}^{2L}$ .
Since $\widetilde{T}$ is assumed to be transitive, we know that for any $M_{i},M_{j}\in {\mathcal{M}}$ , there exists a cylinder $(C_{s},M_{i})$ with $|s|=\ell$ such that $T^{\ell }(C_{s},M_{i})=(\unicode[STIX]{x1D6FA},M_{j})$ . This implies that $(K_{0,\ell })_{i,j}>0$ and thus that $(K_{1}^{\ell })_{i,j}>0$ . In other words, $K_{1}$ is an irreducible matrix. Also, we can find integers $\ell _{1},\ell _{2},\ldots ,\ell _{|{\mathcal{M}}|}\in \mathbb{N}$ so that $(K_{1}^{\ell _{i}})_{i,i}>0$ . Since all terms of $K_{1}$ are non-negative by construction, if $(K_{1}^{\ell _{i}})_{i,i}>0$ then we have $(K_{1}^{m\ell _{i}})_{i,i}>0$ for any $m\in \mathbb{N}$ . Thus, if we let $L=\operatorname{lcm}(\ell _{1},\ell _{2},\ldots ,\ell _{|{\mathcal{M}}|})$ , then we have that $K_{1}^{L}$ is strictly positive along its diagonal.
Suppose $K_{1}^{L}$ is itself irreducible. Then since $K_{1}^{L}$ has non-negative coefficients with a strictly positive diagonal, there is some power of it such that every coefficient is strictly positive (see [Reference MeyerMey00, equation (8.3.5) on p. 672]). We may therefore apply Lemma 3.4 to the sequence of matrices $\{K_{nL,(n+1)L}\}_{n=0}^{\infty }$ . So there exists $c^{\prime }\in (0,1)$ and $n_{0}^{\prime }$ such that $v_{nL,i}\in (c^{\prime },1-c^{\prime })$ for all $i$ and all $n\geqslant n_{0}^{\prime }$ .
Suppose $K_{1}^{L}$ is not irreducible. By [Reference Brualdi and RyserBR91, Theorem 3.4.5], since $K_{1}$ itself is irreducible, there exists a permutation matrix $P$ such that
where the $C_{j}$ are irreducible matrices. In this case, we would apply Lemma 3.4 to each $C_{j}$ and thus show that there exist $c_{j}^{\prime }\in (0,1)$ and $n_{j}^{\prime }$ such that $v_{nL,i}\in (c_{j}^{\prime },1-c_{j}^{\prime })$ for $n\geqslant n_{j}^{\prime }$ and for indexes $i$ corresponding to the matrix $C_{j}$ after undoing the permutation. By taking $c^{\prime }=\min \{c_{j}^{\prime }\}$ and $n_{0}^{\prime }=\max \{n_{j}^{\prime }\}$ , we get the same result as in the previous paragraph.
Regardless of whether $K_{1}^{L}$ is irreducible or not, we have shown that there exists $c^{\prime }\in (0,1)$ and $n_{0}^{\prime }$ such that $v_{nL,i}\in (c^{\prime },1-c^{\prime })$ for all $n\geqslant n_{0}^{\prime }$ .
No column of $K_{1}$ consists of all zeros (otherwise there would be an $M\in {\mathcal{M}}$ that is never visited, contrary to the transitivity of $\widetilde{T}$ ), therefore the sum of the coefficients in any column vector of $K_{nL,nL+j}$ for $j\leqslant L$ is uniformly bounded from below. Thus, we can therefore find a constant $c$ and $n_{0}$ such that $v_{n,i}\in (c,1-c)$ for all $i$ and all $n\geqslant n_{0}$ . In particular, $v_{n,i}\asymp 1$ .
Now we can prove the desired statement (7). It suffices to show the statement is true for $n\geqslant n_{0}$ and for sets of the form $(E,M_{i})$ for some measurable subset $E\subset \unicode[STIX]{x1D6FA}$ and $M_{i}\in {\mathcal{M}}$ . In this case, we have that $\widetilde{T}^{-n}(E,M_{i})$ equals the union of $((T^{-n}E)\cap C_{s},M)$ for $(C_{s},M)\in {\mathcal{S}}_{n,i}$ as defined above. Therefore, by applying (6), we have
as desired. ◻
3.2 Proof of Theorem 3.1
First, we will show that $\widetilde{T}$ is ergodic (despite not necessarily being invariant) with respect to $\tilde{\unicode[STIX]{x1D707}}$ .
Suppose, we have a $\widetilde{T}$ -invariant subset of $\widetilde{\unicode[STIX]{x1D6FA}}$ called $E$ that has non-zero $\tilde{\unicode[STIX]{x1D707}}$ -measure. We define $E^{c}=\widetilde{\unicode[STIX]{x1D6FA}}\setminus E$ and define $E_{M}$ as the set of $x\in \unicode[STIX]{x1D6FA}$ such that $(x,M)\in E$ , so that $E=\bigcup _{M\in {\mathcal{M}}}(E_{M},M)$ .
We claim that
for all $M\in {\mathcal{M}}$ . Since $E$ has positive $\tilde{\unicode[STIX]{x1D707}}$ -measure, there must exist a set $E_{M^{\prime }}$ of positive $\unicode[STIX]{x1D707}$ -measure. By transitivity, for any $M\in {\mathcal{M}}$ , there exists a string $s$ with $|s|=n$ so that $\widetilde{T}^{n}(C_{s},M)=(\unicode[STIX]{x1D6FA},M^{\prime })$ . Thus,
However, by Renyi’s condition, this is within a positive constant multiple of $\unicode[STIX]{x1D707}(C_{s})\unicode[STIX]{x1D707}(E_{M^{\prime }})$ , and in particular is positive. Since $E$ is a $\widetilde{T}$ -invariant set, $\widetilde{T}^{-n}(E_{M^{\prime }},M^{\prime })\subset E$ , and thus (9) holds.
Now we wish to show that $E$ has a substantial intersection with every cylinder set on $\widetilde{\unicode[STIX]{x1D6FA}}$ , in particular, by showing that there exists a constant $\unicode[STIX]{x1D716}>0$ so that for all cylinder sets $(C_{s},M)$ , we have
Since there are only finitely many elements in ${\mathcal{M}}$ , there must exist $\unicode[STIX]{x1D716}^{\prime }>0$ , such that $\unicode[STIX]{x1D707}(E_{M^{\prime }})\geqslant \unicode[STIX]{x1D716}^{\prime }$ for all $M^{\prime }\in {\mathcal{M}}$ . Let us now fix an arbitrary cylinder $(C_{s},M)$ with $n:=|s|$ , and let $M^{\prime }$ be such that $\widetilde{T}^{n}(C_{s},M)=(\unicode[STIX]{x1D6FA},M^{\prime })$ . By applying (6), we have
Therefore, letting $\unicode[STIX]{x1D716}=\unicode[STIX]{x1D716}^{\prime }/{\mathcal{C}}$ gives (10).
Since the cylinder sets generate the Borel sets on $\widetilde{\unicode[STIX]{x1D6FA}}$ , we can find, for any $\unicode[STIX]{x1D6FF}>0$ , a set $E_{\unicode[STIX]{x1D6FF}}$ such that $\tilde{\unicode[STIX]{x1D707}}(E^{c}\triangle E_{\unicode[STIX]{x1D6FF}})<\unicode[STIX]{x1D6FF}$ and $E_{\unicode[STIX]{x1D6FF}}$ is a disjoint union of a finite number of cylinder sets. Therefore, by applying (10), we have
But $\tilde{\unicode[STIX]{x1D707}}(E\cap E^{c})=0$ and $\unicode[STIX]{x1D6FF}$ was an arbitrary positive number. Thus, either $\tilde{\unicode[STIX]{x1D707}}(E)=0$ or $\tilde{\unicode[STIX]{x1D707}}(E^{c})=0$ . Since we know $E$ has positive measure, this therefore implies that $E$ must have full measure, and $\widetilde{T}$ is ergodic with respect to $\tilde{\unicode[STIX]{x1D707}}$ .
We will now construct a measure $\unicode[STIX]{x1D70C}$ that is absolutely continuous with respect to $\tilde{\unicode[STIX]{x1D707}}$ such that $\widetilde{T}$ is not only ergodic but also invariant with respect to $\unicode[STIX]{x1D70C}$ .
We define a sequence of measures $\unicode[STIX]{x1D70C}_{n}$ on $\widetilde{\unicode[STIX]{x1D6FA}}$ by
By Lemma 3.5, we can show that
for any measurable set $E$ . Therefore, by a theorem of Ryll–Nardzewski (see [Reference Dunford and SchwartzDS58, p. 683]), the integrand of (11) converges pointwise to a $L_{1}$ function $g_{A}$ almost everywhere, and since the integrand is dominated by $1$ , the integrand must in fact converge uniformly to $g_{A}$ almost everywhere. Therefore, we may define $\unicode[STIX]{x1D70C}(A)=\lim _{n\rightarrow \infty }\unicode[STIX]{x1D70C}_{n}(A)$ . The Vitali–Hahn–Saks theorem [Reference BrooksBro69] shows that $\unicode[STIX]{x1D70C}$ is in fact a probability measure on $\widetilde{\unicode[STIX]{x1D6FA}}$ . Since $\unicode[STIX]{x1D70C}_{n}(\widetilde{T}^{-1}E)=\unicode[STIX]{x1D70C}_{n}(E)+O(1/n)$ for any measurable set $E$ , we have that $\unicode[STIX]{x1D70C}$ is preserved by $\widetilde{T}$ . Likewise, by Lemma 3.5 again, we can see that
and thus the same is true if we replace $\unicode[STIX]{x1D70C}_{n}$ by $\unicode[STIX]{x1D70C}$ .Footnote 5
Thus it remains to show that if $x\in \unicode[STIX]{x1D6FA}$ is CF-normal, then $(x,M)$ is $\widetilde{T}$ -normal with respect to $\unicode[STIX]{x1D70C}$ for any $M\in {\mathcal{M}}$ .
So consider a point $x\in \unicode[STIX]{x1D6FA}$ that is CF-normal. Then for every cylinder $C_{s}$ and every $M\in {\mathcal{M}}$ , we have
Thus, in particular, we have for any $M,M^{\prime }\in {\mathcal{M}}$ that
Thus, by Lemma 3.3, the points $(x,M)$ for all $M\in {\mathcal{M}}$ are $\widetilde{T}$ -normal with respect to $\unicode[STIX]{x1D70C}$ .◻
The referee of this paper proposed an alternative method for studying $\unicode[STIX]{x1D70C}$ . We include the outline of this other method in case it proves more useful to those wishing to extend these results than the method given above.
One could consider the map ${\mathcal{T}}$ which acts by
Let $X=\overline{\{{\mathcal{T}}^{n}(x,0,M):n\geqslant 0,x\in [0,1),M\in {\mathcal{M}}\}}$ , and let $d\unicode[STIX]{x1D708}$ be $dx\,dy/(1+xy)^{2}$ crossed with the counting measure on ${\mathcal{M}}$ and appropriately normalized to be a probability measure. Then $(X,{\mathcal{T}},\unicode[STIX]{x1D708})$ can be shown to be the natural extension of $(\widetilde{\unicode[STIX]{x1D6FA}},\widetilde{T},\unicode[STIX]{x1D70C})$ via the methods of [Reference Kraaikamp, Schmidt and SteinerKSS12]. Moreover $\unicode[STIX]{x1D70C}$ can be given explicitly by
By picking an appropriate cylinder set in $X$ and applying ${\mathcal{T}}$ enough times, one can show that for each $M$ there exists a ‘horizontal stripe’ $[0,1]\times [y_{1}^{M},y_{2}^{M}]\times \{M\}$ in $X$ . One can then show that, for any set $E\subset E_{M}$ , one has
This provides the crucial relationship between $\unicode[STIX]{x1D707}$ and $\unicode[STIX]{x1D70C}$ that is used in the final part of Theorem 3.1.
4 Building a dynamical system
As we hinted at the start of § 3, we would like to build a dynamical system $\widetilde{T}$ from $\unicode[STIX]{x1D6FA}\times {\mathcal{M}}_{D}$ to itself by $\widetilde{T}(x,M)=(Tx,\unicode[STIX]{x1D719}_{a_{1}}(M))$ , with $\unicode[STIX]{x1D719}_{a}$ defined as it was in § 2. We will use this definition for $\widetilde{T}$ throughout the rest of the paper; however, it may turn out that this system is not transitive and thus Theorem 3.1 may not apply. Thus, we will require the following definition to modify our dynamical system.
We will call a subset ${\mathcal{M}}_{D}^{\prime }\subset {\mathcal{M}}_{D}$ a transitive component if the following conditions are satisfied:
-
(i) for any string $s\in {\mathcal{S}}$ , and any $M\in {\mathcal{M}}_{D}^{\prime }$ we have that $\unicode[STIX]{x1D6F7}_{s}(M)\in {\mathcal{M}}_{D}^{\prime }$ ;
-
(ii) for any $M,M^{\prime }\in {\mathcal{M}}_{D}^{\prime }$ there exists a string $s\in {\mathcal{S}}$ such that $\unicode[STIX]{x1D6F7}_{s}(M)=M^{\prime }$ .
Note that any two distinct transitive components of ${\mathcal{M}}_{D}$ must have empty intersection.
With this definition the transformation $\widetilde{T}$ given by $\widetilde{T}(x,M)=(Tx,\unicode[STIX]{x1D719}_{a_{1}}(M))$ is transitive on $\unicode[STIX]{x1D6FA}\times {\mathcal{M}}_{D}^{\prime }$ for any transitive component ${\mathcal{M}}_{D}^{\prime }\subset {\mathcal{M}}_{D}$ . Thus, Theorem 3.1 applies for this $\widetilde{T}$ .
Lemma 4.1. There exists at least one transitive component of ${\mathcal{M}}_{D}$ . Moreover, there exists a string $s$ such that $\unicode[STIX]{x1D6F7}_{s}(M)$ is in a transitive component for any $M\in {\mathcal{M}}_{D}$ (although not necessarily always in the same transitive component for each $M$ ).
Proof. Consider a directed graph $G$ whose vertices are matrices $M\in {\mathcal{M}}_{D}$ and which has an edge from $M_{1}$ to $M_{2}$ if there exists a $j\in \mathbb{N}$ such that $\unicode[STIX]{x1D719}_{j}(M_{1})=M_{2}$ . Note this graph has out-degree always at least $1$ . A subgraph $G^{\prime }$ of $G$ is said to be strongly connected if for any $M_{1}$ , $M_{2}$ in $V(G^{\prime })$ , the vertex set of $G^{\prime }$ , there exists a path from $M_{1}$ to $M_{2}$ and vice versa. We can partition $G$ into its strongly connected components, which are the maximal strongly connected subgraphs of $G$ . (Note that if there is a vertex $M$ such that there is no path from $M$ to another vertex and back to itself, then $M$ is its own strongly connected component.) Let us call these components $G_{1},G_{2},\ldots ,G_{n}$ . Note that if $M\in V(G_{i})$ , then $V(G_{i})$ consists of all vertices which are strongly connected to $M$ .
Now let us consider another directed graph ${\mathcal{G}}$ whose vertices are $G_{1},G_{2},\ldots ,G_{n}$ and where there is an edge from $G_{i}$ to $G_{j}$ with $i\neq j$ if there exists an edge from some $M\in G_{i}$ to some $M^{\prime }\in G_{j}$ in the directed graph $G$ . We do not let ${\mathcal{G}}$ contain an edge which goes from a vertex to itself. Note that if there is an edge from $G_{i}$ to $G_{j}$ , then by the strong-connectivity of these components, there is a path from any $M\in G_{i}$ to any $M^{\prime }\in G_{j}$ . We see that ${\mathcal{G}}$ cannot have any cycles, as otherwise it would be possible to find matrices in two different strongly connected components that are strongly connected to one another, contradicting the maximality of these components. Thus, ${\mathcal{G}}$ is acyclic.
Any finite acyclic directed graph must contain at least one sink. We claim that the set of vertices of any sink $G_{i}$ of ${\mathcal{G}}$ is a transitive component for ${\mathcal{M}}_{D}$ . Let us fix a sink $G_{i}$ and let ${\mathcal{M}}^{\prime }$ denote the matrices in $V(G_{i})$ . The first condition for being a transitive component is satisfied because if there existed a string $s\in {\mathcal{S}}$ and $M\in {\mathcal{M}}^{\prime }$ such that $\unicode[STIX]{x1D6F7}_{s}(M)\not \in {\mathcal{M}}^{\prime }$ , then there would be at least one path from $G_{i}$ to another strongly connected component, contradicting the assumption that $G_{i}$ is a sink. For the second condition, this follows from the fact that within any $G_{i}$ there is a path from any vertex to any other vertex and also to itself.
Now consider all of the matrices that do not lie in a transitive component, let us call them $M_{1},M_{2},\ldots ,M_{k}$ . Consider $M_{1}$ and suppose it is in $G_{i}$ . As there must be a path from $G_{i}$ to a sink of ${\mathcal{G}}$ , there exists a string $s_{1}$ such that $\unicode[STIX]{x1D6F7}_{s_{1}}(M_{1})$ is in a transitive component. Now consider $\unicode[STIX]{x1D6F7}_{s_{1}}(M_{2})$ . Regardless of what matrix in ${\mathcal{M}}_{D}$ the matrix $\unicode[STIX]{x1D6F7}_{s_{1}}(M_{2})$ happens to be, there is, by the same argument, a string $s_{2}$ such that $\unicode[STIX]{x1D6F7}_{s_{2}}(\unicode[STIX]{x1D6F7}_{s_{1}}(M_{2}))=\unicode[STIX]{x1D6F7}_{s_{1}.s_{2}}(M_{2})$ is in a transitive component. (The string $s_{2}$ could equal $\wedge$ if $\unicode[STIX]{x1D6F7}_{s_{1}}(M_{2})$ is already in a transitive component.) Likewise there is a string $s_{3}$ such that $\unicode[STIX]{x1D6F7}_{s_{1}.s_{2}.s_{3}}(M_{3})$ is in a transitive component, and so on. The desired string $s$ is simply $s_{1}.s_{2}.s_{3}.\cdots \,.s_{k}$ .◻
The second part of Lemma 4.1 allows us to reduce the cases of Theorem 1.1 yet further. By Lemma 2.3, it suffices to prove Theorem 1.1 in the case where $x\in [0,1)$ is a CF-normal number and $M\in {\mathcal{M}}_{D}$ for some $D\in \mathbb{N}$ , $D>1$ . Let $s$ be a string satisfying Lemma 4.1. If $x$ is CF-normal, its continued fraction expansion contains $s$ . Thus, there exists some $n$ such that $\widetilde{T}^{n}(x,M)\in \unicode[STIX]{x1D6FA}\times {\mathcal{M}}_{D}^{\prime }$ , for some transitive component ${\mathcal{M}}_{D}^{\prime }\subset {\mathcal{M}}_{D}$ , and so
with $\unicode[STIX]{x1D6F7}_{[a_{1},a_{2},\ldots ,a_{n}]}(M)\in {\mathcal{M}}_{D}^{\prime }$ . Since the action of $T$ and the action of string concatenation both preserve CF-normality and CF-non-normality, it suffices to assume that $x\in [0,1)$ is CF-normal and that $M$ is in some transitive component ${\mathcal{M}}_{D}^{\prime }$ .
Lemma 4.2. Let $\widetilde{T}$ and $\unicode[STIX]{x1D70C}$ be the transformation and measure corresponding to some transitive component ${\mathcal{M}}_{D}^{\prime }$ , the latter of whose existence is guaranteed by Theorem 3.1. Let $k\in \mathbb{N}$ , and let $f:\unicode[STIX]{x1D6FA}\times {\mathcal{M}}_{D}^{\prime }\rightarrow \mathbb{R}$ be a bounded function that is constant on rank- $k$ cylinder sets.
For any CF-normal $x\in [0,1)$ and any $M\in {\mathcal{M}}_{D}^{\prime }$ , we have
Proof. Since there are countably many pairs $(s,M)$ with $|s|=k$ , let us enumerate them as $(s_{i},M_{i})_{i=1}^{\infty }$ .
Since $f$ is constant on rank- $k$ cylinder sets, we may write
where $a_{i}\in \mathbb{R}$ is uniformly bounded and $1_{E}(\cdot )$ is the standard indicator function of a set $E$ . Let us write $A_{+}=\sup _{i}a_{i}$ and $A_{-}=\inf _{i}a_{i}$ . We then construct functions $f_{j}^{+}$ and $f_{j}^{-}$ , $j\in \mathbb{N}$ , by
By construction we have that $f_{j}^{-}\leqslant f\leqslant f_{j}^{+}$ and, moreover, both $f_{j}^{+}$ and $f_{j}^{-}$ converge pointwise to $f$ (and, hence, by dominated convergence, also converge in norm).
Now let $x\in [0,1)$ be CF-normal and let $M\in {\mathcal{M}}_{D}^{\prime }$ . From the second part of Theorem 3.1, we have that for any cylinder set $(C_{s},M^{\prime })$ that
This statement also holds if $1_{(C_{s},M^{\prime })}$ is replaced by any function that can be expressed as a finite sum of indicator functions of cylinder sets, such as $f_{j}^{\pm }$ .
Thus, we have, for any $j\geqslant 0$ ,
As both integrals above converge in norm to $\int _{\widetilde{\unicode[STIX]{x1D6FA}}}f\,d\unicode[STIX]{x1D70C}$ as $j$ tends to infinity, this completes the proof.◻
5 Proof of Theorem 1.1
As we have noted in Lemma 2.3 and § 4, it suffices to prove Theorem 1.1 in the case where $x\in [0,1)$ is CF-normal and $M\in {\mathcal{M}}_{D}^{\prime }$ where ${\mathcal{M}}_{D}^{\prime }$ is some transitive component of ${\mathcal{M}}_{D}$ .
Note that Theorem 3.1 applies to $\widetilde{T}$ acting on the set $\unicode[STIX]{x1D6FA}\times {\mathcal{M}}_{D}^{\prime }$ , giving us an ergodic, invariant measure $\unicode[STIX]{x1D70C}$ on this space, and $(x,M)$ is normal with respect to $\widetilde{T}$ and $\unicode[STIX]{x1D70C}$ .
For the first and largest step of the proof, we want to show that for any string $r\in {\mathcal{S}}^{\ast }$ , this string appears in $Mx$ with a limiting frequency that does not depend on the CF-normal number $x$ . (However, we will resume throughout the proof that $M$ is a fixed matrix.) In particular, we want a constant $\unicode[STIX]{x1D70C}_{r}$ such that
for all CF-normal $x\in [0,1)$ .
Let $x=\langle a_{1},a_{2},a_{3},\ldots \,\rangle$ , and let $Mx=\langle b_{0};b_{1},b_{2},\ldots \,\rangle$ so that
We let $\ell (n)$ denote the length of $\unicode[STIX]{x1D6F9}_{[a_{1},a_{2},\ldots ,a_{n}]}(M)$ . The following two lemmas assume that $r$ and ${\mathcal{M}}_{D}^{\prime }$ are fixed and that $x$ is any CF-normal number in $[0,1)$ and $M\in {\mathcal{M}}_{D}^{\prime }$ . We will also let $\widetilde{\unicode[STIX]{x1D6FA}}=\unicode[STIX]{x1D6FA}\times {\mathcal{M}}_{D}^{\prime }$ .
Lemma 5.1. We have for some constant $c_{1}>0$
Proof. It is clear that $|s.s^{\prime }|=|s|+|s^{\prime }|$ , so that $\ell (n)$ is just the sum of the lengths of the corresponding resultant strings
As a result, let us consider the function $g:\mathbb{N}\times {\mathcal{M}}_{D}^{\prime }\rightarrow \mathbb{N}$ which acts by taking the pair $(a,M)$ to the length of the resultant string $\unicode[STIX]{x1D713}_{a}(M)$ . Let $G(x,M)$ equal $g(a,M)$ if $x\in C_{[a]}$ . Note that
By Lemma 2.2, we have that $g(a,M)\ll 1$ , so that $G$ is a bounded function that is constant on all rank-one cylinder sets.
So we have that $\ell (n)=\sum _{i=0}^{n-1}G(\widetilde{T}^{i}(x,M))$ , and by Lemma 4.2, $\ell (n)=n\cdot (\int _{\widetilde{\unicode[STIX]{x1D6FA}}}G\,d\unicode[STIX]{x1D70C})(1+o(1))$ .◻
Lemma 5.2. We have for some constant $c_{r}>0$ ,
Proof. In order to prove this lemma, we will need to define trigger strings for this system in the same way that we loosely defined trigger strings to help us understand base-10 normality in § 1.1. This will require a few ancillary definitions. Throughout these definitions, we shall allow $s,s^{\prime },s^{\ast }\in {\mathcal{S}}^{\ast }$ and $M,M^{\prime },M^{\ast }\in {\mathcal{M}}_{D}^{\prime }$ .
We say that $(s,M)$ is a subpair of $(s^{\prime },M^{\prime })$ if $s^{\prime }$ can be written as $s_{+}.s.s_{-}$ for strings $s_{+},s_{-}\in {\mathcal{S}}^{\ast }$ with $M=\unicode[STIX]{x1D6F7}_{s_{+}}(M^{\prime })$ . In particular, $\widetilde{T}^{|s_{+}|}(C_{s^{\prime }},M^{\prime })\subset (C_{s},M)$ .
We say that $r$ appears nicely in the resultant of $(s,M)$ if $r$ is a substring of $\unicode[STIX]{x1D6F9}_{s}(M)$ starting after the zeroth digit of $\unicode[STIX]{x1D6F9}_{s}(M)$ and ending before the last digit. Thus, by our definition of string concatenation, $r$ also appears as a substring of $\unicode[STIX]{x1D6F9}_{s^{\prime }}(M^{\prime })$ for any pair $(s^{\prime },M^{\prime })$ of which $(s,M)$ is a subpair. In fact, if $s^{\prime }=s_{+}.s.s_{-}$ , as in the previous paragraph, and $r$ appears as a substring of $\unicode[STIX]{x1D6F9}_{s}(M)$ starting at the $n$ th digit, then $r$ appears as a substring of $\unicode[STIX]{x1D6F9}_{s^{\prime }}(M^{\prime })$ starting at the $|\unicode[STIX]{x1D6F9}_{s_{+}}(M^{\prime })|+n$ th digit. We refer to these two copies of $r$ as being in the same relative position.
Finally we will define $(s,M)$ to be a trigger string for $r$ of multiplicity $k$ if there are $k$ copies of $r$ that appear nicely in the resultant of $(s,M)$ each of which does not appear nicely in the same relative position in the resultant of $(s^{\ast },M^{\ast })$ for any pair $(s^{\ast },M^{\ast })$ that is a proper subpair of $(s,M)$ . By the lower inequality in Lemma 2.2, we see that the length of $s$ in a trigger string $(s,M)$ must be bounded, say by $\ell =\ell _{r}$ . In consequence, since $|s|$ is bounded, the upper inequality in Lemma 2.2 implies that the multiplicity $k$ of any trigger string must be bounded as well.
From these definitions, we can see that for each occurrence of $r$ that appears nicely within the resultant of $(s^{\prime },M^{\prime })$ , there exists a unique subpair $(s,M)$ of $(s^{\prime },M^{\prime })$ for which $(s,M)$ is a trigger string for $r$ and for which $r$ appears nicely in the resultant of $(s,M)$ in the same relative position as it did within the resultant of $(s^{\prime },M^{\prime })$ . In particular, the number of occurrences of $r$ that appear nicely within the resultant of $(s^{\prime },M^{\prime })$ should be equal to the sum of all of the multiplicities of all the subpairs $(s,M)$ of $(s^{\prime },M^{\prime })$ .
The count
can be thought of as counting the number of occurrences of $r$ that appear nicely within the resultant of $([a_{1},a_{2},\ldots ,a_{n}],M)$ , up to $O(1)$ . The $O(1)$ accounts for those appearances of $r$ that start within the first $\ell (n)+1$ digits of $Mx$ but do not appear nicely within the resultant of $([a_{1},a_{2},\ldots ,a_{n}],M)$ .
So let $f(x,M)$ denote the sum of the multiplicities of all trigger strings $(s,M)$ such that $x\in C_{s}$ . Our above work implies that $f(x,M)$ is a bounded function and is constant on cylinder sets of rank $\ell$ . We can then apply Lemma 4.2 to see that
as desired. ◻
From here we are nearly done. First, let $\ell ^{-1}(m):=\max \{n:\ell (n)\leqslant m\}$ . By Lemma 5.1, $\ell ^{-1}(m)=m/c(1+o(1))$ , so therefore we have that
Thus, we have that
and so (13) follows from Lemmas 5.1 and 5.2.
Now consider the sets
We have shown that for any string $r\in {\mathcal{S}}^{\ast }$ there exists a constant $\unicode[STIX]{x1D70C}_{r}$ such that for all $y\in E_{M}$ , the string $r$ appears in the continued fraction expansion of $y$ with limiting frequency $\unicode[STIX]{x1D70C}_{r}$ , even though we do not know what any of these constants $\unicode[STIX]{x1D70C}_{r}$ equal. On the other hand, for all strings $r\in {\mathcal{S}}^{\ast }$ and all $x\in E$ , the limiting frequency of $r$ in the continued fraction expansion of $x$ is $\unicode[STIX]{x1D707}(C_{r})$ . Thus, either $\unicode[STIX]{x1D70C}_{r}=\unicode[STIX]{x1D707}(C_{r})$ for all $r$ and $E_{M}$ is a subset of $E$ or $\unicode[STIX]{x1D70C}_{r}\neq \unicode[STIX]{x1D707}(C_{r})$ for some $r$ and $E_{M}$ is disjoint from $E$ .
However, $E$ has full Lebesgue measure and $E_{M}$ , being a non-trivial linear fractional transformation of a positive measure set, has positive measure, so $E_{M}$ must be a subset of $E$ , and the theorem is proved.
6 Further questions
In one, admittedly peculiar, sense, the generalization that we have proved of Wall’s result is not the natural generalization to make. What makes rational numbers so nice for any base $b$ , is that they are eventually periodic. So one could ask the following.
Suppose $x$ is CF-normal and $q$ and $r$ have eventually periodic continued fraction expansions, that is, they are both quadratic irrationals, with $q\neq 0$ . Must it be true that $qx+r$ is CF-normal as well?
Also, while Theorem 1.1 solves Bugeaud’s problem, it does not solve Mendès France’s problem. CF-normality is a much stronger condition than CF-simple normality, and our proof relies crucially on full CF-normality. So we ask, as Mendès France did: does non-zero rational multiplication and rational addition preserve CF-simple normality?
Acknowledgements
The author would like to thank Justin Moore for asking a thought-provoking question on mathoverflow.net regarding the effect adding $1/2$ has on a continued fraction expansion, Bill Mance for bringing Bugeaud’s question to his attention, and Cor Kraaikamp for pointing the author to the work of Liardet and Stambul. The author would also like to thank the anonymous referee whose many comments greatly improved and simplified this paper.