Hostname: page-component-78c5997874-8bhkd Total loading time: 0 Render date: 2024-11-10T14:18:49.786Z Has data issue: false hasContentIssue false

Substitution Principle and semidirect products

Published online by Cambridge University Press:  15 August 2023

Célia Borlido*
Affiliation:
Centre for Mathematics of the University of Coimbra (CMUC), Coimbra, Portugal
Mai Gehrke
Affiliation:
Laboratoire J. A. Dieudonné, CNRS, Université Côte d’Azur, Nice, France
*
Corresponding author: Célia Borlido; Email: cborlido@mat.uc.pt
Rights & Permissions [Opens in a new window]

Abstract

In the classical theory of regular languages, the concept of recognition by profinite monoids is an important tool. Beyond regularity, Boolean spaces with internal monoids (BiMs) were recently proposed as a generalization. On the other hand, fragments of logic defining regular languages can be studied inductively via the so-called “Substitution Principle.” In this paper, we make the logical underpinnings of this principle explicit and extend it to arbitrary languages using Stone duality. Subsequently, we show how it can be used to obtain topo-algebraic recognizers for classes of languages defined by a wide class of first-order logic fragments. This naturally leads to a notion of semidirect product of BiMs extending the classical such construction for profinite monoids. Our main result is a generalization of Almeida and Weil’s Decomposition Theorem for semidirect products from the profinite setting to that of BiMs. This is a crucial step in a program to extend the profinite methods of regular language theory to the setting of complexity theory.

Type
Paper
Copyright
© The Author(s), 2023. Published by Cambridge University Press

1. Introduction

Profinite monoids have proved to be a powerful tool in the theory of regular languages. Eilenberg and Reiterman Theorems allow for the study of the so-called varieties of regular languages through the study of topo-algebraic properties of suitable profinite monoids. In 2008, Gehrke et al. (Reference Gehrke, Grigorieff, Pin and Aceto2008) proposed a unified approach for the study of regular languages, using Stone duality. In particular, they realized that profinite monoids could be seen as the extended dual spaces of certain Boolean algebras equipped with a residuation structure. More generally, to a Boolean algebra of (possibly nonregular) languages closed under quotients, one can assign a monoid equipped with a uniformity for which the natural biaction of the monoid on itself has uniformly continuous components (Gehrke et al. Reference Gehrke, Grigorieff and Pin2010). A slight variant (a so-called BiM – Boolean space with an internal monoid) was identified in Gehrke et al. (Reference Gehrke, Petrişan and Reggio2016).

Complexity theory and the theory of regular languages are intimately connected through logic. As with classes of regular languages, many computational complexity classes have been given characterizations as model classes of appropriate logic fragments on finite words (Immerman Reference Immerman1999). For example, ${\rm AC}^0={\bf FO}[\text{arb}]$ , ${\rm ACC}^0= ({\bf FO}+{\bf MOD})[{\text{arb}}]$ , and ${\rm TC}^0 = {\bf MAJ}[\text{arb}]$ where arb is the set of all possible (numerical) predicates of all arities on the positions of a word, FO is usual first-order logic, and MOD and MAJ stand for the modular and majority quantifiers, respectively. On the one hand, the presence of arbitrary predicates, and on the other hand, the presence of the majority quantifier is what brings one far beyond the scope of the profinite algebraic theory of regular languages.

Most results in the field of complexity theory are proved using combinatorial and probabilistic, as well as algorithmic methods (Williams Reference Williams2014). However, there are a few connections with the topo-algebraic tools of the theory of regular languages. A famous result of Barrington et al. (Reference Barrington, Compton, Straubing and Thérien1992) states that a regular language belongs to AC $^0$ if and only if its syntactic homomorphism is quasi-aperiodic. Although this result relies on the lower-bound results in circuit complexity of Furst et al. (Reference Furst, Saxe and Sipser1984) and no purely algebraic proof is known, being able to characterize the class of regular languages that are in AC $^0$ gives some hope that the nonregular classes might be amenable to treatment by the generalized topo-algebraic methods.

Indeed, the hope is that one can generalize the tools of algebraic automata theory, and this paper is a key step in this program. In the paper Logic Meets Algebra: the case of regular languages (Tesson and Thérien Reference Tesson and Thérien2007), Tesson and Thérien lay out the theory used to characterize logic classes in the setting of regular languages in terms of their recognizers. Here, we add topology to the picture, using Stone duality, to obtain corresponding tools that apply beyond the setting of regular languages. In particular, we generalize the Substitution Principle and the associated semidirect product construction used in the study of logic on words for regular languages to the general setting. This allows for an inductive topo-algebraic study of fragments of logic, which differs sharply from Lawvere’s point of view of hyperdoctrines in categorical logic (see Lawvere Reference Lawvere2006; Marquès Reference Marquès, Fahrenberg, Gehrke, Santocanale and Winter2021). Apart from being a contribution in extending profinite methods to complexity classes, this is also a key result in unifying semantics and complexity, and the results here and in the original proceedings paper (Borlido et al. Reference Borlido, Czarnetzki, Gehrke and Krebs2017) were key inspiration for this trend (see also Gehrke et al. Reference Gehrke, Jakl and Reggio2020).

In the sequence of papers (Gehrke et al. Reference Gehrke, Petrişan and Reggio2016, Reference Gehrke, Petrişan and Reggio2017, 2021), a semidirect product construction for Boolean spaces with internal monoids (BiMs) corresponding to the addition of a layer of quantifiers, which is closely related in spirit to our results here, has been studied. While related, our results here differ in three main points. For one, those results require the lp-variety (i.e., the input corresponding to the quantifier) to be regular, and only the predicates or formula algebra need not be regular. Secondly, the lp-variety needs to be generated by a finite semiring, while our results apply to any lp-variety and only the corresponding monoid structure is used. Finally, their approach is category-theoretic and does not make the link with the substitution principle of Tesson and Thérien (Reference Tesson and Thérien2007). When compared to Borlido et al. (Reference Borlido, Czarnetzki, Gehrke and Krebs2017), where the link with the substitution principle of Tesson and Thérien (Reference Tesson and Thérien2007) is highlighted, a major difference is that in Borlido et al. (Reference Borlido, Czarnetzki, Gehrke and Krebs2017) we mostly consider varieties of finite algebraic structures (the so-called typed monoids), while here we put our emphasis on their projective limits. There is also a couple of differences in the terminology used that we point out along the text where relevant.

The paper is organized as follows. In Section 2, we introduce the background needed and set up the main notation used throughout the paper. Important concepts here are those of an lp-strain of languages and of a class of sentences. Essentially, an lp-strain of languages is an assignment to any finite alphabet A (i.e., a finite set) of a Boolean algebra of languages (i.e., a Boolean subalgebra of the powerset $\mathcal{P}(A^*)$ , where $A^*$ denotes the A-generated free monoid), which is closed under taking preimages under length-preserving homomorphisms. A class of sentences is the corresponding notion in the setting of logic on words. Section 3 is devoted to languages over profinite alphabets. These are essential to the main results of this paper. Section 4 is an introduction to recognition of Boolean algebras of (not necessarily regular) languages that are closed under quotients. We define what is a Boolean space with an internal monoid (BiM) (Gehrke et al. Reference Gehrke, Grigorieff and Pin2010, Reference Gehrke, Petrişan and Reggio2016) and consider a slight generalization of this notion: that of a BiM-stamp. Informally, a BiM is to a variety of languages what a BiM-stamp is to an lp-variety of languages (i.e., an lp-strain in which we require closure under quotients). In Section 5, we cast the Substitution Principle (Tesson and Thérien Reference Tesson and Thérien2007) in duality-theoretic terms, which allows us to formulate a very general version that applies beyond regularity. We start by considering, as in Tesson and Thérien (Reference Tesson and Thérien2007), the case of finite Boolean algebras in Section 5.1. The key idea in this section is to use letters of a finite alphabet to play the role of atoms of a finite Boolean algebra of formulas. Since any Boolean algebra is the direct limit of its finite subalgebras, its Stone dual space is the projective limit of the sets of atoms of its finite subalgebras. This is why profinite alphabets (i.e., Boolean spaces) come into play in a natural way in Section 5.2, where we study substitution with respect to arbitrary Boolean algebras, and why the concept of lp-strain (and class of sentences) defined over finite but also profinite alphabets is crucial. In Tesson and Thérien (Reference Tesson and Thérien2007), only formulas with at most one free variable are considered since the extension to finite sets of free variables, needed to study predicate logic, is straightforward. For completeness, we prove this in Section 5.3. Finally, in Section 5.4, we illustrate the Substitution Principle in the setting of logic on words, by showing how it can be used to understand the effect of applying a layer of quantifiers to a given Boolean algebra of formulas. The main result of the paper, Theorem 6.9, is presented in Section 6. In this section, we get out of the context of logic and study the closure under quotients of the operation on languages derived in Section 5 via duality. Theorem 6.9 is a generalization of the classical result by Almeida and Weil (Reference Almeida and Weil1995, Theorem 5.1). Finally, in Section 7, we compute topo-algebraic recognizers for certain Boolean algebras of quantified languages, which are built on the recognizers for the corresponding nonquantified languages. More precisely, in Section 7.1, we study the effect of applying a layer of existential quantifiers, in Section 7.2 of modular quantifiers, and in Section 7.3 of majority quantifiers.

2. Preliminaries

In this section, we briefly present the background needed in the rest of the paper. For further reading on duality we refer to Johnstone (Reference Johnstone1986), and for formal language theory and logic on words to Straubing (Reference Straubing1994).

2.1 Discrete duality

This is the most basic duality we use, and it provides a correspondence between powerset Boolean algebras (these are the complete and atomic Boolean algebras) and sets. Given such Boolean algebra $\mathcal{B}$ , its dual is its set of atoms (i.e., the set of nonzero minimal elements), denoted ${\rm At}(\mathcal{B})$ and, given a set X, its dual is the Boolean algebra $\mathcal{P}(X)$ . Clearly going back and forth yields isomorphic objects. If $h\colon\mathcal{A}\to\mathcal{B}$ preserves arbitrary meets and joins, the dual of h, denoted ${\rm At}(h)\colon{\rm At}(\mathcal{B})\,{\to}\,{\rm At}(\mathcal{A})$ , is given by the adjunction:

\begin{equation*} \forall \ a\,{\in}\, \mathcal{A},\ x\,{\in}\, {\rm At}(\mathcal{B}) \quad\left({\rm At}(h)(x)\leq a \iff x\leq h(a)\right).\end{equation*}

For example, if $\iota\colon\mathcal{A}\rightarrowtail\mathcal{P}(X)$ is the inclusion of a finite Boolean subalgebra of a powerset, then ${\rm At}(\iota):X\twoheadrightarrow{\rm At}(\mathcal{A})$ is the quotient map corresponding to the finite partition of X given by the atoms of $\mathcal{A}$ . Conversely, given a function $f{\colon} X{\to} Y$ , the dual is just $f^{-1}{\colon}\mathcal{P}(Y){\to}\mathcal{P}(X)$ .

2.2 Stone duality

Since it will be useful in the next section, we will present Stone duality as a restriction of a dual adjunction between Boolean algebras and topological spaces.

Generally Boolean algebras do not have enough atoms, and we have to consider ultrafilters instead (which may be seen as “searches downwards” for atoms). Given an arbitrary Boolean algebra $\mathcal{B}$ , an ultrafilter of $\mathcal{B}$ is a subset $\gamma$ of $\mathcal{B}$ satisfying:

  • $\gamma$ is an upset, that is, $a\,{\in}\, \gamma$ and $a\leq b$ implies $b\,{\in}\, \gamma$ ;

  • $\gamma$ is closed under finite meets, that is, $a,b\,{\in}\,\gamma$ implies $a\wedge b\,{\in}\, \gamma$ ;

  • ○ for all $a\in\mathcal{B}$ exactly one of a and $\neg a$ is in $\gamma$ .

We will denote the set of ultrafilters of $\mathcal{B}$ by $X_\mathcal{B}$ , and we will consider it as a topological space equipped with the topology generated by the sets $\widehat{a}=\{\gamma\in X_\mathcal{B}\mid a\in\gamma\}$ for $a\in\mathcal{B}$ . The last property in the definition of ultrafilters implies that these basic open sets are also closed (and thus clopen). The resulting spaces are compact, Hausdorff, and have a basis of clopen sets. Such spaces are called Boolean spaces. Given a homomorphism $h\colon\mathcal{A}\to\mathcal{B}$ between Boolean algebras, the inverse image of an ultrafilter is an ultrafilter and thus $h^{-1}$ induces a map $f:X_\mathcal{B}\to X_\mathcal{A}$ . Since $f^{-1}(\widehat {a}) = \widehat {h(a)}$ for every $a \in \mathcal{A}$ , this map is continuous.

Conversely, given any topological space X, its clopen subsets form a Boolean algebra ${\sf Clopen}(X)$ , and given a continuous function $f\colon X\to Y$ , the inverse map $f^{-1}$ restricts and co-restricts to a homomorphism ${\sf Clopen}(Y) \to {\sf Clopen}(X)$ of Boolean algebras.

These assignments define a dual adjunction between Boolean algebras and topological spaces. When seen as an adjunction between the opposite category of Boolean algebras and that of topological spaces, the counit at a Boolean algebra $\mathcal{B}$ , and unit at a topological space X are, respectively, given by:

\[\varepsilon_\mathcal{B}:\mathcal{B} \to {\sf Clopen}(X_\mathcal{B}), \qquad a \mapsto \widehat a\]

and

(1) \begin{equation}\eta_X: X \to X_{{\sf Clopen}(X)}, \qquad x \mapsto \{K \in {\sf Clopen}(X) \mid x \in K\}.\end{equation}

One can show that $\varepsilon_\mathcal{B}$ is always an isomorphism, while $\eta_X$ is an isomorphism if and only if X is a Boolean space. Thus, this adjunction restricts to a duality between Boolean algebras and Boolean spaces: the so-called Stone duality.

2.3 Compactification of topological spaces

We consider two compactifications of topological spaces: the Čech–Stone and the Banaschewski compactifications, the latter being defined only for zero-dimensional spaces. We also relate them with the dual spaces of suitable Boolean algebras.

Let X be a topological space. Then, the Čech–Stone compactification of X is the unique (up to homeomorphism) compact Hausdorff space $\beta (X)$ together with a continuous function $e: X\to \beta (X)$ satisfying the following universal property: every continuous function $f: X \to Z$ into a compact Hausdorff space Z uniquely determines a continuous function $\beta f: \beta (X) \to Z$ making the following diagram commute:

In the case where X is a completely regular $T_1$ space, the map e is in fact an embedding (Willard Reference Willard2004, Section 19). This is for instance the case of discrete and of Boolean spaces. It may be proved that, for a set S, the dual space of $\mathcal{P}(S)$ is the Čech–Stone compactification of the discrete space S. This fact has been extensively used in the topo-algebraic approach to nonregular languages over finite alphabets (see e.g., Gehrke et al. Reference Gehrke, Grigorieff and Pin2010, Reference Gehrke, Petrişan and Reggio2016). However, as already mentioned, in the present work, we also consider languages over profinite alphabets. For that reason, we will be handling dual spaces of Boolean algebras of the form ${\sf Clopen}(X)$ , for spaces X that are not discrete. In the case where X is zero-dimensional, the dual space of ${\sf Clopen}(X)$ is known as the Banaschewski compactification (Banaschewski 1995, Reference Banaschewski1964) of X, and it is denoted by $\beta_0(X)$ . At the level of morphisms, $\beta_0$ assigns to each continuous map the dual of the Boolean algebra homomorphism taking inverse image on clopens. Moreover, for a zero-dimensional space X, the unit (1) of the dual adjunction between Boolean algebras and topological spaces is an embedding $\eta_X: X \rightarrowtail\beta_0(X)$ with dense image (see e.g., (see e.g., Porter and Woods Reference Porter and Woods1988, Section 4.7, Proposition (b)), and it is easy to see that for every continuous function $f: X \to Y$ between zero-dimensional spaces the following diagram commutes:

There are some cases where the Čech–Stone and the Banaschewski compactifications coincide, for example, for discrete spaces as mentioned above. More generally, we have the following:

A space is said to have Čech-dimension zero provided that each finite open cover admits a finite clopen refinement. In particular, every compact zero-dimensional space, and so, every Boolean space, has Čech-dimension zero.

Theorem 2.1. (Banaschewski 1995, Theorem 4). A zero-dimensional space X is of Čech-dimension zero if and only if it is normal and the Čech–Stone and Banaschewski compactifications of X coincide.

This is of interest to us as it shows that $\beta$ is given by duality not only for discrete spaces but more generally for spaces of Čech-dimension zero such as $Y^*$ for Y a profinite alphabet (cf. Lemma 3.5 and Theorem 3.6).

2.4 Projective and direct limits

A projective limit system (also known as an inverse limit system, or a cofiltered diagram) $\mathcal{F}$ of sets assigns to each element i of a directed partially ordered set I, a set $S_i$ , and to each ordered pair $i\geq j$ in I, a map $f_{i,j}:S_i \to S_j$ so that for all $i, j, k \in I$ with $i \ge j \ge k$ we have $f_{i,i} = id_{S_i}$ and $f_{j,k}\circ f_{i,j} = f_{i,k}$ . The projective limit (or inverse limit or cofiltered limit) of $\mathcal{F}$ , denoted $\lim\limits_{\longleftarrow} \mathcal{F}$ , comes equipped with projection maps $\pi_i: \lim\limits_{\longleftarrow} \mathcal{F}\to S_i$ compatible with the system. That is, for $i,j\in I$ with $i\geq j$ , $f_{i,j} \circ \pi_i= \pi_j$ . Further, it satisfies the following universal property: whenever $\{\pi_i':S' \to S_i\}_{i \in I}$ is a family of maps satisfying $f_{i,j} \circ \pi'_i= \pi'_j$ for all $i\geq j$ , there exists a unique map $g: S'\to\lim\limits_{\longleftarrow} \mathcal{F}$ satisfying $\pi_i' = \pi_i \circ g$ , for all $i \in I$ .

The notion dual to projective limit, obtained by reversing the directions of the maps, is that of direct limit (also known as an injective limit, inductive limit, or filtered colimit).

There are several things worth noting about these notions. First, the projective limit of $\mathcal{F}$ may be constructed as follows:

(2) \begin{equation} \lim\limits_{\longleftarrow} \mathcal{F} = \left\{(s_i)_{i \in I} \in \prod_{i \in I}S_i \mid f_{i,j} (s_i) = s_j \text{ whenever $i \ge j$}\right\}.\end{equation}

Second, projective limits of finite sets, called profinite sets, are equivalent to Boolean spaces. If each $S_i$ is finite, then it is a Boolean space in the discrete topology, and the projective limit is a closed subspace of the product and thus again a Boolean space. Conversely, a Boolean space is the projective limit of the projective system of its finite continuous quotients, corresponding via duality to the fact that Boolean algebras are locally finite and, thus, direct limits of their finite subalgebras.

Third, one also has projective systems and projective limits of richer structures than sets, being the connecting maps required to be morphisms of the appropriate kind. In this work, we will use the description of projective limits in the categories of topological spaces and of monoids. A very useful fact is that, in these settings, projective limits are given as for sets with the obvious enriched structure.

2.5 Biactions and semidirect products of monoids

Let S be a set and M a monoid. We denote the identity of M by 1. A biaction of M on S is a family of functions:

\[\lambda_m: S \to S, \qquad \text{and}\qquad \rho_m: S \to S,\]

for each $m \in M$ , that satisfy the following conditions:

  • $\{\lambda_m\}_{m \in M}$ induces a left action of M on S, that is, $\lambda_1$ the identity function on S, and for every $m, m' \in M$ , the equality $\lambda_m \circ \lambda_{m'} = \lambda_{m m'}$ holds;

  • $\{\rho_m\}_{m \in M}$ induces a right action of M on S, that is, $\rho_1$ the identity function on S, and for every $m, m' \in M$ , the equality $\rho_{m'} \circ \rho_{m} = \rho_{mm'}$ holds;

  • $\{\lambda_m\}_{m \in M}$ and $\{\rho_m\}_{m \in M}$ are compatible, that is, for every $m, m' \in M$ , we have $\lambda_m \circ \rho_{m'} = \rho_{m'} \circ \lambda_m$ .

In the case where S is also a monoid, we have the notion of a monoid biaction. In order to improve readability, we shall denote the operation on S additively, although S is not assumed to be commutative. A monoid biaction of M on S is a biaction of M on the underlying set of S, satisfying the following additional properties:

  • $\lambda_{m_1} \circ \rho_{m_2} (s_1 + s_2) = \lambda_{m_1} \circ \rho_{m_2} (s_1 ) + \lambda_{m_1} \circ \rho_{m_2} (s_2)$ , for all $m_1,m_2\in M$ , and $s_1, s_2 \in S$ ;

  • $\lambda_{m_1} \circ \rho_{m_2} (0) = 0$ , for all $m_1,m_2 \in M$ .

Finally, given such a biaction, we may define a new monoid, called the (two-sided) semidirect product of S and M, usually denoted by $S{**}M$ . The monoid $S{**}M$ has underlying set $S \times M$ , and its binary operation is defined by:

\[(s_1, m_1)(s_2, m_2) = (\rho_{m_2}(s_1) + \lambda_{m_1}(s_2),\, m_1m_2).\]

2.6 Formal languages

Let A be a set, which we call an alphabet. A word over A is an element of the A-generated free monoid $A^*$ , and a language over A is a set $L \subseteq A^*$ of words, that is, an element of the powerset $\mathcal{P}(A^*)$ . Whenever we write $w = a_0\dots a_{n-1}$ for a word, we are assuming that each of the $a_i$ ’s is a letter. If $w = a_0\dots a_{n-1}$ is a word, then we say that w has length n, denoted $\left\lvert w\right\rvert = n$ . The fact that $A^*$ is a monoid means in particular that $A^*$ is equipped with a biaction of itself. By discrete duality, it follows that $\mathcal{P}(A^*)$ also is equipped with a biaction of $A^*$ given as follows. The left (respectively, right) quotient of a language L by a word w is the language $w^{-1}L = \{u \in A^* \mid wu \in L\}$ (respectively, $Lw^{-1} = \{u \in A^* \mid uw \in L\}$ ). We say that a set of languages is closed under quotients if it is invariant under this biaction.

In the discrete duality between complete atomic Boolean algebras and sets, the complete Boolean subalgebras closed under quotients of the powerset $\mathcal{P}(M)$ , where M is a monoid, correspond to the monoid quotients of M (Gehrke Reference Gehrke2016, Theorem 3.26) (see also Dekkers Reference Dekkers2008, Theorem 8.13). In particular, given a language $L \subseteq A^*$ , the discrete dual of $\mathcal{B}_L$ , the complete Boolean subalgebra closed under quotients generated by L, is a monoid quotient $\mu_L: A^*\twoheadrightarrow M_L$ . The monoid $M_L$ is called the syntactic monoid of L, and $\mu_L$ is the syntactic homomorphism of L. The quotient map $\mu_L$ is given by the corresponding congruence relation, known as the syntactic congruence of L, which, by the definition of $\mu_L$ , is given by:

\[ u \sim_L v \iff (\forall x,y \in A^*, \ xuy \in L \iff xvy \in L).\]

Notice that it follows that $L = \mu_L^{-1}(\mu_L[L])$ , that is, L is recognized by $M_L$ via $\mu_L$ . More generally, given a monoid M, a language $L \subseteq A^*$ is said to be recognized by M provided there exists a homomorphism $\mu\colon A^* \to M$ and a subset $P \subseteq M$ satisfying $L =\mu^{-1}(P)$ , or equivalently, provided $L = \mu^{-1}(\mu[L])$ . By duality, the set of languages recognized by $\mu$ is a complete Boolean subalgebra closed under quotients which contains L. Since it must contain $\mathcal{B}_L$ , the syntactic homomorphism of L factors through any homomorphism $\mu$ which recognizes L. That is, the syntactic monoid and homomorphism of L can be thought as the “optimal” recognizer of L.

For a set of languages $\mathcal{S} \subseteq \mathcal{P}(A^*)$ , it is then natural to wonder about the existence of an “optimal” recognizer for all the languages of $\mathcal{S}$ . This leads to the notion of syntactic monoid, homomorphism, and congruence of $\mathcal{S}$ , which are defined via discrete duality for the complete Boolean subalgebra closed under quotients generated by $\mathcal{S}$ . Explicitly, the syntactic congruence of $\mathcal{S}$ , denoted $\sim_\mathcal{S}$ , is the intersection of all the syntactic congruences of the languages of $\mathcal{S}$ , the syntactic monoid is the quotient $M_\mathcal{S} = A^* / {\sim_\mathcal{S}}$ , and the syntactic homomorphism is the canonical projection $\mu_\mathcal{S}: A^* \twoheadrightarrow M_\mathcal{S}$ . The homomorphism $\mu_\mathcal{S}$ is “optimal” in the sense that if $\mu: A^* \to M$ is a homomorphism recognizing every language of $\mathcal{S}$ , then it factors through $\mu_\mathcal{S}$ .

Finally, regular languages are those recognized by finite monoids. It is not hard to see that regular languages over an alphabet A form a Boolean algebra closed under quotients. Beyond regularity, the discrete notion of recognition introduced here is not adequate. In particular, any infinite monoid recognizes uncountably many languages. Recognition of Boolean algebras of (not necessarily regular) languages closed under quotients require the introduction of topology. This will be addressed in Section 4.

So far, we did not require A to be a finite set. In this paper, we will consider languages over profinite alphabets, where the topology of the alphabet will be taken into account (finite alphabets are simply viewed as discrete spaces). Languages over profinite alphabets will be the subject of Section 3. Unless specified otherwise, we will use the letters $A, B, \dots$ to denote finite alphabets, and $X, Y, \dots$ for profinite (not necessarily finite) ones.

2.7 lp-strains and lp-varieties of languages

A homomorphism between free monoids is said to be length-preserving (also called an lp-morphism) provided it maps generators to generators. Therefore, lp-morphisms $A^* \to B^*$ are in a bijection with functions $A \to B$ . Given a map $h:A \to B$ , we use $h^*$ to denote the unique lp-morphism $h^*:A^* \to B^*$ whose restriction to A is h. As we will see in Section 2.10, lp-morphisms are important in the treatment of fragments of logic on words.

Combining Pippenger (Reference Pippenger1997) and Straubing’s (2002) terminology, we call lp-strain of languages an assignment, for each finite alphabet A, of a Boolean algebra $\mathcal{V}(A)$ of languages over A such that, for every lp-morphism $h^*:B^* \to A^*$ , if $L \in \mathcal{V}(A)$ then $(h^*)^{-1}(L)\in \mathcal{V}(B)$ . Since $(h \circ g)^* = h^*\circ g^*$ whenever f and g are composable set functions, each lp-strain of languages defines a presheaf $\mathcal{V}: {\bf Set}_{fin}^{\rm op} \to {\bf BA}$ of Boolean algebras. Moreover, the presheaves obtained in this way are precisely the sub-presheaves of the presheaf $\mathcal{P}: {\bf Set}_{fin}^{\rm op} \to{\bf BA}$ defined by the lp-strain assigning the full powerset $\mathcal{P}(A^*)$ to each finite alphabet A. This notion allows us to extend languages to profinite alphabets (cf. Section 3), which is crucial when handling infinite Boolean algebras of formulas (cf. Section 5.2). Notice that, if $\mathcal{V}$ is an lp-strain of languages, then every function $h: B \to A$ induces dually a continuous function $\widehat h: X_{\mathcal{V}(B)} \to X_{\mathcal{V}(A)}$ making the following diagram commute:

where $A^* \to X_{\mathcal{V}(A)}$ and $B^* \to X_{\mathcal{V}(B)}$ are the maps with dense image obtained by dualizing and restricting the embeddings $\mathcal{V}(A) \rightarrowtail \mathcal{P}(A^*)$ and $\mathcal{V}(B) \rightarrowtail\mathcal{P}(B^*)$ , respectively. This is a particular case of Lemma 3.9, for which we will provide a detailed proof.

In the case where $\mathcal{V}(A)$ is closed under quotients for every alphabet A, we call $\mathcal{V}$ an lp-variety of languages.

2.8 Logic on words

We shall consider classes of languages that are definable in fragments of first-order logic. Variables are denoted by $x,y,z, \ldots$ , and we fix a finite alphabet A. Our logic signature consists of:

  • (unary) letter predicates ${\sf P}_a$ , one for each letter $a \in A$ ;

  • a set $\mathcal{N}$ of numerical predicates R, each of finite arity. A k-ary numerical predicate R is given by a subset $R \subseteq {\mathbb{N}}^k$ . For instance, the usual (binary) numerical predicate $<$ formally corresponds to the predicate $R = \{(i,j) \mid i,j \in {\mathbb{N}}, \ i < j\}$ ;

  • a set $\mathcal{Q}$ of (unary) quantifiers Q, each one being given by a function $Q:\{0,1\}^* \to \{0,1\}$ . For instance, the existential quantifier $\exists$ corresponds to the function mapping $\varepsilon_0 \dots \varepsilon_{n-1} \in \{0,1\}^*$ to 1 exactly when there exists an i so that $\varepsilon_i = 1$ .

Then, (first-order) formulas are recursively built as follows:

  • for each letter predicate ${\sf P}_a$ and variable x, there is an atomic formula ${\sf P}_a(x)$ ;

  • if $R \in \mathcal{N}$ is a k-ary numerical predicate and $x_1, \dots, x_k$ are (possibly nondistinct) variables, then $R(x_1, \dots, x_k)$ is an atomic formula;

  • Boolean combinations of formulas are formulas, that is, if $\phi$ and $\psi$ are formulas, then so are $\phi \wedge \psi$ , $\phi \vee \psi$ , and $\neg \phi$ ;

  • if Q is a quantifier, x a variable and $\phi$ a formula, then $Q x \ \phi$ is also a formula.

An occurrence of a variable x in a formula is said to be free provided it appears outside of the scope of a quantifier Q x. A sentence is a formula with no free occurrences of variables.

A context x is a finite set of distinct variables. Given disjoint contexts x and y, we denote by ${\bf x}{\bf y}$ the union of x and y. Whenever we mention union of contexts, we will assume they are disjoint without further mention. Also, we simply write x to refer to the context $\{x\}$ so that by ${\bf x} x$ we mean the context ${\bf x} \cup \{x\}$ . We denote by $\left\lvert{\bf x}\right\rvert$ the cardinality of x. The models of a formula will always be considered in a fixed context. We say that $\phi$ is in context x provided all the free variables of $\phi$ belong to x. Notice that, if $\phi$ is in context x, then it is also in every other context containing x.

Models of sentences in the empty context are words over A. More generally, models of formulas in the context $\bf x$ are given by $ {\bf x}$ -marked words, that is, words $w \in A^*$ equipped with an interpretation $ {\bf i} \in \{0, \dots, \left\lvert w\right\rvert -1\}^{\bf x}$ of the context x. We identify maps from x to $\{0, \dots, \left\lvert w\right\rvert - 1\}$ with $\left\lvert{\bf x}\right\rvert$ -tuples ${\bf i} = (i_1, \dots, i_{\left\lvert{\bf x}\right\rvert }) \in \{0, \dots, \left\lvert w\right\rvert -1\}^{\left\lvert{\bf x}\right\rvert} \subseteq {\mathbb{N}}^{\left\lvert{\bf x}\right\rvert}$ . We denote the set of all $ {\bf x}$ -marked words by $A^* \otimes {\mathbb{N}}^{ {\bf x}}$ . Given a word $w \in A^*$ and a vector ${\bf i} = (i_1, \dots, i_{\left\lvert{\bf x}\right\rvert })\in \{0, \dots, \left\lvert w\right\rvert -1\}^{\left\lvert{\bf x}\right\rvert}$ , we denote by $(w,{\bf i})$ the marked word based on w equipped with the map given by i. Moreover, if x and y are disjoint contexts, ${\bf i} = (i_1,\dots, i_{\left\lvert{\bf x}\right\rvert}) \in \{0, \dots, \left\lvert w\right\rvert - 1\}^{\left\lvert{\bf x}\right\rvert}$ and ${\bf j} = (j_1, \dots, j_{\left\lvert{\bf y}\right\rvert}) \in \{0, \dots, \left\lvert w\right\rvert -1\}^{\left\lvert{\bf y}\right\rvert}$ , then $(w, {\bf i}, {\bf j})$ denotes the ${ {\bf z}}$ -marked word $(w, {\bf k})$ , where ${\bf z} = {\bf x}{\bf y}$ and ${\bf k} = (i_1, \dots, i_{\left\lvert{\bf x}\right\rvert}, j_1, \dots, j_{\left\lvert{\bf y}\right\rvert})$ .

The semantics of a formula in context ${\bf x} = \{x_1, \dots, x_k\}$ is defined inductively as follows. We let $w \in A^*$ be a word and ${\bf i} = (i_1, \dots, i_k) \in \{0, \dots, \left\lvert w\right\rvert - 1\}^{\left\lvert{\bf x}\right\rvert}$ . Then,

  • the marked word $(w, {\bf i})$ satisfies ${\sf P}_{a}(x_j)$ if and only if its $i_j$ -th letter is an a. As a consequence, for all $j \in \{1, \dots, k\}$ , there exists exactly one letter $a \in A$ such that $(w, {\bf i}) \models {\sf P}_a(x_j)$ ;

  • the marked word $(w, {\bf i})$ satisfies the formula $R(x_{j_1}, \dots, x_{j_\ell})$ given by a numerical predicate R if and only if $(i_{j_1}, \dots, i_{j_\ell}) \in R$ ;

  • the Boolean connectives and $\wedge$ , or $\vee$ , and not $\neg$ are interpreted classically;

  • the marked word $(w, {\bf i})$ satisfies the formula $Q x \ \phi$ , for a quantifier $Q: \{0,1\}^* \to \{0,1\}$ , if and only if $Q((w, {\bf i}, 0) \models \phi, \dots, (w, {\bf i}, \left\lvert w\right\rvert -1)\models \phi ) = 1$ , where “ $(w, {\bf i}, i)\models \phi$ ” denotes the truth value of “ $(w, {\bf i}, i)$ satisfies $\phi$ .”

Important examples of quantifiers are given by the: existential quantifier $\exists$ mentioned above and its variant $\exists !$ (there exists a unique); modular quantifiers $\exists_q^r$ , for each $q \in {\mathbb{N}}$ and $0\leq r<q$ , mapping a word of $\{0,1\}^*$ to 1 if and only if its number of 1’s is congruent to r modulo q; and the majority quantifiers ${\sf Maj}_k$ , for each $k \in {\mathbb{N}}$ , sending an element of $\{0,1\}^*$ to 1 if and only if it has strictly k more occurrences of 1 than of 0. Finally, given a formula $\phi$ in context x, we denote by $L_{\phi}$ the set of marked words that are models of $\phi$ .

Given a finite alphabet A, a context $\bf x$ , a set of numerical predicates $\mathcal{N}$ , and a set of quantifiers $\mathcal{Q}$ , we denote by ${\mathcal{Q}}_{A,{\bf x}}[\mathcal{N}]$ the corresponding set of first-order formulas in context $\bf x$ , up to semantic equivalence. The set of sentences ${\mathcal{Q}}_{A,\emptyset}[\mathcal{N}]$ in the empty context is simply denoted by $\mathcal{Q}_A[\mathcal{N}]$ . Notice that, as long as ${\mathcal{Q}}_{A,{\bf x}}[\mathcal{N}]$ is nonempty, say it contains a formula $\phi$ , it will also contain the always-true formula ${\bf 1}$ , which is semantically equivalent to $\phi \vee \neg\phi$ , and the always-false formula ${\bf 0}$ , which is semantically equivalent to $\phi \wedge \neg \phi$ . In particular, we have $L_{{\bf 1}} = A^* \otimes {\mathbb{N}}^{ {\bf x}}$ and $L_{{\bf 0}} = \emptyset$ , and taking the set of models of a given formula defines an embedding of Boolean algebras:

\[{\mathcal{Q}}_{A,{\bf x}}[\mathcal{N}]\rightarrowtail \mathcal{P}(A^* \otimes {\mathbb{N}}^{ {\bf x}}).\]

In formal language theory, one usually considers languages that are subsets of a free monoid, as presented in the previous paragraph. However, formulas in a nonempty context ${\bf x} = \{x_1,\dots, x_k\}$ define languages of marked words. Nevertheless, there is a natural injection $A^* \otimes {\mathbb{N}}^{ {\bf x}} \rightarrowtail (A\times2^{\bf y})^*$ whenever y is a context containing x. Here, we make a slight abuse of notation by writing $2^{{\bf y}}$ to actually mean the set $\mathcal{P}({\bf y})$ . Indeed, let $(w, {\bf i})$ be a marked word, where $w =a_0 \dots a_{n -1}$ and ${\bf i} = (i_1, \dots, i_k)$ . Then, $(w, {\bf i})$ is completely determined by the word $(a_0, S_0)\dots (a_{n-1},S_{n-1})$ over $(A \times 2^{\bf y})$ , where $S_i = \{x_{j} \mid i_j =i\}$ . Thus, we will often regard the language defined by a formula in a context x as a language over an extended alphabet $A \times 2^{\bf y}$ for some ${\bf y} \supseteq {\bf x}$ . Note that words of the form $(a_0,S_0)\dots (a_{n-1}, S_{n-1})$ as above are usually called x-structures (cf. Straubing Reference Straubing1994, Chapter II).

2.9 Marked words

We now consider the case where x consists of a single variable x, as this will play an important role in the remaining paper (cf. Sections 4.2 and 6). Note, however, that a similar treatment is possible in general.

When ${\bf x} = \{x\}$ , the set of x-marked words may be described as:

\begin{equation*} A^* \otimes {\mathbb{N}} = \{(w, i) \mid w \in A^*, \ i \in \{0, \dots, \left\lvert w\right\rvert - 1\}\},\end{equation*}

and we have an embedding $A^*\otimes {\mathbb{N}} \rightarrowtail (A \times2)^*$ defined by:

\[(w, i) \mapsto (a_0, 0) \dots (a_{i-1}, 0) (a_i, 1) (a_{i+1}, 0) \dots (a_{n-1}, 0),\]

for every word $w = a_0 \dots a_{n-1}$ and $i \in \{0, \dots, n-1\}$ .

Identifying the elements of $A^* \otimes{\mathbb{N}}$ with their images in $(A\times 2)^*$ , we may compute the Boolean algebra closed under quotients generated by this language. It is not difficult to see that $A^* \otimes{\mathbb{N}}$ is a regular language in $(A \times 2)^*$ , and that its syntactic monoid is the three-element commutative monoid $\{e,m,z\}$ satisfying $m^2 = z$ , with z acting as zero and e its identity element. The syntactic morphism of the language $A^* \otimes{\mathbb{N}}$ is $\mu: (A \times 2)^*\twoheadrightarrow \{e,m,z\}$ given by $\mu(a,0) =e$ and $\mu(a, 1) = m$ , and we have

\[A^* = \mu^{-1}(e), \quad A^*\otimes {\mathbb{N}} = \mu^{-1}(m), \quad \text{and} \quad A_z = \mu^{-1}(z),\]

where we identify $A^*$ with $(A\times\{0\})^*$ and $A_z= (A \times2)^* \setminus (A^* \cup A^* \otimes {\mathbb{N}})$ .

2.10 Classes of sentences

Given a function $\zeta: A \to B$ , we may define a map $\zeta^*_{\bf x}:A^* \otimes {\mathbb{N}}^{ {\bf x}} \to B^* \otimes{\mathbb{N}}^ {\bf x}$ by setting $\zeta^*_{\bf x}(w,{\bf i}) = (\zeta^*(w), {\bf i})$ , for every $w \in A^*$ and ${\bf i}\in \{0, \dots, \left\lvert w\right\rvert - 1\}^{\bf x}$ . By discrete duality, $\zeta_{\bf x}^*$ yields a homomorphism of Boolean algebras $(\zeta^*_{\bf x})^{-1}: \mathcal{P}(B^*\otimes {\mathbb{N}}^{ {\bf x}}) \to \mathcal{P}(A^* \otimes {\mathbb{N}}^{ {\bf x}})$ . By identifying fragments of logic with the set of languages they define, we may show that, for every set of quantifiers $\mathcal{Q}$ and every set of numerical predicates $\mathcal{N}$ , the homomorphism $(\zeta^*_{\bf x})^{-1}$ restricts and co-restricts to a homomorphism ${\mathcal{Q}}_{B,{\bf x}} [\mathcal{N}] \to {\mathcal{Q}}_{A,{\bf x}}[\mathcal{N}]$ . Indeed, for x consisting of a single variable x, we have

\[\zeta_x(w, i) \models {\sf P}_b(x) \iff (w, i) \models \bigvee_{\zeta(a) = b} {\sf P}_a(x),\]

for every $w \in A^*$ and $i \in \{0, \dots, \left\lvert w\right\rvert - 1\}$ . Note that, since A is finite, $\bigvee_{\zeta(a) = b} {\sf P}_a(x)$ is a well-defined formula of ${\mathcal{Q}}_{A,{\bf x}}[\mathcal{N}]$ . With a routine structural induction on the construction of formulas, we can then show that $(\zeta^*_{\bf x})^{-1}$ sends the language defined by a formula $\phi\in{\mathcal{Q}}_{B,{\bf x}}[\mathcal{N}]$ to the language defined by the formula obtained from $\phi$ by substituting, for every occurrence of a predicate ${\sf P}_b(x)$ with $b\in B$ , the formula $\bigvee_{\zeta(a)=b}{\sf P}_a(x)$ . Note that if b is not in the image of $\zeta$ , then ${\sf P}_b(x)$ is replaced by the empty join, which is logically equivalent to the always-false proposition. We denote by $\zeta_{\mathcal{Q}_{\bf x}[\mathcal{N}]}$ this restriction and co-restriction of $(\zeta_{\bf x}^*)^{-1}$ , so that the following diagram commutes

(3)

In particular, $\mathcal{Q}_{\_,{\bf x}}[\mathcal{N}]$ defines a presheaf $\mathcal{Q}_{\_,{\bf x}}[\mathcal{N}]: {\bf Set}_{fin}^{\rm op} \to {\sf BA}$ and, as so, it takes right (respectively, left) inverses to left (respectively, right) inverses and thus surjections (respectively, injections) on finite sets to embeddings (respectively, quotients) in Boolean algebras.

A class of sentences is, intuitively speaking, the notion corresponding to that of an lp-strain of languages in the setting of logic on words. This notion models what is usually referred to as a fragment of logic. Formally, a class of sentences $\Gamma$ is a map that associates with each finite alphabet A a set of sentences $\Gamma(A)\subseteq\mathcal{Q}_{A}[\mathcal{N}]$ which satisfies the following properties:

(LC.1) Each set $\Gamma(A)$ is closed under Boolean connectives $\wedge$ and $\neg$ (and thus, under $\vee$ );

(LC.2) For each map $\zeta\colon A\to B$ between finite alphabets, the homomorphism $\zeta_{\mathcal{Q}[\mathcal{N}]}: \mathcal{Q}_B[\mathcal{N}] \to \mathcal{Q}_A[\mathcal{N}]$ restricts and co-restricts to a homomorphism $\zeta_{\Gamma}: \Gamma(B)\to \Gamma(A)$ .

Since we consider sentences up to semantic equivalence, by (LC.1), each $\Gamma(A)$ is a Boolean algebra, and (LC.2) simply says that the assignment $A \mapsto\mathcal{V}_\Gamma(A) = \{L_\phi\mid\phi\in\Gamma(A)\}$ defines an lp-strain of languages.

3 Languages over Profinite Alphabets

As in the regular setting, it is sometimes useful to consider languages over profinite alphabets. Indeed, we will see in Section 5 that these appear naturally when extending substitution to arbitrary Boolean algebras. In this section, we see how any lp-strain of languages may be naturally extended to profinite alphabets. Since lp-strains of languages are the formal language counterpart of classes of sentences, such extension will be another key ingredient in the subsequent sections (cf. Corollary 5.12, but also Theorem 6.9).

Let $\mathcal{V}$ be an lp-strain of languages. We extend the assignment $A\mapsto \mathcal{V}(A)$ for A finite to profinite alphabets as follows. Let Y be a profinite alphabet. By definition, Y is the projective limit of all its finite continuous quotients, say $\{h_i : Y\twoheadrightarrow Y_i\}_{i \in I}$ , the connecting morphisms being all the surjective maps $h_{i,j}: Y_i \twoheadrightarrow Y_j$ satisfying $h_j = h_{i,j} \circ h_i$ . Note that I is a directed set ordered by $i \ge j$ if and only if $h_{i,j}$ is defined. In turn, each of the maps $h_{i,j}$ uniquely defines an lp-morphism $h_{i,j}^*: Y_i^* \twoheadrightarrow Y_j^*$ . Since $\mathcal{V}$ is an lp-strain, for every $i \ge j$ , there is a well-defined embedding of Boolean algebras $\theta_{i,j}: \mathcal{V}(Y_j) \rightarrowtail \mathcal{V}(Y_i)$ sending $L \in \mathcal{V}(Y_j)$ to its preimage $(h_{i,j}^*)^{-1}(L) \in\mathcal{V}(Y_j)$ . A similar phenomenon happens with respect to each of the continuous quotients $h_i: Y \twoheadrightarrow Y_i$ , when $Y^*$ is viewed as the topological space which is the union over $n \ge 0$ of the product spaces $Y^n$ . Indeed, since the clopen subsets of $Y^*$ are the unions of the form $\bigcup_{n \ge 0} C_n$ , where $C_n \subseteq Y^n$ is a clopen subset of $Y^n$ , we have that $h_i^*$ is continuous, and thus, it defines an embedding of Boolean algebras $\iota_i: \mathcal{V}(Y_i) \rightarrowtail {\sf Clopen}(Y^*)$ . Clearly, the family $\{ \mathcal{V}(Y_i) \mid i \in I\}$ forms a direct limit system with connecting morphisms $\{\theta_{i,j}\mid i \le j\}$ satisfying $\iota_j = \iota_i \circ \theta_{i,j}$ . Thus, we may define

(4) \begin{equation} \mathcal{V}(Y) = \lim_{\longrightarrow} \ \{\mathcal{V}(A) \mid A \text{ is a finite continuous quotient of }Y\}.\end{equation}

Note that, by identifying $\mathcal{V}(Y_i)$ with $\iota_i[\mathcal{V}(Y_i)] \subseteq{\sf Clopen}(Y^*)$ for each $i \in I$ , we may regard $\mathcal{V}(Y)$ as a Boolean subalgebra of ${\sf Clopen}({Y^*}) \subseteq \mathcal{P}(Y^*)$ . More precisely, we have

\[\mathcal{V}(Y) = \bigcup_{i \in I} \mathcal{V}(Y_i),\]

and each $\mathcal{V}(Y_i)$ is a Boolean subalgebra of $\mathcal{V}(Y)$ (which dually means that $X_{\mathcal{V}(Y_i)}$ is a quotient of $X_{\mathcal{V}(Y)}$ ). Moreover, given a map $h: Y \to A$ , words $u, v\in Y^*$ , and a language $K \subseteq A^*$ , a routine computation shows that

\[u^{-1}[(h^*)^{-1}(K)]v^{-1}= (h^*)^{-1}(s^{-1}Kt^{-1}),\]

where $s = h^*(u)$ and $t = h^*(v)$ . Thus, if $\mathcal{V}(A)$ is closed under quotients, then so is $(h^*)^{-1}(\mathcal{V}(A))$ . Therefore, we have:

Lemma 3.1 A language $L \subseteq Y^*$ belongs to $\mathcal{V}(Y)$ if and only if there is a finite continuous quotient $h: Y \twoheadrightarrow A$ and a language $K \in \mathcal{V}(A)$ such that $L = (h^*)^{-1}(K)$ . Further, if $\mathcal{V}$ is an lp-variety, then $\mathcal{V}(Y)$ is a Boolean algebra closed under quotients by words of $Y^*$ .

Remark 3.2. As we have seen above, every language over a profinite alphabet may be seen as a clopen subset of $Y^*$ . However, even when $\mathcal{V}$ is the full variety of languages, meaning that $\mathcal{V}(A) = \mathcal{P}(A^*)$ for every finite alphabet A, the equality $\mathcal{V}(Y) = {\sf Clopen}(Y^*)$ does not hold in general. For instance, if $Y = {\mathbb{N}} \cup \{\infty\}$ is the one-point compactification of ${\mathbb{N}}$ , we have $Y = \lim\limits_{\longleftarrow} \ \{h_i: {\mathbb{N}} \twoheadrightarrow \{0, 1, \dots, i\}\}_{i \in {\mathbb{N}}}$ , where $h_i(j) = j$ for $j \in \{0, 1, \dots, i\}$ , and $h_i(j) = i$ for $j \in {\mathbb{N}}_{> i} \cup \{\infty\}$ . Moreover, the set $L = \{0 1 2\dots i \mid i \in {\mathbb{N}}\}$ is a clopen subset of $Y^*$ which does not belong to $\mathcal{V}(Y)$ . Indeed, by Lemma 3.1, this is due to the fact that there is no finite continuous quotient $h:Y \twoheadrightarrow A$ of Y so that L belongs to $(h^*)^{-1}(\mathcal{P}(A^*))$ , as the preimage under h of every language over A necessarily contains some word with the letter $\infty$ .

An interesting open question is then to provide a nice description of the dual space of $\mathcal{V}(Y)$ for the full variety $\mathcal{V}$ . This would provide the “best” ambient space for studying languages over a profinite alphabet. Lacking such a description, we will define a language over a profinite alphabet Y to be a clopen subset of $Y^*$ . Note that we have the following:

Proposition 3.3. Let Y be a profinite set and $L\subseteq Y^*$ . Then the following conditions are equivalent:

  1. (a) There exists $q\colon Y\twoheadrightarrow A$ finite continuous quotient with $L=(q^*)^{-1}(q^*[L])$ ;

  2. (b) There exist $V_0,\dots, V_{n-1}$ a clopen partition of Y such that, if $w\in L$ , then there is $f:|w|\to n$ with

    \[ w\in V_{f(0)}\dots V_{f(|w| -1)}\subseteq L;\]
  3. (c) There exist $V_0,\dots, V_{n -1}$ clopen subsets of Y such that, if $w\in L$ , then there is $f:|w|\to n$ with

    \[w\in V_{f(0)}\dots V_{f(|w| - 1)}\subseteq L.\]

Proof. $(a){\Rightarrow}(b)$ : Fix an enumeration of $A=\{a_0,\dots,a_{n-1}\}$ and take $V_i=q^{-1}(a_i)$ . Then $V_0,\dots, V_{n-1}$ is a clopen partition of Y. Also, for any $w\in L$

\[ w\in (q^*)^{-1}(q^*(w))=q^{-1}(q(w_0))\dots q^{-1}(q(w_{|w| -1}))\subseteq (q^*)^{-1}(q^*[L])= L,\]

where the last equality holds by hypothesis. Now (b) follows as $q^{-1}(q(w_0))\dots q^{-1}(q(w_{|w| - 1}))=V_{f(0)}\dots V_{f(|w| -1)}$ , where $f(j)=i$ if and only if $w_j=a_i$ . Clearly $(b){\Rightarrow}(c)$ .

For $(c){\Rightarrow}(a)$ : Let $W_0,\dots, W_{m-1}$ be the atoms of the finite Boolean algebra of clopens generated by the clopen subsets $V_0,\dots, V_{n-1}$ stipulated in (c) and let $q:Y\twoheadrightarrow m$ be the corresponding quotient map. Then q is continuous and, by (c), for each $w\in L$ we have $f_w:|w|\to n$ with $w\in V_{f_w(0)}\dots V_{f_w(|w| - 1)}\subseteq L$ . Since the $W_i$ are atoms, we have $w_j\in q^{-1}(q(w_{j}))=W_{q(w_{j})}\subseteq V_{f_w(j)}$ , and it follows that

\[ L\subseteq (q^*)^{-1}(q^*[L])=\bigcup_{w\in L}(q^*)^{-1}(q^*(w))\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\]
\[\qquad\qquad\qquad\qquad\qquad =\bigcup_{w\in L}W_{q(w_{0})}\dots W_{q(w_{|w| - 1})}\subseteq\bigcup_{w\in L} V_{f_w(0)}\dots V_{f_w(|w| - 1)}\subseteq L.\qquad\qquad \]

Notice that each of the spaces $Y^n$ is a Boolean space, but $Y^*$ is not. Nevertheless, since it is 0-dimensional, it may be naturally embedded in a Boolean space, namely in the Banaschewski compactification of $Y^*$ , cf. Section 2.3. As a consequence, we have the following:

Lemma 3.4. Let $\mathcal{B} \subseteq {\sf Clopen}(Y^*)$ be a Boolean algebra of languages over the profinite alphabet Y. Then, the dual of the embedding $\mathcal{B} \rightarrowtail {\sf Clopen}(Y^*)$ restricts to $Y^*$ , yielding a continuous function $Y^* \to X_\mathcal{B}$ with dense image, mapping a word $w \in Y^*$ to the ultrafilter $\{K \in \mathcal{B} \mid w \in K\}$ .

Proof. The dual of the embedding $\mathcal{B} \rightarrowtail {\sf Clopen}(Y^*)$ is a quotient $\beta_0(Y^*) \twoheadrightarrow X_\mathcal{B}$ . Since $Y^*$ densely embeds in $\beta_0(Y^*)$ via the assignment $w \mapsto \{K \in {\sf Clopen}(Y^*) \mid w \in K\}$ , the claim follows.

Now, we show that the Banaschewski and Čech–Stone compactifications of $Y^*$ coincide.

Lemma 3.5. Every disjoint union of topological spaces of Čech-dimension zero has Čech-dimension zero.

Proof. Let $\{X_i\}_{i \in I}$ be a family of topological spaces of Čech-dimension zero, and $X = \bigcup_{i \in I}X_i$ be its disjoint union. Let also $X = U_1 \cup \dots \cup U_n$ be a finite open cover of X. Then, for every $i \in I$ , $X_i = \bigcup_{k = 1}^n(U_k \cap X_i)$ is a finite open cover of $X_i$ . Since $X_i$ has Čech-dimension zero, this open cover admits a finite clopen refinement, say $X_i = \bigcup_{j \in F_i} V_{j, i}$ , where $F_i$ is a finite set and each $V_{j,i}$ is a clopen subset of $X_i$ contained in some $U_k \cap X_i$ . For each $k = 1, \dots, n$ , we set

\[V_k = \bigcup \{V_{j, i} \mid i \in I, \ j \in F_i, \ V_{j, i}\subseteq U_k \cap X_i\}.\]

Then, $V_i$ is a clopen subset of X contained in $U_k$ and $X = \bigcup_{k = 1}^n V_k$ . Thus, $\{V_k\}_{k = 1}^n$ is the desired finite clopen refinement of the given finite open cover of X.

By a straightforward application of Theorem 2.1, we may conclude that the Banaschewski and Čech–Stone compactifications of $Y^*$ indeed coincide.

Theorem 3.6. Let Y be a Boolean space and $Y^*$ be the topological space which is the union over $n \ge 0$ of the product spaces $Y^n$ . Then, the dual space of ${\sf Clopen}(Y^*)$ is $\beta(Y^*)$ , the Čech–Stone compactification of $Y^*$ . In particular, there is a continuous embedding $Y^* \rightarrowtail \beta(Y^*)$ with dense image.

The following immediate consequence is stated for later reference.

Corollary 3.7. Let X, Y be Boolean spaces. Then, $\beta(Y^*) \times X$ is the Banaschewski compactification of $Y^* \times X$ .

A consequence of Lemma 3.1 is that the extension to profinite alphabets of an lp-strain of languages is closed under preimages of continuous lp-morphisms between profinite alphabets:

Lemma 3.8. Let $\alpha: Z \to Y$ be a continuous function and $\mathcal{V}$ an lp-strain of languages. Then, the Boolean algebra homomorphism $(\alpha^*)^{-1}: {\sf Clopen}(Y^*) \to{\sf Clopen}(Z^*)$ restricts and co-restricts to a homomorphism $\mathcal{V}(Y) \to \mathcal{V}(Z)$ so that we have the following commutative diagram:

Proof. Let $L \in \mathcal{V}(Y)$ . By Lemma 3.1, there is a finite continuous quotient $h: Y \twoheadrightarrow A$ and a language $K \in \mathcal{V}(A)$ such that $L = (h^*)^{-1}(K)$ . Consider the factorization of $h \circ \alpha$ through its image: $Z \overset{\;q\;}{\twoheadrightarrow} B \overset{\;e\;}{\rightarrowtail} A$ . Then, $B ={\sf Im }(h \circ \alpha)$ is a finite continuous quotient of Z which embeds in A. Since $\mathcal{V}$ is an lp-strain of languages, and thus closed under preimages of lp-morphisms, the language $(e^*)^{-1}(K)$ belongs to $\mathcal{V}(B)$ . Finally, using again Lemma 3.1, we may conclude that $(q^*)^{-1}((e^*)^{-1}(K))$ belongs to $\mathcal{V}(Z)$ . But, since $e \circ q = h \circ \alpha$ and $L = (h^*)^{-1}(K)$ , it follows that $(q^*)^{-1}((e^*)^{-1}(K)) = (\alpha^*)^{-1}((h^*)^{-1}(K)) = (\alpha^*)^{-1}(L)$ . Thus, $(\alpha^*)^{-1}(L)$ belongs to $\mathcal{V}(Z)$ , as intended.

The dual statement of Lemma 3.8, yields the following:

Lemma 3.9. Let $\alpha: Z \to Y$ be a continuous function and $\mathcal{V}$ an lp-strain of languages. Then, there exists a continuous map $\widehat \alpha: X_{\mathcal{V}(Z)} \to X_{\mathcal{V}(Y)}$ making the following diagram commute:

where $f:Y^* \to X_{\mathcal{V}(Y)}$ and $g:Z^* \to X_{\mathcal{V}(Z)}$ are the continuous functions with dense image obtained by dualizing and restricting the embeddings $\mathcal{V}(Y) \rightarrowtail {\sf Clopen}(Y^*)$ and $\mathcal{V}(Z) \rightarrowtail {\sf Clopen}(Z^*)$ , respectively (cf. Lemma 3.4).

Proof. By Lemma 3.8, the diagram below restricts correctly:

Let $\widehat \alpha$ be the dual of the homomorphism $(\alpha^*)^{-1}: \mathcal{V}(Y) \to \mathcal{V}(Z)$ . We claim that $\widehat \alpha$ makes the desired diagram commute. Indeed, given $K \in \mathcal{V}(Y)$ and $w \in Z^*$ , by definition of dual map and by Lemma 3.4, we have the following:

\begin{align*} K \in \widehat \alpha \circ f(w) & \iff (\alpha^*)^{-1}(K) \in f(w) \iff w \in (\alpha^*)^{-1}(K) \\ & \iff \alpha^* (w) \in K \iff K \in g \circ \alpha^*(w). \end{align*}

Thus, $ \widehat \alpha \circ f = g \circ \alpha^*$ as required.

4. Recognition of Boolean Algebras Closed Under Quotients

Quotients by words are natural operations on languages. In the regular setting, they dually correspond to the multiplication in profinite monoids (Gehrke et al. Reference Gehrke, Grigorieff, Pin and Aceto2008), and one is often interested in studying Boolean algebras that are closed under quotients. Beyond regularity, the closure under quotients will bring additional algebraic structure, and thus computational power, to the dual space of the Boolean algebra considered. We may thus think of the dual space of a Boolean algebra closed under quotients as having “almost” a monoid structure. This intuitive idea is captured by the BIMs, which were introduced in Gehrke et al. (Reference Gehrke, Petrişan and Reggio2016) as an alternative to the semiuniform monoids of Gehrke et al. (Reference Gehrke, Grigorieff and Pin2010). In Section 4.1, we will introduce BiMs and define its variant of a BiM-stamp.

Then, in Section 4.2, we will discuss recognition of languages that are defined by some formula with a free variable, that is, languages of marked words. We already saw in Section 2.9 that the set $A^* \otimes {\mathbb{N}}$ of all marked words is not equipped with a monoid structure. For this reason, BiM’s will not appear as a natural notion of recognizer. Nevertheless, the fact that $A^* \otimes {\mathbb{N}}$ embeds in the free monoid $(A \times 2)^*$ allows us to identify some monoid actions that will be crucial when defining semidirect products in Section 6.

4.1 Languages over profinite alphabets

Let Y be a profinite alphabet and $\mathcal{B} \subseteq {\sf Clopen}(Y^*)$ be a Boolean algebra of languages. If $\mathcal{B}$ is closed under quotients, then for every word w over Y, the homomorphisms

\[\ell_{w}:{\sf Clopen}(Y^*) \to {\sf Clopen}(Y^*), \qquad L \mapsto w^{-1}L\]

and

\[r_w:{\sf Clopen}(Y^*) \to {\sf Clopen}(Y^*), \qquad L \mapsto Lw^{-1}\]

restrict and co-restrict to endomorphisms of $\mathcal{B}$ . Thus, we have the two following commutative diagrams:

Dually, we have the following commutative diagrams of continuous functions:

(5)

We make a few remarks about these diagrams. First, notice that concatenation of words over Y turns $Y^*$ into a topological monoid. Indeed, the multiplication $m: Y^* \times Y^* \to Y^*$ is continuous because, if $C_1, \dots, C_n \subseteq Y$ are (cl)open subsets of Y, then

\[m^{-1}(C_1 \times \dots \times C_n) = \bigcup_{i = 0}^n (C_1 \times \dots C_i) \times (C_{i+1} \times \dots \times C_n) \]

is a (cl)open subset of $Y^* \times Y^*$ . Moreover, since for every $u, v \in Y^*$ we have $\ell_u \circ r_v = r_v \circ \ell_u$ , we also have $\widetilde\ell_u \circ \widetilde r_v = \widetilde r_v \circ\widetilde\ell_u$ . Therefore, the monoid structure on $Y^*$ induces a biaction with continuous components of $Y^*$ on $X_\mathcal{B}$ which is given by:

(6) \begin{equation} Y^* \times X_\mathcal{B} \to X_\mathcal{B}, \quad (u, x) \mapsto \widetilde\ell_u(x)\qquad\text{and}\qquad X_\mathcal{B} \times Y^* \to X_\mathcal{B}, \quad (x, v) \mapsto \widetilde r_v(x).\end{equation}

However, $X_\mathcal{B}$ itself does not necessarily inherit a monoid structure. Indeed, as it was shown in Gehrke et al. (Reference Gehrke, Grigorieff and Pin2010), Gehrke (Reference Gehrke2016), when A is a finite alphabet, this is case if and only if $\mathcal{B}$ consists of regular languages. Nevertheless, (6) induces a monoid structure on the dense subspace $M= \pi[Y^*]$ of $X_\mathcal{B}$ , which is given by:

(7) \begin{equation} \pi(u) \cdot \pi(v) = \pi(uv),\end{equation}

for every $u,v \in Y^*$ . Indeed, with a routine computation we may derive the following equalities:

(8) \begin{equation} \widetilde\ell_u(\pi(v)) = \pi(uv) = \widetilde r_v(\pi(u)).\end{equation}

These not only show that (7) is well defined in the sense that $\pi(uv) = \pi(u'v')$ whenever $\pi(u) = \pi(u')$ and $\pi(v) =\pi(v')$ , but also show that the monoid structure on M is indeed inherited from (6). In fact, it is not hard to see that $\pi: Y^* \twoheadrightarrow M$ is precisely the syntactic homomorphism of $\mathcal{B}$ as defined in Section 2.6. Moreover, since M is dense in $X_\mathcal{B}$ , (8) also shows that whenever $\pi(u) = \pi(u')$ (respectively, $\pi(v) = \pi(v')$ ), the continuous functions $\widetilde\ell_u$ and $\widetilde\ell_{u'}$ (respectively, $\widetilde r_v$ and $\widetilde r_{v'}$ ) coincide on all of $X_\mathcal{B}$ . Therefore, the natural biaction of M on itself extends to a biaction of M on $X_\mathcal{B}$ given by:

\[M \times X_\mathcal{B} \to X_\mathcal{B}, \quad (m, x) \mapsto \lambda_m(x)\qquad\text{and}\qquad X_\mathcal{B} \times M \to X_\mathcal{B}, \quad (x, m) \mapsto \rho_m(x),\]

with continuous components at each $m \in M$ . Here, for each $m \in M$ , $\lambda_m$ (respectively, $\rho_m$ ) denotes the continuous function $\widetilde \ell_u$ (respectively, $\widetilde r_u$ ) where $u\in Y^*$ is any word satisfying $\pi(u)= m$ .

Example 4.1. Let A be a finite alphabet and ${\rm Reg}(A)$ denote the set of regular languages over A (which, as already observed, is a Boolean algebra closed under quotients). Let also $\widehat {A^*}$ denote the free profinite monoid over A (see e.g., Almeida 2005 for a description of $\widehat{A^*}$ both as a projective limit and as a completion of a suitable metric space $(A^*, d)$ ). Then, ${\rm Reg}(A)$ is a Boolean algebra isomorphic to the Boolean algebra of clopen subsets of $\widehat{A^*}$ , via the assignment that sends a regular language to its topological closure (Almeida 1994, Section 3.6). In particular, the Stone dual of ${\rm Reg} (A)$ is the underlying topological space of $\widehat{A^*}$ and the map $\pi: A^* \to \widehat{A^*}$ from diagrams (5) is an embedding (with dense image). The biaction of $A^*$ on $\widehat{A^*}$ is obtained by restriction of the multiplication on $\widehat{A^*}$ (Gehrke et al. Reference Gehrke, Grigorieff, Pin and Aceto2008, Section 6) (see also Gehrke Reference Gehrke2016, Section 4.2).

The following definition, which captures duality for Boolean subalgebras of languages closed under the quotient operations, originates in Gehrke et al. (Reference Gehrke, Petrişan and Reggio2016), where it was used for recognition over finite alphabets. The above considerations verify that it remains the appropriate notion for recognition over profinite alphabets.

Definition 4.2. A Boolean space with an internal monoid (BiM) is a triple (M, p, X) where X is a Boolean space equipped with a biaction of a monoid M whose right and left components at each $m\in M$ are continuous and an injective function $p\colon M\rightarrowtail X$ which has dense image and is a morphism of sets with M-biactions. That is, for each $m\in M$ , there are continuous functions $\lambda_m, \rho_m: X \to X$ that make the following diagrams commute:

A morphism $(M, p, X) \to (N, q, W)$ of ${\sf{BiM}}$ s is a pair $(g, \varphi)$ , where $g: M \to N$ is a monoid homomorphism, $\varphi: X \to W$ is a continuous function, and the following diagram commutes

We say that $(g, \varphi)$ is a quotient provided g is surjective (and consequently, so is $\varphi$ ).

Example 4.3. Let Y be a profinite alphabet. Since ${\sf Clopen}(Y^*)$ is a Boolean algebra closed under quotients, the Čech–Stone compactification of $Y^*$ , defines a BiM $(Y^*, \, e,\, \beta(Y^*))$ (cf. Theorem 3.6).

Given a profinite alphabet Y, we say that a language $L \subseteq{\sf Clopen}(Y^*)$ is recognized by the BiM (M, p, X) if there exists a homomorphism $\mu: Y^* \to M$ such that $p \circ \mu$ is continuous, and a clopen subset $C \subseteq X$ satisfying $L = (p\circ \mu)^{-1}(C)$ . Notice that, since X is a Boolean space, $p\circ \mu$ may be uniquely extended to a continuous function $\varphi:\beta(Y^*) \to X$ , and so, if $e: Y^* \rightarrowtail \beta(Y^*)$ denotes the Čech–Stone compactification of $Y^*$ , then the homomorphisms $\mu: Y^* \to M$ for which $p \circ \mu$ is continuous are in a bijective correspondence with the BiM morphisms $(Y^*,\,e,\, \beta(Y^*)) \to (M, p, X)$ . Moreover, the dual of the function $\varphi$ is the homomorphism of Boolean algebras $(p \circ\mu)^{-1}: {\sf Clopen}(X) \to {\sf Clopen}(\beta(Y^*)) = {\sf Clopen}(Y^*)$ whose image consists of the languages recognized by (M, p, X) via $\mu$ . Finally, since p is a morphism of sets with M-biactions and the biaction of M on X has continuous components, the Boolean algebra of languages recognized by (M, p, X) via $\mu$ is closed under quotients. Indeed, if $L = (p \circ \mu)^{-1}(C)$ for some clopen subset $C \subseteq X$ , and $u, v, w \in Y^*$ , then we have

\begin{align*} w \in u^{-1}Lv^{-1} & \iff p\circ \mu(uwv) \in C \iff p (\mu(u)\, \mu(w)\,\mu(v)) \in C \\ & \iff \lambda_{\mu(u)}\circ \rho_{\mu(v)} \circ p \circ \mu(w) \in C \\ & \iff w \in (p \circ \mu)^{-1}((\lambda_{\mu(u)}\circ \rho_{\mu(v)})^{-1}(C)).\end{align*}

Thus, $u^{-1}Lv^{-1} = (p \circ \mu)^{-1}((\lambda_{\mu(u)}\circ\rho_{\mu(v)})^{-1}(C))$ is recognized by the clopen $(\lambda_{\mu(u)}\circ \rho_{\mu(v)})^{-1}(C) \subseteq X$ .

In order to handle lp-varieties (i.e., lp-strains closed under quotients), it is useful to refine the notion of BiM. In the regular setting, this corresponds to the notion of stamp (Pin and Straubing Reference Pin and Straubing2005). The idea is to consider BiMs with a chosen profinite set of generators for the monoid component and constrain the set of languages we recognize accordingly. Formally, a BiM-stamp (called BiM presentation in Borlido et al. Reference Borlido, Czarnetzki, Gehrke and Krebs2017) is a tuple ${\sf R} = (Y,\mu,M,p, X)$ , where $\mu: Y^* \twoheadrightarrow M$ is a monoid quotient, (M, p, X) is a BiM, and $p \circ \mu$ is a continuous function. By the observations made in the preceding paragraph, BiM-stamps $(Y,\mu,M, p, X)$ may also be understood as BiM quotients $(Y^*, e, \beta (Y^*)) \twoheadrightarrow (M, p, X)$ . We remark that each Boolean algebra closed under quotients $\mathcal{B}\subseteq {\sf Clopen}(Y^*)$ naturally defines a BiM-stamp:

\[Y^* \twoheadrightarrow M = \pi[Y^*] \rightarrowtail X_\mathcal{B},\]

as described in the beginning of this section. This is called the syntactic BiM-stamp of $\mathcal{B}$ , and it is denoted by ${\sf Synt}(\mathcal{B}) =(Y, \mu_\mathcal{B}, M_\mathcal{B}, p_\mathcal{B}, X_\mathcal{B})$ .

BiM-stamps encode two types of behavior: algebraic behavior given by the triple $(Y, \mu, M)$ and topological behavior given by the continuous function $p \circ \mu: Y^* \to X$ . The interplay between these two plays a central role in this paper. We say that a language L is recognized by the BiM-stamp ${\sf R} = (Y, \mu, M, p, X)$ provided L is recognized by the BiM (M, p, X) via $\mu$ . As observed above, the set of all languages recognized by a BiM-stamp forms a Boolean algebra closed under quotients. Similarly to what happens in the regular setting, the syntactic BiM-stamp of a given Boolean algebra closed under quotients is the “optimal” BiM-stamp recognizing that Boolean algebra, in the following sense:

Proposition 4.4. A BiM-stamp ${\sf R} = (Y, \mu, M, p, X)$ recognizes a Boolean algebra closed under quotients $\mathcal{B}$ if and only if the syntactic BiM-stamp of $\mathcal{B}$ factors through ${\sf R}$ , that is, there is a commutative diagram:

where g is a homomorphism and $\varphi$ a continuous function.

Proof. It is clear that if ${\sf Synt}(\mathcal{B})$ factors through ${\sf R}$ , then every language recognized by ${\sf Synt}(\mathcal{B})$ is also recognized by ${\sf R}$ and, in particular, $\mathcal{B}$ is recognized by ${\sf R}$ . Conversely, suppose that $\mathcal{B}$ is recognized by ${\sf R}$ . Then, the kernel of $\mu$ is contained in the syntactic congruence ${\sim_\mathcal{B}}$ , and therefore, the syntactic morphism $\mu_\mathcal{B}$ factors through $\mu$ , say $g \circ \mu = \mu_\mathcal{B}$ . On the other hand, since $p \circ \mu$ has dense image, there are embeddings $\mathcal{B} \rightarrowtail {\sf Clopen}(X)$ and ${\sf Clopen}(X) \rightarrowtail {\sf Clopen}(Y^*)$ , or dually, continuous quotients $\pi: \beta(Y^*) \twoheadrightarrow X$ and $\varphi: X \twoheadrightarrow X_\mathcal{B}$ , where the restriction of $\pi$ to $Y^*$ is $p \circ \mu$ , and $p_\mathcal{B} = \varphi \circ p$ . Finally, since $\mu$ is surjective and $\mu_\mathcal{B} = g \circ \mu$ , it follows that $\varphi \circ p = p_\mathcal{B} \circ g$ as intended.

A morphism between BiM-stamps ${\sf R} = (Y, \mu, M, p, X)$ and ${\sf S} =(Z, \nu, N, q, W)$ is a triple $\Phi = (h, g, \varphi)$ , where $h: Y^*\to Z^*$ is a continuous homomorphism and $(g, \varphi)$ is a morphism between the corresponding BiM components of ${\sf R}$ and ${\sf S}$ , so that the following diagram commutes:

When h is an lp-morphism, we say that $\Phi$ is an lp-morphism of BiM-stamps. Note that when h is onto, then so are g and $\varphi$ . If moreover h is length-preserving, then we will call $\Phi$ an lp-quotient. It is worth mentioning that every lp-quotient $\Phi = (h, g, \varphi):{\sf R} \twoheadrightarrow {\sf S}$ induces the following commutative diagram of continuous functions:

Thus, taking preimages yields the following commutative diagram of homomorphisms of Boolean algebras:

Therefore, ${\sf S}$ being an lp-quotient of ${\sf R}$ means that every language recognized by ${\sf S}$ is also recognized by ${\sf R}$ , under the identification ${\sf Clopen}(Z^*) \subseteq {\sf Clopen}(Y^*)$ .

Finally, we show that BiM-stamps over profinite alphabets can be described in terms of certain projective limits of BiM-stamps over finite alphabets. Although a bit technical, the proof mostly uses standard arguments used in the computation of projective limits in set-based categories.

Proposition 4.5. Projective limits exist in the category of BiM-stamps with lp-morphisms. Moreover, each BiM-stamp is the projective limit of its BiM-stamp lp-quotients over finite alphabets.

Proof. We give the general idea of the proof. All the missing details are routine computations. Let $\mathcal{F} = \{{\sf R}_i = (Y_i, \mu_i, M_i, p_i, X_i)\}_{i \in I}$ be a projective system in the category of BiM-stamps with lp-morphisms. For $i \ge j$ , we denote by $\Phi_{i,j} = (h_{i,j}^*, g_{i,j}, \varphi_{i,j}): {\sf R}_i \to {\sf R}_j$ the corresponding connecting morphism, with $h_{i,j}: Y_i \to Y_j$ a continuous function. Then, each of the families $\{Y_i\}_{i \in I}$ , $\{M_i\}_{i \in I}$ , and $\{X_i\}_{i \in I}$ forms itself a projective system, with the maps $h_{i,j}$ , $g_{i,j}$ and $\varphi_{i,j}$ as connecting morphisms, respectively. We set

(9) \begin{equation} Y = \lim\limits_{\longleftarrow}\ \{Y_i \mid i \in I\}, \qquad M_0 = \lim\limits_{\longleftarrow}\ \{M_i \mid i \in I\}, \qquad \text{and}\qquad X_0 = \lim\limits_{\longleftarrow}\ \{X_i \mid i \in I\}, \end{equation}

and for each $i \in I$ , we denote by $h_i: Y \twoheadrightarrow Y_i$ , $\zeta_i: M_0 \twoheadrightarrow M_i$ , and $\pi_i: X_0 \twoheadrightarrow X_i$ the corresponding projections.

Using the explicit description of projective limits displayed in (2), we may check that there are well-defined maps:

(10) \begin{equation} \mu_0: Y^* \to M_0, \ z \mapsto (\mu_i \circ h_i^*(z))_{i \in I}\qquad\text{and} \qquad p_0: M_0 \rightarrowtail X_0, \ m \mapsto (p_i\circ \zeta_i(m))_{i \in I}. \end{equation}

Graphically, we have the following commutative diagram:

Moreover, by continuity of $p_i\circ \mu_i \circ h_i^*$ , for every $i \in I$ , the map $p_0 \circ \mu_0$ is also continuous. We set

\[M = \mu_0 [Y^*] \qquad \text{and} \qquad X = \overline{p_0[M]}.\]

Co-restricting $\mu_0$ to M, we have an onto homomorphism $\mu: Y^* \twoheadrightarrow M$ , and the restriction and co-restriction of $p_0$ to M and to X, respectively, yields a map $p:M \rightarrowtail X$ with dense image. We claim that ${\sf R} = (Y, \mu, M, p, X)$ is the projective limit of $\mathcal{F}$ . The fact that ${\sf R}$ satisfies the universal property of projective limits is inherited from the universal properties satisfied by Y, $M_0$ , and $X_0$ .

Thus, it remains to check that M continuously bi-acts on X.

We denote by $\lambda_{i,m_i}: X_i \to X_i$ the (continuous) component at $m_i \in M_i$ of the left action of $M_i$ on $X_i$ . Then, for $m = (m_i)_{i \in I} \in M$ , setting

\[\lambda_m: X \to X, \qquad x = (x_i)_{i \in I} \mapsto (\lambda_{i,m_i}(x_i))_{i \in I}\]

defines a continuous map which induces a left action of M on X. Indeed, for every $m, m' \in M$ , we may compute $\lambda_m \circ p(m') = p (mm')$ , using the analogous property for each $\lambda_{i,m_i}$ . This proves not only that $\lambda_m$ is well defined (because $\lambda_m[X] = \lambda_m\left[\overline{p[M]}\right] \subseteq \overline{p[M]} = X$ ) but also that p is a morphism of sets with a left M-action. Similarly, we can define the right action of M on X. The compatibility between the left and right actions is inherited from the compatibility between the left and right actions for each ${\sf R}_i$ . Thus, ${\sf R}$ is a BiM-stamp.

Finally, that each BiM-stamp is the projective limit of all its lp-quotients over finite alphabets is an easy consequence of the explicit computation of projective limits just made.

We remark that, even though each of the maps $\mu_i \circ h_i^*$ is onto, since $Y^*$ is not compact, the map $\mu_0$ defined in (10) is not onto in general (cf. Almeida and Weil Reference Almeida and Weil1995, Lemma 1.2). To see this, consider for instance, for each integer n, the BiM-stamp $(\{0\}^* \twoheadrightarrow {\mathbb{Z}}_n\xrightarrow {id} {\mathbb{Z}}_n)$ . Then, we have a projective limit system $\mathcal{F} = \{(\{0\}^* \twoheadrightarrow {\mathbb{Z}}_n \xrightarrow{id} {\mathbb{Z}}_n)\}_{n \in {\mathbb{N}}}$ and $M_0$ (which equals $X_0$ ) defined in (9) is the additive profinite group $\widehat {\mathbb{Z}}$ (seen as a monoid). The map $\mu_0$ is then the inclusion $\{0\}^*\rightarrowtail \widehat {\mathbb{Z}}$ , which is not surjective. Finally, the projective limit of $\mathcal{F}$ is the BiM-stamp $(\{0\}^*\twoheadrightarrow {\mathbb{N}} \rightarrowtail \widehat {\mathbb{N}})$ , where $\widehat{\mathbb{N}}$ denotes the closure of ${\mathbb{N}}$ in $\widehat {\mathbb{Z}}$ .

4.2 Languages of marked words and quotienting operations

Recall that the set $A^* \otimes{\mathbb{N}}$ of marked words may be seen as a regular language in $(A \times 2)^*$ . The fact that the syntactic morphism of this language is $\mu: (A\times 2)^*\to\{e,m,z\}$ given by $(a,0)\mapsto e$ and $(a,1)\mapsto m$ corresponds to saying that

\[ (A\times 2)^*\cong A^* \,\uplus\, (A^* \otimes{\mathbb{N}})\,\uplus\, A_z,\]

as defined in Section 2.9, and that concatenation on $(A \times 2)^*$ decomposes as follows: it yields biactions of $A^*$ on each of these components, thus accounting for all pairs involving an element of $A^*$ , and any other pair is sent to $A_z$ .

Dually, the above disjoint union decomposition of $(A\times 2)^*$ yields the following Cartesian product decomposition:

\[\mathcal{P}((A\times 2)^*)\cong \mathcal{P}(A^*) \,\times\, \mathcal{P}(A^* \otimes{\mathbb{N}})\,\times\, \mathcal{P}(A_z),\]

We are interested in Boolean subalgebras $\mathcal{D}$ of $\mathcal{P}((A \times2)^*)$ . Dual to the inclusions of the disjoint components, we get projections of $\mathcal{D}$ onto

\[ \mathcal{D}_0=\{L\cap A^*\mid L\in \mathcal{D}\}, \quad \mathcal{D}_1=\{L\cap (A^* \otimes{\mathbb{N}})\mid L\in \mathcal{D}\},\quad \text{and}\quad\mathcal{D}_z=\{L\cap A_z\mid L\in \mathcal{D}\},\]

respectively. While $\mathcal{D}$ is always contained in $\mathcal{D}_0\times\mathcal{D}_1\times\mathcal{D}_z$ it is not difficult to see that we have equality if and only if $\mathcal{D}$ contains the language $A^*\otimes {\mathbb{N}}$ and its orbit under the biaction of $(A\times 2)^*$ .

We will be particularly interested in such algebras for which $\mathcal{D}_z=2$ . In this case, any component of the dual biaction by quotients on $\mathcal{P}((A\times 2)^*)$ with domain $\mathcal{P}(A_z)$ just sends the bounds to bounds of the appropriate component. The remaining components are as follows. First, we have the biactions of $A^*$ on $\mathcal{P}(A^*)$ and on $\mathcal{P}(A^* \otimes {\mathbb{N}})$ given, for $u\in A^*$ , respectively, by:

(11) \begin{gather} \begin{aligned} \ell^0_u: \mathcal{P}(A^*) &\to \mathcal{P}(A^*) \qquad\qquad & r^0_u: \mathcal{P}(A^*) &\to \mathcal{P}(A^*), \\ L \ \ &\mapsto u^{-1}L & L\ \ &\mapsto Lu^{-1} \end{aligned}\end{gather}

and by

(12) \begin{gather} \begin{aligned} \ell^1_u: \mathcal{P}(A^* \otimes {\mathbb{N}}) &\to \mathcal{P}(A^* \otimes {\mathbb{N}}) \qquad\qquad & r^1_u: \mathcal{P}(A^* \otimes {\mathbb{N}}) &\to \mathcal{P}(A^* \otimes {\mathbb{N}}). \\ L\ \ &\mapsto u^{-1}L & L\ \ &\mapsto Lu^{-1} \end{aligned}\end{gather}

For every marked word $(w,i)$ , the biaction of $A^*$ on $A^* \otimes {\mathbb{N}}$ also induces two functions $A^* \to A^* \otimes {\mathbb{N}}$ , which are given by left and by right multiplication by (w,i). Dually, these define

(13) \begin{gather} \begin{aligned} \ell^1_{(w,i)}:\mathcal{P}(A^* \otimes {\mathbb{N}}) & \to \mathcal{P}(A^*)\qquad\qquad& r^1_{(w,i)}: \mathcal{P}(A^* \otimes {\mathbb{N}}) & \to \mathcal{P}(A^*).\\ L\ \ &\mapsto (w,i)^{-1}L & L\ \ &\mapsto L(w,i)^{-1} \end{aligned}\end{gather}

Lemma 4.6. Let $\mathcal{D} \subseteq \mathcal{P}((A \times 2)^*)$ be a Boolean subalgebra such that $\mathcal{D}_z = 2$ . Then, the following are equivalent:

  1. (a) $\mathcal{D}$ is closed under quotients and $(A^* \otimes {\mathbb{N}}) \in \mathcal{D}$ or $A_z\in\mathcal{D}$ ;

  2. (b) $\mathcal{D}\cong\mathcal{D}_0\times\mathcal{D}_1\times 2$ , $\mathcal{D}_0$ and $\mathcal{D}_1$ are closed under the respective biactions of $A^*$ , and for all $L\in \mathcal{D}_1$ and all $(w,i) \in A^* \otimes{\mathbb{N}}$ we have $(w,i)^{-1}L,\ L(w,i)^{-1}\in\mathcal{D}_0$ .

Proof. Suppose we have (a). If $A_z \in \mathcal{D}$ then, as $\mathcal{D}$ is closed under quotients,

\[ (a,1)^{-1}A_z=(A^* \otimes {\mathbb{N}})\cup A_z\in\mathcal{D}, \]

and thus $A^*\otimes {\mathbb{N}}\in\mathcal{D}$ . Since $\mathcal{D}$ contains the language $A^*\otimes {\mathbb{N}}$ and is closed under quotients, it also contains its orbit under the biaction of $(A\times 2)^*$ . Thus, we have that $\mathcal{D}$ is isomorphic to $\mathcal{D}_0\times\mathcal{D}_1\times \mathcal{D}_z$ which, by hypothesis, is $\mathcal{D}_0\times\mathcal{D}_1\times 2$ . The rest of the assertion in (b) follows from $\mathcal{D}$ being closed under quotients and the splitting of the biaction of $(A\times 2)^*$ on $\mathcal{P}((A\times 2)^*)$ as given by (11), (12), and (13).

Conversely, let us assume that (b) holds. Notice that having $\mathcal{D}\cong\mathcal{D}_0\times\mathcal{D}_1\times \mathcal{D}_z$ amounts to having that, for every $L_1, L_2, L_3 \in \mathcal{D}$ , there exists some $L \in \mathcal{D}$ such that

\[L_1 \cap A^* = L \cap A^*, \qquad L_2 \cap (A^* \otimes {\mathbb{N}}) = L \cap (A^*\otimes {\mathbb{N}}), \qquad \text{and} \qquad L_3 \cap A_z = L \cap A_z.\]

So, in particular, by taking $L_1 = L_3 = \emptyset$ and $L_2 = (A \times 2)^*$ , we may conclude that $A^* \otimes {\mathbb{N}} \in \mathcal{D}$ . To show that $\mathcal{D}$ is closed under quotients, it suffices to consider quotients by letters. Also, since quotient operations are homomorphisms, it suffices to consider languages belonging to each component. By hypothesis, $\mathcal{D}_0$ and $\mathcal{D}_1$ are closed under quotients by letters $a\in A$ and clearly so is 2 within $\mathcal{P}(A_z)$ . Also by hypothesis, $(a,1)^{-1}L,\ L(a,1)^{-1}\in \mathcal{D}_0$ for $L\in\mathcal{D}_1$ . Finally, for $L\in\mathcal{D}_0$ , we have

\[ (a,1)^{-1} L=L(a,1)^{-1}=\emptyset\in\mathcal{D} \qquad \text{and} \qquad (a,1)^{-1} A_z=A_z(a,1)^{-1}=(A^*\otimes{\mathbb{N}})\cup A_z\in\mathcal{D}. \]

Notice that in the case where $\mathcal{D}$ satisfies the equivalent conditions of Lemma 4.6, the Boolean subalgebra with atoms $A^*$ , $A^*\otimes{\mathbb{N}}$ , and $A_z$ is a subalgebra of $\mathcal{D}$ . Also note that the Boolean subalgebra of $\mathcal{P}((A \times 2)^*)$ generated by $A^*$ is closed under quotients, and it is such that $\mathcal{D}_z = 2$ , but it does not satisfy the equivalent conditions of the the lemma.

Corollary 4.7. Let $\mathcal{C}\subseteq \mathcal{P}(A^* \cup (A^* \otimes {\mathbb{N}}))$ be a Boolean subalgebra. Then, the Boolean subalgebra $\mathcal{D}$ of $\mathcal{P}((A \times 2)^*)$ closed under quotients generated by $\mathcal{C}$ is generated, as a lattice, by $\mathcal{S}_0 \cup \mathcal{S}_1 \cup \{A_z\}$ , where

\begin{multline*} \mathcal{S}_0 = \{u^{-1}Lv^{-1} \mid u, v \in A^*, \ L \in \mathcal{C}_0\} \\ \cup\ \{(w,i)^{-1} L u^{-1},\ u^{-1}L(w,i)^{-1} \mid u \in A^*, (w,i)\in A^* \otimes {\mathbb{N}}, \ L \in\mathcal{C}_1\} \subseteq \mathcal{P}(A^*) \end{multline*}

and

\[\mathcal{S}_1 = \{u^{-1}L v^{-1}\mid u, v \in A^*, \ L \in \mathcal{C}_1\} \subseteq \mathcal{P}(A^* \otimes {\mathbb{N}}).\]

Proof. Since the quotient operations are Boolean homomorphisms, ${\mathcal{D}}$ is generated as a Boolean algebra by the quotients of the languages in $\mathcal{C}$ . Also, since, for $a\in A$ and $L\subseteq A^* \cup (A^* \otimes {\mathbb{N}})$ , we have

\[ a^{-1}L\subseteq A^* \cup (A^* \otimes {\mathbb{N}})\ \text{ and }\ (a,1)^{-1}L\subseteq A^*, \]

it follows that ${\mathcal{D}}_z=2$ . Also, since $\mathcal{C}$ is a Boolean subalgebra of $\mathcal{P}(A^* \cup (A^*\otimes {\mathbb{N}}))$ , it contains $A^* \cup (A^*\otimes {\mathbb{N}})$ and thus ${\mathcal{D}}$ contains $A_z$ . It follows that Lemma 4.6 applies to ${\mathcal{D}}$ and the conclusion of the corollary easily follows.

Another immediate consequence of the equivalence between 4.6 and 4.6 in Lemma 4.6 is the following:

Corollary 4.8. Let $\mathcal{D} \subseteq \mathcal{P}((A \times 2)^*)$ be a Boolean subalgebra closed under quotients that contains $A^* \otimes {\mathbb{N}}$ and such that $\mathcal{D}_z = 2$ . We let $\pi: (A \times 2)^* \to M_\mathcal{D}$ denote the syntactic morphism of $\mathcal{D}$ and we set

\[M = \pi[A^*] \qquad\text{and}\qquad T= \pi[A^* \otimes {\mathbb{N}}].\]

Then,

  1. (a) the biaction of $M_{\mathcal{D}}$ on $X_{\mathcal{D}}$ restricts and co-restricts to a biaction of M on $X_{\mathcal{D}_0}$ and to a biaction of M on $X_{\mathcal{D}_1}$ ;

  2. (b) for every $t \in T$ , the components of the biaction of $M_\mathcal{D}$ on itself at t restrict and co-restrict as follows:

    \[\lambda_{t}: X_{\mathcal{D}_0} \to X_{\mathcal{D}_1} \qquad \text{and}\qquad\rho_{t}: X_{\mathcal{D}_0} \to X_{\mathcal{D}_1}.\]

We remark that, if $\mathcal{D} \subseteq \mathcal{P}((A \times 2)^*)$ is a Boolean subalgebra closed under quotients then $\mathcal{D}_0$ , seen as a Boolean subalgebra of $\mathcal{P}(A^*)$ , is also closed under quotients: it is so because, for every $u, v \in A^*$ and $L \in \mathcal{D}$ , the following equality holds

\[u^{-1}(L \cap A^*) v^{-1} = (u^{-1}Lv^{-1}) \cap A^*.\]

Moreover, since the quotient $\mathcal{P}((A \times 2)^*) \twoheadrightarrow\mathcal{P}(A^*)$ restricts and co-restricts to a quotient $\mathcal{D}\twoheadrightarrow \mathcal{D}_0$ , the syntactic morphism of $\mathcal{D}_0 \subseteq\mathcal{P}(A^*)$ is a restriction and co-restriction of the syntactic morphism of $\mathcal{D} \subseteq \mathcal{P}((A \times 2)^*)$ . In particular, the monoid M defined in Corollary 4.8 is the syntactic monoid $M_{\mathcal{D}_0}$ of $\mathcal{D}_0$ and the biaction of M on $X_{\mathcal{D}_0}$ is simply the natural biaction of $M_{\mathcal{D}_0}$ on $X_{\mathcal{D}_0}$ .

5. The Substitution Principle

The concept of substitution for the study of logic on words, as laid out by Tesson and Thérien in (2007), is quite different from substitution in predicate logic. Substitution in predicate logic works on terms, whereas the notion of substitution in Tesson and Thérien (Reference Tesson and Thérien2007) works at the level of predicates. As such it provides a method for decomposing complex formulas into simpler ones. The core idea is to enrich the alphabet over which the logic is defined in order to be able to substitute large subformulas through letter predicates.

In this section, we start by defining substitution maps with respect to finite Boolean algebras, and we prove a local version of the Substitution Principle (cf. Corollary 5.4), whose main ingredient is the duality between finite sets and finite Boolean algebras. Then, by proving that the substitution maps form a direct limit system, we are able to extend the construction to arbitrary Boolean algebras using full fledged Stone duality. In this process, we are naturally led to consider profinite alphabets, and we are able to state a global version of the Substitution Principle (cf. Corollary 5.12). Finally, in Section 5.4, we show how in practical terms these techniques may be useful in the study of fragments of logic.

5.1 Substitution with respect to finite Boolean algebras

As an example, consider the sentence $\psi = \exists x \phi(x)$ . Then, $\psi$ may be obtained from the sentence $\exists x {\sf P}_b(x)$ by replacing ${\sf P}_b(x)$ by $\phi(x)$ , and, thus, understanding $\psi$ amounts to understanding both the sentence $\exists x \ {\sf P}_b(x)$ and the formula $\phi(x)$ . If we want to substitute away several subformulas in this way, we must account for their logical relations. For instance, suppose that $\phi(x)$ is the conjunction $\phi_1(x)\wedge\phi_2(x)$ , and that $\phi_1(x)$ and $\phi_2(x)$ are the simpler subformulas we wish to consider. Should $\psi$ be obtained from a simpler sentence as above, such a sentence would be $\exists x \ ({\sf P}_{b_1}(x) \wedge{\sf P}_{b_2}(x))$ for two letters $b_1, b_2$ . But, since $\phi_1$ and $\phi_2$ may be related we would then need to impose relations on letters. Then, no complexity is removed. Thus instead, we will consider a finite Boolean algebra of formulas to be substituted away. These can all be accounted for by having a letter predicate for each atom of this finite Boolean algebra. The fact that letter predicates model the atoms of a finite Boolean algebra of formulas in a free variable is built into their interpretation. This explains why, when substituting a formula of a given finite set $\mathcal{F}$ for each occurrence of a letter predicate from a corresponding alphabet in a sentence, we should only consider sets $\mathcal{F}$ of formulas that have the same logical behavior as letter predicates, meaning that $\mathcal{F}$ satisfies

  1. (A.1) $\bigvee_{\phi \in \mathcal{F}}\ \phi$ is the always-true proposition;

  2. (A.2) for every $\phi_1, \phi_2 \in \mathcal{F}$ distinct, $\phi_1 \wedge \phi_2 $ is the always-false proposition.

In other words, we require that $\mathcal{F}$ is the set of atoms of the finite Boolean algebra it generates.

We now formalize this concept of substitution. Throughout this section, we fix a context x and a variable x which does not belong to x. Moreover, $\Gamma$ will be a fixed class of sentences and $\Delta \subseteq {\mathcal{Q}}_{A,{\bf x} x}[\mathcal{N}]$ a finite Boolean subalgebra. We regard the set of atoms of $\Delta$ as a finite alphabet, and in order to emphasize both the fact that it is an alphabet and the fact that it is determined by $\Delta$ , we will denote it by $C_{\Delta}$ . On the other hand, when we wish to view an element c of $C_{\Delta}$ as a formula of $\Delta$ , we will write $\phi_c$ instead of c.

Definition 5.1. The $\Gamma$ -substitution given by $\Delta$ (with respect to the variable x) is the map:

\[\sigma_{\Gamma, \Delta}: \Gamma(C_{\Delta}) \to{\mathcal{Q}}_{A,{\bf x}}[\mathcal{N}]\]

sending a sentence to the formula in context x obtained by substituting for each occurrence of a letter predicate ${\sf P}_c(z)$ , the formula $\phi_c[x/z]$ (i.e., the formula obtained by substituting z for x in the formula $\phi_c\in {\rm At}(\Delta)$ ). Note that here we assume (without loss of generality) that the variables occurring in $\psi \in \Gamma(C_\Delta)$ , such as z, do not occur in the formulas of $\Delta$ . When $\Gamma$ is clear from the context, as will very often be the case, we will simply write $\sigma_\Delta$ instead of $\sigma_{\Gamma, \Delta}$ .

Since the only constraints on the interpretation of letters in a word are given by the properties (A.1) and (A.2), it follows by a simple structural induction that $\sigma_{\Delta}$ is a homomorphism of Boolean algebras. We denote the image of this homomorphism by $\Gamma \odot \Delta$ . In the sequel, we will consider $\sigma_\Delta$ as denoting the co-restriction of the substitution to its image, that is,

\[\sigma_\Delta: \Gamma(C_\Delta) \twoheadrightarrow \Gamma \odot \Delta.\]

Remark 5.2. Let us compare our substitution map with that of Tesson and Thérien (2007, Definition 2.6). In Tesson and Thérien (2007), given a finite set of formulas $\Phi \subseteq \mathcal{Q}_{A, x}[\mathcal{N}]$ , the authors define the associated $\Phi$ -substitution to be the function $\sigma_\Phi: \Gamma(2^\Phi) \to \mathcal{Q}_A[\mathcal{N}]$ that sends a formula $\psi \in \Gamma(2^\Phi)$ to the formula obtained from $\psi$ by replacing each letter predicate ${\sf P}_S(z)$ , with $S \subseteq \Phi$ , by the formula $\varphi_S = \bigwedge_{\varphi \in S} \varphi[x/z] \wedge \bigwedge_{\varphi \notin S} \neg \varphi[x/z]$ . Let $\Delta$ be the Boolean subalgebra of $\mathcal{Q}_{A, x}[\mathcal{N}]$ generated by $\Phi$ . It is not hard to see that, for every subset $S \subseteq \Phi$ , either $\varphi_S$ is the always-false formula or it is an atom of $\Delta$ , and that every atom of $\Delta$ is semantically equivalent to the formula $\varphi_S$ , where $S = \{\varphi \in \Phi \mid \phi \leq \varphi\}$ . In particular, we have an embedding $\zeta: {\rm At}(\Delta) \rightarrowtail 2^\Phi$ mapping an atom $\phi \in \Delta$ to the set $\{\varphi \in \Phi \mid \phi \leq \varphi\}$ . Finally, since $\Gamma$ is a class of sentences, $\zeta$ induces a Boolean algebra quotient $\zeta_\Gamma:\Gamma(2^\Phi) \twoheadrightarrow \Gamma(C_\Delta)$ which, by definition of our substitution map $\sigma_\Delta$ and of the substitution map $\sigma_\Phi$ from Tesson and Thérien (2007), is such that $\sigma_\Delta \circ \zeta_\Gamma = \sigma_\Phi$ . In particular, the images of $\sigma_\Delta$ and of $\sigma_\Phi$ are the same.

Next we describe the languages of $\Gamma \odot \Delta$ via those of $\Gamma$ and those of $\Delta$ . Since $\Gamma(C_\Delta)$ and $\Gamma\odot \Delta$ define, respectively, a Boolean algebra of languages over $C_\Delta$ and a Boolean algebra of marked words over A, we have embeddings:

Our first goal is to prove that the substitution map $\sigma_\Delta$ extends to a complete homomorphism of Boolean algebras $\mathcal{P}(C^*_\Delta)\to \mathcal{P}(A^* \otimes {\mathbb{N}}^{ {\bf x}})$ . Formulated dually, this means we are looking for a map on the level of (marked) words $\tau_\Delta: A^* \otimes {\mathbb{N}}^{ {\bf x}} \to C^*_\Delta$ making the following diagram commute:

Here, the maps $p_\Delta\colon C_{\Delta}^*\to X_{\Gamma{(C_{\Delta})}}$ and $q_\Delta\colon A^* \otimes {\mathbb{N}}^{ {\bf x}} \to X_{\Gamma \odot \Delta}$ are, respectively, the restrictions of the dual maps of the embeddings $\Gamma{(C_{\Delta})}\rightarrowtail \mathcal{P}(C_{\Delta}^*)$ and $\Gamma\odot \Delta \rightarrowtail\mathcal{P}(A^* \otimes {\mathbb{N}}^{ {\bf x}})$ .

In order to be able to define the map $\tau_\Delta$ , we need to understand the Boolean algebra $\Delta$ via duality. Recall that ${\mathcal{Q}}_{A,{\bf x} x}[\mathcal{N}]$ embeds in $\mathcal{P}(A^* \otimes {\mathbb{N}}^{ {{\bf x} x}})$ as a Boolean subalgebra, and therefore, so does $\Delta$ :

\[\Delta \rightarrowtail {\mathcal{Q}}_{A,{\bf x} x}[\mathcal{N}] \rightarrowtail \mathcal{P}(A^* \otimes {\mathbb{N}}^{ {{\bf x} x}}).\]

Applying discrete duality to this composition, we obtain a map:

\[\xi_\Delta: A^* \otimes {\mathbb{N}}^{ {{\bf x} x}} \twoheadrightarrow C_\Delta = {\rm At}(\Delta) \]

defined, for $w\in A^* \otimes {\mathbb{N}}^{ {\bf x}}$ , $i<|w|$ , $c\in C_\Delta$ , and $\phi_c$ the atom of $\Delta$ corresponding to c, by:

(14) \begin{equation} \xi_\Delta(w, i) = c \ \iff \ (w, i)\in L_{\phi_c} \ \iff \ (w, i)\vDash \phi_c .\end{equation}

Proposition 5.3. Let $\Gamma$ be a class of sentences, $\Delta$ a finite Boolean subalgebra of ${\mathcal{Q}}_{A,{\bf x}x}[\mathcal{N}]$ , and $\sigma_\Delta:\Gamma(C_\Delta)\to\Gamma\odot\Delta$ the associated substitution as defined above. Then, the function ${\tau_\Delta: A^* \otimes {\mathbb{N}}^{ {\bf x}} \to C_{\Delta}^*}$ defined by $\tau_\Delta(w) = \xi_\Delta(w, 0) \cdots \xi_\Delta(w, \left\lvert w\right\rvert - 1)$ makes the following diagram commute:

Proof. We show that the dual diagram:

commutes. To this end, let $w \in A^* \otimes {\mathbb{N}}^{ {\bf x}}$ , $i<|w|$ , and $c\in C_\Delta$ with $\phi_c$ the corresponding atom of $\Delta$ . First, we argue that

\[(\tau_\Delta(w), i) \models {\sf P}_c(x) \ \iff \ (w, i) \models \phi_c.\]

This is so because, by the definition of letter predicates, the marked word over $C_\Delta^*$ , $(\tau_\Delta(w), i)$ , is a model of ${\sf P}_c(x)$ if and only if its i-th letter is a c. By definition of $\tau_\Delta$ , this is equivalent to having $\xi_\Delta(w, i) = c$ , which, by (14), means that $(w, i) \models \phi_c$ , as required.

Now, since the validity in a marked word of a quantified formula $Qx\ \psi$ is fully determined once we know the truth value of the given formula $\psi$ at each point of the marked word, and since $\psi\in\Gamma(C_\Delta)$ and $\sigma_\Delta(\psi)$ are built up identically once the substitutions of ${\sf P}_c(x)$ by $\phi_c$ have been made, it follows that, for all $\psi\in\Gamma(C_\Delta)$ , we have

(15) \begin{equation} \tau_\Delta(w) \in L_\psi \ \iff \ w \in L_{\sigma_\Delta(\psi)}.\end{equation}

However,

\[ w \in L_{\sigma_\Delta(\psi)}\quad\iff\quad L_{\sigma_\Delta(\psi)}\in p_\Delta(w)\quad\iff\quad L_\psi\in \Sigma_\Delta( p_\Delta(w))\]

so that

\[ \tau_\Delta(w) \in L_\psi \ \iff \ L_\psi\in \Sigma_\Delta( p_\Delta(w))\]

and thus $\Sigma_\Delta( p_\Delta(w))=q_\Delta(\tau_\Delta(w))$ as required.

The existence of the map $\tau_\Delta$ defined in Proposition 5.3 yields the next result.

Corollary 5.4. (Substitution Principle -- local version) Let $\Gamma$ be a class of sentences and $\Delta \subseteq {\mathcal{Q}}_{A,{\bf x}x}[\mathcal{N}]$ be a finite Boolean subalgebra. Then, the languages definable by a formula of $\Gamma \odot \Delta$ are precisely those of the form $\tau_{\Delta}^{-1}(K)$ , where K is a language definable in $\Gamma{(C_{\Delta})}$ .

Proof. This is an immediate consequence of the commutativity of the diagram dual to that of Proposition 5.3.

Remark 5.5. We warn the reader that the operator $(\_\odot\_)$ just defined does not coincide with the operator $(\_\circ\_)$ considered both in Borlido et al. (2017) and in the regular version of Tesson and Thérien (2007). The relationship between these operators is expressed by the following equality:

\[\Gamma \circ \Delta = \langle (\Gamma \odot \Delta) \cup \Delta_{\bf x}\rangle_{\sf BA},\]

where $\Delta_{\bf x}$ denotes the set of formulas of $\Delta$ in context x (i.e., those whose free variables belong to x). We choose to first study the operator $(\_\odot\_)$ in order to emphasize the role of Stone duality in the Substitution Principle. It should be clear for the reader how to state the corresponding results for $(\_\circ\_)$ .

5.2 The extension to arbitrary Boolean algebras using profinite alphabets

As the reader may have noticed, the context x fixed along the previous section is not playing an active role. For that reason, and in order to simplify the notation, we now assume that x is the empty context. Later, in Section 5.3, we will see that this assumption may be done without the loss of generality.

Here, we show that substitution as defined in Section 5.1 extends to arbitrary Boolean algebras in a meaningful way. Fix a class of sentences $\Gamma$ . We start by comparing the substitution maps obtained for two finite Boolean algebras, one contained in the other. To this end, suppose $\Delta_1\rightarrowtail \Delta_2$ is such an inclusion of finite Boolean subalgebras of ${\mathcal{Q}}_{A,x}[\mathcal{N}]$ and let $\zeta: C_2\twoheadrightarrow C_1$ be the dual of the inclusion. Recall that the semantics of logic on words provides embeddings of $\Gamma(C_i)$ in $\mathcal{P}(C_i^*)$ and that, since $\Gamma$ is a class of sentences, the surjection $\zeta$ yields an embedding $\zeta_\Gamma\colon\Gamma(C_1)\rightarrowtail\Gamma(C_2)$ making the following diagram commute (cf. diagram (3) and LC.2):

(16)

We also have

Lemma 5.6. The following diagram is commutative:

Proof. We first recall from Proposition 5.3 that, for $i = 1, 2$ , and $w \in A^*$ , we have

\[\tau_{\Delta_i}(w) = \xi_i(w, 0) \dots \xi_i(w, \left\lvert w\right\rvert -1),\]

where $\xi_i: A^* \otimes {\mathbb{N}} \twoheadrightarrow C_i$ is the quotient dual to the embedding $\Delta_i \rightarrowtail \mathcal{P}(A^* \otimes {\mathbb{N}})$ . On the other hand, by hypothesis, we have a commutative diagram:

which dually gives

It then follows that

\begin{align*} \tau_{\Delta_1}(w) & = \xi_1(w, 0) \dots \xi_1(w, \left\lvert w\right\rvert - 1) \\ & = \zeta^*(\xi_2(w, 0) \dots \xi_2(w, \left\lvert w\right\rvert - 1)) = \zeta^* (\tau_{\Delta_2}(w)), \end{align*}

that is, the statement of the lemma is true.

As a straightforward consequence of diagram (16), Lemma 5.6, and Proposition 5.3, we have the following:

Theorem 5.7. Let $\Delta_1 \rightarrowtail \Delta_2$ be an embedding of finite Boolean subalgebras of ${\mathcal{Q}}_{A,x}[\mathcal{N}]$ , and let $\zeta: C_2 \twoheadrightarrow C_1$ be its dual map. Then, the following diagram commutes:

Corollary 5.8. If $\Delta_1 \subseteq \Delta_2$ are finite Boolean subalgebras of ${\mathcal{Q}}_{A,x}[\mathcal{N}]$ , then the inclusion $\Gamma \odot \Delta_1 \subseteq \Gamma \odot \Delta_2$ holds. In particular, for every Boolean subalgebra $\Delta\subseteq {\mathcal{Q}}_{A,x}[\mathcal{N}]$ , the family $\{\Gamma \odot \Delta' \mid \Delta' \subseteq \Delta\text{ is a finite Boolean subalgebra}\}$ forms a direct limit system.

Proof. Let $\Delta_1 \rightarrowtail \Delta_2$ be an embedding of finite Boolean subalgebras of ${\mathcal{Q}}_{A,x}[\mathcal{N}]$ , and $\zeta: C_2 \twoheadrightarrow C_1$ its dual map. Commutativity of the outer triangle of Theorem 5.7 yields

Thus, we have a direct system of Boolean subalgebras of $\mathcal{Q}_A[\mathcal{N}]$ and $\Gamma\odot \Delta_1\subseteq\Gamma\odot \Delta_2$ .

This leads to the following definition.

Definition 5.9. For an arbitrary Boolean subalgebra $\Delta \subseteq {\mathcal{Q}}_{A,{x}}[\mathcal{N}]$ , we set

(17) \begin{equation} \Gamma \odot \Delta = \lim\limits_{\longrightarrow} \ \{\Gamma\odot \Delta'\mid \Delta' \subseteq \Delta \text{ is a finite Boolean subalgebra}\}. \end{equation}

Note that $\Gamma \odot \Delta$ is simply given by the union of all Boolean subalgebras $\Gamma\odot \Delta' \subseteq \mathcal{Q}_A[\mathcal{N}]$ . In particular, it is also a Boolean subalgebra of $\mathcal{Q}_A[\mathcal{N}]$ .

In turn, on the side of Boolean spaces, we have a projective limit system formed by the maps $\tau_{\Delta'}$ .

Corollary 5.10. For every Boolean subalgebra $\Delta \subseteq {\mathcal{Q}}_{A,x}[\mathcal{N}]$ , the set of maps

(18) \begin{equation} \{\tau_{\Delta'}: A^* \to C_{\Delta'}^* \mid \Delta' \subseteq \Delta\text{ is a finite Boolean subalgebra}\} \end{equation}

forms a projective limit system, where the connecting morphisms are the homomorphisms of monoids $C_{\Delta_{2}}^* \twoheadrightarrow C_{\Delta_1}^*$ induced by the dual maps of inclusions $\Delta_1 \rightarrowtail \Delta_2$ of finite Boolean subalgebras of $\Delta$ . Moreover, the limit of this system is the map $\tau_\Delta: A^* \to X_{\Delta}^*$ sending a word $w \in A^*$ to the word $\gamma_0 \cdots \gamma_{\left\lvert w\right\rvert - 1}$ with $\gamma_i = \{\phi \in \Delta \mid (w, i) \text{ satisfies } \phi\}$ . In other words, $\gamma_i$ is the projection to $X_{\Delta}$ of the principal (ultra)filter of $\mathcal{P}(A^*\otimes {\mathbb{N}})$ generated by (w, i).

Proof. The fact that (18) forms a projective limit system follows from Lemma 5.6.

Now, we remark that the maps $\tau_{\Delta'}$ are all length-preserving; thus, the above system factors into projective limit systems $\{\tau_{\Delta'}: A^n \to (C_{\Delta'} )^n \mid \Delta' \subseteq \Delta\text{ is a finite Boolean subalgebra}\}$ for each $n \ge 0$ , and each of these systems has itself a projective limit, which is induced by the projections $A^* \otimes {\mathbb{N}} \to X_\Delta$ . Thus, the projective limit of (18) is the map $\tau_\Delta: A^* \to X_\Delta^*$ defined in the statement, where the space $X^*_{\Delta}$ is seen as the (disjoint) union over $n \ge 0$ of the spaces $X_{\Delta}^n$ , each one of those being a Boolean space when equipped with the product topology.

The reader should now recall the extension of an lp-strain of languages $\mathcal{V}$ to profinite alphabets presented in Section 3, and in particular that such extension yields an embedding $\mathcal{V}(Y) \rightarrowtail {\sf Clopen}(Y^*)$ for every profinite alphabet Y.

Corollary 5.11. Let $\Gamma$ be a class of sentences and $\Delta \subseteq {\mathcal{Q}}_{A,{x}}[\mathcal{N}]$ be a Boolean subalgebra. Then, the Boolean algebra $\Gamma \odot \Delta$ is a quotient of $\mathcal{V}_\Gamma(X_{\Delta})$ , or equivalently, $X_{\Gamma \odot \Delta}$ is isomorphic to a closed subspace of $X_{\mathcal{V}_\Gamma(X_\Delta)}$ . More precisely, there exist commutative diagrams:

which are dual to each other.

Proof. We prove that the left-hand diagram commutes. Let L be a language of $\Gamma\odot \Delta$ . By Definition 5.9 of $\Gamma \odot \Delta$ , this means that there exists a finite Boolean subalgebra $\Delta' \subseteq \Delta$ such that L belongs to $\Gamma \odot \Delta'$ . Denote by $C_{\Delta'}$ the alphabet consisting of the atoms of $\Delta'$ . Using the local version of the Substitution Principle (cf. Corollary 5.4), this is equivalent to the existence of some language $K \in \mathcal{V}_\Gamma(C_{\Delta'})$ such that $L = \tau_{\Delta'}^{-1}(K)$ . But by Lemma 3.1 and Corollary 5.10, this amounts to having that L belongs to $\tau_\Delta^{-1}(\mathcal{V}_\Gamma(X_\Delta))$ . So, we just proved that the image of the following composition:

is precisely $\Gamma \odot \Delta$ . That is, we have a commutative diagram as in the left-hand side of the statement.

We can now state the already announced global version of the Substitution Principle, which is simply a rephrasing of the commutativity of the left-hand diagram of Corollary 5.11.

Corollary 5.12. (Substitution Principle -- global version). Let $\Gamma$ be a class of sentences and $\Delta \subseteq {\mathcal{Q}}_{A,{x}}[\mathcal{N}]$ be a Boolean subalgebra. Then, the languages (over A) definable in $\Gamma \odot \Delta$ are precisely the languages of the form $\tau_\Delta^{-1}(K)$ , where $K \in \mathcal{V}_\Gamma(X_{\Delta})$ .

5.3 Using an alphabet to encode free variables

We now explain how free variables may be encoded in an alphabet, provided the logic at hand is expressive enough. The idea is that a formula $\phi\in {\mathcal{Q}}_{A,x}[\mathcal{N}]$ may be regarded as a sentence over the extended alphabet $(A \times 2)$ ; in the same way, we may regard marked words as words over $(A \times 2)$ via the embedding $A^*\otimes {\mathbb{N}} \rightarrowtail (A \times 2)^*$ . More generally, given two disjoint contexts x and y, we will define two functions:

\[\varepsilon_{{\bf x}, {\bf y}} :{\mathcal{Q}}_{A,{\bf x}{\bf y}}[\mathcal{N}] \to \mathcal{Q}_{A \times 2^{\bf x}, {\bf y}}[\mathcal{N}] \quad \text{and}\quad \delta_{{\bf x},{\bf y}}: \mathcal{Q}_{A \times 2^{\bf x}, {\bf y}}[\mathcal{N}] \to {\mathcal{Q}}_{A,{{\bf x}{\bf y}}}[\mathcal{N}],\]

which we call, respectively, the encoding and the decoding function. As the name suggests, $\varepsilon_{{\bf x} {\bf y}}$ “encodes” the variables from x in the extended alphabet $A \times2^{\bf x}$ , while $\delta_{{\bf x},{\bf y}}$ reverts this process (see Corollary 5.20 for a precise statement).

Given contexts x and y, we let

\[\iota_{{\bf x}}:A^* \otimes {\mathbb{N}}^{ {\bf x}} \rightarrowtail (A \times 2^{\bf x})^*\]

denote the natural embedding of $ {\bf x}$ -marked words into the set of words over $(A \times 2^ {\bf x})$ , and

\[\iota_{{\bf x},{\bf y}}: A^* \otimes {\mathbb{N}}^{ {{\bf x}{\bf y}}} \rightarrowtail (A \times 2^{\bf x})^* \otimes {\mathbb{N}}^{ {\bf y}}\]

be defined by $\iota_{{\bf x}, {\bf y}}(w, {\bf i}, {\bf j}) = (\iota_{\bf x}(w, {\bf i}), {\bf j})$ , for every $(w, {\bf i}, {\bf j}) \in A^* \otimes{\mathbb{N}}^{ {{\bf x}{\bf y}}}$ .

In order to keep the notation simple, we first consider the case where x consists of a single variable x.

To define the encoding function, we need to assume that the quantifier $\exists!$ (there exists a unique) and the numerical predicate $=$ (formally given by $\{(n,n) \mid n \in {\mathbb{N}}\}$ ) are expressible in our logic. This assumption is needed so that we are able to define the set of all marked words $A^* \otimes {\mathbb{N}}$ as a language over $(A \times 2)$ . Indeed, one can easily check that the models of the sentence:

(19) \begin{equation} \Phi= \exists ! z\ \bigvee_{a \in A} {\sf P}_{(a,1)}(z)\end{equation}

are precisely the words of $(A \times 2)^*$ having exactly one letter of the form (a, 1), that is, those that belong to the image of $\iota_x: A^* \otimes {\mathbb{N}} \rightarrowtail (A \times 2)^*$ . More generally, if $\Phi$ is seen as a formula in a given context y, then its models are the y-marked words in the image of $\iota_{x, {\bf y}}: A^* \otimes {\mathbb{N}}^{ {x{\bf y}}} \rightarrowtail (A \times 2)^* \otimes{\mathbb{N}}^{ {\bf y}}$ .

Definition 5.13. Let $\phi$ be a formula in context $x{\bf y}$ over the alphabet A and assume that all occurrences of x in $\phi$ are free. We denote by $\phi^x$ the formula obtained from $\phi$ by:

  • replacing each predicate ${\sf P}_a(x)$ by $\exists z\ {\sf P}_{(a,1)}(z)$ ,

  • replacing each predicate ${\sf P}_a(z)$ by ${\sf P}_{(a,1)} (z) \vee {\sf P}_{(a, 0)}(z)$ , if $z \neq x$ ,

  • replacing each predicate $R(\_, x, \_)$ by $\exists z \ (\bigvee_{a \in A}{\sf P}_{(a,1)}(z) \wedge R(\_, z,\_))$ .

By a routine structural induction on the construction of a formula, we can show the following:

Lemma 5.14. Let $\phi$ be a formula in context $x{\bf y}$ over the alphabet A. Then, for every $w \in A^*$ , $i \in \{0, \dots, \left\lvert w\right\rvert - 1\}$ , and ${\bf j} \in \{0, \dots,\left\lvert w\right\rvert - 1\}^{\left\lvert{\bf y}\right\rvert}$ , we have

\[(w, i, {\bf j}) \models \phi \iff \iota_{x, {\bf y}}(w, i,{\bf j}) \models \phi^x.\]

Corollary 5.15. Let $\Phi$ be the formula defined in (19). Then, there is a well-defined lattice homomorphism:

\[\varepsilon_{x, {\bf y}}: {\mathcal{Q}}_{A,x{\bf y}} [\mathcal{N}] \to {\mathcal{Q}}_{A\times 2,{\bf y}} [\mathcal{N}], \qquad \phi \mapsto \phi^x \wedge \Phi,\]

which makes the following diagram commute:

That is, for every $\phi \in {\mathcal{Q}}_{A,x{\bf y}} [\mathcal{N}]$ , we have $L_{\varepsilon_{x,{\bf y}}(\phi)} = \iota_{x,{\bf y}}[L_{\phi}]$ .

Proof. We first note that $\varepsilon_{x, {\bf y}}$ is well defined, that is, that $\phi^x \wedge \Phi$ and $(\phi')^x \wedge \Phi$ are semantically equivalent formulas provided so are $\phi$ and $\phi'$ . Indeed, since the models of $\Phi$ , when seen as a formula in the context y, are the y-marked words over $(A \times 2)$ that belong to the image of $\iota_{x, {\bf y}}$ , that is a consequence of Lemma 5.14. It is easily seen that $\varepsilon_{x, {\bf y}}$ is a lattice homomorphism. Commutativity of the diagram also follows from Lemma 5.14.

We call encoding function (with respect to $x, {\bf y}$ ) the map $\varepsilon_{x {\bf y}}: {\mathcal{Q}}_{A,x{\bf y}}[\mathcal{N}] \to {\mathcal{Q}}_{A\times 2,{\bf y}}[\mathcal{N}]$ of Corollary 5.15. Note that, if ${\bf y}'$ is a context containing y (so that $ {\mathcal{Q}}_{A,x{\bf y}}[\mathcal{N}] \subseteq {\mathcal{Q}}_{A,x{\bf y}'}[\mathcal{N}]$ and $ {\mathcal{Q}}_{A \times 2,{\bf y}}[\mathcal{N}] \subseteq {\mathcal{Q}}_{A\times 2,{\bf y}'} [\mathcal{N}]$ ) then $\varepsilon_{x, {\bf y}}$ is the restriction and co-restriction of $\varepsilon_{x, {\bf y}'}$ .

Conversely, a formula $\psi$ over $(A \times 2)$ in context y may be regarded as a formula over A in context $x{\bf y}$ as follows:

Definition 5.16. Let $\psi$ be a formula in context y over the alphabet $A \times 2$ . We denote by $\psi_x$ the formula obtained from $\psi$ by replacing:

  • each predicate ${\sf P}_{(a,1)}(z)$ by ${\sf P}_a(z) \wedge (x = z)$ ,

  • each predicated ${\sf P}_{(a, 0)}(z)$ by ${\sf P}_a(z) \wedge (x \neq z)$ .

Of course, we are assuming that things have been arranged so that the variable x does not occur in $\psi$ .

Again, a structural induction on construction of formulas shows the following:

Lemma 5.17. Let $\psi$ be a formula in context y over the alphabet $(A \times 2)$ . Then, for every $w \in A^*$ , $i \in \{0, \dots, \left\lvert w\right\rvert - 1\}$ , and ${\bf j} \in \{0, \dots,\left\lvert w\right\rvert - 1\}^{\left\lvert{\bf y}\right\rvert}$ , we have

\[(w, i,{\bf j}) \models \psi_x \iff \iota_{x, {\bf y}}(w, i, {\bf j}) \models \psi.\]

The next result is a straightforward consequence of Lemma 5.17.

Corollary 5.18. There is a well-defined Boolean algebra homomorphism:

\[\delta_{x,{\bf y}}: {\mathcal{Q}}_{A\times 2,{\bf y}} [\mathcal{N}] \to {\mathcal{Q}}_{A,x{\bf y}}[\mathcal{N}], \qquad\psi \mapsto \psi_x,\]

which makes the following diagram commute:

That is, for every $\psi \in {\mathcal{Q}}_{A\times 2,{\bf y}}[\mathcal{N}]$ , we have $L_{\delta_{x,{\bf y}} (\phi)} = \iota_{x,{\bf y}}^{-1}(L_{\phi})$ .

The decoding function (with respect to $x, {\bf y}$ ) is then the map $\delta_{x,{\bf y}}: {\mathcal{Q}}_{A\times 2,{\bf y}} [\mathcal{N}] \to {\mathcal{Q}}_{A,x{\bf y}}[\mathcal{N}]$ defined in Corollary 5.18. Similarly to what happens for $\varepsilon_{x, {\bf y}}$ , if ${\bf y}'$ is a context containing y, then $\delta_{x, {\bf y}}(\psi)$ is the restriction and co-restriction of $\delta_{x, {\bf y}'}(\psi)$ .

Let us now define $\varepsilon_{{\bf x}, {\bf y}}$ and $\delta_{{\bf x}, {\bf y}}$ , for all contexts x and y. We write ${\bf x} = \{x_1, \dots, x_k\}$ and, for each $i \in \{0, \dots, k\}$ , we let ${\bf x}_i$ denote the context $\{x_{1}, \dots, x_i\}$ (in particular, ${\bf x}_0$ denotes the empty context and ${\bf x}_k = {\bf x}$ ). Then, for each $i \in \{1, \dots, k\}$ , we have defined:

\[\varepsilon_{x_i, {\bf x}_{i-1}{\bf y}}: {\mathcal{Q}}_{{A\times 2^{k-i}},{\bf x_i}{\bf y}}[\mathcal{N}] \to {\mathcal{Q}}_{A\times 2^{k-i+1},{\bf x}_{i-1}{\bf y}}[\mathcal{N}]\]

and

\[\delta_{x_i, {\bf x}_{i-1}{\bf y}}:{\mathcal{Q}}_{A\times 2^{k-i+1},{\bf x}_{i-1}{\bf y}} [\mathcal{N}] \to {\mathcal{Q}}_{A\times 2^{k-i},{\bf x}_i{\bf y}}[\mathcal{N}].\]

An iteration of these maps yields two functions:

\[ \varepsilon_{{\bf x}, {\bf y}}: {\mathcal{Q}}_{A,{\bf x}{\bf y}}[\mathcal{N}] \to \mathcal{Q}_{A \times 2^{\bf x},{\bf y}}[\mathcal{N}] \quad \text{and}\quad \delta_{{\bf x},{\bf y}}: \mathcal{Q}_{A \times 2^{\bf x},{\bf y}}[\mathcal{N}] \to {\mathcal{Q}}_{A,{\bf x}{\bf y}}[\mathcal{N}],\]

which are given, respectively, by $\varepsilon_{{\bf x}, {\bf y}} =\varepsilon_{x_1,{\bf x}_0{\bf y}} \circ \dots \circ\varepsilon_{x_k,{\bf x}_{k-1}{\bf y}}$ and by $\delta_{{\bf x},{\bf y}} =\delta_{x_k,{\bf x}_{k-1}{\bf y}} \circ \dots \circ \delta_{x_1,{\bf x}_0{\bf y}}$ . In particular, $\varepsilon_{{\bf x}, {\bf y}}$ is a lattice homomorphism, while $\delta_{{\bf x}, {\bf y}}$ is a Boolean algebra homomorphism. Moreover, when x consists of a single variable x, the functions $\varepsilon_{x, {\bf y}}$ and $\delta_{x, {\bf y}}$ are precisely those previously defined.

By Corollaries 5.15 and 5.18, we have the following:

Proposition 5.19. Let x and y be disjoint contexts. Then, the following diagrams commute:

An immediate, but important, consequence of Proposition 5.19 is the following:

Corollary 5.20. For every $\phi\in {\mathcal{Q}}_{A,{\bf x}{\bf y}}[\mathcal{N}]$ and $\psi\in{\mathcal{Q}}_{A\times 2^{\bf x},{\bf y}}[\mathcal{N}]$ , we have the following:

  • (a) $\phi$ is semantically equivalent to $\delta_{{\bf x},{\bf y}}\varepsilon_{{\bf x},{\bf y}}(\phi)$ ,

  • (b) the models of $\psi$ that belong to ${\sf Im }(\iota_{{\bf x},{\bf y}})$ are precisely the models of $\varepsilon_{{\bf x},{\bf y}}\delta_{{\bf x},{\bf y}}(\psi)$ .

In particular, $\varepsilon_{{\bf x}, {\bf y}}$ is a lattice embedding, and $\delta_{{\bf x}, {\bf y}}$ is a quotient of Boolean algebras.

We finally show that it suffices to consider substitution with respect to a single variable.

Proposition 5.21. Let $\Gamma$ be a class of sentences, and let $\Delta \subseteq {\mathcal{Q}}_{A,{\bf x} x}[\mathcal{N}]$ be a finite Boolean subalgebra. Then, there exist a finite Boolean subalgebra $\Delta' \subseteq {\mathcal{Q}}_{A \times 2^{\bf x},x}[\mathcal{N}]$ , and an embedding $\zeta:{\rm At}(\Delta) \rightarrowtail {\rm At}(\Delta')$ , for which the following diagram commutes:

where $C_{\Delta} = {\rm At}(\Delta)$ and $C_{\Delta'} = {\rm At}(\Delta')$ . In particular, since $\zeta_\Gamma$ is surjective, we have

\[\Gamma \odot \Delta = \delta_{{\bf x}, \emptyset}[\Gamma \odot \Delta'].\]

Proof. We take $\Delta' = \{\psi \in {\mathcal{Q}}_{A \times 2^{\bf x},x} [\mathcal{N}] \mid \delta_{{\bf x} x}(\psi) \in \Delta\}$ . Since $\delta_{{\bf x} x}$ is a Boolean algebra homomorphism, $\Delta'$ is a Boolean algebra. Moreover, $\delta_{{\bf x} x}$ restricts and co-restricts to a Boolean algebra quotient $\delta_{{\bf x} x}': \Delta' \twoheadrightarrow \Delta$ . We let $\zeta: {\rm At}(\Delta) \rightarrowtail {\rm At}(\Delta')$ be its dual. Let us show that the diagram of the statement commutes. Let $w \in A^* \otimes {\mathbb{N}}^{ {\bf x}}$ and $\phi \in \Gamma(C_{\Delta'})$ . Then, we have

and

Therefore, the diagram commutes provided the equality

(20) \begin{equation} \zeta^* \circ \tau_\Delta(w) = \tau_{\Delta'}\circ \iota_{{\bf x}}(w) \end{equation}

holds. We first note that both sides of this equality are words over $C_{\Delta'} = {\rm At}(\Delta')$ of length $\left\lvert w\right\rvert$ . So, we fix $i \in \{0, \dots, \left\lvert w\right\rvert -1\}$ . We let $c_\ell$ and $c_r$ denote, respectively, the i-th letter of the left and of the right-hand side of (20). We also let $\psi_{c_\ell}$ and $\psi_{c_r}$ denote the atoms of $\Delta'$ represented by the letters $c_\ell$ and $c_r$ , respectively. Now, since $\zeta$ is dual to $\delta_{{\bf x}, x}$ , we have that $\psi_{c_\ell}$ is determined by the condition $(\tau_\Delta(w))_i \leq \delta_{{\bf x}, x}(\psi_{c_\ell})$ where, by (14), $(\tau_\Delta(w))_i$ the unique atom of $\Delta$ satisfying $(w,i)\models (\tau_\Delta(w))_i$ . In particular, we have $(w, i) \models \delta_{{\bf x}, x}(\psi_{c_\ell})$ . On the other hand, by (14), $\psi_{c_r}$ is the unique atom of $\Delta'$ satisfying $(\iota_{{\bf x}}(w), i) \models \psi_{c_r}$ . Since $(\iota_{{\bf x}}(w), i) = \iota_{{\bf x}, x}(w,i)$ , by Proposition 5.19, this is equivalent to $(w, i) \models \delta_{{\bf x}, x}(\psi_{c_r})$ . Thus, we have $\psi_{c_\ell} = \psi_{c_r}$ as required.

Using the definition of $(\Gamma\odot \Delta)$ when $\Delta$ is an arbitrary Boolean algebra, we may easily derive the following result from Proposition 5.21 and from its proof:

Corollary 5.22. Let $\Gamma$ be a class of sentences, and let $\Delta \subseteq {\mathcal{Q}}_{A,{\bf x} x}[\mathcal{N}]$ be a Boolean subalgebra. Then, there exists a Boolean subalgebra $\Delta' \subseteq {\mathcal{Q}}_{A \times 2^{\bf x},x}[\mathcal{N}]$ , namely, $\Delta' = \{\psi \in {\mathcal{Q}}_{A \times 2^{\bf x},x}[\mathcal{N}] \mid \delta_{{\bf x}, x}(\psi) \in \Delta\}$ , such that

\[\Gamma \odot \Delta = \delta_{{\bf x}, \emptyset}[\Gamma \odot \Delta'].\]

5.4 Applications to logic on words

In this section, we show the utility of the Substitution Principle in handling the application of one layer of quantifiers to a Boolean algebra of formulas. As we will see, this allows for a systematic study of fragments of logic via smaller subfragments (cf. Proposition 5.25 below). Before stating it, we show how we can assign to a given set of quantifiers a class of sentences.

Lemma 5.23. Let $\mathcal{Q}$ be a set of quantifiers. The assignment

\[\Gamma_\mathcal{Q}: A \mapsto \mathcal{Q}_A[\emptyset] = \left\langle \left\{Q x \bigvee_{a \in B} {\sf P}_a(x) \mid Q \in \mathcal{Q}, \ B \subseteq A\right\}\right\rangle_{\sf BA}\]

defines a class of sentences.

Proof. By definition, $\Gamma_\mathcal{Q}(A)$ is a Boolean algebra, so Property (LC.1) holds. To check (LC.2), it is enough to observe that replacing a letter predicate by a disjunction of letter predicates yields formulas without numerical predicates. Thus, every map $\zeta: A \to B$ defines a homomorphism $\zeta_\Gamma: \mathcal{Q}_{B}[\emptyset] \to \mathcal{Q}_{A}[\emptyset]$ .

The following is an immediate consequence of the definitions of $\Gamma_\mathcal{Q}$ and of $(\_\odot\_)$ , and it shows that $\Gamma_\mathcal{Q}$ -substitutions model the application of a layer of quantifiers to a given Boolean algebra.

Lemma 5.24. Let A be a finite alphabet, $\mathcal{Q}$ a set of quantifiers, $\mathcal{N}$ a set of numerical predicates, and $\Delta \subseteq \mathcal{Q}_{A, x}[\mathcal{N}]$ a Boolean subalgebra. Then,

\[\Gamma_\mathcal{Q}\odot \Delta = \langle\{Q x\ \phi \mid \phi \in \Delta\}\rangle_{\sf BA}.\]

Finally, an iteration of Lemma 5.24 yields the following:

Proposition 5.25. Let A be a finite alphabet, $\mathcal{Q}$ a set of quantifiers, and $\mathcal{N}$ a set of numerical predicates. For each $n > 0$ , let $\Delta_n$ be the Boolean algebra of quantifier-free formulas in the context ${\bf x} = \{x_1, \ldots, x_n\}$ . Then, a sentence of $\mathcal{Q}_A[\mathcal{N}]$ has quantifier-depth n if and only if it belongs to

\[\underbrace{\Gamma_{\mathcal{Q}} \odot (\dots \odot(\Gamma_{\mathcal{Q}}}_{n\text{ times}} \odot \ \Delta_n)\dots).\]

In particular, we have

\[\mathcal{Q}_A [\mathcal{N}] = \bigcup_{n \in {\mathbb{N}}} \underbrace{\Gamma_{\mathcal{Q}} \odot (\dots \odot(\Gamma_{\mathcal{Q}}}_{n\text{ times}} \odot \ \Delta_n)\dots).\]

Proof. Let n be a positive integer. For each $k \geq 0$ , we denote

\[\Gamma_\mathcal{Q}^k \odot \Delta_{n} = \underbrace{\Gamma_{\mathcal{Q}} \odot (\dots \odot(\Gamma_{\mathcal{Q}}}_{k\text{ times}} \odot \ \Delta_n)\dots).\]

We prove by induction on $k\ge 0$ that $\Gamma_\mathcal{Q}^k \odot \Delta_{n}$ consists of the formulas of quantifier-depth k in the context $\{x_{k+1}, \dots,x_n\}$ . The case $k = 0$ follows from the definition of $\Delta_n$ . Suppose the claim is valid for k, and let ${\bf x} = \{x_{k+2}, \dots, x_n\}$ . By induction hypothesis, we have $\Gamma_\mathcal{Q}^k \odot \Delta_n \subseteq {\mathcal{Q}}_{A,{\bf x}x_{k+1}}[\mathcal{N}]$ . Thus, by Corollary 5.22, the Boolean algebra $\Delta= \{\psi \in {\mathcal{Q}}_{A \times 2^{\bf x},{x_{k+1}}}[\mathcal{N}] \mid \delta_{{\bf x}, x_{k+1}}(\psi) \in \Gamma_\mathcal{Q}^k \odot \Delta_n\}$ is such that

\[\Gamma_\mathcal{Q}^{k+1} \odot \Delta_n = \Gamma_\mathcal{Q} \odot (\Gamma_\mathcal{Q}^k \odot \Delta_n) = \delta_{{\bf x}, \emptyset}\ [\Gamma_\mathcal{Q} \odot \Delta].\]

Since $\delta_{{\bf x}, \emptyset}$ is a homomorphism of Boolean algebras, by Lemma 5.24, it follows that $\Gamma_\mathcal{Q}^{k+1} \odot \Delta_n$ is the Boolean algebra generated by the formulas of the form $\delta_{{\bf x}, \emptyset}(Q x_{k+1}\ \psi)$ , where $\psi$ belongs to $\Delta$ . Finally, since

\[ \delta_{{\bf x}, \emptyset}(Q x_{k+1}\ \psi) = Q x_{x+1} \ \delta_{{\bf x}, x_{k+1}}(\psi),\]

by definition of $\Delta$ and by induction hypothesis, we may conclude that $\Gamma_\mathcal{Q}^{k+1}\odot \Delta_n$ consists of the quantifier-depth $k+1$ formulas in the context $\{x_{k+2}, \dots, x_n\} = {\bf x}$ as we intended to show.

6. Semidirect Products

Let $\Gamma$ be a class of sentences and $\Delta \subseteq {\mathcal{Q}}_{A,x}[\mathcal{N}]$ a Boolean subalgebra of formulas in one free variable. By Corollary 5.12, the languages definable by a formula of $\Gamma \odot \Delta$ may be described using the Boolean algebra of languages $\mathcal{V}_\Gamma(X_\Delta)$ , where $X_\Delta$ is the dual space of $\Delta$ , and the map $\tau_\Delta: A^* \to X_{\Delta}^*$ as defined in Corollary 5.10. On the other hand, the map $\tau_\Delta$ depends only on the Boolean subalgebra $\mathcal{D}_\Delta=\{L_{\phi} \mid \phi \in \Delta\} \subseteq \mathcal{P}(A^* \otimes {\mathbb{N}})$ of the languages definable by a formula in $\Delta$ . The following definition is an abstract version of this construction.

Definition 6.1. Let $\mathcal{C} \subseteq \mathcal{P}(A^* \otimes {\mathbb{N}})$ be a Boolean subalgebra and $\mathcal{W} \subseteq {\sf Clopen}(X_\mathcal{C}^*)$ be a Boolean subalgebra of languages over $X_\mathcal{C}$ , the dual space of $\mathcal{C}$ , viewed as a profinite alphabet. Also, let $\pi: A^* \otimes {\mathbb{N}} \to X_{\mathcal{C}}$ be the (restriction of the) dual of the embedding $\mathcal{C}\rightarrowtail \mathcal{P}(A^* \otimes {\mathbb{N}})$ . Define the map $\tau_{\mathcal{C}}: A^* \to X_{\mathcal{C}}^*$ by:

\[\tau_{\mathcal{C}}(w) = \pi(w, 0) \dots \pi(w, \left\lvert w\right\rvert - 1),\]

and let

\[\mathcal{W} \odot \mathcal{C} =\{\tau_{\mathcal{C}}^{-1}(K) \mid K \in \mathcal{W}\}.\]

Clearly, $\mathcal{W} \odot \mathcal{C}$ is a Boolean subalgebra of $\mathcal{P}(A^*)$ and, by duality, we have that the dual space of $\mathcal{W} \odot \mathcal{C}$ is a closed subspace of $X_{\mathcal{W}}$ given by the dual of the map $\tau_{\mathcal{C}}^{-1}\colon \mathcal{W}\to\mathcal{P}(A^*).$ That is, we have commutative diagrams:

which are dual to each other. Comparing these diagrams with those of Corollary 5.11, we readily see that Definition 6.1 is indeed an abstraction of the construction in Section 5.2. More precisely, by Corollary 5.12, as discussed above, if $\mathcal{C}=\mathcal{D}_\Delta$ and $\mathcal{W}=\mathcal{V}_\Gamma(X_\Delta)$ , then $\mathcal{W} \odot \mathcal{C}$ is the Boolean algebra of languages over A given by the logic fragment $\Gamma\odot \Delta$ . Apart from this logic application, which is our focus here, note that this definition is useful more widely in the theory of formal languages; see Weil (Reference Weil1992) for some examples.

Our purpose now is to obtain a description of the BiM dual to the Boolean algebra closed under quotients generated by a Boolean algebra of the form $\mathcal{V}(X_\mathcal{C}) \odot \mathcal{C}$ , for an lp-variety of languages $\mathcal{V}$ and $\mathcal{C}$ coming from a Boolean subalgebra closed under quotients of $\mathcal{P}((A \times 2)^*)$ as studied in Section 4.2.

Thus, we let $\mathcal{D}$ be a Boolean subalgebra of $\mathcal{P}((A \times 2)^*)$ which is closed under the quotient operations and is generated by the union of its retracts $\mathcal{D}_0\subseteq\mathcal{P}(A^*)$ and $\mathcal{D}_1\subseteq\mathcal{P}(A^*\otimes{\mathbb{N}})$ as considered in Lemma 4.6. Recall that $\mathcal{D}_0$ and $\mathcal{D}_1$ are, respectively, the following quotients of $\mathcal{D}$ :

\[ \mathcal{D}_0=\{L\cap A^*\mid L\in \mathcal{D}\} \qquad \text{and}\qquad \mathcal{D}_1=\{L\cap (A^* \otimes{\mathbb{N}})\mid L\in \mathcal{D}\}.\]

Let $\pi: (A \times 2)^* \to X_\mathcal{D}$ be the restriction to $(A \times2)^*$ of the map dual to the embedding $\mathcal{D} \rightarrowtail \mathcal{P}((A\times 2)^*)$ . As in Section 4.2, we denote

\[M= \pi[A^*]\qquad \text{and}\qquad T = \pi[A^* \otimes {\mathbb{N}}].\]

These are, respectively, a dense monoid in $X_{\mathcal{D}_0}$ and a dense subset of $X_{\mathcal{D}_1}$ . By Corollary 4.8(a), the natural biaction of the syntactic monoid $M_{{\mathcal{D}}}$ of $\mathcal{D}$ on $X_{{\mathcal{D}}}$ restricts and co-restricts to a biaction of M on $X_{\mathcal{D}_0}$ and to a biaction of M on $X_{\mathcal{D}_1}$ . In particular, M is a monoid quotient of $A^*$ , and T comes equipped with a biaction of M.

Now, for each $m \in M_\mathcal{D}$ , we let $\lambda_m$ and $\rho_m$ denote, respectively, the left and right components of the biaction of $M_{{\mathcal{D}}}$ on $X_{{\mathcal{D}}}$ . Then, there is also a biaction of M on $X_{\mathcal{D}_1}^*$ with continuous components given by:

(21) \begin{equation} \lambda_m^* (x_0 \dots x_{k-1}) = \lambda_m(x_0) \dots \lambda_m(x_{k-1}) \quad \text{and}\quad \rho_m^*(x_0 \dots x_{k-1}) = \rho_m(x_0) \dots \rho_m(x_{k-1})\end{equation}

for every $m \in M$ and $x_0 \dots x_{k-1}\in X_{\mathcal{D}_1}^*$ . Moreover, since T is invariant under the biaction of M on $X_{\mathcal{D}_1}$ , the T-generated free monoid $T^*$ is invariant under the above biaction of M on $X_{\mathcal{D}_1}^*$ , thereby yielding a monoid biaction of M on $T^*$ . Thus, we have a well-defined semidirect product $T^*{**}M$ given by this monoid biaction, cf. Section 2.5. Explicitly, the multiplication on $T^*{**}M$ is given by:

(22) \begin{equation} (\underline t, m)(\underline t', m') = (\rho_{m'}(t_0) \dots \rho_{m'}(t_{k-1})\lambda_m(t_0') \dots \lambda_m(t_{\ell-1}'), \, mm'),\end{equation}

for every $(\underline t, m) = (t_0 \dots t_{k-1}, m)$ and $(\underline t', m') = (t_0' \dots t_{\ell-1}', m')$ in $T^* {**} M$ .

Our next goal is to show that $T^*{**}M$ has a monoid quotient that is part of a BiM having $X_{\mathcal{V}(X_{\mathcal{D}_1})} \times X_{\mathcal{D}_0}$ as space component. This BiM is relevant because, via a suitable morphism, it exactly recognizes the lattice $\mathcal{V}\circ \mathcal{D}$ generated by the languages in $\mathcal{V}(X_{\mathcal{D}_1}) \odot \mathcal{D}_1$ and in $\mathcal{D}_0$ (cf. Theorem 6.9). We proceed in two steps. First, we show that the multiplication on $T^*{**}M$ naturally extends to a biaction of $T^*{**}M$ on $\beta(X_{\mathcal{D}_1}^*) \times X_{\mathcal{D}_0}$ with continuous components, so that the inclusion

\[T^*{**}M \rightarrowtail \beta(X_{\mathcal{D}_1}^*) \times X_{\mathcal{D}_0}\]

has a BiM structure (cf. Lemma 6.2 and Proposition 6.4). Then, for every Boolean subalgebra $\mathcal{W}$ of ${\sf Clopen}(X_{\mathcal{D}_1}^*)$ , we have a continuous quotient $\eta:\beta(X_{\mathcal{D}_1}^*) \twoheadrightarrow X_\mathcal{W}$ , and hence, a continuous quotient $(\eta \times id): \beta(X_{\mathcal{D}_1}^*) \times X_{\mathcal{D}_0}\twoheadrightarrow X_\mathcal{W} \times X_{\mathcal{D}_0}$ . The second step is then to show that, if $\mathcal{W} = \mathcal{V}(X_{\mathcal{D}_1})$ for some lp-variety $\mathcal{V}$ , then $(\eta \times id)$ defines a BiM quotient:

(23)

where N denotes the image of $T^*{**}M$ under $(\eta \times id)$ , that is, $N = (\eta \times id)[T^*{**}M]$ , and $\iota$ is the inclusion map. This is a consequence of Corollary 6.6. In Lemma 6.7, we show that N is given by a semidirect product $\eta[T^*]{**} M$ . The BiM $(\eta[T^*]{**} M, \, \iota, \,X_{\mathcal{V}(X_{\mathcal{D}_1})} \times X_{\mathcal{D}_0})$ will be denoted by ${\bf R}_{\mathcal{V} \circ \mathcal{D}}$ . For the reader’s convenience, we list the notation just introduced in the following table:

For the first part, we first observe that, using the equality $\rho_{m'}(t) = \lambda_t(m')$ valid for every $t, m' \in M_\mathcal{D}$ (and in particular, for every $t \in T$ and $m' \in M$ ), (22) may be rewritten as follows:

\[(\underline t, m)(\underline t', m') = (\lambda_{t_0}(m') \dots \lambda_{t_{k-1}}(m')\lambda_m(t_0') \dots \lambda_m(t_{\ell-1}'), \, mm').\]

This provides a natural way of defining an element $\widetilde\lambda_{(\underline t, m)}(\underline x, x_0)$ when $(\underline x, x_0)$ belongs to $X_{\mathcal{D}_1}^* \times X_{\mathcal{D}_0}$ , namely,

(24) \begin{equation} \widetilde\lambda_{(\underline t, m)}(\underline x, x_0) = (\lambda_{t_1}(x_0) \dots \lambda_{t_k}(x_0) \lambda_{m}(x_1) \dots \lambda_{m}(x_\ell),\, \lambda_{m}(x_0)),\end{equation}

where $\underline x = x_1 \dots x_\ell$ . Similarly, we define

(25) \begin{equation} \widetilde\rho_{(\underline t, m)}(\underline x, x_0) = (\rho_{m}(x_1) \dots \rho_{m}(x_\ell)\rho_{t_1}(x_0) \dots \rho_{t_k}(x_0),\, \rho_{m}(x_0)).\end{equation}

Lemma 6.2. The functions

\begin{align*} \widetilde\lambda_{(\underline t, m)}: X_{\mathcal{D}_1}^* \times X_{\mathcal{D}_0} & \to X_{\mathcal{D}_1}^* \times X_{\mathcal{D}_0} \\ (x_1 \dots x_\ell, x_0) & \mapsto (\lambda_{t_0}(x_0) \dots \lambda_{t_{k-1}}(x_0) \lambda_{m}(x_1) \dots \lambda_{m}(x_\ell), \, \lambda_{m}(x_0)), \end{align*}

and

\begin{align*} \widetilde\rho_{(\underline t, m)}: X_{\mathcal{D}_1}^* \times X_{\mathcal{D}_0} & \to X_{\mathcal{D}_1}^* \times X_{\mathcal{D}_0} \\ (x_1 \dots x_\ell, x_0) & \mapsto (\rho_{m}(x_1) \dots \rho_{m}(x_\ell)\rho_{t_0}(x_0) \dots \rho_{t_{k-1}}(x_0),\, \rho_{m}(x_0)), \end{align*}

define a biaction of $T^*{**}M$ on the space $X_{\mathcal{D}_1}^* \times X_{\mathcal{D}_0}$ .

Proof. This is a consequence of the fact that the family $\{\lambda_m, \rho_m\}_{m \in M_{\mathcal{D}}}$ defines a biaction of $M_\mathcal{D}$ on $X_\mathcal{D}$ . We illustrate the computations involved by showing that $\widetilde\lambda_{(\underline t, m)} \circ \widetilde \lambda_{(\underline t', m')} = \widetilde \lambda_{(\underline t, m)(\underline t', m')}$ for every $(\underline t, m), (\underline t', m') \in T^*{**}M$ and leave the rest for the reader. Let us write $\underline t = t_0 \dots t_{k-1}$ and $\underline t' = t'_0 \dots t'_{\ell-1}$ , and pick $(\underline x, x_0) \in X_{\mathcal{D}_1}^* \times X_{\mathcal{D}_0}$ . We will use the following notation: if $f_0, \dots, f_{i-1}$ are functions $P \to Q$ then, for every $p \in P$ , we denote by $\langle f_0, \dots, f_{i-1}\rangle(p)$ the word $f_0(p) \dots f_{i-1}(p)$ over Q. Then, we may compute

\begin{align*} \widetilde\lambda_{(\underline t, m)} \circ \widetilde \lambda_{(\underline t', m')} (\underline x, x_0) & = \widetilde\lambda_{(\underline t, m)} ( \langle \lambda_{t_0'}, \dots, \lambda_{t_{\ell-1}'}\rangle(x_0)\, \lambda^*_{m'}(\,\underline x\,), \lambda_{m'}(x_0)) \\ & = (\langle \lambda_{t_0}, \dots \lambda_{t_{k-1}}\rangle (\lambda_{m'}(x_0)) \lambda_m^*( \langle \lambda_{t_0'}, \dots, \lambda_{t_{\ell-1}'}\rangle(x_0)\, \lambda^*_{m'}(\,\underline x\,)), \lambda_m(\lambda_{m'}(x_0))) \\ & = (\langle \lambda_{t_0}\lambda_{m'}, \dots \lambda_{t_{k-1}}\lambda_{m'}, \lambda_m\lambda_{t_0'}, \dots, \lambda_m\lambda_{t_{\ell-1}'}\rangle(x_0)\, \lambda_m^*(\lambda^*_{m'}(\,\underline x\,)), \lambda_m(\lambda_{m'}(x_0))) \\ & \stackrel{(*)}{=} (\langle \lambda_{t_0 m'}, \dots \lambda_{t_{k-1} m'}, \lambda_{m t_0'}, \dots,\lambda_{m t_{\ell-1}'}\rangle(x_0)\, \lambda^*_{mm'}(\,\underline x\,), \lambda_{mm'}(x_0)) \\ & = \widetilde \lambda_{(\underline t, m)(\underline t', m')}(\underline x, x_0). \end{align*}

Here, every equality follows from the appropriate definitions, except for the equality marked with $(*)$ , which uses the fact that $\{\lambda_m\}_{m \in M_{\mathcal{D}}}$ is itself a left action.

We now see that the biaction defined in Lemma 6.2 further extends to a biaction of $T^*{**}M$ on $\beta(X_{\mathcal{D}_1}^*)\times X_{\mathcal{D}_0}$ with continuous components, thereby defining a BiM:

\[T^* {**} M \rightarrowtail \beta(X_{\mathcal{D}_1}^*) \times X_{\mathcal{D}_0}.\]

Indeed, that may be seen as a consequence of the following technical lemma.

Lemma 6.3. Let $L_0 \in \mathcal{D}_0$ and $L_1, \dots, L_n \in \mathcal{D}_1$ . Then, for every $\underline t = t_0 \dots t_{k-1} \in T^*$ and $m \in M$ , the following equalities hold

(26) \begin{equation} \widetilde \lambda_{(\underline t, m)}^{-1} (\widehat L_1 \times \dots \times \widehat L_n \times \widehat L_0) = \begin{cases} \lambda_m^{-1}(\widehat L_{k+1}) \times \dots \times \lambda_m^{-1}(\widehat L_n) \times (\lambda_m^{-1}(\widehat L_0) \cap \bigcap_{i = 1}^k \lambda_{t_{i-1}}^{-1}(\widehat L_i)), \quad\!\!\! \text{if $k \leq n$;}\\ \emptyset, \quad\!\!\! \text{ otherwise;} \end{cases} \end{equation}

and

\begin{align*}\widetilde \rho_{(\underline t, m)}^{\,-1} (\widehat L_1 \times \dots \times \widehat L_n \times \widehat L_0) = \begin{cases} \rho_m^{-1}(\widehat L_{1}) \times \dots \times \rho_m^{-1}(\widehat L_{n-k}) \times (\rho_m^{-1}(\widehat L_0) \cap \bigcap_{i = 1}^k \rho_{t_{i-1}}^{-1}(\widehat L_{n-k+i})), \quad \text{if $k \leq n$;}\\ \emptyset, \quad \text{ otherwise.} \end{cases} \end{align*}

(Here, we adopt the convention that $\widehat L_{k+1} \times \dots \times \widehat L_n = \{\varepsilon\} = \widehat L_{1} \times \dots \times \widehat L_{n-k}$ when $k = n$ .)

In particular, the biaction of $T^*{**}M$ on $X_{\mathcal{D}_1}^* \times X_{\mathcal{D}_0}$ has continuous components.

Proof. We will only show the first equality, as the second one is analogous. First note that the pair $(\underline x, x_0) \in X_{\mathcal{D}_1}^* \times X_{\mathcal{D}_0}$ belongs to the left-hand side of (26) if and only if

(27) \begin{equation} \lambda_{t_0}(x_0) \dots \lambda_{t_{k-1}}(x_0) \lambda^*_{m}(\underline x) \in \widehat L_1 \times \dots \times \widehat L_n \quad\text{ and }\quad \lambda_{m}(x_0) \in \widehat L_0. \end{equation}

Thus, the left-hand side of (26) is empty unless $k \leq n$ . When that is the case, (27) is equivalent to

\[\lambda_{t_{i-1}}(x_0) \in \widehat L_i, \text{ for $i = 1, \dots, k$,} \quad \lambda_m^*(\underline x) \in \widehat L_{k+1} \times \dots \times \widehat L_n, \quad \text{and}\quad \lambda_{m}(x_0) \in \widehat L_0,\]

that is,

\[(\underline x, x_0) \in \lambda_m^{-1}(\widehat L_{k+1}) \times \dots \times \lambda_m^{-1}(\widehat L_n) \times (\lambda_m^{-1}(\widehat L_0) \cap \bigcap_{i = 1}^k \lambda_{t_{i-1}}^{-1}(\widehat L_i)).\]

This shows that (26) holds.

Proposition 6.4. The biaction of $T^* {**} M$ on $X_{\mathcal{D}_1}^* \times X_{\mathcal{D}_0}$ extends to a biaction of $T^*{**}M$ on $\beta(X_{\mathcal{D}_1}^*) \times X_{\mathcal{D}_0}$ with continuous components, and the inclusion $T^* {**} M \rightarrowtail \beta(X_{\mathcal{D}_1}^*) \times X_{\mathcal{D}_0}$ admits a BiM structure with respect to such biaction.

Proof. By Lemma 6.3, the biaction of $T^*{**}M$ on $X_{\mathcal{D}_1}^* \times X_{\mathcal{D}_0}$ has continuous components. Therefore, since $\beta(X_{\mathcal{D}_1}^*) \times X_{\mathcal{D}_0}$ and $\beta_0(X_{\mathcal{D}_1}^* \times X_{\mathcal{D}_0})$ are homeomorphic (cf. Corollary 3.7), each of its components may be uniquely extended to a continuous endofunction of $\beta(X_{\mathcal{D}_1}^*) \times X_{\mathcal{D}_0}$ . Since $X_{\mathcal{D}_1}^* \times X_{\mathcal{D}_0}$ is dense in $\beta(X_{\mathcal{D}_1}^*) \times X_{\mathcal{D}_0}$ , it follows that $T^* {**} M \rightarrowtail \beta(X_{\mathcal{D}_1}^*) \times X_{\mathcal{D}_0}$ is a BiM, as required.

The extensions of the maps defined in Lemma 6.2 to continuous endofunctions of $\beta(X_{\mathcal{D}_1}^*) \times X_{\mathcal{D}_0}$ will also be denoted by $\widetilde\lambda_{(\underline t, m)}$ and by $\widetilde\rho_{(\underline t, m)}$ , respectively. We will now give an expression for $\widetilde\lambda_{(\underline t, m)} (\gamma,x_0)$ and $\widetilde\rho_{(\underline t, m)} (\gamma, x_0)$ when $(\gamma, x_0) \in \beta(X_{\mathcal{D}_1}^*) \times X_{\mathcal{D}_0}$ . Recall from Section 4.1 that the embedding $X_{\mathcal{D}_1}^*\rightarrowtail\beta(X_{\mathcal{D}_1}^*)$ defines a BiM. We let $\{\widetilde \ell_w,\widetilde r_w\}_{w \in X_{\mathcal{D}_1}^*}$ be the components of the biaction of $X_{\mathcal{D}_1}^*$ on $\beta(X_{\mathcal{D}_1}^*)$ . On the other hand, since $\lambda_m$ and $\rho_m$ are continuous functions and the topological space $X^*_{\mathcal{D}_1}$ is the union over $n \ge 0$ of the product spaces $X_{\mathcal{D}_1}^n$ , the maps $(\lambda_m^*)^{-1}$ and $(\rho_m^*)^{-1}$ define endomorphisms of ${\sf Clopen}(X_{\mathcal{D}_1}^*)$ . Therefore, $\lambda_m^*$ and $\rho_m^*$ extend to continuous functions $\beta(\lambda_m^*): \beta(X_{\mathcal{D}_1}^*) \to\beta(X_{\mathcal{D}_1}^*)$ and $\beta(\rho_m^*): \beta(X_{\mathcal{D}_1}^*) \to\beta(X_{\mathcal{D}_1}^*)$ which are, respectively, the duals of $(\lambda_m^*)^{-1}$ and of $(\rho_m^*)^{-1}$ .

We now observe that, given $(\underline x, x_0) \in X_{\mathcal{D}_1}^* \times X_{\mathcal{D}_0}$ , we may rewrite (24) as follows:

\[\widetilde\lambda_{(\underline t, m)}(\underline x, x_0) = (\widetilde \ell_{\lambda_{\underline t}(x_0)}\circ \lambda_m^*(\underline x),\, \lambda_{m}(x_0)),\]

where $\lambda_{\underline t}(x_0) = \lambda_{t_0}(x_0)\dots\lambda_{t_{k-1}}(x_0)$ . Therefore, the function $f:\beta(X_{\mathcal{D}_1}^*) \times X_{\mathcal{D}_0} \to \beta(X_{\mathcal{D}_1}^*) \times X_{\mathcal{D}_0}$ defined by:

\[f(\gamma, x_0) = (\widetilde \ell_{\lambda_{\underline t}(x_0)}\circ \beta(\lambda_m^*)(\gamma),\, \lambda_{m}(x_0)),\]

for every $(\gamma, x_0) \in \beta(X_{\mathcal{D}_1}^*) \times X_{\mathcal{D}_0}$ , is an extension of the restriction $\widetilde\lambda_{(\underline t, m)}:X_{\mathcal{D}_1}^* \times X_{\mathcal{D}_0} \to X_{\mathcal{D}_1}^* \times X_{\mathcal{D}_0}$ . Since $X_{\mathcal{D}_1}^* \times X_{\mathcal{D}_0}$ is a dense subspace of $\beta(X_{\mathcal{D}_1}^*) \times X_{\mathcal{D}_0}$ and $\widetilde\lambda_{(\underline t, m)}: \beta(X_{\mathcal{D}_1}^*) \times X_{\mathcal{D}_0} \to \beta(X_{\mathcal{D}_1}^*) \times X_{\mathcal{D}_0}$ is, up to isomorphism, the map dual to $\widetilde\lambda^{-1}_{(\underline t, m)}: {\sf Clopen} (X_{\mathcal{D}_1}^* \times X_{\mathcal{D}_0}) \to {\sf Clopen} (X_{\mathcal{D}_1}^* \times X_{\mathcal{D}_0})$ , in order to show that f coincides with $\widetilde\lambda_{(\underline t, m)}: \beta(X_{\mathcal{D}_1}^*) \times X_{\mathcal{D}_0} \to \beta(X_{\mathcal{D}_1}^*) \times X_{\mathcal{D}_0}$ , it suffices to show that $f^{-1}$ induces a well-defined map ${\sf Clopen} (\beta(X_{\mathcal{D}_1}^*)\times X_{\mathcal{D}_0}) \to {\sf Clopen} (\beta(X_{\mathcal{D}_1}^*) \times X_{\mathcal{D}_0})$ (thus, in particular, f is continuous) that is isomorphic to $\widetilde\lambda^{-1}_{(\underline t, m)}$ . Given a clopen subset $K\subseteq X_{\mathcal{D}_1}^*$ , we let cl(K) denote the closure of K in $\beta(X_{\mathcal{D}_1}^*)$ so that the assignment $K \mapsto cl(K)$ establishes an isomorphism between ${\sf Clopen}(X_{\mathcal{D}_1}^*)$ and ${\sf Clopen}(\beta(X_{\mathcal{D}_1}^*))$ . Let us fix $L_1, \dots, L_n \in\mathcal{D}_1$ and $L_0 \in \mathcal{D}_0$ . Then, for every $\gamma \in \beta(X_{\mathcal{D}_1}^*) =X_{{\sf Clopen}(X_{\mathcal{D}_1}^*)}$ and $x_0 \in X_{\mathcal{D}_0}$ , we have

\begin{align*} (\gamma, x_0) \in f^{-1} (cl(\widehat L_1 \times \dots \times \widehat L_n) \times \widehat L_0) & \iff (\widetilde \ell_{\lambda_{\underline t}(x_0)}\circ \beta(\lambda_m^*)(\gamma),\, \lambda_{m}(x_0)) \in cl(\widehat L_1 \times \dots \times \widehat L_n) \times \widehat L_0 \\ & \iff \widehat L_1 \times \dots \times \widehat L_n \in \widetilde \ell_{\lambda_{\underline t}(x_0)}\circ \beta(\lambda_m^*)(\gamma) \quad \text{and} \quad x_0 \in \lambda_m^{-1}(\widehat L_0) \\ & \iff (\lambda_m^*)^{-1}(\lambda_{\underline t}(x_0)^{-1}(\widehat L_1 \times \dots \times \widehat L_n)) \in \gamma \quad \text{and} \quad x_0 \in \lambda_m^{-1}(\widehat L_0),\end{align*}

where $\lambda_{\underline t}(x_0)^{-1}(\widehat L_1 \times \dots\times \widehat L_n)$ denotes the left quotient of $\widehat L_1\times \dots \times \widehat L_n \subseteq X_{\mathcal{D}_1}^*$ by the word $\lambda_{\underline t}(x_0) \in X_{\mathcal{D}_1}^*$ . Now, either $k \leq n$ and $\lambda_{\underline t}(x_0) \in \widehat L_1 \times \dots \times\widehat L_k$ , and then we have

\[(\lambda_m^*)^{-1}(\lambda_{\underline t}(x_0)^{-1}(\widehat L_1 \times \dots \times \widehat L_k)) = \lambda_m^{-1}(\widehat L_{k+1}) \times \dots \times \lambda_m^{-1}(\widehat L_{n}),\]

or else,

\[(\lambda_m^*)^{-1}(\lambda_{\underline t}(x_0)^{-1}(\widehat L_1 \times \dots \times \widehat L_k)) = \emptyset.\]

Therefore, $f^{-1} (cl(\widehat L_1 \times \dots \times \widehat L_n)\times \widehat L_0) $ is nonempty if and only if $k \leq n$ and, in that case, it is given by:

(28) \begin{equation} f^{-1} (cl(\widehat L_1 \times \dots \times \widehat L_n) \times \widehat L_0) = cl(\lambda_m^{-1}(\widehat L_{k+1}) \times \dots \times \lambda_m^{-1}(\widehat L_{n})) \times (\lambda_m^{-1}(\widehat L_0) \cap \bigcap_{i = 1}^k \lambda_{t_{i-1}}^{-1}(\widehat L_i)).\end{equation}

By Lemma 6.3, we may then conclude that $f^{-1}$ induces a function isomorphic to $\widetilde\lambda^{-1}_{(\underline t, m)}$ , as claimed.

Proceeding similarly with $\widetilde\rho_{(\underline t, m)}$ , we may conclude that, for every $(\gamma, x_0) \in \beta(X_{\mathcal{D}_1}^*) \times X_{\mathcal{D}_0}$ , the following equalities hold

(29) \begin{equation} \widetilde\lambda_{(\underline t, m)}(\gamma, x_0) = (\widetilde \ell_{\lambda_{\underline t}(x_0)}\circ \beta(\lambda_m^*)(\gamma),\, \lambda_{m}(x_0)) \quad \text{and} \quad\widetilde\rho_{(\underline t, m)}(\gamma, x_0) = (\widetilde r_{\rho_{\underline t}(x_0)} \circ \beta(\rho_m^*) (\gamma),\, \rho_{m}(x_0)),\end{equation}

As promised, we now prove that the continuous quotient $(\eta \times id) :\beta(X_{\mathcal{D}_1}^*) \times X_{\mathcal{D}_0}\twoheadrightarrow X_{\mathcal{V}(X_{\mathcal{D}_1})} \times X_{\mathcal{D}_0}$ induces a BiM quotient as in (23). This is a consequence of the following lemma:

Lemma 6.5. If $L_1, \dots, L_n \in \mathcal{D}_1$ are such that $\widehat L_1 \times \dots \times \widehat L_n$ belongs to $\mathcal{V}(X_{\mathcal{D}_1}) \subseteq {\sf Clopen}(X_{\mathcal{D}_1}^*)$ then, for every $0 \leq k \leq n$ , the language $\lambda_m^{-1}(\widehat L_{k+1}) \times \dots \lambda_m^{-1}(\widehat L_n)$ also belongs to $\mathcal{V}(X_{\mathcal{D}_1})$ .

Proof. Let us denote $K = \lambda_m^{-1}(\widehat L_{k+1}) \times \dots \times \lambda_m^{-1}(\widehat L_{n})$ . Since, by Corollary 4.8, we have a continuous function $\lambda_m: X_{\mathcal{D}_1} \to X_{\mathcal{D}_1}$ , by Lemma 3.8, we have $K \in \mathcal{V}(X_{\mathcal{D}_1}) $ provided the language $L = \widehat L_{k+1} \times \dots \times \widehat L_{n} \subseteq X_{\mathcal{D}_1}^*$ belongs to $\mathcal{V}(X_{\mathcal{D}_1})$ . To show this, we pick a word $w \in \widehat L_1 \times \dots \times \widehat L_{k}$ . Then, since $ \widehat L_1 \times \dots \times \widehat L_n$ belongs to $\mathcal{V}(X_{\mathcal{D}_1})$ , by Lemma 3.1, the quotient $w^{-1}( \widehat L_1 \times \dots \times \widehat L_n)$ also does. But this quotient is precisely L. Indeed, for every word $v \in X_{\mathcal{D}_1}^*$ , we have

\[v \in w^{-1}( \widehat L_1 \times \dots \times \widehat L_n) \iff w v \in \widehat L_1 \times \dots \times \widehat L_n \iff v \in \widehat L_{k+1} \times \dots \times \widehat L_{n},\]

where the last equivalence follows from the choice of w.

Corollary 6.6. Let $\mathcal{V}$ be an lp-variety of languages. Then, for every $(\underline t, m) \in T^* {**} M$ , there are continuous functions $\lambda_{(\underline t, m)}$ and $\rho_{(\underline t, m)}$ making the following diagrams commute:

(30)

where $\widetilde \lambda_{(\underline t, m)}$ and $\widetilde \rho_{(\underline t, m)}$ are, respectively, the left and right components at $(\underline t, m)$ of the biaction of $T^*{**}M$ on $\beta(X_{\mathcal{D}_1}^*)\times X_{\mathcal{D}_0}$ (cf. Proposition 6.4). Moreover, the family $\{\lambda_{(\underline t, m)}, \rho_{(\underline t, m)}\}_{(\underline t, m)}$ defines a biaction of $T^*{**}M$ on $X_{\mathcal{V}(X_{\mathcal{D}_1})} \times X_{\mathcal{D}_0}$ .

Proof. By Lemmas 6.3 and 6.5, the Boolean algebra homomorphism

\[\widetilde \lambda_{(\underline t, m)}^{-1}: {\sf Clopen}(\beta(X_{\mathcal{D}_1}^*) \times X_{\mathcal{D}_0}) \to {\sf Clopen}(\beta(X_{\mathcal{D}_1}^*) \times X_{\mathcal{D}_0})\]

restricts and co-restricts to a Boolean algebra homomorphism

\[{\sf Clopen}(X_{\mathcal{V}(X_{\mathcal{D}_1})} \times X_{\mathcal{D}_0}) \to {\sf Clopen}(X_{\mathcal{V}(X_{\mathcal{D}_1})} \times X_{\mathcal{D}_0}).\]

Let $\lambda_{(\underline t, m)}$ be the dual of the latter. Then, $\lambda_{(\underline t, m)}$ is a continuous function that makes the left-hand side of (30) commute. Existence of $\rho_{(\underline t, m)}$ is shown similarly.

The fact that $\{\lambda_{(\underline t, m)}, \rho_{(\underline t, m)}\}_{(\underline t, m)}$ defines a biaction of $T^*{**}M$ on $X_{\mathcal{V}(X_{\mathcal{D}_1})} \times X_{\mathcal{D}_0}$ follows from having that $\{\widetilde\lambda_{(\underline t, m)}, \widetilde\rho_{(\underline t, m)}\}_{(\underline t, m)}$ defines a biaction of $T^*{**}M$ on $\beta(X_{\mathcal{D}_1}^*) \times X_{\mathcal{D}_0}$ , together with surjectivity of $(\eta \times id)$ and commutativity of the diagrams in (30).

By Corollary 6.6, we have that the restriction of $(\eta \times id): \beta(X_{\mathcal{D}_1}^*) \times X_{\mathcal{D}_0} \to X_{\mathcal{V}(X_{\mathcal{D}_1})} \times X_{\mathcal{D}_0}$ to $T^* {**} M$ is a morphism of sets with $(T^*{**}M)$ -biactions. Therefore, $N = (\eta \times id)[T^*{**}M]$ comes equipped with a monoid structure induced by the monoid structure of $T^*{**} M$ , and we have a BiM ${\bf R}_{\mathcal{V} \circ \mathcal{D}} = (N,\, \iota,\,X_{\mathcal{V}(X_{\mathcal{D}_1})} \times X_{\mathcal{D}_0})$ which is a quotient of $(T^*{**} M \rightarrowtail X_{\mathcal{D}_1}^* \times X_{\mathcal{D}_0})$ as in (23). We will now give a precise description of the monoid N. Note that the underlying set of N is $\eta[T^*] \times M$ . Moreover, since $\mathcal{V}$ is an lp-variety, by Lemma 3.1, $\mathcal{V}(X_{\mathcal{D}_1}^*)$ is closed under quotients and thus, $\eta[X_{\mathcal{D}_1}^*]$ is a monoid and the restriction and co-restriction of $\eta$ to a map $X_{\mathcal{D}_1}^* \twoheadrightarrow \eta[X_{\mathcal{D}_1}^*]$ is a monoid quotient. Since $T^*$ is a submonoid of $X_{\mathcal{D}_1}^*$ , we have that $\eta[T^*]$ is also a monoid. We will show that M biacts on $\eta[T^*]$ and that N is the semidirect product $\eta[T^*] {**}M$ defined by this biaction (Lemma 6.7). In fact, we show the following slightly more general fact: M biacts on $\eta[X_{\mathcal{D}_1}^*]$ and the ensuing semidirect product $\eta[X_{\mathcal{D}_1}^*]{**} M$ is a monoid quotient of $X^*_{\mathcal{D}_1}{**}M$ (recall that M biacts on the free monoid $X_{\mathcal{D}_1}^*$ so that we have a semidirect product $X_{\mathcal{D}_1}^*{**} M$ of which $T^*{**} M$ is a submonoid, cf. (21)).

Lemma 6.7. For every $m \in M$ , the assignments

\[\ell_m: \eta[X_{\mathcal{D}_1}^*] \to \eta[X_{\mathcal{D}_1}^*], \qquad \eta(\underline x) \mapsto \eta(\lambda_m^*(\underline x))\]

and

\[r_m: \eta[X_{\mathcal{D}_1}^*] \to \eta[X_{\mathcal{D}_1}^*], \qquad \eta(\underline x) \mapsto \eta(\rho_m^*(\underline x))\]

are well-defined functions, which define a monoid biaction of M on $\eta[X_{\mathcal{D}_1}^*]$ . In particular, the monoid N is a semidirect product $\eta[X_{\mathcal{D}_1}^*] {**} M$ whose multiplication is defined by:

(31) \begin{equation} (\eta(\underline x), m) (\eta(\underline x'), m') = (\eta(\rho_{m'}^*(\underline x)\lambda_{m}^*(\underline x')), mm') \end{equation}

for every $(\underline x, m), (\underline x', m') \in X_{\mathcal{D}_1}^*{**} M$ , and $(\eta \times id): X_{\mathcal{D}_1}^*{**} M \twoheadrightarrow \eta[X_{\mathcal{D}_1}^*]{**} M$ is a monoid quotient.

Proof. We first show that $\ell_m$ is well defined. Let $\underline x, \underline x' \in X_{\mathcal{D}_1}^*$ be such that $\eta(\underline x) = \eta(\underline x')$ . We need to show that $\eta(\lambda_m^*(\underline x)) = \eta(\lambda_m^*(\underline x'))$ . By definition of dual map, having $\eta(\underline x) = \eta(\underline x')$ is equivalent to having that, for every $K \in \mathcal{V}(X_{\mathcal{D}_1})$ ,

\[\underline x \in K \iff \underline x' \in K.\]

Since $\lambda_m$ is a continuous function and $\mathcal{V}$ is an lp-strain of languages, by Lemma 3.9, we then have that, for every $K \in \mathcal{V}(X_{\mathcal{D}_1})$ ,

\[\underline x \in (\lambda^*_m)^{-1}(K) \iff \underline x' \in (\lambda^*_m)^{-1}(K),\]

and this is equivalent to having $\eta(\lambda_m^*(\underline x)) = \eta(\lambda_m^*(\underline x'))$ as required.

Similarly, one can show that $r_m$ is well defined. The fact that $\{\ell_m, r_m\}_{m \in M}$ defines a monoid biaction is inherited from the fact that $\{\lambda_m, \rho_m\}_{m \in M}$ defines a monoid biaction.

Finally, the fact that the multiplication on $\eta[X_{\mathcal{D}_1}^*]{**}M$ is given by (31) is a straightforward consequence of the definition of semidirect products (cf. Section 2.5). To conclude that $\eta \times id$ is a monoid quotient, it suffices to observe that (31) may be rewritten as:

\[ (\underline x, m)(\underline x', m') = (\eta\times id) (\rho_{m'}^*(\underline x)\lambda^*_m(\underline x'), mm') = (\eta\times id) ((\underline x, m)(\underline x', m')). \]

We just finished proving the following:

Proposition 6.8. Let $\mathcal{V}$ be an lp-variety of languages and $\mathcal{D} \subseteq \mathcal{P}((A \times 2)^*)$ be a Boolean subalgebra closed under quotients that is generated by the union of its retracts $\mathcal{D}_0 \subseteq \mathcal{P}(A^*)$ and $\mathcal{D}_1 \subseteq \mathcal{P}(A^* \otimes {\mathbb{N}})$ . Then, we have a quotient of BiMs:

We remark that, by (29) and by Corollary 6.6, we have that the biaction of $\eta[T^*]{**}M$ on $X_{\mathcal{V}(X_{\mathcal{D}_1})} \times X_{\mathcal{D}_0}$ is defined by:

(32) \begin{equation} \lambda_{(\eta(\underline t), m)}(\eta(\gamma), x_0) = (\eta\circ \widetilde \ell_{\lambda_{\underline t}(x_0)} \circ \beta(\lambda_m^*)(\gamma), \lambda_m(x_0))\end{equation}

and

(33) \begin{equation} \rho_{(\eta(\underline t), m)}(\eta(\gamma), x_0) = (\eta \circ \widetilde r_{\rho_{\underline t}(x_0)} \circ \beta(\rho_m^*) (\gamma),\, \rho_{m}(x_0))\end{equation}

for every $(\underline t, m) \in T^* {**}M$ and $(\gamma, x_0) \in\beta(X_{\mathcal{D}_1}^*) \times X_{\mathcal{D}_0}$ . (Recall that, if $\underline t =t_0 \dots t_{k-1}$ then $\lambda_{\underline t}(x_0) =\lambda_{t_0}(x_0) \dots \lambda_{t_{k-1}}(x_0) \in X_{\mathcal{D}_1}^*$ , the maps $\widetilde \ell_w, \widetilde r_w: \beta(X_{\mathcal{D}_1}^*) \to\beta(X_{\mathcal{D}_1}^*)$ are the unique continuous extensions of the left and right multiplication in the free monoid $X_{\mathcal{D}_1}^*$ by the word $w \in X_{\mathcal{D}_1}^*$ , and $\beta(\lambda_m^*), \beta(\rho_m^*):\beta(X_{\mathcal{D}_1}^*) \to \beta(X_{\mathcal{D}_1}^*)$ are the unique continuous extensions of the maps $\lambda_m^*, \rho_m^*: X_{\mathcal{D}_1}^* \to X_{\mathcal{D}_1}^*$ .)

We are ready to prove the main result of this section. The reader should recall the notation listed in Table 1.

Table 1: Notation

Theorem 6.9. Let $\mathcal{V}$ be an lp-variety of languages and $\mathcal{D} \subseteq \mathcal{P}((A \times 2)^*)$ be a Boolean subalgebra closed under quotients that is generated by the union of its retracts $\mathcal{D}_0 \subseteq \mathcal{P}(A^*)$ and $\mathcal{D}_1 \subseteq \mathcal{P}(A^* \otimes {\mathbb{N}})$ . Let $h: A^* \to \eta[T^*] {**} M$ be the homomorphism defined by:

\[h(a) = (\eta\circ \pi(a,1), \pi(a)).\]

Then, a language is recognized by the BiM ${\bf R}_{\mathcal{V} \circ \mathcal{D}} = (\eta[T^*]{**}M,\, \iota, \,X_{\mathcal{V}(X_{\mathcal{D}_1})} \times X_{\mathcal{D}_0})$ via h if and only if it belongs to $\mathcal{V} \circ \mathcal{D}$ .

Proof. We first show, by induction on the length of words, that $h(w) = (\eta \circ \tau_{\mathcal{D}_1}(w), \pi(w))$ , for every $w \in A^*$ . This is trivially the case if w is the empty word. Let $w \in A^*$ and $a \in A$ be a letter. Then, by induction hypothesis, we have

\[h(wa) = h(w)h(a) = (\eta \circ \tau_{\mathcal{D}_1}(w), \pi(w))(\eta\circ \tau_{\mathcal{D}_1}(a), \pi(a)),\]

and by definition of the multiplication on $(\eta[T^*]{**}M)$ (cf. (31)), it follows that

\[h(wa) = (\eta(\rho_{\pi(a)}^*(\tau_{\mathcal{D}_1}(w)) \lambda_{\pi(w)}^*(\tau_{\mathcal{D}_1}(a))), \pi(wa)).\]

Now, since $\tau_{\mathcal{D}_1}(w) = \pi(w, 0) \dots \pi(w,\left\lvert w\right\rvert - 1)$ and $\tau_{\mathcal{D}_1}(a) = \pi(a,1)$ , we have

\[\rho_{\pi(a)}^*(\tau_{\mathcal{D}_1}(w)) = \rho_{\pi(a)}\circ\pi(w, 0) \dots \rho_{\pi(a)}\circ \pi(w,\left\lvert w\right\rvert - 1) = \pi(wa, 0) \dots \pi(wa,\left\lvert w\right\rvert - 1)\]

and

\[\lambda_{\pi(w)}^*(\tau_{\mathcal{D}_1}(a)) = \lambda_{\pi(w)}(\pi(a,1)) = \pi(w(a,1)) = \pi(wa, \left\lvert w\right\rvert ).\]

Therefore,

\[h(wa) = (\eta (\pi(wa, 0) \dots \pi(wa,\left\lvert w\right\rvert - 1) \pi(wa, \left\lvert w\right\rvert)), \pi(wa)) = (\eta \circ \tau_{\mathcal{D}_1}(wa), \pi(wa)).\]

Now, the languages recognized by $(\eta[T^*]{**}M\rightarrowtail X_{\mathcal{V}(X_{\mathcal{D}_1})} \times X_{\mathcal{D}_0})$ via h are precisely the finite unions of languages of the form:

\[h^{-1}(\widehat K \times \widehat L) = \tau_{\mathcal{D}_1}^{-1}\circ \eta^{-1}(\widehat K) \cap \pi^{-1}(\widehat L)\]

for some $K\in \mathcal{V}(X_{\mathcal{D}_1})$ and $L \in \mathcal{D}_0$ . Since $\eta$ and $\pi$ are, respectively, dual to the embeddings $\mathcal{V}(X_{\mathcal{D}_1}) \rightarrowtail {\sf Clopen}(X_{\mathcal{D}_1}^*)$ and $\mathcal{D} \rightarrowtail \mathcal{P}((A \times 2)^*)$ , it follows that

\[h^{-1}(\widehat K \times \widehat L) = \tau_{\mathcal{D}_1}^{-1}(K) \cap L.\]

Thus, by Definition 6.1 of $\mathcal{V}(X_{\mathcal{D}_1}) \odot \mathcal{D}_1$ , the languages recognized by h are precisely the lattice combinations of languages of $\mathcal{V}(X_{\mathcal{D}_1}) \odot \mathcal{D}_1$ and of $\mathcal{D}_0$ .

In particular, since the set of languages recognized by a BiM via a fixed morphism is a Boolean algebra closed under quotients (cf. Section 4.1), it follows that the lattice $\mathcal{V} \circ \mathcal{D}$ is, in fact, a Boolean algebra closed under quotients.

Corollary 6.10. Let $\mathcal{V}$ be an lp-variety of languages and $\mathcal{D} \subseteq \mathcal{P}((A \times 2)^*)$ be a Boolean subalgebra closed under quotients that is generated by the union of its retracts $\mathcal{D}_0 \subseteq \mathcal{P}(A^*)$ and $\mathcal{D}_1 \subseteq \mathcal{P}(A^* \otimes {\mathbb{N}})$ . Then, the lattice $\mathcal{V} \circ \mathcal{D}$ is a Boolean algebra closed under quotients.

Finally, for the reader who is familiar with Almeida and Weil’s work (Almeida andWeil Reference Almeida and Weil1995), we sketch here the relationship between their Decomposition Theorem for semidirect products (Almeida and Weil Reference Almeida and Weil1995, Theorem 5.1) and our Theorem 6.9, thereby exhibiting in which sense our result may be seen as a generalization of their. We start by stating (Almeida and Weil Reference Almeida and Weil1995, Theorem 5.1) in a language that is closer to that used in this paper. Given pseudovarieties $\bf V$ and $\bf W$ of (finite) monoids, we will denote by $\mathcal{V}$ and $\mathcal{W}$ , respectively, the varieties of (regular) languages determined by $\bf V$ and $\bf W$ under Eilenberg’s correspondence (see Eilenberg Reference Eilenberg1976, Chapter VII, Section 3). It follows from Almeida (Reference Almeida1994, Theorem 3.6.1) that, for any pseudovariety $\bf V$ , the underlying topological space of the free A-generated pro-V monoid $ \overline{\Omega}_A{\bf V}$ is homeomorphic to the Stone dual $X_{\mathcal{V}(A)}$ of $\mathcal{V}(A)$ , and the canonical embedding $\iota: A \rightarrowtail \overline{\Omega}_A{\bf V}$ is a restriction of the continuous quotient $\pi:\beta(A^*)\twoheadrightarrow X_{\mathcal{V}(A)}$ dual to the embedding $\mathcal{V}(A)\rightarrowtail \mathcal{P}(A^*)$ . Moreover, the co-restriction $A^*\twoheadrightarrow \pi[A^*]$ of $\pi$ is the syntactic morphism of $\mathcal{V}(A)$ , and $A^* \twoheadrightarrow \pi[A^*] \rightarrowtail X_{\mathcal{V}(A)}$ its syntactic BiM-stamp (cf. Section 4.1). Finally, we will say that a homomorphism $\theta:A^* \to Z$ into a profinite monoid Z recognizes the language $L \subseteq A^*$ provided $L = \theta^{-1}(V)$ for some clopen subset $V \subseteq Z$ . We may then state Almeida and Weil’s Decomposition Theorem for semidirect products as follows:

Theorem 6.11. Let $\bf V$ and $\bf W$ be two pseudovarieties of finite monoids, and A be a finite alphabet. Footnote 1 We write $Y = X_{\mathcal{W}(A)} \times A \times X_{\mathcal{W}(A)}$ and denote by $\eta': Y^* \to X_{\mathcal{V}(Y)}$ the restriction of the dual of the embedding $\mathcal{V}(Y) \rightarrowtail {\sf Clopen}(Y^*)$ . Then, the right and left actions of the profinite monoid $X_{\mathcal{W}(A)}$ on the set Y given by $x \cdot (x_1, a, x_2) = (xx_1, a, x_2)$ and $(x_1, a, x_2) \cdot x = (x_1, a, x_2x)$ can be extended to a continuous biaction of $X_{\mathcal{W}(A)}$ on $X_{\mathcal{V}(Y)}$ , which satisfies

\[x \cdot \eta'(y_1 \dots y_n)\cdot x' = \eta'((xy_1x') \dots (xy_nx')),\]

for every $x, x' \in X_{\mathcal{D}_1}$ and $y_1 \dots y_n \in Y^*$ . The resulting two-sided semidirect product $X_{\mathcal{V}(Y)} {**} X_{\mathcal{W}(A)}$ is a profinite monoid (for the product topology of its underlying space $X_{\mathcal{V}(Y)} \times X_{\mathcal{W}(A)}$ ).

Moreover, if $\pi': A^* \to X_{\mathcal{W}(A)}$ is the restriction of the dual of the embedding $\mathcal{W}(A) \rightarrowtail \mathcal{P}(A^*)$ , then letting $\theta(a) = (\eta'(\pi'(1),a,\pi'(1)), \pi'(a))$ defines a unique monoid homomorphism $\theta: A^* \to X_{\mathcal{V}(Y)} {**} X_{\mathcal{W}(A)}$ that recognizes exactly the lattice combinations of languages recognized by a two-sided semidirect product $M{**}N$ with $M \in \bf V$ and $N \in \bf W$ .

Now, given pseudovarieties of monoids $\bf V$ and $\bf W$ , we take for $\mathcal{D}_0 \subseteq \mathcal{P}(A^*)$ the Boolean algebra $\mathcal{W}(A)$ and we let $\pi_0: A^* \to X_{\mathcal{D}_0}$ be the restriction of the dual of $\mathcal{D}_0\rightarrowtail \mathcal{P}(A^*)$ . In order to suitably define $\mathcal{D}_1$ , we will bear in mind the isomorphism $\xi: A^* \otimes {\mathbb{N}} \to A^* \times A \times A^*$ , which identifies a marked word (w, i) with the triple $(a_0 \dots a_{i-1}, a_i, a_{i+1}\dots a_{n-1})$ , for $w = a_0 \dots a_{n-1}$ . Accordingly, we let $\widetilde \pi_0: A^* \times A \times A^* \to X_{\mathcal{D}_0} \times A \times X_{\mathcal{D}_0}$ be defined by $\widetilde\pi_0(u, a, v) = (\pi_0(u), a, \pi_0(v))$ and $\widetilde\pi_1: A^*\otimes {\mathbb{N}} \to X_{\mathcal{D}_0} \times A \times X_{\mathcal{D}_0}$ be defined by $\widetilde\pi_1(w, i) = \widetilde \pi_0(a_0 \dots a_{i-1}, a_i,a_{i+1}\dots a_{n-1})$ , for $w = a_0 \dots a_{n-1}$ . Finally, $\mathcal{D}_1\rightarrowtail \mathcal{P}(A^* \otimes {\mathbb{N}})$ is the embedding whose dual is the co-restriction $\pi_1$ of $\widetilde\pi_1$ to the closure of its image, so that $\widetilde \pi_1 = e \circ \pi_1$ , where $e: X_{\mathcal{D}_1}\rightarrowtail X_{\mathcal{D}_0} \times A \times X_{\mathcal{D}_0}$ is the subspace embedding. Equivalently, $\mathcal{D}_1$ consists of the lattice combinations of languages of the form $\xi^{-1}(L_1 \times \{a\} \times L_2)$ , with $L_1, L_2 \in \mathcal{D}_0$ and $a \in A$ . Using Corollary 4.7, one may then verify that the sublattice $\mathcal{D}$ of $\mathcal{P}((A \times 2)^*)$ generated by $\mathcal{D}_0 \cup \mathcal{D}_1 \cup \{A_z\}$ is a Boolean algebra closed under quotients and thus, $\mathcal{D}_0$ , $\mathcal{D}_1$ , and $\mathcal{D}$ are as in the statement of Theorem 6.9, and the map $\pi: (A\times 2)^*\to X_{\mathcal{D}}$ restricts and co-restricts to $\pi_0$ and to $\pi_1$ . Also note that, by definition of Y in Theorem 6.11 and of $\mathcal{D}_0$ above, we have $Y = X_{\mathcal{D}_0} \times A \times X_{\mathcal{D}_0}$ . If $\eta$ and $\eta'$ are as in Theorems 6.9 and 6.11, respectively, then by Lemma 3.9, we have a commutative diagram:

Moreover, it is easily seen that the biaction of $X_{\mathcal{W}(X)} =X_{\mathcal{D}_0}$ on $X_{\mathcal{V}(Y)}$ given by Theorem 6.11 is an extension of the biaction of $M \subseteq X_{\mathcal{D}_0}$ on $\eta[X_{\mathcal{D}_1}^*]\subseteq X_{\mathcal{V}(Y)}$ given by Lemma 6.7. In particular, the map $(\widehat e \times id) \circ \iota:\eta[T^*]{**} M\rightarrowtail X_{\mathcal{V}(Y)}{**}X_{\mathcal{W}(A)}$ is a submonoid embedding, and the following diagram commutes:

Thus, in order to conclude that Theorem 6.11 of Almeida and Weil is, indeed, a consequence of our Theorem 6.9, it suffices to argue that $\mathcal{V} \circ \mathcal{D}$ consists of the lattice combinations of languages recognized by a monoid of the form $M{**}N$ , with $M \in{\bf V}$ and $N \in {\bf W}$ . In case, $\bf V$ and $\bf W$ are definable by suitable fragments of first-order logic, that is, the content of Tesson and Thérien (Reference Tesson and Thérien2007, Lemma 3.8). The general case follows standard arguments underlying the so-called block-product principle for finite monoids (see e.g., Pin Reference Pin1997, Section 6.3 or Straubing and Thérien Reference Straubing and Thérien2002, Section 4.2). For the reader’s convenience, we sketch its proof below.

Proposition 6.12. A language $L \subseteq A^*$ belongs to $\mathcal{V} \circ \mathcal{D}$ if and only if it is a lattice combination of languages over A recognized by a monoid of the form $M{**}N$ , with $M \in {\bf V}$ and $N \in \bf W$ .

Proof. Let us denote by $(\mathcal{V}{**}\mathcal{W})(A)$ the lattice generated by the languages over A recognized by a monoid of the form $M{**}N$ , with $M \in {\bf V}$ and $N \in \bf W$ . Also recall that $\mathcal{V}\circ \mathcal{D} = (\mathcal{V}(X_{\mathcal{D}_1})\odot X_{\mathcal{D}_1}) \cup \mathcal{D}_0$ . Since $\mathcal{D}_0 = \mathcal{W}(A)$ and N is a submonoid of every semidirect product of the form $M{**}N$ , it is clear that $\mathcal{D}_0 \subseteq (\mathcal{V}{**}\mathcal{W})(A)$ . Let $L \in \mathcal{V}(X_{\mathcal{D}_1})\odot X_{\mathcal{D}_1}$ , say $L = \tau_{\mathcal{D}_1}^{-1}(K)$ , for some $K \in \mathcal{V}(X_{\mathcal{D}_1})$ . Since $X_{\mathcal{D}_1}$ embeds in Y via e, by definition of variety, there exists a language $K_0 \in \mathcal{V}(Y)$ such that $K = (e^*)^{-1}(K_0)$ , and by Lemma 3.1, there is a finite continuous quotient $q: Y \twoheadrightarrow B$ and a monoid homomorphism $g: B^* \to M$ into a monoid $M \in \bf V$ such that $K_0 = (g \circ q^*)^{-1}(P)$ , for some $P \subseteq M$ . On the other hand, for each $b \in B$ , $q^{-1}(b)$ is a clopen subset of $Y = X_{\mathcal{D}_0} \times A \times X_{\mathcal{D}_0}$ and, as so, it may be written as a finite union $\bigcup_{i = 1}^{k_b} (\widehat {L_{i,b}} \times \{a_{i, b}\} \times \widehat {L_{i,b}'})$ , for some $L_{i,b}, L_{i, b}' \in \mathcal{D}_0 = \mathcal{W}(A)$ and $a_{i, b} \in A$ . We let $h: A^* \to N$ be a monoid homomorphism into a monoid $N \in \bf W$ that recognizes every language $L_{i,b}$ and $L_{i, b}'$ , and we define $\widetilde h: A^* \times A \times A^* \to N \times A \times N$ by $\widetilde h (u, a, v) = (h(u), a, h(v))$ . Note that, in particular, every clopen subset in the image of $q^{-1}$ is of the form $\widehat { \widetilde h^{-1}(Q)}$ , for some $Q \subseteq N \times A \times N$ . Therefore, the map $q \circ \widetilde \pi_0$ factors through $\widetilde h$ , say via $r: N \times A \times N \to B$ and, since $\widetilde \pi_0$ is dense, q is surjective, and B is finite, we have that r is a quotient. Finally, we let $\tau_h: A^* \to (N \times A \times N)^*$ be given by $\tau_h(a_0 \dots a_{n-1}) = (1, a_0, h(a_1 \dots a_{n-1}))(h(a_0), a_1, h(a_2 \dots a_{n-1})) \dots (h(a_0 \dots a_{n-2}), a_{n-1}, 1)$ . Routine computations show that the following diagram commutes:

In particular, we have $L = \tau_h^{-1}(f \circ r^*)^{-1}(P)$ , which is a language recognized by the semi-direct product $M^{N \times N}{**}N$ (usually called block product and denoted $M \Box N$ ), with N biacting on $M^{N \times N}$ by:

\[n \cdot \alpha \cdot n': N \times N \to M, \quad (x, y) \mapsto \alpha(xn, n'y),\]

for every $n,n' \in N$ and $\alpha \in M^{N\times N}$ . Indeed, one may verify that

\[L = g^{-1}(\{\alpha \in M^{N \times N} \mid \alpha(1,1) \in P \} \times N),\]

where g is the unique homomorphism $g: A^* \to M^{N \times N}{**}N$ satisfying $g(a) = (\alpha_a, h(a))$ , with $\alpha_a: N \times N \to M$ given by $\alpha_a(x,y) = \alpha \circ r (x, a, y)$ .

Conversely, let $h: A^* \to M{**}N$ be a monoid homomorphism, with $M \in \bf V$ and $N \in \bf W$ , and write $h(w) = (h_1(w), h_2(w))$ (note that $h_2$ is a monoid homomorphism, but $h_1$ may not be). Since $\mathcal{V} \circ \mathcal{D}$ is a lattice, and the languages recognized by h are all lattice combinations of languages of the form $h^{-1}(P \times N)$ and $h^{-1}(M \times Q)$ , with $P \subseteq M$ and $Q \subseteq N$ , it suffices to show that the latter belong to $\mathcal{V} \circ \mathcal{D}$ . It is clear that every language of the form $h^{-1}(M \times Q)$ belongs to $\mathcal{D}_0 = \mathcal{W}(A)$ , as it is recognized by N. We will now argue that the language $L = h^{-1}(P \times N)$ belongs to $\mathcal{V}(X_{\mathcal{D}_1}) \odot \mathcal{D}_1$ . Since $N \in \bf W$ , we have that $h_2$ factors through $\pi_0$ via a continuous map, say $h_2 = r \circ \pi_0$ , with $r: X_{\mathcal{D}_0} \to N$ continuous. We further let $\widetilde r: X_{\mathcal{D}_1} \to N \times A \times N$ be the continuous function defined by $\widetilde r (x) = (r \times id \times r) \circ e(x)$ , for every $x \in X_{\mathcal{D}_1}$ , and $g: (N \times A \times N)^* \to M$ be the unique monoid homomorphism satisfying $g(n, a, n') = n \cdot h_1(a) \cdot n'$ , where $(\_) \cdot n$ and $n \cdot (\_)$ denote, respectively, the right and left actions of n on M. Finally, if $q: M{**} N \twoheadrightarrow M$ is the projection onto M, one may check that the following diagram commutes:

In particular, we have $L = \tau_{\mathcal{D}_1}^{-1}(K)$ , where $K = (g \circ \widetilde r^{\,*})^{-1}(P)$ is a language of $\mathcal{V}(X_{\mathcal{D}_1})$ . This shows that L belongs to $\mathcal{V}(X_{\mathcal{D}_1} \odot \mathcal{D}_1)$ , as required.

7. Examples

We finish the paper with the computation of the BiM ${\bf R}_{\mathcal{V} \circ \mathcal{D}} =(\eta[T^*]{**}M, \, \iota, \, X_{\mathcal{V}(X_{\mathcal{D}_1})} \times X_{\mathcal{D}_0})$ for an arbitrary Boolean subalgebra $\mathcal{D} \subseteq \mathcal{P}((A \times2)^*)$ closed under quotients that is generated by the union of its retracts $\mathcal{D}_0 \subseteq \mathcal{P}(A^*)$ and $\mathcal{D}_1 \subseteq \mathcal{P}(A^*\otimes {\mathbb{N}})$ and three specific lp-varieties $\mathcal{V}$ (recall the notation from Table 1). The lp-varieties considered are naturally defined by classes of sentences of the form $\Gamma_\mathcal{Q}$ , as in Lemma 5.6, for some set of quantifiers $\mathcal{Q}$ . In particular, by Theorem 6.9, it follows that if $\varphi(x)$ is a formula defining a language in $\mathcal{D}$ then, for every quantifier $Q \in \mathcal{Q}$ , the language $Qx \ \varphi(x)$ is recognized by ${\bf R}_{\mathcal{V} \circ \mathcal{D}}$ . In Section 7.1, we consider the case where $\mathcal{Q} = \{\exists\}$ consists of the existential quantifier, in Section 7.2 we consider the set of modular quantifiers $\mathcal{Q} = \{\exists^r_q \mid r \in \{0, \dots,q-1\}\}$ for some positive integer r, and in Section 7.3 we consider the set of majority quantifiers $\mathcal{Q} = \{{\sf Maj}_k\mid k \in {\mathbb{Z}}_{\geq 0}\}$ . It is easy to verify that the lp-strain of languages obtained from each of these choices of $\mathcal{Q}$ is indeed an lp-variety so that Theorem 6.9 does apply.

In what follows, for every set X and every monoid $(N, +)$ , we will use $[X \to N]_{fin}$ to denote the set of functions $f: X \to N$ such that $f(x) = 0$ for all but finitely many $x \in X$ . Note that, when equipped with pointwise addition, $[X \to N]_{fin}$ is also a monoid whose identity is the constant function equal to the identity of N. If w is a word over some alphabet (finite or profinite), then we will use c(w) to denote the set of letters that occur in w, that is, c(w) denotes the content of w. Finally, for a subset S of the alphabet at hand, we use $\left\lvert w\right\rvert_S$ to denote the number of occurrences of a letter of S in w, and in the case where $S = \{a\}$ is a singleton, we simply write $\left\lvert w\right\rvert_a$ .

7.1 Existential quantifier

Given a finite alphabet A and a subset $S \subseteq A$ , we denote by $L_S$ the language over A defined by the formula $\exists x \\bigvee_{a \in S} P_a(x)$ , that is, $L_S = A^* S A^* = \{w \in A^*\mid c(w) \cap S \neq 0\}$ . We let $\mathcal{V}_\exists$ be defined by:

\[A \mapsto \mathcal{V}_\exists(A)= \langle{\{L_S \mid S \subseteq A\}}\rangle_{\sf BA},\]

for every finite alphabet A. Then, $\mathcal{V}_\exists$ is an lp-variety of regular languages. It is easy to see that, for each finite alphabet A, the syntactic monoid of $\mathcal{V}_\exists(A)$ is the free join-semilattice generated by A, that is, $(\mathcal{P}(A), \cup)$ – the powerset of A equipped with union. Moreover, the syntactic morphism of $\mathcal{V}_\exists(A)$ is given by:

\[\eta_A: A^* \twoheadrightarrow \mathcal{P}(A), \quad w \mapsto c(w) = \{a \in A \mid a \text{ occurs in }w\}.\]

Let Y be a profinite alphabet. Since $\mathcal{P}(A)$ is the dual space of $\mathcal{V}_\exists(A)$ , it follows from the definition of $\mathcal{V}_\exists(Y)$ that

\[X_{\mathcal{V}_\exists(Y)} = \lim_{\longleftarrow} \ \{\mathcal{P}(A) \mid Y \twoheadrightarrow A \text{ is a finite continuous quotient}\}.\]

In other words, $X_{\mathcal{V}_\exists(Y)}$ is the free profinite join-semilattice on Y. It is a well-known fact that this is the Vietoris space of Y, that is, the topological space ${\sf Viet}(Y)$ consisting of the closed subsets of Y and equipped with the topology generated by the subsets of the form:

\[\Diamond V = \{C \in {\sf Viet}(Y) \mid C \cap V \neq \emptyset\}\qquad \text{and}\qquad \Box V= \{C \in {\sf Viet}(Y) \mid C \subseteq V\},\]

where V is a clopen subset of Y. Note that, since $(\Diamond V)^c= \Box (V^c)$ , the clopen subsets of ${\sf Viet}(Y)$ are the Boolean combinations of the subsets of the form $\Box V$ , for $V \in{\sf Clopen}(Y)$ . We refer to Kuratowski (Reference Kuratowski1966, Chapter I, Section 17) and toMichael (1951) for further reading on the Vietoris construction.

Our next goal is to compute the map $\eta: \beta(Y^*)\twoheadrightarrow X_{\mathcal{V}_\exists(Y)}$ dual to the embedding $\mathcal{V}_\exists(Y)\rightarrowtail {\sf Clopen}(Y^*)$ . The following is an easy consequence of Lemma 3.1:

Lemma 7.1. Let Y be a profinite alphabet. Then, the Boolean algebra $\mathcal{V}_\exists(Y)$ is generated, as a lattice, by the languages of the form $V^*$ and $Y^*VY^*$ , where $V \subseteq Y$ is a clopen subset.

Proof. By Lemma 3.1, we know that $\mathcal{V}_\exists(Y)$ consists of the languages of the form $(\pi^*)^{-1}(K)$ for some finite continuous quotient $\pi: Y \twoheadrightarrow A$ and some language $K \in \mathcal{V}_\exists(A)$ . Since $\mathcal{V}_\exists(A)$ is the Boolean algebra generated by the languages of the form $A^* SA^*$ , with $S \subseteq A$ , it follows that $\mathcal{V}_\exists(Y)$ is the Boolean algebra generated by the languages of the form $(\pi^*)^{-1}(A^* S A^*) = Y^* \pi^{-1}(S) Y^*$ . Finally, we observe that the clopen subsets of Y are precisely the subsets of the form $\pi^{-1}(S)$ for some finite continuous quotient $\pi: Y \twoheadrightarrow A$ and some subset $S \subseteq A$ , and that $(Y^* V Y^*)^c = (V^c)^*$ for every $V \in {\sf Clopen}(Y)$ . Thus, $\mathcal{V}_\exists(Y)$ is generated, as a lattice, by the languages of the form $V^*$ and $Y^* V Y^*$ , where $V \in {\sf Clopen}(Y)$ .

We may now describe $\eta: \beta(Y^*) \twoheadrightarrow {\sf Viet}(Y)$ .

Proposition 7.2. Let Y be a profinite alphabet. Then, the dual of the embedding $\mathcal{V}_\exists(Y) \rightarrowtail {\sf Clopen}(Y^*)$ is the map:

\[\eta: \beta(Y^*) \twoheadrightarrow {\sf Viet}(Y), \quad \gamma \mapsto \bigcap \{V \in {\sf Clopen}(Y) \mid V^* \in \gamma\}.\]

Proof. Since arbitrary intersections of clopen sets are closed, the map $\eta$ is well defined. Using the duality between Boolean algebras and Boolean spaces, it suffices to show that $\eta$ is a continuous surjective map whose dual $\eta^{-1}: {\sf Clopen}({\sf Viet}(Y)) \to {\sf Clopen}(Y^*)$ has image $\mathcal{V}_\exists(Y)$ . Let us first prove that $\eta$ is surjective. Given $C \in {\sf Viet}(Y)$ , we let

\[\mathcal{F}_C = \{V^* \mid V \in {\sf Clopen}(Y), \, C \subseteq V\} \cup \{Y^* V Y^* \mid V \in {\sf Clopen}(Y), \, C \cap V \neq \emptyset\},\]

and we argue that $\mathcal{F}_C$ has the finite intersection property. Let $V_1, \dots, V_m, W_1, \dots, W_n \in {\sf Clopen}(Y)$ be such that $C \subseteq V_i$ ( $i = 1, \dots, m$ ) and $C \cap W_j \neq \emptyset$ ( $j = 1, \dots, n$ ). If $y_j \in C \cap W_j$ for every $j \in \{1, \dots, n\}$ , then the word $w = y_1 \dots y_n$ belongs to $\bigcap_{i = 1}^m V_i^* \cap \bigcap_{j = 1}^n Y^*W_jY^*$ , and thus, this is a nonempty subset of $Y^*$ as required. Therefore, there is an ultrafilter $\gamma_C \in \beta(Y^*)$ extending $\mathcal{F}_C$ . Since such an ultrafilter must satisfy

\[C \subseteq V \iff V^* \in \gamma_C,\]

for every $V \in {\sf Clopen}(Y)$ , it follows that $\eta(\gamma_C) = C$ , and this shows surjectivity of $\eta$ . Now, if $W \in {\sf Clopen}(Y)$ and $\gamma \in \beta(Y^*)$ , we have

\begin{align*} \gamma \in \eta^{-1}(\Box W) & \iff \bigcap\{V \in {\sf Clopen}(Y) \mid V^* \in \gamma\} \subseteq W\qquad \text{(by definition of~$\eta$)} \\ & \iff W^c \subseteq \bigcup \{V^c \mid V \in {\sf Clopen}(Y), \, V^* \in \gamma\} \\ & \iff \exists V_1, \dots, V_n \in {\sf Clopen}(Y) \colon V_i^* \in \gamma \text{ and } W^c \subseteq \bigcup_{i = 1}^n V_i^c \\ & \hspace{6.4cm} (\text{because $W^c$ is a compact subset of}\,\textit{Y}) \\ & \iff \exists V \in {\sf Clopen}(Y) \colon V^* \in \gamma \text{ and } V \subseteq W \\ & \hspace{1.1cm} \text{(because $\gamma$ is closed under finite intersections and $\bigcap_{i = 1}^n V_i^* = (\bigcap_{i = 1}^n V_i)^*$)} \\ & \iff W^* \in \gamma \qquad \text{(because $\gamma$ is upward closed)} \\ & \iff \gamma \in \widehat {W^*}. \end{align*}

That is, we have $\eta^{-1}(\Box W) = \widehat {W^*}$ for every $W \in {\sf Clopen}(Y)$ . This proves both continuity of $\eta$ and the equality $\eta^{-1}({\sf Clopen}({\sf Viet}(Y))) = \mathcal{V}_\exists(Y)$ as intended.

We now make explicit each item in Table 1 for this case. Note that, the restriction of $\eta$ to $Y^*$ is the map:

\[\eta_Y: Y^* \to {\sf Viet}(Y), \quad w \mapsto c(w) = \{y \in Y \mid y \text{ occurs in } w\},\]

whose image is $\mathcal{P}_{fin}(Y)$ – the family of finite subsets of Y. Indeed, that is because, for every word $w \in Y^*$ , if $\gamma_w =\{K \in {\sf Clopen}(Y^*) \mid w \in K\}$ is the associated principal ultrafilter and V a clopen subset of Y, then we have

\[V^* \in \gamma_w \iff w \in V^* \iff c(w) \subseteq V.\]

Now, if $\mathcal{D} \subseteq \mathcal{P}((A \times 2)^*)$ is a Boolean subalgebra closed under quotients, $\pi: (A \times 2)^* \to X_\mathcal{D}$ is the dual of the embedding $\mathcal{D} \rightarrowtail \mathcal{P}((A \times 2)^*)$ , and $T=\pi[A^* \otimes {\mathbb{N}}]$ and $M= \pi[A^*]$ are as before, we have $\eta[T^*] = \mathcal{P}_{fin}(T) \subseteq \mathcal{P}_{fin}(X_{\mathcal{D}_1})$ . By a straightforward application of Lemma 6.7, the biaction of M on $\mathcal{P}_{fin}(T)$ is given by:

\[\lambda_m(P) = \{m t \mid t \in P\}\quad \text{and}\quad \rho_m(P) = \{tm\mid t \in P\},\]

for every $P \in \mathcal{P}_{fin}(T)$ and $m \in M$ . In particular, the monoid multiplication of $\eta[T^*]{**}M = \mathcal{P}_{fin}(T){**}M$ is given by:

\[(P_1, m_1) (P_2, m_2) = (\{t_1m_2 \mid t_1 \in P_1\} \cup \{m_1t_2 \mid t_2 \in P_2\}, m_1m_2),\]

for every $(P_1, m_1), (P_2, m_2) \in \mathcal{P}_{fin}(T){**}M$ . Although this suffices to characterize the BiM

\[{\bf R}_{\mathcal{V}_{\exists} \circ \mathcal{D}}= (\eta[T^*]{**}M, \iota, X_{\mathcal{V}_\exists(X_{\mathcal{D}_1})} \times X_{\mathcal{D}_0}) = (\mathcal{P}_{fin}(T){**}M , \iota, {\sf Viet}(X_{\mathcal{D}_1}) \times X_{\mathcal{D}_0}),\]

we may use (32) and (33) to explicitly compute the biaction of ${\sf Viet}(X_{\mathcal{D}_1}) \times X_{\mathcal{D}_0}$ on $\mathcal{P}_{fin}(T){**}M$ . We start by computing the first coordinate of (32) and (33). Given a closed subset $C \subseteq Y$ , we let $\gamma_C \in \beta(Y^*)$ satisfy

(34) \begin{equation} C \subseteq V \iff V^* \in \gamma_C,\end{equation}

for every $V \in {\sf Clopen}(Y)$ . Note that, by the proof of Proposition 7.2, such an ultrafilter always exists and is such that $\eta(\gamma_C) = C$ .

Lemma 7.3. For every $C \in {\sf Viet}(X_{\mathcal{D}_1})$ , $w \in X_{\mathcal{D}_1}^*$ , and $m \in M$ , the following equalities hold

\[\eta \circ \widetilde\ell_w\circ \beta(\lambda_m^*) (\gamma_C) = \lambda_m[C] \cup c(w)\qquad \text{and}\qquad \eta \circ \widetilde r_w\circ \beta(\rho_m^*) (\gamma_C) = \rho_m[C]\cup c(w).\]

Proof. By definition of $\eta$ , the closed set $\eta \circ \widetilde\ell_w\circ \beta(\lambda_m^*) (\gamma_C)$ consists of those points $x \in X_{\mathcal{D}_1}$ that belong to every clopen subset $V \subseteq X_{\mathcal{D}_1}$ satisfying $V^* \in \widetilde\ell_w\circ \beta(\lambda_m^*) (\gamma_C)$ . Since $\widetilde\ell_w$ and $\beta(\lambda_m^*)$ are, respectively, the duals of the homomorphisms $\ell_w$ (recall (5)) and $(\lambda_m^*)^{-1}$ , we have that $V^* \in \widetilde\ell_w\circ \beta(\lambda_m^*) (\gamma_C)$ if and only if $(\lambda_m^*)^{-1} \circ \ell_w(V^*) \in \gamma_C$ . Now, it is easy to see that

\[(\lambda_m^*)^{-1} \circ \ell_w(V^*) = \begin{cases} (\lambda_m^{-1}(V))^*, &\text{if $w \in V^*$}; \\\emptyset, &\text{else}. \end{cases} \]

Therefore, using (34), we may conclude that $\eta \circ \widetilde\ell_w\circ \beta(\lambda_m^*) (\gamma_C)$ consists of those points $x \in X_{\mathcal{D}_1}$ that belong to every clopen subset $V \subseteq X_{\mathcal{D}_1}$ satisfying $w \in V^*$ (i.e., $c(w)\subseteq V$ ) and $C \subseteq \lambda_m^{-1}(V)$ (that is, $\lambda_m[C] \subseteq V$ ). Since c(w) and $\lambda_m[C]$ are both closed subsets of Y, it follows that

\[\eta \circ \widetilde\ell_w\circ \beta(\lambda_m^*) (\gamma_C) = \lambda_m[C] \cup c(w)\]

as required. The second equality of the statement is proved in a similar manner.

Now, given $(C, x_0) \in {\sf Viet}(X_{\mathcal{D}_1}) \times X_{\mathcal{D}_0}$ and $(P,m) \in \mathcal{P}_{fin}(T) {**} M$ , we let $\underline t \in T^*$ be such that $\eta(\underline t) = P$ . Using (32), (33), and Lemma 7.3, we have that the biaction of $\mathcal{P}_{fin}(T){**}M$ on ${\sf Viet}(X_{\mathcal{D}_1}) \times X_{\mathcal{D}_0}$ is given by:

\[\lambda_{(P, m)}(C, x_0) = (\lambda_m[C] \cup \{\lambda_t(x_0) \mid t \in P\}, \lambda_m(x_0))\]

and

\[\rho_{(P, m)} (C, x_0) = (\rho_m[C] \cup \{\rho_t(x_0) \mid t \in P\}, \rho_m(x_0)).\]

Finally, comparing our construction with that of Gehrke et al. (Reference Gehrke, Petrişan and Reggio2016), we have that the BiM ${\bf R}_{\mathcal{V}_{\exists} \circ \mathcal{D}}$ just described properly embeds in the unary Schützenberger product of $(M_\mathcal{D} \rightarrowtail X_\mathcal{D})$ introduced in Gehrke et al. (Reference Gehrke, Petrişan and Reggio2016, Definition 11). Indeed, since we are assuming that $\mathcal{D} \subseteq \mathcal{P}((A \times 2)^*)$ is a Boolean subalgebra closed under quotients that is generated by the union of its retracts $\mathcal{D}_0$ and $\mathcal{D}_1$ , we have that $\mathcal{D}$ is isomorphic to the product $\mathcal{D}_0 \times \mathcal{D}_1 \times \mathcal{D}_z$ (cf. Section 4.2). Dually, it follows that $X_\mathcal{D}$ may be seen as the disjoint union of its closed subspaces $X_{\mathcal{D}_0}$ , $X_{\mathcal{D}_1}$ , and $X_{\mathcal{D}_z}$ . In particular, $X_{\mathcal{D}_0}$ and $X_{\mathcal{D}_1}$ are properly contained in $X_\mathcal{D}$ . On the other hand, we recall that the space component of the unary Schützenberger product of $(M_\mathcal{D} \rightarrowtail X_\mathcal{D})$ is ${\sf Viet}(X_\mathcal{D}) \times X_\mathcal{D} $ , while that of the ${\bf R}_{\mathcal{V}_{\exists} \circ \mathcal{D}}$ is ${\sf Viet}(X_{\mathcal{D}_1}) \times X_{\mathcal{D}_0}$ . Thus, it is clear that the first one properly contains the latter. A similar analysis may be done for the monoid component of each of the BiMs involved.

7.2 Modular quantifiers

Before describing the elements of Table 1 for the lp-variety defined by the modular quantifiers, we remark that the existential quantifier considered in the previous section is defined by the two-element Boolean algebra ${\bf 2}$ in the following sense: for every word w and every formula $\varphi(x)$ , we have

\[w \models \exists x \ \varphi(x)\iff \bigvee_{i = 0}^{\left\lvert w\right\rvert -1} \chi_{ \varphi(x)}(w, i) = 1,\]

where the right-hand side is an equality in $\bf 2$ and $\chi_{\varphi(x)}(w, i) = 1$ if and only if $(w, i) \models\varphi(x)$ . In Gehrke et al. (Reference Gehrke, Petrişan and Reggio2017, 2021), the authors extended the results of Gehrke et al. (Reference Gehrke, Petrişan and Reggio2016) to a broader class of quantifiers defined by finite semirings in a similar way: if $(S, +, \cdot\, , 0, 1)$ is a semiring and $r \in S$ , then $Q_S^r$ is defined by:

\[w \models Q_S^r x \ \varphi(x) \iff \sum_{i = 0}^{\left\lvert w\right\rvert - 1} \chi_{ \varphi(x)}(w, i) = r,\]

where $\chi_{\varphi(x)}(w, i)$ is 1 if $(w, i) \models\varphi(x)$ and is 0 otherwise. When $S = {{\mathbb{Z}}}_q$ is the semiring of integers modulo q, the quantifier $Q_{q}^r$ is nothing but the modular quantifier $\exists_q^r$ . Unlike in Gehrke et al. (Reference Gehrke, Petrişan and Reggio2017, 2021), where the full semiring structure of ${\mathbb{Z}}_q$ is needed, the reader should notice that we will only make use of the underlying additive structure of ${\mathbb{Z}}_q$ .

We fix a positive integer q, and we let $\mathcal{V}_{{\sf mod} \, q}$ denote the lp-variety given by:

\[A \mapsto \mathcal{V}_{{\sf mod} \, q}(A) = \langle \{L_{S, r} \mid S \subseteq A, \ r \in \{0, \dots, q-1\}\}\rangle_{\sf BA},\]

where $L_{S,r}$ denotes the language defined by the formula $\exists_q^r\, x \ \bigvee_{a \in S} P_a(x)$ , that is, $L_{S, r} = \{w\in A^* \mid \left\lvert w\right\rvert_S \equiv r\ {\rm mod} \ q\}$ . Once again, we have that $\mathcal{V}_{{\sf mod}\, q}$ is an lp-variety of regular languages. Note that, since for every $S \subseteq A$ and $r \in \{0,\dots, q-1\}$ , the equality $L_{S,r}^c = \bigcup \{L_{S, r'} \mid r'\in \{0, \dots, q-1\}, \, r' \neq r\}$ holds, the Boolean algebra $\mathcal{V}_{{\sf mod} \, q}(A)$ is generated, as a lattice, by the languages of the form $L_{S, r}$ . It is easy to see that, for each finite alphabet A, ${\mathbb{Z}}_q^A$ equipped with pointwise addition is the syntactic monoid of $\mathcal{V}_{{\sf mod} \, q}(A)$ and

\[\eta_A: A^* \twoheadrightarrow {{Z}}_q^A, \quad w \mapsto (\left\lvert w\right\rvert_a {\rm mod }\ q)_{a \in A}\]

is its syntactic morphism. In Reggio (Reference Reggio2020), a computation of the space

\[X_{\mathcal{V}_{{\sf mod} \, q}(Y)} =\lim_{\longleftarrow} \ \{ \mathbb{Z}_q^A\mid A \text{ is a finite continuous quotient of }Y\},\]

is provided and the Boolean algebra $\mathcal{V}_{{\sf mod} \, q}(Y)$ is described (cf. Lemmas 3.5 and 4.1, respectively). We will now proceed with the description of ${\bf R}_{\mathcal{V}_{{\sf mod} \, q} \circ \mathcal{D}}$ without using these results.

Lemma 7.4. Let Y be a profinite alphabet. Then, the Boolean algebra $\mathcal{V}_{{\sf mod} \, q}(Y)$ is generated, as a lattice, by the languages of the form $L_{V, r} = \{w \in Y^* \mid \left\lvert w\right\rvert_V \equiv r \ {\rm mod}\ q\}$ , where $V \subseteq Y$ is a clopen subset and $r \in \{0, \dots, q-1\}$ .

Proof. Since each $\mathcal{V}_{{\sf mod} \, q}(A)$ is generated, as a lattice, by the languages of the form $L_{S, r}$ , with $S \subseteq A$ and $r \in \{0, \dots, q-1\}$ , by Lemma 3.1, the languages of $\mathcal{V}_{{\sf mod} \, q}(Y)$ are the lattice combinations of languages of the form $(\pi^*)^{-1}(L_{S, r})$ for some finite continuous quotient $\pi: Y \twoheadrightarrow A$ , and some $S \subseteq A$ and $r \in \{0, \dots, q-1\}$ . The claim follows from observing that

\[(\pi^*)^{-1}(L_{S, r}) = \{w \in Y^* \mid \left\lvert w\right\rvert_{\pi^{-1}(S)} \equiv r \ {\rm mod}\ q\},\]

and that the clopen subsets of Y are exactly the subsets of the form $\pi^{-1}(S)$ .

Proposition 7.5. Let Y be a profinite alphabet. Then, $([Y\to {\mathbb{Z}}_q]_{fin}, +)$ is the syntactic monoid of $\mathcal{V}_{{\sf mod} \, q}(Y)$ and

\[\eta_Y: Y^* \twoheadrightarrow [Y\to {\mathbb{Z}}_q]_{fin}, \quad w \mapsto (f_w: y \mapsto \left\lvert w\right\rvert_y {\rm mod }\ q)\]

is its syntactic morphism. Moreover, $\mathcal{V}_{{\sf mod} \, q}(Y)$ is isomorphic to the Boolean subalgebra $\mathcal{B}_Y$ of $\mathcal{P}([Y\to \mathbb{Z}_q]_{fin})$ generated by the subsets of the form:

\[\langle V, r\rangle = \{f \in [Y\to {\mathbb{Z}}_q]_{fin} \mid \sum \{f(y) \mid y \in V\} \equiv r \ {\rm mod } \ q \},\]

where $V \in {\sf Clopen}(Y)$ and $r \in {\mathbb{Z}}_q$ .

Proof. First note that $\eta_Y$ is a well-defined surjective function. Then, proving that $\eta_Y$ is the syntactic morphism of $\mathcal{V}_{{\sf mod} \, q}(Y)$ amounts to proving that $\ker(\eta_Y)$ is the syntactic congruence $\sim$ of $\mathcal{V}_{{\sf mod} \, q}(Y)$ . By Lemma 7.4, two words $u, v \in Y^*$ are $\sim$ -equivalent if and only if $\left\lvert u\right\rvert_V \equiv \left\lvert v\right\rvert_V \ {\rm mod} \ q$ , for every $V \in {\sf Clopen}(Y)$ . Since, for every $y \in Y$ , there is a clopen subset $V \subseteq Y$ such that $y \in V$ and $(V \setminus \{y\}) \cap (c(u) \cup c(v)) = \emptyset$ , it follows that $u \sim v \iff \eta_Y (u) = \eta_Y(v)$ as required.

The last claim follows from having $\eta_Y^{-1}(\langle V, r\rangle) = L_{V, r}$ , for every $V \in {\sf Clopen}(Y)$ and $r \in \{0, \dots, q-1\}$ , and from Lemma 7.4.

It remains to describe the monoid $\eta[T^*]{**}M$ (recall Table 1). By Proposition 7.5, we have that $\eta[T^*] = [T\to {\mathbb{Z}}_q]_{fin}$ is the submonoid of $[Y\to{\mathbb{Z}}_q]_{fin}$ whose functions vanish in $Y \setminus T$ . Using Lemma 6.7, we may then conclude that the biaction of M on $\eta[T^*] = [T\to {\mathbb{Z}}_q]_{fin}$ is defined by:

\[\lambda_m(f): t \mapsto \sum \{f(t') \mid m t' = t\} \quad \text{and} \quad \rho_m(f): t \mapsto \sum \{f(t') \mid t'm = t\} ,\]

for every $f \in [T \to {\mathbb{Z}}_q]_{fin}$ and $m \in M$ . As expected, our construction is tightly related to that of Gehrke et al. (Reference Gehrke, Petrişan and Reggio2017, 2021): the BiM ${\bf R}_{\mathcal{V}_{{\sf mod} \, q} \circ \mathcal{D}}$ embeds in the BiM $(\Diamond_{{\mathbb{Z}}_q}(X_{\mathcal{D}}), \Diamond_{{\mathbb{Z}}_q}(M_{\mathcal{D}}))$ of Gehrke et al. (Reference Gehrke, Petrişan and Reggio2017, 2021).

7.3 Majority quantifiers

Our last example concerns the majority quantifiers. Although an algebraic approach was undertaken in Krebs et al. (Reference Krebs, Lange and Reifferscheid2005), it relies on an iterative application of a block product construction and a simple description of a topo-algebraic recognizer for the languages of $\mathcal{V}_{\sf Maj}\circ \mathcal{D}$ built on a recognizer for the languages of $\mathcal{D}$ is not known (here, $\mathcal{V}_{\sf Maj}$ denotes the lp-variety defined by the majority quantifiers). Therefore, this example is a truly new application of the work of this paper.

Let A be a finite alphabet. For each subset $S \subseteq A$ and for each integer $k \in {\mathbb{Z}}$ , we let $L_{S, k}$ denote the language of all words $w \in A^*$ satisfying $\left\lvert w\right\rvert_S - \left\lvert w\right\rvert_{S^c} >k$ . Note that, if $k \geq 0$ , then the language $L_{S, k}$ is defined by the formula ${\sf Maj}_k x\ \bigvee_{a \in S}{\sf P}_a(x)$ . Then, $\mathcal{V}_{\sf Maj}$ is the lp-variety defined by:

\[A \mapsto \mathcal{V}_{\sf Maj}(A) = \langle \{L_{S, k} \mid S \subseteq A, \ k \in {\mathbb{N}}\}\rangle_{\sf BA}.\]

Since, for every $S \subseteq A$ and for every $k \in {\mathbb{Z}}$ , we have $L_{S, k}^c = L_{S^c, -(k+1)}$ , it follows that $\mathcal{V}_{\sf Maj}(A)$ is the lattice generated by $\{L_{S, k} \mid S \subseteq A,\, k \in {\mathbb{Z}}\}$ .

The proof of the following is omitted as it is very similar to the proof of Lemma 7.4.

Lemma 7.6. For every profinite alphabet Y, the Boolean algebra $\mathcal{V}_{\sf Maj}(Y)$ is generated, as a lattice, by the languages of the form $L_{V, k}= \{w \in Y^* \mid \left\lvert w\right\rvert_V - \left\lvert w\right\rvert_{V^c} > k\}$ , where $V \in{\sf Clopen}(Y)$ and $k \in {\mathbb{Z}}$ .

Using this characterization, we may then describe the syntactic morphism of $\mathcal{V}_{\sf Maj}(Y)$ .

Proposition 7.7. Let Y be a profinite alphabet. Then, $([Y\to {\mathbb{N}}]_{fin}, +)$ is the syntactic monoid of $\mathcal{V}_{\sf Maj}(Y)$ and

\[\eta_Y: Y^* \twoheadrightarrow [Y\to {\mathbb{N}}]_{fin}, \quad w \mapsto (f_w: y \mapsto \left\lvert w\right\rvert_y)\]

is the syntactic morphism. Moreover, $\mathcal{V}_{\sf Maj}(Y)$ is isomorphic to the Boolean subalgebra $\mathcal{B}_Y$ of $\mathcal{P}([Y\to \mathbb{N}]_{fin})$ generated by the subsets of the form:

\[\langle V, k\rangle = \{f \in [Y\to {\mathbb{N}}]_{fin} \mid \sum \{f(y) \mid y \in V\} - \sum \{f(y) \mid y \notin V\} > k \},\]

where $V \in {\sf Clopen}(Y)$ and $k \in {\mathbb{N}}$ .

Proof. It is clear that $\eta_Y$ is a surjective well-defined function. To show that $\eta_Y$ is the syntactic morphism of $\mathcal{V}_{\sf Maj}(Y)$ , we need to show that the syntactic congruence $\sim$ of $\mathcal{V}_{\sf Maj}(Y)$ coincides with $\ker(\eta_Y)$ . By Lemma 7.6, two words $u, v \in Y^*$ are $\sim$ -equivalent if and only if, for every $V \in {\sf Clopen}(Y)$ and $k \in {\mathbb{Z}}$ , the equivalence

(35) \begin{equation} \left\lvert u\right\rvert_V - \left\lvert u\right\rvert_{V^c} > k \iff \left\lvert v\right\rvert_V - \left\lvert v\right\rvert_{V^c} > k \end{equation}

holds. In particular, we have the inclusion $\ker(\eta_Y) \subseteq {\sim}$ . Conversely, suppose that $u, v \in Y^*$ are $\sim$ -equivalent. We first note that $\left\lvert u\right\rvert = \left\lvert v\right\rvert$ . Indeed, that is because, by taking $V = Y$ in (35), we have

\[\forall k \in {\mathbb{Z}}, \ \left\lvert u\right\rvert > k \iff \left\lvert v\right\rvert > k.\]

We denote $n = \left\lvert u\right\rvert = \left\lvert v\right\rvert$ . Now, given a letter $y \in Y$ , we let $V \in {\sf Clopen}(Y)$ be such that $y \in V$ and $(V \setminus \{y\}) \cap (c(u) \cup c(v)) = \emptyset$ . That is, V is such that $\left\lvert u\right\rvert_{V} = \left\lvert u\right\rvert_y$ and $\left\lvert v\right\rvert_{V} = \left\lvert v\right\rvert_y$ . Then, using the equalities $\left\lvert u\right\rvert_{V^c} = n - \left\lvert u\right\rvert_V$ and $\left\lvert v\right\rvert_{V^c} = n - \left\lvert v\right\rvert_V$ , it follows from (35) that

\[\forall k \in {\mathbb{Z}}, \ 2 \left\lvert u\right\rvert_Y - n > k \iff 2 \left\lvert v\right\rvert_y - n> k,\]

which clearly implies that $\left\lvert u\right\rvert_y = \left\lvert v\right\rvert_y$ as required.

The last assertion follows from Lemma 7.6 and from the equality $(\eta_Y)^{-1}(\langle V, k\rangle) = L_{V, k}$ , which holds for every $V \in {\sf Clopen}(Y)$ and $k \in {\mathbb{Z}}$ .

Finally, we let $\mathcal{D} \subseteq \mathcal{P}((A \times 2)^*)$ be a Boolean subalgebra closed under quotients that is generated by its retracts $\mathcal{D}_0$ and $\mathcal{D}_1$ . As before, we let $\pi: (A \times 2)^* \to X_\mathcal{D}$ be dual to the embedding $\mathcal{D} \rightarrowtail \mathcal{P}((A \times 2)^*)$ , and we denote $T = \pi[A^* \otimes {\mathbb{N}}]$ and $M: = \pi[A^*]$ . Then, by Proposition 7.7, we have that the underlying set of the monoid component of ${\bf R}_{\mathcal{V}_{\sf Maj} \circ \mathcal{D}}$ is $[T \to {\mathbb{N}}]_{fin} \times M$ , and by Lemma 6.7, the biaction of M on $[T \to {\mathbb{N}}]_{fin}$ is defined by:

\[\lambda_m(f): t \mapsto \sum \{f(t') \mid m t' = t\} \quad \text{and} \quad \rho_m(f): t \mapsto \sum \{f(t') \mid t'm = t\} ,\]

for every $f \in [T \to {\mathbb{N}}]_{fin}$ and $m \in M$ . Therefore, for every $(f_1, m_1), (f_2, m_2) \in [T \to {\mathbb{N}}]_{fin} {**} M$ , we have $(f_1, m_1)(f_2, m_2) = (f, m_1m_2)$ , where $f(t) = \sum \{f_1(t')\mid t'm_2 = t\} + \sum \{f_2(t') \mid mt' = t\}$ for every $t \in T$ .

Competing interests.

The authors declare none.

Footnotes

*

This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (grant agreement No.670624). The first-named author was also partially supported by the Centre for Mathematics of the University of Coimbra – UIDB/00324/2020, funded by the Portuguese Government through FCT/MCTES.

1 We remark that their result is valid for A profinite, but for the sake of simplicity we will consider the case where A is finite.

References

Almeida, J. (1994). Finite Semigroups and Universal Algebra, World Scientific Publishing Co. Inc., River Edge, NJ. Translated from the 1992 Portuguese original and revised by the author.Google Scholar
Almeida, J. (2005). Profinite semigroups and applications. In: Structural Theory of Automata, Semigroups, and Universal Algebra, NATO Science Series II: Mathematics, Physics and Chemistry, vol. 207, Dordrecht, Springer, 1–45. Notes taken by Alfredo Costa.CrossRefGoogle Scholar
Almeida, J. and Weil, P. (1995). Free profinite semigroups over semidirect products. Russian Mathematics (Iz. VUZ) 39 (1) 127.Google Scholar
Banaschewski, B. (1955). Über nulldimensionale Räume. Mathematische Nachrichten 13 129140.CrossRefGoogle Scholar
Banaschewski, B. (1964). Extensions of topological spaces. Canadian Mathematical Bulletin 7 122.CrossRefGoogle Scholar
Barrington, D. A. M., Compton, K., Straubing, H. and Thérien, D. (1992). Regular languages in ${\rm NC}^1$ . Journal of Computer and System Sciences 44 (3) 478499.CrossRefGoogle Scholar
Borlido, C., Czarnetzki, S., Gehrke, M. and Krebs, A. (2017). Stone duality and the substitution principle. In: Computer Science Logic 2017, LIPIcs, Leibniz International Proceedings in Informatics, vol. 82, Schloss Dagstuhl, Leibniz-Zent. Inform., Wadern, Art. No. 13, 20.Google Scholar
Dekkers, M. (2008). Stone Duality: An Application in the Theory of Formal Languages. Master’s thesis, Radboud University Nijmegen, Netherlands.Google Scholar
Eilenberg, S. (1976). Automata, Languages, and Machines. Vol. B, New York, Academic Press [Harcourt Brace Jovanovich Publishers].Google Scholar
Furst, M., Saxe, J. B. and Sipser, M. (1984). Parity, circuits, and the polynomial-time hierarchy. Mathematical Systems Theory 17 (1) 1327.CrossRefGoogle Scholar
Gehrke, M. (2016). Stone duality, topological algebra, and recognition. Journal of Pure and Applied Algebra 220 (7) 27112747.CrossRefGoogle Scholar
Gehrke, M., Grigorieff, S. and Pin, J.-É. (2008). Duality and equational theory of regular languages. In: Aceto, L., et al. (eds.) ICALP 2008, Part II, Lecture Notes in Computer Science, vol. 5126, Berlin, Springer, 246257.Google Scholar
Gehrke, M., Grigorieff, S. and Pin, J.-É. (2010). A topological approach to recognition. In Automata, Languages and Programming. Part II, Lecture Notes in Computer Science, vol. 6199, Berlin, Springer, 151162.CrossRefGoogle Scholar
Gehrke, M., Jakl, T. and Reggio, L. (2020). A Cook’s tour of duality in logic: from quantifiers, through Vietoris, to measures. arXiv:2007.15415, preprint.Google Scholar
Gehrke, M., Petrişan, D. and Reggio, L. (2016). The Schützenberger product for syntactic spaces. In: 43rd International Colloquium on Automata, Languages, and Programming, LIPIcs. Leibniz International Proceedings in Informatics, vol. 55, Schloss Dagstuhl. Leibniz-Zent. Inform., Wadern, Art. No. 112, 14.Google Scholar
Gehrke, M., Petrişan, D. and Reggio, L. (2017). Quantifiers on languages and codensity monads. In: 2017 32nd Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), NJ, IEEE, [Piscataway], 12.Google Scholar
Gehrke, M., Petrişan, D. and Reggio, L. (2020). Quantifiers on languages and codensity monads. Mathematical Structures in Computer Science 30 (10) 10541088.CrossRefGoogle Scholar
Immerman, N. (1999). Descriptive Complexity , Graduate Texts in Computer Science, New York, Springer-Verlag.Google Scholar
Johnstone, P. T. (1986). Stone Spaces, Cambridge Studies in Advanced Mathematics, vol. 3, Cambridge, Cambridge University Press. Reprint of the 1982 edition.Google Scholar
Krebs, A., Lange, K.-J. and Reifferscheid, S. (2005). Characterizing ${TC}^0$ in terms of infinite groups. In: STACS 2005, 22nd Annual Symposium on Theoretical Aspects of Computer Science, Stuttgart, Germany, February 24–26, 2005, Proceedings, 496–507.Google Scholar
Kuratowski, K. (1966). Topology. Vol. I, New York-London, Academic Press; Warsaw, Państwowe Wydawnictwo Naukowe. New edition, revised and augmented, Translated from the French by J. Jaworowski.Google Scholar
Lawvere, F. W. (2006). Adjointness in foundations. Repr. Theory and Applications of Categories (16) 1–16. Reprinted from Dialectica 23 (1969).Google Scholar
Marquès, J. (2021). Polyadic spaces and profinite monoids. In: Fahrenberg, U., Gehrke, M., Santocanale, L. and Winter, M. (eds.) Relational and Algebraic Methods in Computer Science - 19th International Conference, RAMiCS 2021, Marseille, France, November 2–5, 2021, Proceedings, Lecture Notes in Computer Science, vol. 13027, Springer, 292–308.CrossRefGoogle Scholar
Michael, E. (1951). Topologies on spaces of subsets. Transactions of the American Mathematical Society 71 152182.CrossRefGoogle Scholar
Pin, J.-É. (1997). Syntactic semigroups. In: Handbook of Formal Languages, vol. 1, Berlin, Springer, 679746.CrossRefGoogle Scholar
Pin, J.-É. and Straubing, H. (2005). Some results on $\mathcal{C}$ -varieties. Theoretical Informatics and Applications 39 239262.CrossRefGoogle Scholar
Pippenger, N. (1997). Regular languages and Stone duality. Theory of Computing Systems 30 (2) 121134.CrossRefGoogle Scholar
Porter, J. R. and Woods, R. G. (1988). Extensions and Absolutes of Hausdorff Spaces, New York, Springer-Verlag.CrossRefGoogle Scholar
Reggio, L. (2020). Codensity, profiniteness and algebras of semiring-valued measures. Journal of Pure and Applied Algebra 224 (1) 181205.CrossRefGoogle Scholar
Straubing, H. (1994). Finite Automata, Formal Logic, and Circuit Complexity , Progress in Theoretical Computer Science, Boston, MA, Birkhäuser Boston, Inc.Google Scholar
Straubing, H. (2002). On logical descriptions of regular languages. In: LATIN 2002, Lecture Notes in Computer Science, vol. 2286, Berlin, Springer, 528538.CrossRefGoogle Scholar
Straubing, H. and Thérien, D. (2002). Weakly iterated block products of finite monoids. In: LATIN 2002: Theoretical informatics (Cancun), Lecture Notes in Computer Science, vol. 2286, Berlin, Springer, 91104.CrossRefGoogle Scholar
Tesson, P. and Thérien, D. (2007). Logic meets algebra: the case of regular languages. Logical Methods in Computer Science 3 (1) 1:4, 37.Google Scholar
Weil, P. (1992). Closure of varieties of languages under products with counter. Journal of Computer and System Sciences 45 (3) 316339.CrossRefGoogle Scholar
Willard, S. (2004). General Topology, Mineola, NY, Dover Publications, Inc. Reprint of the 1970 original.Google Scholar
Williams, R. (2014). Nonuniform ACC circuit lower bounds. Journal of the ACM 61 (1) Art. 2, 32.CrossRefGoogle Scholar
Borlido, C and Gehrke, M (2023). Substitution Principle and semidirect products. Mathematical Structures in Computer Science. https://doi.org/10.1017/S0960129523000294 CrossRefGoogle Scholar
Figure 0

Table 1: Notation