Hostname: page-component-cd9895bd7-dk4vv Total loading time: 0 Render date: 2024-12-26T07:24:42.290Z Has data issue: false hasContentIssue false

Propagation of chaos and large deviations in mean-field models with jumps on block-structured networks

Published online by Cambridge University Press:  05 June 2023

Donald A. Dawson*
Affiliation:
Carleton University
Ahmed Sid-Ali*
Affiliation:
Carleton University
Yiqiang Q. Zhao*
Affiliation:
Carleton University
*
*Postal address: School of Mathematics and Statistics, Carleton University, 1125 Colonel By Drive, Ottawa, Ontario K1S 5B6, Canada.
*Postal address: School of Mathematics and Statistics, Carleton University, 1125 Colonel By Drive, Ottawa, Ontario K1S 5B6, Canada.
*Postal address: School of Mathematics and Statistics, Carleton University, 1125 Colonel By Drive, Ottawa, Ontario K1S 5B6, Canada.
Rights & Permissions [Opens in a new window]

Abstract

A system of interacting multi-class finite-state jump processes is analyzed. The model under consideration consists of a block-structured network with dynamically changing multi-color nodes. The interactions are local and described through local empirical measures. Two levels of heterogeneity are considered: between and within the blocks where the nodes are labeled into two types. The central nodes are those connected only to nodes from the same block, whereas the peripheral nodes are connected to both nodes from the same block and nodes from other blocks. Limits of such systems as the number of nodes tends to infinity are investigated. In particular, under specific regularity conditions, propagation of chaos and the law of large numbers are established in a multi-population setting. Moreover, it is shown that, as the number of nodes goes to infinity, the behavior of the system can be represented by the solution of a McKean–Vlasov system. Then, we prove large deviations principles for the vectors of empirical measures and the empirical processes, which extends the classical results of Dawson and Gärtner (Stochastics 20, 1987) and Léonard (Ann. Inst. H. Poincaré Prob. Statist. 31, 1995).

Type
Original Article
Copyright
© The Author(s), 2023. Published by Cambridge University Press on behalf of Applied Probability Trust

1. Introduction

Since McKean’s seminal paper [Reference McKean51], mean-field theory has been widely used to study large stochastic interacting particle systems arising in various domains such as statistical physics [Reference Dawson21, Reference Gärtner37, Reference McKean51, Reference McKean52], biological systems [Reference Dawson23, Reference Méléard and Bansaye53], communication networks [Reference Benaïm and Le Boudec8, Reference Graham39, Reference Graham and Méléard41, Reference Graham and Méléard42], mathematical finance [Reference Giesecke, Spiliopoulos, Sowers and Sirignano38, Reference Kley, Klüppelberg and Reichel47], etc. This theory, first initiated in connection with the mathematical foundation of the Boltzmann equation, aims for a mathematically rigorous treatment of the time evolution of stochastic systems with weak long-range interaction where the interaction between the particles is realized via the empirical measure of the particle configuration. For such scenarios, it is then natural to investigate the behavior of the empirical process instead of considering the particle configuration itself. In particular, one is interested in investigating laws of large numbers, limit theorems, and large deviations for the empirical process in the limit as the number of particles tends to infinity. Another concept that plays an important role is chaos propagation, first introduced by Kac as part of kinetic theory [Reference Kac45], and then widely studied in the literature since; see, e.g., [Reference Gärtner37] and [Reference Sznitman61] for detailed developments on the subject.

Classically, the systems studied are homogeneous with complete interaction graphs; that is, the particles are exchangeable and each particle interacts with all the others. In such a setting, the big picture is well understood and various asymptotic results have been established for a variety of models. One can consult [Reference Dawson22], [Reference Gärtner37], or [Reference Sznitman61] for an overview. However, though such assumptions are reasonable in statistical physics and accurately describe a variety of phenomena, this may no longer be the case when considering other applications. Researchers have therefore studied many new interacting particle systems where the homogeneity or complete interaction assumptions are not tenable; this is the direction in which this paper proceeds.

Systems in which particles carry intrinsic distinguishing features lead naturally to heterogeneous models. Thus, one cannot presume the particles to be identically distributed. Instead, one relies on additional conditions to establish limiting results. For instance, in [Reference Finnoff34, Reference Finnoff35, Reference Giesecke, Spiliopoulos, Sowers and Sirignano38], models for the activities of heterogeneous economic agents were proposed and laws of large numbers were proved under some regularity conditions. In [Reference Chong and Klüppelberg18], the authors investigated systems of interacting stochastic differential equations with two kinds of heterogeneity: one originating from different weights of the linkages, and the other concerning their asymptotic relevance when the system becomes large. The authors then introduced a partial mean-field system by averaging only over the particles with weak interactions and proved a law of large numbers together with a large deviations principle.

A particular instance of heterogeneity, with which our current work is in line, is the multi-population paradigm, also known as multi-types or multi-species. Here, the particles of the system are divided into groups, within which they are homogeneous or partially homogeneous. The interest in these structures is motivated by their ubiquity in different fields (see references below). We thus propose in this paper to take a step forward in the understanding of their behavior by studying the large-scale asymptotics of interacting particle systems with jumps on block-structured networks. Specifically, we will set up a model for block-structured networks with dynamically changing multi-color nodes. The evolution of node colors is described by a sequence of finite-state pure-jump processes interacting through local empirical measures describing the neighborhood of each node. The nodes of the network are divided into a finite number of blocks. In addition, the nodes within each block are divided into two subgroups: central and peripheral nodes. The central nodes are those connected only to nodes from the same block, whereas the peripheral nodes interact with both particles from the same block and some particles from other blocks. Thus, our model describes two levels of heterogeneity: between blocks and within blocks.

Block-structured networks are ubiquitous in various interacting particle systems composed of different communities, where a given community consists of a group of agents densely connected to each other but sparsely connected to the other dense groups of the population. For instance, a community in a social network might refer to a circle of friends, a community on the World Wide Web might include a group of pages on closely related topics, and a community in a cellular or genetic network might be related to a functional module. The study of the grouping patterns of communities, together with their detection, is an active field of research among physicists and applied mathematicians, and the study of what has become known as community structure is now one of the most prominent areas of network science. The reader can consult [Reference Porter, Onnela and Mucha59] and the references therein for an overview of the subject.

Our idea builds on the results in several existing works. The authors of [Reference Collet19, Reference Collet, Formentin and Tovazzi20] studied a bi-populated Curie–Weiss model and established, via a large deviations approach, the propagation of chaos and the asymptotic dynamics of the pair of group magnetizations in the infinite-volume limit. Laws of large numbers and a central limit theorem were proved in [Reference Kirsch and Toth46] for an extension of this model to the case of heterogeneous coupling within and between groups. A related paper [Reference Löwe and Schubert50] studied high-temperature fluctuations for a block spin Ising model and established a central limit theorem. A variant of this model was analyzed in [Reference Knöpfel, Löwe, Schubert and Sinulis48], where the vertices are divided into a finite number of blocks and pair interactions are given according to their blocks. The authors proved a large deviations principle and a central limit theorem. In the same spirit, limiting results were established in [Reference Nagasawat and Tanaka56] for a system of reflected diffusions segregated into two groups of blue and red particles and subject to a reflection condition. These results were extended in [Reference Nagasawat and Tanaka57] to the case with drift coefficients not of average form. Other recent related works are [Reference Meylahn54], which studied the two-community noisy Kuramoto model, and [Reference Aleandri and Minelli4], which studied opinion dynamics in a model with Lotka–Volterra-type interactions. Among other instances of the multi-population paradigm, we mention in particular the works [Reference Buckdahn, Li and Peng14, Reference Carmona and Zhu17], which considered mean-field game models with a single major player and statistically identical minor players. Propagation of chaos was proved for the minor players conditioned on the major player.

Another closely related model was proposed in [Reference Bayraktar and Wu7], where systems of weakly interacting jump processes on time-varying random graphs with dynamically changing multi-color edges were studied. In [Reference Bayraktar and Wu7], the dynamics of a node depend on the joint empirical distribution of all other nodes and the edges to which it connects. In contrast, the dynamics of an edge depend only on the corresponding nodes to which it connects. The paper [Reference Bayraktar and Wu7] established the law of large numbers, propagation of chaos, and central limit theorems for these systems. Despite certain similarities, the class of models which we are considering in the current paper differs in several aspects from the models contemplated in [Reference Bayraktar and Wu7]. First, the interacting particle systems that we study are on static block-structured graphs, whereas the ones considered in [Reference Bayraktar and Wu7] are on time-varying random graphs with edge-structure dynamics. Moreover, in the current work, we consider a multi-population setting where the interaction between the nodes is local, i.e. each node interacts only with its neighbors, whereas in [Reference Bayraktar and Wu7] the interaction between nodes is global, since the dynamic of a given node depends on the empirical distribution of all the other nodes. Finally, the analysis carried out and the results obtained in our current work are established on the vector of local empirical measures adapted to the multi-population context, and thus on product spaces, which allows us to overcome the heterogeneity due to the block structure of the graph. Furthermore, note that the current paper addresses the topic of interacting particle systems on large (random) networks, which has attracted increasing attention in recent years; see, e.g., [Reference Bayraktar, Chakraborty and Wu6, Reference Bayraktar and Wu7, Reference Bhamidi, Budhiraja and Wu9] and the references therein.

Alongside the papers listed above, the multi-population framework has also been considered for systems of interacting diffusions. We mention for instance [Reference Kley, Klüppelberg and Reichel47] for an analysis of a system of interacting Ornstein–Uhlenbeck processes on a heterogeneous network of credit-interlinked agents, [Reference Bossy, Faugeras and Talay13, Reference Budhiraja and Wu16, Reference Touboul62] and the references therein for studies of neuronal networks composed of separate populations, or [Reference Nguyen, Nguyen and Du58] and the references therein for mean-field multi-class interacting diffusions models in a general setting. (Note that some erroneous results that were originally stated in [Reference Touboul62] were corrected in [Reference Touboul63].)

The goal of the current work is to develop limiting results for interacting finite-state pure-jump processes on a class of block-structured networks. Our first main result, Theorem 5.1, and its consequence Corollary 5.1 give propagation of chaos and a law of large numbers under some regularity conditions on the degrees of the nodes. We show that in the mean-field limit, the asymptotic behavior of the node colors can be represented by the solution of a McKean–Vlasov system. Because of the lack of symmetry, we make use of the extension of the notion of chaoticity and Sznitman coupling methods to multi-class systems developed in [Reference Graham40, Reference Graham and Robert43]. The existence and uniqueness results for the limiting system are established in Theorem 4.1. The regularity conditions which we impose (cf. Condition 4.1) can be compared to the uniform degree property introduced in [Reference Delattre, Giacomin and Luçon28] for a model of interacting diffusions on random graphs and to the one introduced in [Reference Budhiraja, Mukherjee and Wu15] for a model of interacting pure-jump processes on sparse graphs.

Another aspect which we are interested in is the large deviations properties of the system. For this purpose, with the aim of simplicity, we will restrict ourselves to the case where the blocks are cliques and the peripheral subgraph is complete, that is, the case where all peripheral nodes of the system are connected and all the central nodes within the same block are connected. We then state our next main results in Theorem 6.1, which establishes the large deviations principle for the empirical measure vector over finite time duration, followed by Theorem 6.2, which gives the large deviations principle for the empirical process vector. These results generalize those of [Reference Borkar and Sundaresan12, Reference Léonard49] to the multi-population context. Also, unlike [Reference Léonard49] and similarly to [Reference Borkar and Sundaresan12], we do not impose chaotic initial conditions, but only converging initial conditions. The proofs of the large deviations principles, which provide tools for handling the technicalities arising from the multi-population context, generalize the classical approach developed in [Reference Dawson and Gärtner24] and its adaptation to the context of jump processes in [Reference Léonard49].

In summary, the current work is a contribution to the multi-population paradigm and a move towards heterogeneity for mean-field models and their large deviations behavior. The rest of this paper is organized as follows. The detailed model for interacting finite-state pure-jump processes on block-structured graphs is introduced in Section 2. Section 3 provides some practical examples of applications of the class of models studied in this paper. In Section 4, we introduce the McKean–Vlasov limiting system, and we prove the existence and uniqueness of its solution under specific regularity conditions introduced in Condition 4.1. Then, under the same conditions, in Section 5 we prove propagation of chaos (Theorem 5.1) and the law of large numbers (Corollary 5.1). Next, in Section 6, we present the large deviations principles for the empirical measure vector (Theorem 6.1) and for the empirical process vector (Theorem 6.2).

2. Formulation of the model

This section introduces the model and related notation.

2.1. The setting

A block-structured network:

  • Consider an undirected block-structured graph $\mathcal{G}=(\mathcal{V},\Xi)$ , where $\mathcal{V}$ is the set of nodes and $\Xi$ is the set of edges. The set $\mathcal{V}$ is partitioned into r (finite) blocks $C_1,\ldots,C_r$ of sizes $N_1,\ldots,N_r$ , respectively. Denote by $|\mathcal{V}|\,:\!=\,N_1+\cdots+N_r=N$ the cardinality of the set $\mathcal{V}$ , which corresponds to the total number of nodes in the network.

  • The nodes of each block $C_j$ are divided into two categories:

  • Central nodes $C^c_j$ are connected to some nodes from the same block but not to any nodes from any other blocks. We set $|C_j^c|=N_j^c$ .

  • Peripheral nodes $C^p_j$ are connected to some nodes from the same block and some nodes from other blocks. We set $|C^p_j|=N^p_j$ .

Multi-color nodes:

Let $\mathcal{Z}\,:\!=\,\{1,2,\ldots,K\}\subset\mathbb{N}$ be a set of K colors. Suppose that each node of the graph $\mathcal{G}=(\mathcal{V},\Xi)$ is colored by one of these colors at each time. One can associate each node to a particle whose state space is $\mathcal{Z}$ . Thus, we will use the denominations ‘node’ and ‘particle’ interchangeably to refer to the same thing. Denote by $(\mathcal{Z},\mathcal{E})$ the directed graph where $\mathcal{E}\subset\mathcal{Z}\times\mathcal{Z}\backslash \{(z,z)| z \in\mathcal{Z}\}$ describes the set of admissible jumps for each particle. Moreover, whenever $(z,z')\in\mathcal{E}$ , a particle colored by z is allowed to move from z to z at a rate that depends on the current state of the node and the state of its neighbors (adjacent nodes).

For each $1\leq j\leq r$ and $n\in C_j^c$ (resp. $n\in C_j^p$ ), let us define by $(X^c_{n,j}(t),t\geq 0)$ (resp. $(X^p_{n,j}(t),t\geq 0)$ ) the stochastic process that describes the state (color) of the central (resp. peripheral) node n at time t. In addition, we denote by $\mu^N_j(t)$ the local empirical measure describing the state of the jth block at time t, which is given by

(1) \begin{equation}\begin{split}\mu_j^N(t)&\,:\!=\,\frac{1}{N_j}\Bigg(\sum_{n\in C^c_j}\delta_{X^c_{n,j}(t)}+\sum_{n\in C^p_j}\delta_{X^p_{n,j}(t)}\Bigg)\\ &=\frac{N_j^c}{N_j}\frac{1}{N_j^c}\sum_{n\in C^c_j}\delta_{X^c_{n,j}(t)}+\frac{N_j^p}{N_j}\frac{1}{N_j^p}\sum_{n\in C^p_j}\delta_{X^p_{n,j}(t)}\\ &=:\frac{N_j^c}{N_j}\mu_j^{c,N}(t)+\frac{N_j^p}{N_j}\mu_j^{p,N}(t),\end{split}\end{equation}

where $\mu_j^{c,N}(t)=\frac{1}{N_j^c}\sum_{n\in C^c_j}\delta_{X^c_{n,j}(t)}$ $\Big($ resp. $\mu_j^{p,N}(t)=\frac{1}{N_j^p}\sum_{n\in C^p_j}\delta_{X^p_{n,j}(t)}\Big)$ is the empirical measure describing the state of the central (resp. peripheral) nodes of the jth block at time t. The fractions $\frac{N_j^c}{N_j}$ (resp. $\frac{N_j^p}{N_j}$ ) thus represent the proportion of central (resp. peripheral) nodes in the block j. Denote by $\mathcal{M}_1(\mathcal{Z})$ the set of all probability measures over $\mathcal{Z}$ , endowed with the topology of weak convergence.

The random dynamics:

The process $X(t)=\Big(X^c_{n,j}(t), X^p_{m,j}(t),n\in C_j^c, m\in C_j^p, 1\leq j\leq r\Big)$ describing the evolution of the entire system is a continuous-time Markov chain with state space $\mathcal{Z}^N$ . The transition rate of each node depends on its current state and the state of its neighbors, together with the block to which it belongs. To characterize these neighborhoods, we introduce a set of local empirical measures describing the state of the star-shaped subgraph centered at each node n and composed of the nodes connected to it. To lighten the formulas and for ease of reading, we introduce the following shorthand notation: for any two nodes $n,m\in\mathcal{V}$ , $m\sim n$ means that $\{m,n\}\in\Xi$ . Moreover, for any block $1\leq j\leq r$ and $\iota\in\{c,p\}$ , we denote by $\mathfrak{N}_j^{\iota}(n)\,:\!=\,\{n'\in C_j^{\iota}\,:\,n\sim n'\}$ the set of nodes in $C_j^{\iota}$ that are connected to n. Let deg(n) denote the degree of the node n, and let $M_i^{\iota}(n)\,:\!=\,|\mathfrak{N}_i^{\iota}(n)|$ for $\iota\in\{c,p\}$ and $1\leq i\leq r$ be the cardinality of the set $\mathfrak{N}_i^{\iota}(n)$ . Thus, one notices that for $n\in C_j^c$ , $deg(n)=M_{j}^{c}(n)+M_{j}^{p}(n)$ , and for $n\in C_j^p$ , $deg(n)=M_{j}^{c}(n)+\sum_{k=1}^r M_k^{p}(n)$ .

Now, for any $n\in C_j^c$ and $1\leq j\leq r$ , let us define

(2) \begin{equation}\begin{split}\aleph^c_{n,j}(t)&\,:\!=\, \frac{1}{1+M_{j}^{c}(n)}\sum_{\substack{m\in\{n\}\cup\mathfrak{N}_j^c(n) }}\delta_{X^c_{m,j}(t)},\quad \aleph^p_{n,j}(t)\,:\!=\,\frac{1}{M_{j}^{p}(n)}\sum_{m\in\mathfrak{N}_j^p(n)}\delta_{X^p_{m,j}(t)},\end{split}\end{equation}

and

(3) \begin{equation}\begin{split}\varrho^c_{n,j}&\,:\!=\,\frac{1+M_{j}^{c}(n)}{1+deg(n)},\quad \varrho^p_{n,j}\,:\!=\,\frac{M_{j}^{p}(n)}{1+deg(n)},\end{split}\end{equation}

and finally

(4) \begin{equation}\begin{split}\mu_{n,j}^{c,N}(t)\,:\!=\, & \varrho^c_{n,j}\aleph^c_{n,j}(t)+\varrho^p_{n,j}\aleph^p_{n,j}(t).\end{split}\end{equation}

Equivalently, for any $n\in C_j^p$ , $1\leq j\leq r$ , and $j'\neq j$ , define

(5) \begin{equation}\begin{split}\beth^c_{n,j}(t)\,:\!=\,\frac{1}{M_{j}^{c}(n)}&\sum_{m\in\mathfrak{N}_j^c(n)}\delta_{X^c_{m,j}(t)},\mbox{ }\beth^p_{n,j,j}(t)\,:\!=\,\frac{1}{1+M_{j}^{p}(n)}\sum_{m\in\{n\}\cup \mathfrak{N}_j^p(n)}\delta_{X^p_{m,j}(t)},\\ &\mbox{ } \beth^p_{n,j,j'}(t)\,:\!=\,\frac{1}{M_{j'}^p(n)}\sum_{m\in \mathfrak{N}_{j'}^p(n)}\delta_{X^p_{m,j'}(t)},\end{split}\end{equation}

and

(6) \begin{equation}\begin{split}\varsigma^c_{n,j}&\,:\!=\,\frac{M_{j}^{c}(n)}{1+deg(n)},\quad\varsigma^p_{n,j,j}\,:\!=\,\frac{1+M_{j}^{p}(n)}{1+deg(n)},\quad\varsigma^p_{n,j,j'}\,:\!=\,\frac{M_{j'}^p(n)}{1+deg(n)},\end{split}\end{equation}

and finally

(7) \begin{equation}\begin{split}\mu_{n,j}^{p,N}(t)&\,:\!=\, \varsigma^c_{n,j}\beth^c_{n,j}(t)+\sum_{\substack{j'=1\\ j'\neq j}}^r\varsigma^p_{n,j,j'}\beth^p_{n,j,j'}(t).\end{split}\end{equation}

Therefore, the random dynamic in each block $1\leq j\leq r$ is described as follows:

  • The central nodes dynamic. For each central node $n\in C^c_j$ , its color $X^c_{n,j}(t)$ goes from z to z , for $(z,z')\in (\mathcal{Z},\mathcal{E})$ , at rate

    (8) \begin{align} \lambda_{j,z,z'}^{c}\bigg(\aleph^c_{n,j}(t),\aleph^p_{n,j}(t),\varrho^c_{n,j},\varrho^p_{n,j}\bigg),\end{align}
    which depends on its current state and on the states of its neighbors through the functions $ \lambda_{j,z,z'}^{c}\,:\,\mathcal{M}_1(\mathcal{Z})\times\mathcal{M}_1(\mathcal{Z})\times [0,1]\times [0,1]\rightarrow\mathbb{R}_+$ .
  • The peripheral nodes dynamic. For each peripheral node $n\in C^p_j$ , its color $X^p_{n,j}(t)$ goes from z to z , for $(z,z')\in (\mathcal{Z},\mathcal{E})$ , at rate

    (9) \begin{equation}\begin{split}\lambda^p_{j,z,z'}\bigg(\beth^c_{n,j}(t),\beth^p_{n,j,1}(t),\ldots,\beth^p_{n,j,r}(t),\varsigma^c_{n,j},\varsigma^p_{n,j,1},\ldots,\varsigma^p_{n,j,r}\bigg),\end{split}\end{equation}
    which also depends on its state and the states of its neighbors through the functions $ \lambda_{j,z,z'}^{p}\,:\,\big(\mathcal{M}_1(\mathcal{Z})\big)^{r+1}\times [0,1]^{r+1}\rightarrow\mathbb{R}_+$ .

The explicit forms of the rate functions will be introduced in Condition 4.1. To avoid cluttering our notation, let us introduce the following vectors:

(10) \begin{align} \upsilon_{n,j}^{c,N}(t)\,:\!=\,\bigg(\aleph^c_{n,j}(t),\aleph^p_{n,j}(t),\varrho^c_{n,j},\varrho^p_{n,j}\bigg),\end{align}
(11) \begin{equation} \begin{split} \upsilon_{n,j}^{p,N}(t)\,:\!=\, \bigg(\beth^c_{n,j}(t),\beth^p_{n,j,1}(t),\ldots,\beth^p_{n,j,r}(t),\varsigma^c_{n,j},\varsigma^p_{n,j,1},\ldots,\varsigma^p_{n,j,r}\bigg). \end{split}\end{equation}

Thus, we will write $ \lambda_{j,z,z^{\prime}}^{c}\Big(\upsilon_{n,j}^{c,N}(t)\Big)$ instead of (8) and $\lambda^p_{j,z,z'}\Big(\upsilon_{n,j}^{p,N}(t)\Big)$ instead of (9).

Remark 2.1. One can see the model under investigation as a multi-species system where each block $C_j$ represents a separate species. In particular, the rate functions $\lambda_{j,z,z'}^{c}$ and $\lambda_{j,z,z'}^{p}$ being block-dependent, the dynamic of each particle depends on its species, i.e., the block to which it belongs. This idea has been extensively used in the literature on multi-type systems; see, e.g., [Reference Agliari, Migliozzi and Tantari1, Reference Alberici, Camilli, Contucci and Mingione3, Reference Barra, Contucci, Mingione and Tantari5, Reference Budhiraja and Wu16] and the references therein. The specificity here is the existence of heterogeneity even across particles of the same species/block. Indeed, the central/peripheral paradigm creates two sub-types of particles within the same species whose rate functions differ. This construction appears to be natural in certain multi-group systems where only a few particles from the different groups interact; detailed examples are given below. Also, the interaction structure differs even for the central (resp. peripheral) particles of the same species/block, given that the rate functions depend on the node-centered local empirical measures, which differ even within the same block.

2.2. The infinitesimal generator

For any $T\in (0,+\infty)$ , the processes $X^c_{n,j}\,:\,[0,T]\rightarrow\mathcal{Z}$ for $n\in C_j^c$ and $X^p_{m,j}\,:\, [0,T] \rightarrow\mathcal{Z}$ for $m\in C_j^p$ , which respectively describe the evolution of the central and the peripheral particles over the time interval [0,T], are càdlàg paths, and thus are elements of the Skorokhod space $\mathcal{D}([0,T],\mathcal{Z})$ equipped with the Skorokhod topology. Let $X^N=\big(X^c_{n,j},X^p_{m,j},n\in C_j^c,m\in C_j^p, 1\leq j\leq r\big)\in\mathcal{D}([0,T],\mathcal{Z}^N)$ denote the full path description of all N particles. Thus the process $X^N$ is a Markov process with càdlàg paths, with state space $\mathcal{Z}^N$ , and with the infinitesimal generator $\mathcal{L}^N$ acting on the bounded measurable functions $\phi$ on $\mathcal{Z}^N$ according to

\begin{align*}\mathcal{L}^N\phi\big(x^N\big) &\,:\!=\, \sum_{j=1}^r\Bigg[ \sum_{n\in C_j^c}\sum_{z'\,:\,(z,z')\in\mathcal{E}}\lambda_{j,z,z'}^{c}\\& \qquad \Bigg(\frac{1}{M_{j}^{c}(n)+1}\sum_{\substack{m\in\{n\}\cup\mathfrak{N}_j^c(n) }}\delta_{x_{m,j}},\frac{1}{M_{j}^{p}(n)}\sum_{m\in\mathfrak{N}_j^p(n)}\delta_{x_{m,j}}, \varrho^c_{n,j},\varrho^p_{n,j}\Bigg)\\&\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\times\left( \phi\big(x^N_{n,z,z'}\big)-\phi\big(x^N\big) \right)\\ &+\sum_{n\in C_j^p}\sum_{z'\,:\,(z,z')\in\mathcal{E}}\lambda^p_{j,z,z'}\Bigg(\frac{1}{M_{j}^{c}(n)}\sum_{m\in\mathfrak{N}_j^c(n)}\delta_{x_{m,j}},\frac{1}{M_1^{p}(n)}\sum_{m\in\mathfrak{N}_1^p(n)}\delta_{x_{m,1}},\ldots,\\ & \qquad \frac{1}{M_{j}^{p}(n)+1}\sum_{\substack{m\in\{n\}\cup\mathfrak{N}_j^p(n) }}\delta_{x_{m,j}},\ldots\\ &\qquad\qquad\qquad\qquad\qquad\quad\ldots,\frac{1}{M_r^{p}(n)}\sum_{m\in\mathfrak{N}_r^p(n)}\delta_{x_{m,r}},\varsigma^c_{n,j},\varsigma^p_{n,j,1},\ldots,\varsigma^p_{n,j,r}\Bigg)\\ &\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\times\left( \phi\Big(x^N_{n,z,z'}\Big)-\phi\big(x^N\big) \right)\Bigg],\end{align*}

where $x^N=\big(x_{n,j},x_{m,j},n\in C_j^c,m\in C_j^p, 1\leq j\leq r\big)\in\mathcal{Z}^N$ and $x^N_{n,z,z'}$ describes the new configuration of the system when the state of the nth node has changed from z to z .

2.3. Stochastic differential equation representation

Recall that, for each central node $n\in C^c_j$ (resp. peripheral node $n\in C^p_j$ ) from a given block $1\leq j\leq r$ , the evolution of its color is described by the continuous-time stochastic process $(X_{n,j}^{c}(t),t\geq 0)$ $\big($ resp. $\big(X_{n,j}^{p}(t),t\geq 0\big)\big)$ that takes values in the finite state space $\mathcal{Z}$ , and whose dynamic is given by the time-dependent transition rate matrix $\left(\lambda_{j,z,z'}^{c}(\upsilon_{n,j}^{c,N}(t))\right)_{(z,z')\in\mathcal{E}}$ (resp. $\left(\lambda^p_{j,z,z'}(\upsilon_{n,j}^{p,N}(t))\right)_{(z,z')\in\mathcal{E}}$ ). Therefore, using a classical approach (see e.g. [Reference Skorokhod60, p. 104]), the processes $X_{n,j}^{c}$ and $X_{n,j}^{p}$ can be represented, at least weakly, by the following system of stochastic differential equations:

(12) \begin{equation}\begin{split}X^{c}_{n,j}(t)&=X^{c}_{n,j}(0)+\int_{[0,t]\times\mathbb{R}_+}\sum_{(z,z')\in\mathcal{E}}{\unicode{x1D7D9}}_{X^{c}_{n,j}(s{-})=z}(z'-z){\unicode{x1D7D9}}_{\left[0,\lambda_{j,z,z'}^{c}(\upsilon_{n,j}^{c,N}(s{-}))\right]}(y)\mathcal{N}_{n,j}^c(ds,dy),\\X^{p}_{n,j}(t)&=X^{p}_{n,j}(0)+\int_{[0,t]\times\mathbb{R}_+}\sum_{(z,z')\in\mathcal{E}}{\unicode{x1D7D9}}_{X^{p}_{n,j}(s{-})=z}(z'-z){\unicode{x1D7D9}}_{\left[0,\lambda^p_{j,z,z'}(\upsilon_{n,j}^{p,N}(s{-}))\right]}(y)\mathcal{N}_{n,j}^p(ds,dy),\end{split}\end{equation}

where $\{\mathcal{N}_{n,j}^c, n\in C_j^c, 1\leq j\leq r\}$ and $\{\mathcal{N}_{n,j}^p, n\in C_j^p, 1\leq j\leq r\}$ are collections of Poisson random measures on $\mathbb{R}^2$ whose intensity measures are Lebesgue on $\mathbb{R}^2_+$ . We will use the representation (12) in the analysis of the asymptotic behavior of the system when the total number of nodes N goes to infinity.

3. Examples

As mentioned in the introduction, mean-field block models have been proposed to investigate various phenomena arising in fields such as physics, engineering, biology, etc. This section presents some examples of applications of the model analyzed in the current paper, with the goal being to illustrate its usefulness and its flexibility to capture various phenomena. Of course, it remains a toy model that should be appropriately adapted to different applications, but we believe that the insights from the current study are of great interest for both theoretical and practical purposes.

3.1. Load-balancing networks

Load-balancing protocols are often used in queueing networks to improve system performance by shortening the queue length, reducing the waiting time, and increasing the system throughput. In this regard, the mean-field approach has been proven to be useful; see, e.g., [Reference Mitzenmather55, Reference Vvedenskaya, Dobrushin and Karpelevich65, Reference Vvedenskaya and Suhov66]. In particular, interesting work in this direction was proposed in [Reference Dawson, Tang and Zhao26], where the authors considered a queueing network with N nodes in which queue lengths are balanced through mean-field interactions using an interaction function. Here we summarize their model and then describe how our current model can be used to generalize the ideas in [Reference Dawson, Tang and Zhao26].

Consider a system consisting of N queues with a mean-field interaction. At $t = 0$ , for $1 \leq n \leq N$ , the arrival rate to the nth queue occurs according to $\zeta_{X_n(0)}$ , and the service rate at queue n is $\vartheta_{X_n(0)}$ . Let $h(x)\,:\,\mathbb{R}_+\times\mathbb{R}_+\rightarrow \mathbb{R}$ be a continuous nondecreasing interaction function satisfying certain regularity conditions (see [Reference Dawson, Tang and Zhao26, p. 339]). This function makes it possible to capture the mean-field interaction between queues as follows: for each queue $n= 1, 2,\ldots,N$ , the arrival rate at time t is given by $\zeta_{X_j(t)}- h(X_j (t),\langle \mu^N (t)(dx),x\rangle)$ , where $\mu^N (t)\,:\!=\,\frac{1}{N}\sum_{j=1}^N\delta_{X_j(t)}$ is the empirical measure corresponding to the N queues at time t, and $\langle \mu^N (t)(dx),x\rangle=\frac{1}{N}\sum_{j=1}^NX_j(t)$ is the mean queue length of the N queues at time t. Roughly speaking, the arrival rate at each queue depends on the current size of the queue and on the mean size of its neighbors (which is the entire set of queues in this case). The authors of [Reference Dawson, Tang and Zhao26] studied the performance of this system when the number of queues N goes to infinity.

The model proposed in the current paper can be seen as a generalization of the model in [Reference Dawson, Tang and Zhao26] to heterogeneous queueing networks, namely, to block-structured networks. To see this, let us consider the graph $\mathcal{G}=(\mathcal{V},\Xi)$ as a queueing network where the particles (nodes) are finite-buffer server queues of maximum size K (arbitrarily large), and the corresponding states $\Big(X^c_{n,j}(t), X^p_{m,j}(t),n\in C_j^c,m\in C_j^p,1\leq j\leq r, t\geq 0\Big)$ represent the number of customers waiting in each queue at each time t. Again, at $t = 0$ , for $1 \leq n \leq N$ , the arrival rate to the nth queue occurs according to $\zeta_{X^{\iota}_{n,j}(0)}$ , and the service rate at queue n is $\vartheta_{X^{\iota}_{n,j}(0)}$ , for $\iota\in\{c,p\}$ . Since the network now is heterogeneous, the mean-field interaction is local. Thus, the arrival rate at a central node queue $n\in C_j^c$ at time t is given by $\zeta^c_{X^c_{n,j}(t)}-h^c\Big(X^c_{n,j}(t),\Big\langle \mu^{c,N}_{n,j} (t)(dx),x\Big\rangle\Big)$ , whereas the arrival rate at a peripheral node queue $n\in C_j^p$ at time t is given by $\zeta^p_{X^p_{n,j}(t)}-h^p\Big(X^p_{n,j}(t),\Big\langle \mu_{n,j}^{p,N} (t)(dx),x\Big\rangle\Big)$ , where $\mu^{c,N}_{n,j} (t)$ and $\mu_{n,j}^{p,N} (t)$ are the local empirical measures respectively given by (4) and (7). The service rates $\vartheta^c_{X^c_{n,j}(t)}$ and $\vartheta^p_{X^p_{n,j}(t)}$ depend only on the queue sizes $X^c_{n,j}(t)$ and $X^p_{n,j}(t)$ at time t. Hence, the transition rates $\lambda_{j,z,z'}^{c}$ and $\lambda^p_{j,z,z'}$ are specified as follows:

  • The size $X^c_{n,j}(t)$ of each central queue $n\in C^c_j$ at time t goes from z to z at rate

    \begin{align*}\lambda_{j,z,z'}^{c}\,:\!=\,\left\{\begin{array}{l@{\quad}l} \zeta^c_{X^c_{n,j}(t)}-h^c\bigg(X^c_{n,j}(t),\frac{1}{deg(n)}\sum\limits_{\iota\in\{c,p\}}\sum\limits_{\substack{m\in \mathfrak{N}^{\iota}_j(n)}} X^{\iota}_{m,j}(t)\bigg) &{\text{if}\ z'=z+1\ \text{and}\ z'\leq K },\\[3pt] \vartheta^c_{X^c_{n,j}(t)} & {\text{if}\ z'=z-1\ \text{and}\ X^c_{n,j}(t)\geq 1 },\\[11pt] -\sum\limits_{y\neq z}\lambda^c_{j,z,y} & {\text{if}\ z'=z},\\[3pt] 0&\text{otherwise.} \end{array}\right.\end{align*}
  • The size $X^p_{n,j}(t)$ of each peripheral queue $n\in C^p_j$ at time t goes from z to z at rate

    \begin{align*}\lambda^p_{j,z,z'}\,:\!=\,\left\{\begin{array}{l@{\quad}l} \zeta^p_{X^p_{n,j}(t)}-h^p\Bigg(X^p_{n,j}(t),\frac{1}{deg(n)}\sum\limits_{\substack{1\leq k\leq r\\ \iota\in\{c,p\}}}\sum\limits_{\substack{m\in\mathfrak{N}_k^{\iota}(n)}}(X^{\iota}_{m,k}(t))\Bigg) &{\text{if}\ z'=z+1\ \text{and}\ z'\leq K },\\ \vartheta^p_{X^p_{n,j}(t)} & {\text{if}\ z'=z-1\ \text{and}\ X^p_{n,j}(t)\geq 1 },\\[12pt] -\sum\limits_{y\neq z}\lambda^p_{j,z,y} & {\text{if}\ z'=z},\\ 0& \text{otherwise}. \end{array}\right.\end{align*}

It is worth mentioning that the sparse graph topologies have been considered in applications in response to some issues encountered when implementing load-balancing protocols. In particular, many service systems are geographically constrained; therefore, when a task arrives at any specific server, it may be impossible to collect instantaneous state information from all the servers. In addition, executing a task commonly involves the use of some data, and storing such data for all possible tasks on all servers requires an excessive amount of storage capacity. The use of sparser graph topologies is then considered, such that tasks that arrive at a specific server can only be forwarded, following a specific load-balancing scheme, to the servers that possess the data required to process the tasks. In other words, a specific server can only interact with its neighbors in a suitable sparse topology; see, e.g., [Reference Budhiraja, Mukherjee and Wu15] and the references therein for more insights about the subject. The block-structured topology with the central/peripheral paradigm can for instance be considered to overcome the geographic constraint by allowing central nodes to rely only on the information collected locally on nodes from the same block, which may represent nodes within the same geographic area, while the peripheral nodes are those relying on information from both within and outside the block. To increase system efficiency, one could restrict the number of peripheral nodes allowed. The results obtained in the current work allow us to understand the behavior of such systems when the number N of servers of the network is very large. In particular, the multi-chaotic property established in Theorem 5.1 tells us that the queue lengths at any finite collection of tagged servers are statistically asymptotically independent, and the queue-length process for each server converges in distribution to the corresponding McKean–Vlasov process given by (14). Also, Condition 4.1 and Remark 4.1 tell us that the multi-chaoticity result holds even when the peripheral subgraph is not complete, which means that one can achieve similar asymptotic performance even with far fewer connections between the peripheral nodes than when all the peripheral nodes are connected and all the central nodes of the same block are connected.

3.2. Multi-population SIS epidemics

The susceptible–infected–susceptible (SIS) model, originally used in epidemiology, is also convenient to model the spread of information in networks, since the two phenomena are similar. The SIS model can be summarized as follows: consider a piece of information, or an infectious disease, that propagates across a population. A member that has a copy of the information/disease is said to be infected, and a member that does not have a copy of the information/disease is said to be susceptible. When an infected member comes into contact with a susceptible one, the former transmits a copy of the information (disease) to the latter, which gets infected. Moreover, an infected member may spontaneously get rid of the information/disease, a phenomenon called curing, and become susceptible again.

In both epidemiology and network information diffusion, the population often consists of relatively isolated subgroups such that members of the same subgroup interact a great deal, but only a few pairs of members from different subgroups are connected. One might think, for instance, of countries as isolated communities connected by travelers across the globe, or of interactions in social media, which often happen in almost closed communities, with only a few influential members interacting across groups. Our model allows one to study the spreading dynamics of information or of a disease among the members of a population structured as separate communities.

Consider a population consisting of r isolated communities and a ‘mobile’ community. The members of each isolated community interact only among themselves and with members of the mobile community. Thus, there is no direct interaction between members of different communities. However, indirect inter-community interactions happen via the set of mobile members. This idea was used in [Reference Akhil, Altman and Sundaresan2], where the authors considered an optimal control problem to find the optimal resource allocation strategy that maximizes information spread over a multi-community population. Their objective was to obtain a good tradeoff between the information spread in the network and the use of system resources.

Now, let $\mathcal{Z}\,:\!=\,\{0,1\}$ be the state space that indicates whether the particle is susceptible $(=0)$ or infected $(=1)$ . Recalling the model description introduced in Section 2, one might think of the central nodes of each block as an isolated community that interacts with other communities only through the members of the mobile community represented by the peripheral nodes. Note that in contrast to [Reference Akhil, Altman and Sundaresan2], the central nodes of a given block interact only with peripheral (mobile) nodes from the same block, and not with all the peripheral/mobile nodes, as stipulated in [Reference Akhil, Altman and Sundaresan2]. Also, the interaction graph for the peripheral members is not complete; thus, not all the peripheral nodes interact with each other. Nevertheless, the fact that the multi-chaotic property holds under Condition 4.1 (cf. Theorem 5.1) tells us that systems with full connections among the peripheral components and among the components of each block are asymptotically close to those with fewer connections, as specified by Condition 4.1 and Remark 4.1. This is of interest, for example, in resource allocation problems where a cost is attributed to each connection; however, such considerations are beyond the scope of the present paper.

Denote by $X^c_{n,j}(t)$ , for $n\in C_j^c$ , and $X^p_{m,j}(t)$ , for $m\in C_j^p$ , the state (‘susceptible’ or ‘infected’) of the nth central particle and the mth peripheral particle, respectively, in the jth community. Two connected central members of the same community j come in contact with each other at rate $\gamma_j$ . Connected peripheral and central nodes from the same community interact with each other at rate $\nu_j$ . Two connected peripheral nodes come in contact with each other at rate $\eta$ . Finally, an infected node in the jth community spontaneously gets rid of the infection at rate $\zeta_j$ . Therefore, the transition rates $\lambda_{j,z,z'}^{c}$ and $\lambda^p_{j,z,z'}$ , which sum up the dynamics that we are interested in, are specified as follows:

  • The state $X^c_{n,j}(t)$ of each central member $n\in C^c_j$ at time t goes from z to z at rate

    \begin{align*}\lambda_{j,z,z'}^{c}\,:\!=\,\left\{\begin{array}{l@{\quad}l} \sum\limits_{\substack{m\in \mathfrak{N}_j^c(n) }}X^c_{m,j}(t)\gamma_j+\sum\limits_{\substack{m\in \mathfrak{N}_j^p(n) }}X^p_{m,j}(t)\nu_j & \text{if}\ \text{z}=0\ \text{and}\ \text{z}'=1 ,\\ \zeta_j & \text{if}\ \text{z}=1\ \text{and}\ \text{z}'=0 ,\\ -\sum\limits_{y\neq z}\lambda^c_{j,z,y} & \text{if}\ \text{z}'=\text{z},\\ 0&\mbox{otherwise.} \end{array}\right.\end{align*}
  • The state $X^p_{n,j}(t)$ of each peripheral (mobile) member $n\in C^p_j$ at time t goes from z to z at rate

    \begin{align*}\lambda^p_{j,z,z'}\,:\!=\,\left\{\begin{array}{l@{\quad}l} \sum\limits_{\substack{m\in \mathfrak{N}_j^c(n) }}X^c_{m,j}(t)\nu_j+\sum\limits_{k=1}^r\sum\limits_{\substack{m\in \mathfrak{N}_k^p(n) }}X^p_{m,k}(t)\eta & \text{if}\ \text{z}=0\ \text{and}\ \text{z}'=1 ,\\ \zeta_j & \text{if}\ \text{z}=1\ \text{and}\ \text{z}'=0 ,\\ -\sum\limits_{y\neq z}\lambda^p_{j,z,y} & \text{if}\ \text{z}'=\text{z},\\ 0&\text{otherwise.} \end{array}\right.\end{align*}

Note that the large deviations properties established in Section 6 constitute a step towards the study of the large-time behavior of such systems. Indeed, the large deviations of the empirical measures established in Theorem 6.2 can be used to investigate the large deviations of the invariant measure, from which one can study the large-time behavior of the system and related phenomena such as metastability and convergence to the invariant measure. The interested reader can consult, e.g., [Reference Dawson, Sid-Ali and Zhao25, Reference Freidlin and Wentzell36, Reference Hwang and Sheu44, Reference Yasodharan and Sundaresan67] and the references therein for further insight.

4. Existence and uniqueness of the limiting system

This section aims to introduce and prove the existence of the limiting equation that describes the behavior of the interacting particle system detailed in Section 2, as the total number of particles N in the system tends to infinity. In particular, this equation is of McKean–Vlasov type as explained below. The main result of this section is Theorem 4.1, which establishes the existence and uniqueness of the solution of the limiting McKean–Vlasov equation. The convergence of the system towards this equation will then be investigated in Section 5.

4.1. Notation and conventions

Let $(\mathbb{S}, d)$ be a Polish space. For any $y\in\mathbb{S}^d$ , for some $d\in\mathbb{N}$ , one writes $\|y\|\,:\!=\,\max (y_1, \ldots, y_d)$ . For any $x\in\mathcal{D}([0,T],\mathbb{S}^d)$ , $\|x\|_T$ denotes $\sup_{0\leq t\leq T}\|x(t)\|$ . Let $\mathcal{M}(\mathbb{S})$ be the set of all measures on $\mathbb{S}$ . Given $\mu,\nu\in\mathcal{M}(\mathbb{S})$ , the bounded-Lipschitz metric $d_{BL}(\cdot,\cdot)$ is defined by

(13) \begin{align}d_{BL}(\mu,\nu)\,:\!=\, \sup_{g\in Lip(\mathbb{S})} \big| \langle \mu,g\rangle-\langle\nu,g\rangle \big|,\end{align}

where

\begin{align*}Lip(\mathbb{S})\,:\!=\,\bigg\{g\in C_b(\mathbb{S})\,:\,\sup_{x\in\mathbb{S}} |g(x)|\leq 1,\sup_{x\neq y} \frac{|g(x)-g(y)|}{d(x,y)} \leq 1\bigg\}.\end{align*}

Recall that the bounded-Lipschitz metric metrizes the weak convergence of probability measures on $\mathbb{S}$ with respect to bounded continuous test functions $C_b(\mathbb{S})$ . For $p\geq 1$ , let $\mathcal{P}_p(\mathbb{S})$ be the collection of all probability measures on $\mathbb{S}$ with finite pth moment. Then, for any $\mu$ and $\nu$ in $\mathcal{P}_p(\mathbb{S})$ , the pth Wasserstein distance between $\mu$ and $\nu$ is defined as

\begin{align*} \mathcal{W}_p( \mu, \nu)\,:\!=\, \left(\inf_{\gamma\in\Gamma (\mu,\nu)} \int_{E\times E} d(x,y)^p d \gamma(x,y)\right)^{1/p},\end{align*}

where $\Gamma (\mu ,\nu )$ denotes the collection of all measures on $\mathbb{S}\times \mathbb{S}$ with marginals $\mu$ and $\nu$ . Moreover, for $M_1,M_2$ in $\mathcal{P}_p\big(\mathcal{D}([0,T],\mathbb{S})\times\cdots\mathcal{D}([0,T],\mathbb{S})\big)$ , the pth Wasserstein distance between $M_1$ and $M_2$ is given, for any $t\in [0,T]$ , by

\begin{align*}\mathcal{W}_{p,t}(M_1, M_2) & \,:\!=\, \inf\bigg\{\big[\mathbb{E}\|Y_1-Y_2\|^p_t\big]^{1/p}\,:\, Y_1, Y_2 \in \mathcal{D}([0,T],\mathbb{S})\times\cdots\mathcal{D}([0,T],\mathbb{S}), M_1=\\& \qquad \mathcal{L}(Y_1), M_2= \mathcal{L}(Y_2)\bigg\}.\end{align*}

4.2. The limiting system

We use in the sequel the convention that N goes to infinity when both $\min_{1\leq j\leq r}N_j^c$ and $\min_{1\leq j\leq r}N_j^p$ go to infinity. Given the multi-population setting, one describes the state of the system at each time t using the following empirical measure vector:

\begin{align*}\mu^N(t)\,:\!=\,\left(\mu_1^{c,N}(t),\mu_1^{p,N}(t),\cdots,\mu_r^{c,N}(t),\mu_r^{p,N}(t)\right),\end{align*}

where we recall that for each $1\leq j\leq r$ , $\mu_j^{c, N}(t)$ (resp. $\mu_j^{p, N}(t)$ ) is the empirical measure describing the states of the central (resp. peripheral) nodes of the jth block at time t. Under some regularity conditions (cf. Condition 4.1), we will prove in Section 5 the convergence, as N tends to infinity, of the empirical measure vector $\mu^{N}$ towards the distribution $\mu\in\mathcal{M}_1(\mathcal{D}\big([0,T],\mathcal{Z}^{2r}\big))$ , the solution of an appropriate limiting system. Namely, the empirical vector $\mu^N$ should converge weakly to $\mu$ with

\begin{align*}\mu&\,:\!=\,\left(\mu_1^{c},\mu_1^{p},\cdots,\mu_r^{c},\mu_r^{p}\right)\in\big(\mathcal{M}_1(\mathcal{D}([0,T],\mathcal{Z}))\big)^{2r},\end{align*}

where $ \mu_j^{c}\,:\!=\,\mathcal{L}\big( \bar{X}^c_{n,j}\big)$ for $n\in C_j^c$ and $\mu_j^{p}\,:\!=\,\mathcal{L}( \bar{X}^p_{m,j})$ for $m\in C_j^p$ , with $\Big(\bar{X}^c_{n,j}, \bar{X}^p_{m,j},n\in C_j^c,m\in C_j^p, 1\leq j\leq r\Big)$ being the solution of the following system of stochastic differential equations:

(14) \begin{equation}\begin{split} \bar{X}^{c}_{n,j}(t)&= \bar{X}^{c}_{n,j}(0)+\int_{[0,t]\times\mathbb{R}_+}\sum_{(z,z')\in\mathcal{E}}{\unicode{x1D7D9}}_{ \bar{X}^{c}_{n,j}(s{-})=z}(z'-z){\unicode{x1D7D9}}_{\left[0,\lambda_{j,z,z'}^{c}\left(\upsilon_{j}^c(s{-})\right)\right]}(y)\mathcal{N}_{n,j}^c(ds,dy),\\ \bar{X}^{p}_{m,j}(t)&= \bar{X}^{p}_{m,j}(0)+\int_{[0,t]\times\mathbb{R}_+}\sum_{(z,z')\in\mathcal{E}}{\unicode{x1D7D9}}_{ \bar{X}^{p}_{m,j}(s{-})=z}(z'-z){\unicode{x1D7D9}}_{\left[0,\lambda^p_{j,z,z'}\left(\upsilon_{j}^p(s{-})\right)\right]}(y)\mathcal{N}_{m,j}^p(ds,dy).\end{split}\end{equation}

The vectors $\upsilon_{j}^c(t)$ and $\upsilon_{j}^p(t)$ are defined by

(15) \begin{equation}\begin{split}\upsilon_{j}^c(t)&\,:\!=\,\big(\mu^c_j(t),\mu^p_j(t),p_j^c,p_j^p\big), \\\upsilon_{j}^p(t)&\,:\!=\,\big(\mu^c_j(t),\mu^p_1(t),\ldots,\mu^p_r(t),\alpha_j^c,q_{j,1},\ldots,q_{j,r}\big),\end{split}\end{equation}

where $p_j^c,p_j^p,\alpha_j^c, q_{j1},\ldots,q_{jr}\in (0,1)$ are parameters satisfying

\begin{align*} p_j^c+p_j^p=1\quad\text{and}\quad \alpha_j^c+q_{j1}+\cdots+q_{jr}=1\quad\text{for each} \quad 1\leq j\leq r;\end{align*}

these parameters will later be chosen appropriately (cf. Condition 4.1). The link between the initial conditions of the systems (12) and (14) will be introduced in the sequel. Observe that the solution of (14) depends not only on its sample path but also on the distribution of the process itself. Thus, the system (14) is McKean–Vlasov.

4.3. Regularity assumptions

We introduce and discuss here the regularity conditions under which the existence and uniqueness of the limiting system (14), together with the propagation of chaos and laws of large numbers investigated in Section 5, hold.

Condition 4.1.

  1. 1. For all $1\leq j\leq r$ and $(z,z')\in\mathcal{E}$ , there exist measurable functions $\gamma^{j,c}_{z,z'}\,:\,\mathcal{Z}\rightarrow\mathbb{R}^+$ and $\gamma^{j,p}_{z,z'}\,:\,\mathcal{Z}\rightarrow\mathbb{R}^+$ such that the following hold:

  • For any probability measures $\nu,\mu\in\mathcal{M}_1(\mathcal{Z})$ and any real numbers $a_1,a_2$ satisfying $0<a_1,a_2<1$ and $a_1+a_2=1$ , we have

    (16) \begin{equation}\lambda_{j,z,z'}^{c}(\nu,\mu,a_1,a_2)=a_1\int_{\mathcal{Z}}\gamma^{j,c}_{z,z'}(x)\nu (dx)+ a_2\int_{\mathcal{Z}}\gamma^{j,p}_{z,z'}(x)\mu (dx). \end{equation}
  • For any $\nu,\mu_1,\ldots,\mu_r\in\mathcal{M}_1(\mathcal{Z})$ and any real numbers $a,b_1,\ldots,b_r$ satisfying $0<a,b_1,\ldots,b_r<1$ and $a+b_1+\cdots+b_r=1$ , we have

    (17) \begin{align}\lambda^p_{j,z,z'}(\nu,\mu_1,\ldots,\mu_r,a,b_1,&\dots,b_r)= a\int_{\mathcal{Z}}\gamma^{j,c}_{z,z'}(x)\nu (dx) \nonumber\\&+b_1\int_{\mathcal{Z}}\gamma^{1,p}_{z,z'}(x)\mu_1 (dx)+\cdots+b_r\int_{\mathcal{Z}}\gamma^{r,p}_{z,z'}(x)\mu_r (dx).\end{align}
  1. 2. For each block $1\leq j\leq r$ , there exist $p_j^c,p_j^p\in(0,1)$ such that, as $N\rightarrow\infty$ ,

    (18) \begin{align}\frac{N_j^p}{N_j}\rightarrow p_j^p,\quad \frac{N_j^c}{N_j}\rightarrow p_j^c,\quad and \quad p^p_j+p^c_j=1.\end{align}
  2. 3. For each block $1\leq j\leq r$ , as $N\rightarrow\infty$ ,

    (19) \begin{align}\sup_{n\in C_j^c}\left|\varrho_{n,j}^c-p^c_j\right|\rightarrow 0 \quad and \quad \sup_{n\in C_j^c}\left|\varrho_{n,j}^p-p^p_j\right|\rightarrow 0.\end{align}
  3. 4. For each block $1\leq j\leq r$ , there exist $\alpha_j^c,q_{j1},\ldots,q_{jr}\in (0,1)$ with $\alpha_j^c+q_{j1}+\cdots+q_{jr}=1$ such that the following conditions hold for each block $1\leq i\leq r$ , as $N\rightarrow\infty$ :

    (20) \begin{align}\sup_{n \in C_j^p}\bigg|\varsigma_{n,j}^c-\alpha_{j}^c\bigg|\rightarrow 0\quad\text{ and }\quad\sup_{n\in C_j^p}\bigg|\varsigma_{n,j,i}^p-q_{ji}\bigg|\rightarrow 0.\end{align}
  4. 5. For all nodes $n\in\mathcal{V}$ , $deg(n)\rightarrow\infty$ as $N\rightarrow\infty$ .

Remark 4.1.

  1. 1. Since $\mathcal{Z}$ is a finite state space, the functions $\gamma^{j,c}_{z,z'}$ and $\gamma^{j,p}_{z,z'}$ are bounded on $\mathcal{Z}$ . Moreover, since $\mathcal{Z}\subset \mathbb{N}$ and since every bounded function on $\mathbb{N}$ is automatically Lipschitz, $\gamma^{j,c}_{z,z'}$ and $\gamma^{j,p}_{z,z'}$ are also Lipschitz. Denote by $\bar{\gamma}>0$ the maximum bound and by $L_{\gamma}$ the maximum Lipschitz coefficient of the sequences of functions $\Big\{\gamma^{j,c}_{z,z'},1\leq j \leq r,(z,z')\in\mathcal{E}\Big\}$ and $\Big\{\gamma^{j,p}_{z,z'},1\leq j \leq r,(z,z')\in\mathcal{E}\Big\}$ .

  2. 2. The conditions in (19) and (20) are satisfied if, for instance, for all $1\leq j\leq r$ , the following hold:

  • For each $n\in C_j^c$ , we have $M_{j}^{c}(n)/N_j^c\rightarrow 1$ and $M_{j}^{p}(n)/N_j^p\rightarrow 1$ as $N\rightarrow\infty$ .

  • For each $n\in C_j^p$ , we have $M_{j}^{c}(n)/N_j^c\rightarrow 1$ and $M_i^{p}(n)/N_i^p\rightarrow 1$ as $N\rightarrow\infty$ for all $1\leq i\leq r$ .

  1. Indeed, under this assumption one can define

    (21) \begin{align}\alpha_j^c&=\lim_{N\rightarrow\infty}\frac{N_j^c}{N_j^c+N_1^p+\cdots+N_r^p}\qquad\forall 1\leq j\leq r,\end{align}
    (22) \begin{align}q_{ji}&=\lim_{N\rightarrow\infty}\frac{N_{i}^p}{N_j^c+N_1^p+\cdots+N_r^p}\qquad\forall 1\leq j,i\leq r, \end{align}
    and thus, one can easily verify that, as $N\rightarrow\infty$ , the following hold:
  • For all $n\in C_j^c$ ,

    (23) \begin{align}\frac{1+M_{j}^{c}(n)}{1+deg(n)}\rightarrow p^c_j\mbox{ and } \frac{M_{j}^{p}(n)}{1+deg(n)}\rightarrow p^p_j.\end{align}
  • For all $n\in C_j^p$ and $i\neq j$ ,

    (24) \begin{align}\frac{M_i^{p}(n)}{1+deg(n)}\rightarrow q_{ji},\quad \frac{1+M_j^{p}(n)}{1+deg(n)}\rightarrow q_{jj},\quad\text{and}\quad\frac{M_{j}^{c}(n)}{1+deg(n)}\rightarrow\alpha_{j}^c.\end{align}
  1. 3. A special case where the conditions (19) and (20) are satisfied is when the blocks are cliques and the peripheral subgraph is complete—that is, when all peripheral nodes are connected (see Figure 1) and all the nodes in the same block are connected. In such a case, the central (resp. peripheral) nodes in the same block are exchangeable.

    Figure 1. Example of a block-structured graph with a complete peripheral subgraph. Here we have a 4-block-structured graph linked by a set of peripheral nodes. For the first block, the set of central nodes is $C_1^c=\{1,2\}$ and the set of peripheral nodes is $C_1^p=\{3,4\}$ . The set of all peripheral nodes of the graph is given by the set of nodes $C^p=\{3,4,5,10,11,14,18\}$ .

  2. 4. Even though the conditions (19) and (20) are somewhat restrictive, the construction of the model allows one to have very different degrees in each block. One might further compare these conditions with existing conditions in the literature. Consider for example the condition imposed in [Reference Budhiraja, Mukherjee and Wu15] for a supermarket model on sparse graphs to asymptotically behave as on cliques. The condition in [Reference Budhiraja, Mukherjee and Wu15] relies on the local properties of the graph by requiring direct neighbors of any node to have asymptotically similar degrees; see [Reference Budhiraja, Mukherjee and Wu15, Condition 1(ii)]. This condition is violated in our model. Indeed, the conditions (19) and (20) allow central and peripheral nodes from the same block to have very different degrees, even if they are neighbors, which goes beyond [Reference Budhiraja, Mukherjee and Wu15, Condition 1(ii)]. In addition, under our condition, $deg_{\max} (G)/deg_{\min}(G)$ should not go to 1 as $N\rightarrow\infty$ , nor does $\max_j \left|\left(deg_{\min} (C_j)/deg_{\max}(C_j)\right)-1\right|$ go to zero as proposed in [Reference Budhiraja, Mukherjee and Wu15, Remark 1] (here $deg_{\min}(C_j)$ and $deg_{\max}(C_j)$ refer to the minimum and maximum degrees of nodes within the same block j). In this sense, the family of graphs which we are considering in the present work is sparser than the ones covered by [Reference Budhiraja, Mukherjee and Wu15, Condition 1(ii)]. Another condition with which to compare ours is the one proposed in [Reference Delattre, Giacomin and Luçon28], under which an n-dimensional diffusion system converges to a limiting Fokker–Planck equation; see [Reference Delattre, Giacomin and Luçon28, Equations (1.1) and (1.3)]. Note that [Reference Delattre, Giacomin and Luçon28, Equations (1.5) and (1.7)] impose global regularity conditions in the sense that the degrees of all the nodes should converge to the same limit; such conditions are not imposed here.

  3. 5. While the current paper considers deterministic graphs, one can investigate the case where the underlying graph topology is random. For example, it is of interest for some applications to have a scenario where the connections between the peripheral nodes are random. One can then search for adequate conditions to impose on the edge dynamics for the propagation-of-chaos property to hold. This, however, goes beyond the scope of the current paper.

4.4. Existence and uniqueness

We now prove the existence and uniqueness of the solution of the limiting McKean–Vlasov system introduced in (14).

Theorem 4.1. Suppose that Condition 4.1 holds. Then, for a given initial condition $\Big(\Big(\bar{X}_n^{c}(0),\bar{X}_m^{p}(0)\Big),n\in C_j^c,m\in C_j^p;\,1\leq j\leq r\Big)$ , the McKean–Vlasov system (14) has a unique solution over any finite time interval [0,T]. In addition, this solution depends continuously on the initial condition in the following sense: if $(\bar{X}^1(t),t\in[0, T])$ and $(\bar{X}^2(t),t\in[0, T])$ are two solutions of (14) with two different initial conditions $(\bar{X}^1(0))$ and $(\bar{X}^2(0))$ , respectively, then there exists a constant $A_T$ , depending on the time horizon T, such that

(25) \begin{equation}\begin{split}&\max_{1\leq j\leq r}\max_{n\in C_j^c}\mathbb{E}\left[\Big\|\bar{X}^{1,c}_{n,j}-\bar{X}^{2,c}_{n,j}\Big\|_T\right]+\max_{1\leq j\leq r}\max_{n\in C_j^p}\mathbb{E}\left[\Big\|\bar{X}_{n,j}^{1,p}-\bar{X}_{n,j}^{2,p} \Big\|_T\right]\\&\qquad\leq \bigg( \max_{1\leq j\leq r}\max_{n\in C_j^c} \mathbb{E}\Big[\Big|\bar{X}^{1,c}_{n,j}(0)-\bar{X}^{2,c}_{n,j}(0)\Big|\Big]+\max_{1\leq j\leq r}\max_{n\in C_j^p}\mathbb{E}\Big[\Big|\bar{X}_{n,j}^{1,p}(0)-\bar{X}_{n,j}^{2,p}(0)\Big|\Big] \bigg)e^{A_T}.\end{split}\end{equation}

Proof. For $1\leq j\leq r$ , with a slight abuse of notation, let

\[ e_{j,c}\,:\, \big(x^c_1, x_1^p,\ldots, x_r^c,x_r^p\big) \in \big(\mathcal{Z}^{2r}\big)\rightarrow x_j^c\in\mathcal{Z}\]

and

\[ e_{j,p}\,:\, \big(x^c_1, x_1^p,\ldots, x_r^c,x_r^p\big) \in \big(\mathcal{Z}^{2r}\big)\rightarrow x_j^p\in\mathcal{Z}\]

be the cth and the pth component, respectively, of the jth projection. Moreover, for $t\leq T$ , let $p_t\,:\, f\in \mathcal{D}\big([0,T],\mathcal{Z}^{2r}\big)\rightarrow f(t)\in\mathcal{Z}^{2r}$ .

For $M\in\mathcal{M}_1\big(\mathcal{D}\big([0,T],\mathcal{Z}^{2r}\big)\big)$ , let $M (t) \,:\!=\, M \circ p_t^{-1}$ . Consider the system starting at $\bar{X}_0=(\bar{X}_0^{1,c},\bar{X}_0^{1,p},\ldots,\bar{X}_0^{r,c},\bar{X}_0^{r,p})$ and given, at each $t\in (0,T]$ , by

(26) \begin{equation}\begin{split} \bar{X}^{c}_{j}(t)&= \bar{X}^{c}_{j}(0)+\int_{[0,t]\times\mathbb{R}_+}\sum_{(z,z')\in\mathcal{E}}{\unicode{x1D7D9}}_{ \bar{X}^{c}_{j}(s{-})=z}(z'-z){\unicode{x1D7D9}}_{\left[0,\lambda_{j,z,z'}^{c}\left(\upsilon_{j}^c(s{-})\right)\right]}(y)\mathcal{N}_{j}^c(ds,dy),\\ \bar{X}^{p}_{j}(t)&= \bar{X}^{p}_{j}(0)+\int_{[0,t]\times\mathbb{R}_+}\sum_{(z,z')\in\mathcal{E}}{\unicode{x1D7D9}}_{ \bar{X}^{p}_{j}(s{-})=z}(z'-z){\unicode{x1D7D9}}_{\left[0,\lambda^p_{j,z,z'}\left(\upsilon_{j}^p(s{-})\right)\right]}(y)\mathcal{N}_{j}^p(ds,dy),\end{split}\end{equation}

for $1\leq j\leq r$ , where $\mu_j^c(t)=M(t)\circ e_{j,c}^{-1}$ and $\mu_j^p(t)=M(t)\circ e_{j,p}^{-1}$ , the vectors $\upsilon_{j}^c(t)$ and $\upsilon_{j}^p(t)$ are given by (15), and $\big\{\mathcal{N}_{j}^c, 1\leq j\leq r\big\}$ and $\big\{\mathcal{N}_{j}^p, 1\leq j\leq r\big\}$ are collections of Poisson random measures on $\mathbb{R}^2$ whose intensity measures are Lebesgue on $\mathbb{R}^2_+$ . Denote by $\psi$ and $\phi$ the mappings that associate to M the solution of this system and its corresponding law. Thus, $\psi (M)=\big(\bar{X}^c_{j}, \bar{X}^p_{j},1\leq j\leq r\big)$ and $\phi (M)=\mathcal{L}\big(\bar{X}^c_{j}, \bar{X}^p_{j},1\leq j\leq r\big)$ . Observe that if $\bar{X}$ is a solution of (14), then its law is a fixed point of $\phi$ . Conversely, if M is a fixed point of $\phi$ for the system (26), then the corresponding solution $\psi (M)$ defines a solution of the limiting system (14). The idea is then to prove the existence of a fixed point of $\phi$ .

Take $M_1,M_2\in\mathcal{M}_1\big(\mathcal{D}\big([0,T],\mathcal{Z}^{2r}\big)\big)$ . Set $\bar{X}_1\,:\!=\,\big(\bar{X}^{1,c}_1,\bar{X}^{1,p}_1\ldots,\bar{X}^{1,c}_r,\bar{X}^{1,p}_r\big)=\psi (M_1)$ and $\bar{X}_2\,:\!=\,\big(\bar{X}^{2,c}_1,\bar{X}^{2,p}_1\ldots,\bar{X}^{2,c}_r,\bar{X}^{2,p}_r\big)=\psi (M_2)$ . Thus, $\mathcal{L}(\bar{X}_1)=\phi (M_1)$ and $\mathcal{L}(\bar{X}_2)=\phi (M_2)$ . Moreover, for all $t\in [0,T]$ , define $\mu_1(t)\,:\!=\,(\mu_1^{1,c}(t),\mu_1^{1,p}(t),\ldots,\mu_r^{1,c}(t),\mu_r^{1,p}(t))$ and $\mu_2(t)\,:\!=\,(\mu_1^{2,c}(t),\mu_1^{2,p}(t),\ldots,\mu_r^{2,c}(t),\mu_r^{2,p}(t))$ with $\mu_j^{1,c}(t)\,:\!=\,M_1(t)\circ e_{j,c}^{-1}$ , $\mu_j^{1,p}(t)\,:\!=\,M_1(t)\circ e_{j,p}^{-1}$ , $\mu_j^{2,c}(t)\,:\!=\,M_2(t)\circ e_{j,c}^{-1}$ and $\mu_j^{2,p}(t)\,:\!=\,M_2(t)\circ e_{j,p}^{-1}$ for $1\leq j\leq r$ . According to (15), we introduce the following notation:

(27) \begin{equation}\begin{split}\upsilon^{1,c}_j(t)&\,:\!=\,\big(\mu^{1,c}_j(t),\mu^{1,p}_j(t),p_j^{c},p_j^{p}\big), \\\upsilon^{2,c}_j(t)&\,:\!=\,\big(\mu^{2,c}_j(t),\mu^{2,p}_j(t),p_j^{c},p_j^{p}\big), \\\upsilon^{1,p}_{n,j}(t)&\,:\!=\,\big(\mu^{1,c}_j(t),\mu^{1,p}_1(t),\ldots,\mu^{1,p}_r(t),\alpha_j^c,q_{j,1},\ldots,q_{j,r}\big),\\\upsilon^{2,p}_{n,j}(t)&\,:\!=\,\big(\mu^{2,c}_j(t),\mu^{2,p}_1(t),\ldots,\mu^{2,p}_r(t),\alpha_j^c,q_{j,1},\ldots,q_{j,r}\big).\end{split}\end{equation}

We first prove that $\phi$ is a contraction mapping on $\mathcal{M}_1\big(\mathcal{D}\big([0,T],\mathcal{Z}^{2r}\big)\big)$ ; that is, for any $t\in[0,T]$ ,

(28) \begin{align}\mathcal{W}_{1,t}\big(\phi(M_1),\phi(M_2)\big)\leq C(t)\mathbb{E}\bigg[\int_{[0,t]} \mathcal{W}_{1,s}(M_1,M_2) ds\bigg].\end{align}

To this end, for ease of reading, let us introduce the following notation:

(29) \begin{equation}\begin{split}\Delta_j^c(t)\,:\!=\,\big\|\bar{X}^{1,c}_{j}-\bar{X}^{2,c}_{j}\big\|_t,\quad \Delta_j^p(t)\,:\!=\,\big\|\bar{X}^{1,p}_{j}-\bar{X}^{2,p}_{j}\big\|_t.\end{split}\end{equation}

Indeed, for any $1\leq j\leq r$ we have that

(30) \begin{equation}\begin{split}\Delta_j^c(t)&\leq \int_{[0,t]\times\mathbb{R}_+}\bigg|\sum_{(z,z')\in\mathcal{E}}(z'-z)\bigg\{{\unicode{x1D7D9}}_{\bar{X}^{1,c}_{j}(s{-})=z}{\unicode{x1D7D9}}_{\left[0,\lambda_{j,z,z'}^{c}\Big(\upsilon^{1,c}_j(s)\Big)\right]}(y) \\ &\qquad\qquad\qquad\qquad -{\unicode{x1D7D9}}_{\bar{X}^{2,c}_{j}(s{-})=z}{\unicode{x1D7D9}}_{\left[0,\lambda_{j,z,z'}^{c}(\upsilon^{2,c}_{j}(s))\right]}(y)\bigg\}\bigg| \mathcal{N}_{j}^c(ds,dy).\end{split}\end{equation}

Using a martingale argument (see (63)) and taking the expectation, by adding and subtracting terms (see (65)) one gets, for any $t\in [0, T]$ ,

(31) \begin{equation}\begin{split}\mathbb{E}\left[\Delta_j^c(t)\right]&\leq K\mathbb{E}\Bigg[\int_{[0,t]}\sum_{(z,z')\in\mathcal{E}}\bigg|\Big({\unicode{x1D7D9}}_{\bar{X}^{1,c}_{j}(s)=z}-{\unicode{x1D7D9}}_{\bar{X}^{2,c}_{j}(s)=z}\Big)\lambda_{j,z,z'}^{c}\Big(\upsilon^{1,c}_j(s)\Big) \\ &\qquad\qquad\qquad\qquad\qquad+{\unicode{x1D7D9}}_{\bar{X}^{2,c}_{j}(s)=z}\bigg(\lambda_{j,z,z'}^{c}\Big(\upsilon^{1,c}_j(s)\Big)-\lambda_{j,z,z'}^{c}\Big(\upsilon^{2,c}_j(s)\Big)\bigg) \bigg|ds\Bigg].\end{split}\end{equation}

Recall the definition of the functions $\lambda_{j,z,z'}^{c}$ in (16). Given that $\mu_j^c(t)$ and $\mu_j^p(t)$ are probability measures and using the boundedness of the functions $\gamma^{j,c}_{z,z'}$ and $\gamma^{j,p}_{z,z'}$ , one easily gets that

(32) \begin{align}\lambda_{j,z,z'}^{c}\Big(\upsilon^{1,c}_j(s)\Big)\leq \bar{\gamma},\end{align}

and

(33) \begin{equation} \begin{split} \bigg|\lambda_{j,z,z'}^{c}\Big(\upsilon_j^{1,c}(s)\Big){-}\lambda_{j,z,z'}^{c}\Big(\upsilon_j^{2,c}(s)\Big)\bigg|&\leq p_j^c\bigg|\Big\langle \gamma^c_{z,z'}, \mu_j^{1,c}(s)-\mu_j^{2,c}(s)\Big\rangle\bigg|+p_j^p\bigg|\Big\langle \gamma^p_{z,z'}, \mu_j^{1,p}(s){-}\mu_j^{2,p}(s)\Big\rangle\bigg|\\ &\leq p_j^c\bar{\gamma} d_{BL} \Big(\mu_j^{1,c}(s),\mu_j^{2,c}(s)\Big)+p_j^p\bar{\gamma}d_{BL} \Big(\mu_j^{1,p}(s),\mu_j^{2,p}(s)\Big). \end{split} \end{equation}

Therefore one obtains

(34) \begin{align}\mathbb{E}\big[\Delta_j^c(t)\big]&\leq K|\mathcal{E}|\bar{\gamma}\mathbb{E}^0\bigg[\int_{[0,t]}\bigg(\big|\bar{X}^{1,c}_{j}(s)-\bar{X}^{2,c}_{j}(s)\big|+p_j^c d_{BL} \Big(\mu_j^{1,c}(s),\mu_j^{2,c}(s)\Big)\nonumber\\& \qquad +p_j^pd_{BL} \Big(\mu_j^{1,p}(s),\mu_j^{2,p}(s)\Big)\bigg) ds\bigg].\end{align}

Using (17) and the same steps as previously, one finds, for any $1\leq j\leq r$ ,

(35) \begin{equation}\begin{split}\mathbb{E}\big[\Delta_j^p(t)\big]&\leq K|\mathcal{E}|\bar{\gamma}\mathbb{E}\bigg[\int_{[0,t]}\bigg(\big|\bar{X}^{1,p}_{j}(s)-\bar{X}^{2,p}_{j}(s)\big|+\alpha_j^c d_{BL} \Big(\mu_j^{1,c}(s),\mu_j^{2,c}(s)\Big) \\ &\qquad\qquad\quad\qquad+q_{j1}d_{BL} \Big(\mu_1^{1,p}(s),\mu_1^{2,p}(s)\Big)+\cdots+q_{jr}d_{BL} \Big(\mu_r^{1,p}(s),\mu_r^{2,p}(s)\Big)\bigg)ds\bigg]. \end{split}\end{equation}

On the one hand, from the Kantorovich–Rubinstein theorem, one has that for $1\leq j\leq r$ and $\alpha\in\{c,p\}$ ,

(36) \begin{align}\mathcal{W}_1\Big(\mu_j^{1,\alpha}(s),\mu_j^{2,\alpha}(s)\Big)=\sup_{\|g\|_L\leq 1} \Big| \Big\langle \mu_j^{1,\alpha}(s),g\Big\rangle-\Big\langle \mu_j^{2,\alpha}(s),g\Big\rangle \Big|,\end{align}

where the supremum is taken over the functions g with Lipschitz constant 1. Therefore,

(37) \begin{align}d_{BL} \Big(\mu_j^{1,\alpha}(s),\mu_j^{2,\alpha}(s)\Big)\leq \mathcal{W}_1\Big(\mu_j^{1,\alpha}(s),\mu_j^{2,\alpha}(s)\Big).\end{align}

On the other hand, one can easily verify that

(38) \begin{align}\mathcal{W}_1\Big(\mu_j^{1,\alpha}(s),\mu_j^{2,\alpha}(s)\Big)\leq\mathcal{W}_{1,s}\big(M_1,M_2\big).\end{align}

Thus, using (37) and (38) and taking the supremum over $1\leq j\leq r$ in (34) and (35), one obtains

(39) \begin{equation}\begin{split}\mathbb{E}\bigg[\sup_{1\leq j\leq r}\Delta_j^c(t)\bigg]&\leq K|\mathcal{E}|\bar{\gamma}\mathbb{E}\bigg[\int_{[0,t]}\bigg(\sup_{1\leq j\leq r}\Delta_j^c(s)+ \mathcal{W}_{1,s}(M_1,M_2)\bigg) ds\bigg]\end{split}\end{equation}

and

(40) \begin{equation}\begin{split}\mathbb{E}\bigg[\sup_{1\leq j\leq r}\Delta_j^p(t)\bigg]&\leq K|\mathcal{E}|\bar{\gamma}\mathbb{E}\bigg[\int_{[0,t]}\bigg(\sup_{1\leq j\leq r}\Delta_j^p(s) + \mathcal{W}_{1,s}(M_1,M_2)\bigg) ds\bigg].\end{split}\end{equation}

Adding the two last inequalities side by side and applying Grönwall’s lemma leads to

(41) \begin{equation}\begin{split}\mathbb{E}\bigg[\sup_{1\leq j\leq r}\Delta_j^c(t)+\sup_{1\leq j\leq r}\Delta_j^p(t)\bigg]&\leq K|\mathcal{E}|\bar{\gamma}\mathbb{E}\bigg[\int_{[0,t]} \mathcal{W}_{1,s}(M_1,M_2) ds\bigg]e^{K|\mathcal{E}|\bar{\gamma}t}.\end{split}\end{equation}

Hence,

(42) \begin{equation}\begin{split}\mathbb{E}\big[\|\bar{X}^{1}-\bar{X}^{2}\|_t\big]&\leq C(t)\mathbb{E}\bigg[\int_{[0,t]} \mathcal{W}_{1,s}(M_1,M_2) ds\bigg],\end{split}\end{equation}

with $C(t)=K|\mathcal{E}|\bar{\gamma}e^{K|\mathcal{E}|\bar{\gamma}t}$ . From the definition of the Wasserstein distance, it is easy to observe that

\begin{align*}\mathcal{W}_{1,t}\big(\phi(M_1),\phi(M_2)\big)\leq \mathbb{E}\big[\|\bar{X}^{1}-\bar{X}^{2}\|_t\big],\end{align*}

from which one deduces (28).

Consider now the following recursive scheme:

  • $M_0\in\mathcal{M}_1\big(\mathcal{D}\big([0,T],\mathcal{Z}^{2r}\big)\big)$ ;

  • $M_{k+1}=\phi (M_k),\quad k\geq 0$ .

By iterating the formula in (28) and using the fact that $\mathcal{W}_{1,t}(M_1,M_0)$ is increasing in t, one finds that for all $k\geq 0$ ,

\begin{align*}\mathcal{W}_{1,t}(M_{k+2},M_{k+1})\leq \frac{(tC(t))^k}{k!}\mathcal{W}_{1,t}(M_1,M_2).\end{align*}

Moreover, it is easy to verify that $\mathbb{E}[\|\bar{X}\|_T]<\infty$ , where $\bar{X}=\psi(M)$ for any $M\in\mathcal{M}_1\big(\mathcal{D}\big([0,T],\mathcal{Z}^{2r}\big)\big)$ , from which we deduce that $\mathcal{W}_{1,t}(M_1,M_2)<\infty$ and thus the sequence $\{M_k\}_{k\geq 0}$ is a Cauchy sequence. Note that the space $\mathcal{P}_1\big(\mathcal{D}\big([0,T],\mathcal{Z}^{2r}\big)\big)$ endowed with the Wasserstein distance $\mathcal{W}_{p,T}$ is complete (see [Reference Bolley11]). Hence the sequence $\{M_k\}_{k\geq 0}$ converges to some measure M in $\mathcal{P}_1\big(\mathcal{D}\big([0,T],\mathcal{Z}^{2r}\big)\big)$ which is a fixed point of $\phi$ on $\mathcal{P}_1\big(\mathcal{D}\big([0,T],\mathcal{Z}^{2r}\big)\big)$ . This proves the existence of the solution of the equation in (26) and thus that of the equation in (14). Uniqueness follows from again using (28) and Grönwall’s lemma.

Let $\big(\bar{X}^1(t)\big)\,:\!=\,\Big(\bar{X}^{1,c}_{n,j}(t),\bar{X}^{1,p}_{m,j}(t),n\in C_j^c,m\in C_j^p, 1\leq j\leq r\Big)$ and $\big(\bar{X}^2(t)\big)\,:\!=\,\Big(\bar{X}^{2,c}_{n,j}(t),\bar{X}^{2,p}_{m,j}(t),n\in C_j^c,m\in C_j^p, 1\leq j\leq r\Big)$ be two solutions of (14) with respective initial conditions $(\bar{X}^1(0))$ and $(\bar{X}^2(0))$ . Denote by $\mu_j^{1,c}(t)\,:\!=\,\mathcal{L}\Big(\bar{X}^{1,c}_{n,j}(t)\Big)$ and $\mu_j^{1,p}(t)\,:\!=\,\mathcal{L}\Big(\bar{X}^{1,p}_{m,j}(t)\Big)$ , for $1\leq j \leq r$ , the probability measures corresponding to the first solution. Similarly, denote by $\mu_j^{2,c}(t)\,:\!=\,\mathcal{L}\Big(\bar{X}^{2,c}_{n,j}(t)\Big)$ and $\mu_j^{2,p}(t)\,:\!=\,\mathcal{L}\Big(\bar{X}^{2,p}_{m,j}(t)\Big)$ the probability measures corresponding to the second solution. Again, for ease of reading, let us introduce the following notation:

(43) \begin{equation}\begin{split}\Delta_{n,j}^c(t)\,:\!=\,\big\|\bar{X}^{1,c}_{n,j}-\bar{X}^{2,c}_{n,j}\big\|_t,\quad \Delta_{n,j}^p(t)\,:\!=\,\big\|\bar{X}_{n,j}^{1,p}-\bar{X}_{n,j}^{2,p} \big\|_t.\end{split}\end{equation}

Using this together with the notation in (27), one finds that for any $n\in C_j^c$ ,

(44) \begin{equation}\begin{split}\Delta_{n,j}^c(t)&\leq \Big|\bar{X}^{1,c}_{n,j}(0)-\bar{X}^{2,c}_{n,j}(0)\Big|\\ &\qquad+\int_{[0,t]\times\mathbb{R}_+}\Bigg|\sum_{(z,z')\in\mathcal{E}}(z'-z)\Bigg\{{\unicode{x1D7D9}}_{\bar{X}^{1,c}_{n,j}(s{-})=z}{\unicode{x1D7D9}}_{\left[0,\lambda_{j,z,z'}^{c}\Big(\upsilon_{j}^{1,c}(s)\Big)\right]}(y) \\ &\qquad\qquad\qquad\qquad\qquad\qquad\qquad -{\unicode{x1D7D9}}_{\bar{X}^{2,c}_{n,j}(s{-})=z}{\unicode{x1D7D9}}_{\left[0,\lambda_{j,z,z'}^{c}\Big(\upsilon_{j}^{2,c}(s)\Big)\right]}(y)\Bigg\}\Bigg| \mathcal{N}_{n,j}^c(ds,dy).\end{split}\end{equation}

Using a martingale argument (see (63)), taking the conditional expectation $E^0$ given $(\bar{X}^1(0),\bar{X}^2(0))$ , and finally adding and subtracting terms (see (65)), one gets that for $t\in [0,T]$ ,

(45) \begin{equation}\begin{split}\mathbb{E}^0\Big[\Delta_{n,j}^c(t)\Big]&\leq \Big|\bar{X}^{1,c}_{n,j}(0)-\bar{X}^{2,c}_{n,j}(0)\Big|\\&+K\mathbb{E}^0\Bigg[\int_{[0,t]}\sum_{(z,z')\in\mathcal{E}}\bigg|\Big({\unicode{x1D7D9}}_{\bar{X}^{1,c}_{n,j}(s)=z}-{\unicode{x1D7D9}}_{\bar{X}^{2,c}_{n,j}(s)=z}\Big)\lambda_{j,z,z'}^{c}\Big(\upsilon_{j}^{1,c}(s)\Big) \\ &\quad\qquad\qquad\qquad\qquad+{\unicode{x1D7D9}}_{\bar{X}^{2,c}_{n,j}(s{-})=z}\Big(\lambda_{j,z,z'}^{c}\Big(\upsilon_{j}^{1,c}(s)\Big)-\lambda_{j,z,z'}^{c}\Big(\upsilon_{j}^{2,c}(s)\Big)\Big) \bigg|ds\Bigg].\end{split}\end{equation}

Since $\mu_{j}^{1,c}(t)$ and $\mu_{j}^{1,p}(t)$ are probability measures, it is easy to see using (16) and the boundedness of the functions $\gamma^{j,c}_{z,z'}$ and $\gamma^{j,p}_{z,z'}$ that

(46) \begin{align}\lambda_{j,z,z'}^{c}\Big(\upsilon_{j}^{1,c}(s)\Big)\leq \bar{\gamma}.\end{align}

Additionally, the Lipschitz property of the functions $\gamma^{j,c}_{z,z'}$ leads to

(47) \begin{align} \bigg|\lambda_{j,z,z'}^{c}\Big(\upsilon_{j}^{1,c}(s)\Big)-\lambda_{j,z,z'}^{c}\Big(\upsilon_{j}^{2,c}(s)\Big)\bigg|&\leq p_{j}^c\bar{\gamma}\mathbb{E}\Big[\Big|\bar{X}^{1,c}_{n,j}(s)-\bar{X}^{2,c}_{n,j}(s)\Big|\Big]+p_{j}^p\bar{\gamma}\mathbb{E}\Big[\Big|\bar{X}^{1,p}_{1,j}(s)-\bar{X}^{2,p}_{1,j}(s)\Big|\Big]. \end{align}

Therefore one obtains

(48) \begin{equation}\begin{split}\mathbb{E}^0\Big[\Delta_{n,j}^c(t)\Big]&\leq \Delta_{n,j}^c(0)+K\bar{\gamma}|\mathcal{E}|\int_{[0,t]}\bigg(\mathbb{E}^0\Big[\Delta_{n,j}^c(s)\Big]+p_{j}^c\mathbb{E}\Big[\Delta_{n,j}^c(s)\Big]+p_{j}^p\mathbb{E}\Big[\Delta_{1,j}^p(s)\Big]\bigg) ds.\end{split}\end{equation}

Taking the expectation on both sides of the last inequality, and recalling that $p_j^c+p_j^p=1$ for all $1\leq j \leq r$ , one gets

(49) \begin{equation}\begin{split}\mathbb{E}\Big[\Delta_{n,j}^c(t)\Big]&\leq \mathbb{E}\big[\Delta_{n,j}^c(0)\big]+2K\bar{\gamma}|\mathcal{E}|\int_{[0,t]}\bigg(\mathbb{E}\Big[\Delta_{n,j}^c(s)\Big]+\mathbb{E}\Big[\Delta_{1,j}^p(s)\Big]\bigg) ds.\end{split}\end{equation}

Thus,

(50) \begin{align}\mathbb{E}\Big[\Delta_{n,j}^c(t)\Big]&\leq \max_{1\leq j\leq r}\max_{m\in C_j^c} \mathbb{E}\Big[\Delta_{m,j}^c(0)\Big]\nonumber\\& \qquad +2K\bar{\gamma}|\mathcal{E}|\int_{[0,t]}\bigg(\max_{1\leq j\leq r}\max_{m\in C_j^c}\mathbb{E}\Big[\Delta_{m,j}^c(s)\Big]+\max_{1\leq j\leq r}\max_{m\in C_j^p}\mathbb{E}\Big[\Delta_{m,j}^p(s)\Big]\bigg) ds.\end{align}

Taking the maximum over $n\in C_j^c$ and $1\leq j\leq r$ gives

(51) \begin{equation}\begin{split}\max_{1\leq j\leq r}\max_{n\in C_j^c}\mathbb{E}\Big[\Delta_{n,j}^c(t)\Big]&\leq \max_{1\leq j\leq r}\max_{m\in C_j^c} \mathbb{E}\Big[\Delta_{m,j}^c(0)\Big]+2K\bar{\gamma}|\mathcal{E}|\int_{[0,t]}\bigg(\max_{1\leq j\leq r}\max_{m\in C_j^c}\mathbb{E}\Big[\Delta_{m,j}^c(s)\Big]\\&\qquad\qquad\qquad\qquad\qquad\qquad+\max_{1\leq j\leq r}\max_{m\in C_j^p}\mathbb{E}\Big[\Delta_{m,j}^p(s)\Big]\bigg) ds.\end{split}\end{equation}

Using similar arguments, one finds that for any $1\leq j\leq r$ and $n\in C_j^p$ ,

(52) \begin{align} \mathbb{E}^0\left[\Delta_{n,j}^p(t)\right] & \leq |\bar{X}_{n,j}^{1,p}(0)-\bar{X}_{n,j}^{2,p}(0)|+K\mathbb{E}^0\bigg[\int_{[0,t]}\sum_{(z,z')\in\mathcal{E}}\bigg|\bigg(\bar{X}^{1,p}_{n,j}(s)-\bar{X}^{2,p}_{n,j}(s)\bigg)\lambda^p_{j,z,z'}\left(\upsilon_{j}^{1,p}(s)\right)\nonumber \\ &\qquad+\bigg(\lambda^p_{j,z,z'}\left(\upsilon^{1,p}_{j}(s)\right)-\lambda^p_{j,z,z'}\left(\upsilon_{j}^{2,p}(s)\right)\bigg) \bigg|\bigg]ds.\end{align}

By (17) and the Lipschitz boundedness property of the functions $\gamma^{j,c}_{z,z'}$ and $\gamma^{j,p}_{z,z'}$ one finds that

(53) \begin{align}\mathbb{E}^0\left[\Delta_{n,j}^p(t)\right]&\leq \Delta_{n,j}^p(0)+K\bar{\gamma}|\mathcal{E}|\int_{[0,t]}\bigg(\mathbb{E}^0\Big[\Delta_{n,j}^p(s)\Big] +\alpha_{j}^c\mathbb{E}\Big[\Delta_{1,j}^c(s)\Big]+q_{j1}\mathbb{E}\Big[\Delta_{1,1}^p(s)\Big]+\ldots\nonumber\\& \qquad \qquad\qquad\qquad +q_{jr}\mathbb{E}\Big[\Delta_{1,r}^p(s)\Big]\bigg)ds.\end{align}

Taking the expectation on both sides of the last inequality gives

(54) \begin{align}\mathbb{E}\left[\Delta_{n,j}^p(t)\right]&\leq \mathbb{E}\left[\Delta_{n,j}^p(0)\right]+K\bar{\gamma}|\mathcal{E}|\int_{[0,t]}\bigg(\mathbb{E}\Big[\Delta_{n,j}^p(s)\Big] +\alpha_{j}^c\mathbb{E}\Big[\Delta_{1,j}^c(s)\Big]+q_{j1}\mathbb{E}\Big[\Delta_{1,1}^p(s)\Big]+\ldots\nonumber\\& \qquad \qquad \qquad +q_{jr}\mathbb{E}\Big[\Delta_{1,r}^p(s)\Big]\bigg)ds.\end{align}

Recall that, for all $1\leq j \leq r$ , $\alpha_j^c+q_{j1}+\cdots+q_{jr}=1$ . Therefore

(55) \begin{align}\mathbb{E}\left[\Delta_{n,j}^p(t)\right]&\leq \max_{1\leq j \leq r}\max_{m\in C_j^p}\mathbb{E}\Big[\Delta_{m,j}^p(0)\Big]\nonumber\\& \qquad +K\bar{\gamma}|\mathcal{E}|\int_{[0,t]}\bigg( \max_{1\leq j \leq r}\max_{m\in C_j^c}\mathbb{E}\Big[\Delta_{m,j}^c(s)\Big]+(r+1)\max_{1\leq j \leq r}\max_{m\in C_j^p} \mathbb{E}\Big[\Delta_{m,j}^p(s)\Big]\bigg)ds.\end{align}

Taking the maximum over $n\in C_j^p$ and $1\leq j\leq r$ gives

(56) \begin{equation}\begin{split}\max_{1\leq j\leq r}\max_{n\in C_j^p}\mathbb{E}\big[\Delta_{n,j}^p(t)\big]&\leq \max_{1\leq j \leq r}\max_{m\in C_j^p}\mathbb{E}\big[\Delta_{m,j}^p(0)\big]\\& + K\bar{\gamma}|\mathcal{E}|\!\int_{[0,t]}\!\!\bigg( \!\max_{1\leq j \leq r}\max_{m\in C_j^c}\mathbb{E}\big[\Delta_{m,j}^c(s)\big] +(r+1)\max_{1\leq j \leq r}\max_{m\in C_j^p} \mathbb{E}\big[\Delta_{m,j}^p(s)\big]\bigg)ds.\end{split}\end{equation}

Now (51) and (56) together lead to

(57) \begin{equation}\begin{split}\max_{1\leq j\leq r}&\max_{n\in C_j^c}\mathbb{E}\Big[\Delta_{n,j}^c(t)\Big]+\max_{1\leq j\leq r}\max_{n\in C_j^p}\mathbb{E}\left[\Delta_{n,j}^p(t)\right]\\&\leq \max_{1\leq j\leq r}\max_{m\in C_j^c} \mathbb{E}\Big[\Delta_{m,j}^c(0)\Big]+\max_{1\leq j\leq r}\max_{m\in C_j^p} \mathbb{E}\Big[\Delta_{m,j}^p(0)\Big] \\&\quad+4K\bar{\gamma}|\mathcal{E}|(r+2)\int_{[0,t]}\bigg(\max_{1\leq j\leq r}\max_{m\in C_j^c} \mathbb{E}\Big[\Delta_{m,j}^c(s)\Big]+\max_{1\leq j\leq r}\max_{m\in C_j^p} \mathbb{E}\Big[\Delta_{m,j}^p(s)\Big]\bigg)ds.\end{split}\end{equation}

Then Grönwall’s lemma gives

(58) \begin{equation}\begin{split}\max_{1\leq j\leq r}&\max_{n\in C_j^c}\mathbb{E}\Big[\Delta_{n,j}^c(t)\Big]+\max_{1\leq j\leq r}\max_{n\in C_j^p}\mathbb{E}\left[\Delta_{n,j}^p(t)\right]\\&\leq \bigg( \max_{1\leq j\leq r}\max_{m\in C_j^c} \mathbb{E}\Big[\Delta_{m,j}^c(0)\Big]+\max_{1\leq j\leq r}\max_{m\in C_j^p} \mathbb{E}\Big[\Delta_{m,j}^p(0)\Big]\bigg)\exp\big\{t4K\bar{\gamma}|\mathcal{E}|(r+2) \big\}.\end{split}\end{equation}

Defining $A_t=8K\bar{\gamma}|\mathcal{E}|(2+r)t$ leads to (25). The theorem is proved.

5. Laws of large numbers and propagation of chaos

This section investigates the weak convergence of the finite particle system represented by the stochastic differential equation in (12) towards the limiting McKean–Vlasov system (14) as the total number of particles N tends to infinity. In particular, as the main results of this section, we establish propagation of chaos in Theorem 5.1 and laws of large numbers in Corollary 5.1.

Let us start this section by recalling the notions of multi-exchangeability and multi-chaoticity introduced in [Reference Graham40].

Definition 5.1. A sequence of random variables $(X_{n,k}, 1 \leq n \leq N_k, 1 \leq k \leq K)$ indexed by $(N_k, 1\leq k\leq K)\in \mathbb{N}^K$ is said to be multi-exchangeable if its law is invariant under permutation of the indexes within the classes; that is, for any permutations $\sigma_k$ of $\{1,\ldots, N_k\}$ for $1 \leq k \leq K$ , the following equality holds in distribution:

\begin{align*}(X_{\sigma_k(n),k} , 1 \leq n \leq N_k, 1 \leq k \leq K) \overset{dist}{=} (X_{n,k}, 1 \leq n \leq N_k, 1 \leq k \leq K).\end{align*}

A sequence of random variables $(X_{n,k}, 1\leq n\leq N_k, 1\leq k\leq K)$ indexed by $(N_k,1\leq k\leq K)\in \mathbb{N}^K$ is $P_1\otimes\cdots\otimes P_K$ -multi-chaotic if, for any $m\geq 1$ , the convergence in distribution

\begin{align*}\lim_{N\rightarrow\infty}(X_{n,k}, 1\leq n \leq m, 1 \leq k \leq K) \overset{dist}{=} P_1^{\otimes m}\otimes\cdots\otimes P_K^{\otimes m}\end{align*}

holds for the topology of uniform convergence on compact sets, where $P_k$ , for $1\leq k \leq K$ , is a probability distribution on $\mathbb{R}_+$ , and with the convention that N goes to infinity when $\min N_k$ goes to infinity.

5.1. Propagation of chaos

The following result establishes the weak convergence of the pre-limiting system (12) towards the McKean–Vlasov system (14) as the number of particles N goes to infinity, and thus its multi-chaoticity.

Theorem 5.1. Suppose that Condition 4.1 holds true. Moreover, suppose that the initial conditions $\Big(X_{n,j}^{c}(0),X_{m,j}^{p}(0),n\in C_j^c,m\in C_j^p;\,1\leq j\leq r\Big)$ are multi-exchangeable and $\nu^{1,c}\otimes\nu^{1,p}\otimes\cdots \nu^{r,c}\otimes\nu^{r,p}$ -multi-chaotic. Then, for any $t\in [0,T]$ , as $N\rightarrow\infty$ ,

(59) \begin{align}\max_{1\leq j \leq r}\max_{n\in C_j^c}\mathbb{E}\Big[\big\|X^{c}_{n,j}-\bar{X}^{c}_{n,j}\big\|_t \Big]+\max_{1\leq j \leq r}\max_{n\in C_j^p}\mathbb{E}\Big[\big\|X^{p}_{n,j}-\bar{X}^{p}_{n,j}\big\|_t \Big]\rightarrow 0,\end{align}

and the sequence of processes $\Big(\Big(X_n^{c,N}(t),X_m^{p,N}(t),t\geq 0\Big),n\in C_j^c,m\in C_j^p;\,1\leq j\leq r\Big)$ , the solutions of the stochastic differential equation (12) with initial conditions $\Big(X_n^{c,N}(0),X_m^{p,N}(0),n\in C_j^c,m\in C_j^p;\,1\leq j\leq r\Big)$ , is $P_{\bar{X}}$ -multi-chaotic, where $P_{\bar{X}}=\mu_{1}^c\otimes\mu_{1}^p\otimes\cdots\mu^c_{r}\otimes \mu_{r}^p$ is the distribution of the process $\Big(\Big(\bar{X}_n^{c}(t),\bar{X}_m^{p}(t),t\geq 0\Big),n\in C_j^c,m\in C_j^p;\,1\leq j\leq r\Big)$ , the solution of the limiting stochastic differential equation (14) with initial distribution $\nu^{1,c}\otimes\nu^{1,p}\otimes\cdots \nu^{r,c}\otimes\nu^{r,p}$ .

Before proceeding to the proof, we recall, without proof, an elementary result on (conditionally) independent and identically distributed (i.i.d.) random variables.

Lemma 5.1. Let $\{S_i\,:\,i=1,\ldots,n\}$ be a collection of $\mathbb{S}$ -valued random variables defined on some probability space $(\Omega,\mathcal{F},\mathbb{P})$ , where $\mathbb{S}$ is a Polish space. Suppose that $S_1,\ldots,S_n$ are conditionally i.i.d. given some $\sigma$ -algebra $\mathcal{G}\subset\mathcal{F}$ . Then, for any $k\in\mathbb{N}$ , there exists a positive and finite constant $0<a_k<\infty$ such that

(60) \begin{align} \sup_{\|f\|_{\infty}\leq 1}\mathbb{E}\bigg|\frac{1}{n}\sum_{i=1}^n\big(f(S_i)-\mathbb{E}[f(S_i)|\mathcal{G}]\big)\bigg|^k\leq \frac{a_k}{n^{k/2}}. \end{align}

Proof of Theorem 5.1. We use a coupling method. Let $X(t)=\Big(\Big(X_{n,j}^{c}(t),X_{m,j}^{p}(t)\Big),n\in C_j^c,m\in C_j^p;\,1\leq j\leq r\Big)$ be the solution of the stochastic differential equation (12) with initial conditions $X(0)=\Big(\Big(X_{n,j}^{c}(0),X_{m,j}^{p}(0)\Big),n\in C_j^c,m\in C_j^p;\,1\leq j\leq r\Big)$ . Moreover, let $Y(t)=\Big(\Big(Y_{n,j}^{c}(t),Y_{m,j}^{p}(t)\Big),n\in C_j^c,m\in C_j^p;\,1\leq j\leq r\Big)$ be the solution of the limiting stochastic differential equation (14) with the same initial conditions as X(t), i.e., $Y(0)=X(0)$ . Also, let the processes Y(t) and X(t) be defined on the same probability space by taking the same sequences of Poisson random measures $\big\{\mathcal{N}^{c}_{n,j}\big\}$ and $\big\{\mathcal{N}^{p}_{n,j}\big\}$ in both cases. We first prove that these two processes are asymptotically close, that is, for any $t\in [0, T]$ ,

(61) \begin{align}\max_{1\leq j\leq r}\max_{n\in C_j^c}\mathbb{E}\Big[\Big\|X^{c}_{n,j}-Y^{c}_{n,j}\Big\|_t \Big]+\max_{1\leq j\leq r}\max_{n\in C_j^p}\mathbb{E}\Big[\Big\|X^{p}_{n,j}-Y^{p}_{n,j}\Big\|_t \Big]\rightarrow 0,\mbox{ as $N\rightarrow\infty$}.\end{align}

To this end, we treat the central and peripheral nodes in two separate steps. For convenience, define

\begin{align*}o(n)\,:\!=\,\frac{1}{deg(n)+1}.\end{align*}

Step 1. Fix $1\leq j\leq r$ . For each central node $n\in C_j^c$ and any $t\in [0,T]$ ,

(62) \begin{align}\mathbb{E}\left[\big\|X_{n,j}^{c}-Y_{n,j}^{c}\big\|_t\right]&=\mathbb{E}\left[\sup_{0\leq s\leq t}\Big|X_{n,j}^{c}(s)-Y_{n,j}^{c}(s)\Big|\right]\nonumber\\ &\leq\mathbb{E}\Bigg[\Bigg|\int_{[0,t]\times\mathbb{R}_+}\sum_{(z,z')\in\mathcal{E}}{\unicode{x1D7D9}}_{X^{c}_{n,j}(s{-})=z}(z'-z){\unicode{x1D7D9}}_{\left[0,\lambda_{j,z,z'}^{c}\Big(\upsilon_{n,j}^{c,N}(s)\Big)\right]}(y)\mathcal{N}_{n,j}^c(ds,dy)\nonumber \\ &\qquad -\int_{[0,t]\times\mathbb{R}_+}\sum_{(z,z')\in\mathcal{E}}{\unicode{x1D7D9}}_{Y^{c}_{n,j}(s{-})=z}(z'-z){\unicode{x1D7D9}}_{\left[0,\lambda_{j,z,z'}^{c}\big(\upsilon_{j}^c(s)\big)\right]}(y)\mathcal{N}_{n,j}^c(ds,dy) \Bigg| \Bigg]\nonumber\\ &\leq\mathbb{E}\bigg[\int_{[0,t]\times\mathbb{R}_+}\bigg|\sum_{(z,z')\in\mathcal{E}}(z'-z)\bigg\{{\unicode{x1D7D9}}_{X^{c}_{n,j}(s{-})=z}{\unicode{x1D7D9}}_{\left[0,\lambda_{j,z,z'}^{c}\Big(\upsilon_{n,j}^{c,N}(s)\Big)\right]}(y)\nonumber \\ &\qquad\qquad -{\unicode{x1D7D9}}_{Y^{c}_{n,j}(s{-})=z}{\unicode{x1D7D9}}_{\left[0,\lambda_{j,z,z'}^{c}\big(\upsilon_{j}^c(s)\big)\right]}(y)\bigg\}\mathcal{N}_{n,j}^c(ds,dy) \bigg| \bigg].\end{align}

Denote by $\mathcal{F}_t$ the filtration generated by the Poisson random measures and defined by

\begin{align*}\mathcal{F}_t\,:\!=\,\sigma\Big\langle \mathcal{N}_{n,j}^c(A\times B)\,:\,n\in C_j^c \cup \mathcal{N}_{m,j}^p(A\times B)\,:\,m\in C_j^p, A\in\mathcal{B}(\mathbb{R}_+),B\in\mathcal{B}([0,T])) \Big\rangle.\end{align*}

Then $X^{c}_{n,j}(t)$ and $Y^{c}_{n,j}(t)$ are adapted to the filtration $\mathcal{F}_t$ . Therefore, the two processes

(63) \begin{equation}\begin{split}&\int_{[0,t]\times\mathbb{R}_+}\sum_{(z,z')\in\mathcal{E}}{\unicode{x1D7D9}}_{X^{c}_{n,j}(s{-})=z}(z'-z){\unicode{x1D7D9}}_{\left[0,\lambda_{j,z,z'}^{c}\Big(\upsilon_{n,j}^{c,N}(s)\Big)\right]}(y)\Big[\mathcal{N}_{n,j}^c(ds,dy)-dsdy\Big],\\&\int_{[0,t]\times\mathbb{R}_+}\sum_{(z,z')\in\mathcal{E}}{\unicode{x1D7D9}}_{Y^{c}_{n,j}(s{-})=z}(z'-z){\unicode{x1D7D9}}_{\left[0,\lambda_{j,z,z'}^{c}\big(\upsilon_{j}^c(s)\big)\right]}(y)\Big[\mathcal{N}_{n,j}^c(ds,dy)-dsdy\Big]\end{split}\end{equation}

are $\mathcal{F}_t$ -martingales. Furthermore, (62) reduces to

(64) \begin{align}\mathbb{E}\left[\big\|X_{n,j}^{c}-Y_{n,j}^{c} \big\|_t\right]&\leq \mathbb{E}\Bigg[\int_{[0,t]}\bigg|\sum_{(z,z')\in\mathcal{E}}(z'-z)\bigg\{{\unicode{x1D7D9}}_{X^{c}_{n,j}(s)=z}\lambda_{j,z,z'}^{c}\Big(\upsilon_{n,j}^{c,N}(s)\Big)\nonumber\\& \qquad -{\unicode{x1D7D9}}_{Y^{c}_{n,j}(s)=z}\lambda_{j,z,z'}^{c}\big(\upsilon_{j}^c(s)\big)\bigg\} \bigg|ds \Bigg].\end{align}

Recall that $K=|\mathcal{Z}|$ is the cardinality of the set $\mathcal{Z}$ . By adding and subtracting terms we obtain

(65) \begin{align}\mathbb{E}&\left[\big\|X_{n,j}^{c}-Y_{n,j}^{c} \big\|_t\right]\nonumber\\ &\leq K\mathbb{E}\Bigg[\int_{[0,t]}\bigg|\sum_{(z,z')\in\mathcal{E}}\bigg\{{\unicode{x1D7D9}}_{X^{c}_{n,j}(s)=z}\lambda_{j,z,z'}^{c}\Big(\upsilon_{n,j}^{c,N}(s)\Big)-{\unicode{x1D7D9}}_{Y^{c}_{n,j}(s)=z}\lambda_{j,z,z'}^{c}\Big(\upsilon_{n,j}^{c,N}(s)\Big)\nonumber\\ &\qquad\qquad\qquad\qquad\qquad+{\unicode{x1D7D9}}_{Y^{c}_{n,j}(s)=z}\lambda_{j,z,z'}^{c}\Big(\upsilon_{n,j}^{c,N}(s)\Big)-{\unicode{x1D7D9}}_{Y^{c}_{n,j}(s)=z}\lambda_{j,z,z'}^{c}\big(\upsilon_{j}^c(s)\big)\bigg\} \bigg|ds \Bigg]\nonumber \\ &\leq K\int_{[0,t]}\sum_{(z,z')\in\mathcal{E}}\mathbb{E}\bigg[\bigg|\Big({\unicode{x1D7D9}}_{X^{c}_{n,j}(s)=z}-{\unicode{x1D7D9}}_{Y^{c}_{n,j}(s)=z}\Big)\lambda_{j,z,z'}^{c}\Big(\upsilon_{n,j}^{c,N}(s)\Big)\nonumber\\& +{\unicode{x1D7D9}}_{Y^{c}_{n,j}(s)=z}\bigg(\lambda_{j,z,z'}^{c}\Big(\upsilon_{n,j}^{c,N}(s)\Big)-\lambda_{j,z,z'}^{c}\big(\upsilon_{j}^c(s)\big)\bigg) \bigg|\bigg]ds.\end{align}

The goal now is to bound the right-hand side of (65). Let us start with the second term. Again, by adding and subtracting terms we get

(66) \begin{align}\mathbb{E}\bigg[\bigg|{\unicode{x1D7D9}}_{Y^{c}_{n,j}(s)=z}\bigg(\lambda_{j,z,z'}^{c}&\Big(\upsilon_{n,j}^{c,N}(s)\Big)-\lambda_{j,z,z'}^{c}\big(\upsilon_{j}^c(s)\big)\bigg) \bigg|\bigg]\nonumber\\&\leq \mathbb{E}\bigg[\bigg|\lambda_{j,z,z'}^{c}\Big(\upsilon_{n,j}^{c,N}(s)\Big)-\lambda_{j,z,z'}^{c}\big(\upsilon_{j}^c(s)\big) \bigg|\bigg]\nonumber\\&= \mathbb{E}\Bigg[\Bigg| \Bigg(o(n)\sum_{m\in\{n\}\cup\mathfrak{N}_j^c(n)}\gamma^{j,c}_{z,z'}\big(X_{m,j}^{c}(s)\big)+o(n)\sum_{m\in\mathfrak{N}_j^p(n)}\gamma^{j,p}_{z,z'}\Big(X_{m,j}^{p}(s)\Big)\Bigg)\nonumber\\&\qquad-\Bigg(o(n)\sum_{m\in\{n\}\cup\mathfrak{N}_j^c(n)}\gamma^{j,c}_{z,z'}\big(Y_{m,j}^{c}(s)\big)+o(n)\sum_{m\in\mathfrak{N}_j^p(n)}\gamma^{j,p}_{z,z'}\Big(Y_{m,j}^{p}(s)\Big)\Bigg)\nonumber\\&\qquad+\Bigg(o(n)\sum_{m\in\{n\}\cup\mathfrak{N}_j^c(n)}\gamma^{j,c}_{z,z'}\big(Y_{m,j}^{c}(s)\big)+o(n)\sum_{m\in\mathfrak{N}_j^p(n)}\gamma^{j,p}_{z,z'}\Big(Y_{m,j}^{p}(s)\Big)\Bigg)\nonumber\\&\qquad-\bigg(p_j^c\int_{\mathcal{Z}}\gamma^{j,c}_{z,z'}(x)\mu_j^c(s)ds+p_j^p\int_{\mathcal{Z}}\gamma^{j,p}_{z,z'}(x)\mu_j^p(s)ds\bigg)\Bigg|\Bigg]\nonumber\\&\leq \mathbb{E}\Bigg[\Bigg| \Bigg(o(n)\sum_{m\in\{n\}\cup\mathfrak{N}_j^c(n)}\gamma^{j,c}_{z,z'}\big(X_{m,j}^{c}(s)\big)+o(n)\sum_{m\in\mathfrak{N}_j^p(n)}\gamma^{j,p}_{z,z'}\Big(X_{m,j}^{p}(s)\Big)\Bigg)\nonumber\\&\qquad-\Bigg(o(n)\sum_{m\in\{n\}\cup\mathfrak{N}_j^c(n)}\gamma^{j,c}_{z,z'}\big(Y_{m,j}^{c}(s)\big)+o(n)\sum_{m\in\mathfrak{N}_j^p(n)}\gamma^{j,p}_{z,z'}\Big(Y_{m,j}^{p}(s)\Big)\Bigg)\Bigg|\Bigg]\nonumber\\&+\mathbb{E}\Bigg[\Bigg|\Bigg(o(n)\sum_{m\in\{n\}\cup\mathfrak{N}_j^c(n)}\gamma^{j,c}_{z,z'}\big(Y_{m,j}^{c}(s)\big)+o(n)\sum_{m\in\mathfrak{N}_j^p(n)}\gamma^{j,p}_{z,z'}\Big(Y_{m,j}^{p}(s)\Big)\Bigg)\nonumber\\&\qquad-\bigg(p_j^c\int_{\mathcal{Z}}\gamma^{j,c}_{z,z'}(x)\mu_j^c(s)ds+p_j^p\int_{\mathcal{Z}}\gamma^{j,p}_{z,z'}(x)\mu_j^p(s)ds\bigg)\Bigg|\Bigg].\end{align}

Denote by $\mathcal{J}_1$ and $\mathcal{J}_2$ respectively the first and the second expectation in the right-hand side of (66). Then, from the Lipschitz property of the functions $\gamma^{j,c}_{z,z'}$ and $\gamma^{j,p}_{z,z'}$ , $\mathcal{J}_1$ is bounded as follows:

(67) \begin{align}\mathcal{J}_1&\leq o(n)\sum_{m\in\{n\}\cup\mathfrak{N}_j^c(n)}L_{\gamma}\mathbb{E}\Big[\Big|X_{m,j}^{c}(s))-Y_{m,j}^{c}(s))\Big|\Big]+o(n)\sum_{m\in\mathfrak{N}_j^p(n)}L_{\gamma}\mathbb{E}\Big[\Big|X_{m,j}^{p}(s))-Y_{m,j}^{p}(s))\Big|\Big]\nonumber\\ &\leq \varrho_{n,j}^c L_{\gamma}\max_{ m\in C_j^c}\mathbb{E}\big\|X_{m,j}^{c}-Y_{m,j}^{c}\big\|_s+\varrho_{n,j}^p L_{\gamma}\max_{m\in C_j^p}\mathbb{E}\big\|X_{m,j}^{p}-Y_{m,j}^{p}\big\|_s, \end{align}

where $L_{\gamma}$ is the maximum Lipschitz constant of the functions $\gamma^{j,c}_{z,z'}$ and $\gamma^{j,p}_{z,z'}$ for all $(z,z')\in\mathcal{Z}^2$ . Moreover, by adding and subtracting terms and using the fact that both $\big\{Y_{n,j}^{c}(s)\big\}$ and $\big\{Y_{n,j}^p(s)\big\}$ are sequences of i.i.d. random variables, $\mathcal{J}_2$ can be bounded as follows:

(68) \begin{equation}\begin{split}\mathcal{J}_2&\leq \mathbb{E}\Bigg[\Bigg|o(n)\sum_{m\in\{n\}\cup\mathfrak{N}_j^c(n)}\gamma^{j,c}_{z,z'}\big(Y_{m,j}^{c}(s)\big)-p_j^c\int_{\mathcal{Z}}\gamma^{j,c}_{z,z'}(x)\mu_j^c(s)ds\Bigg|\Bigg]\\&\qquad\qquad\qquad+\mathbb{E}\Bigg[\Bigg|o(n)\sum_{m\in\mathfrak{N}_j^p(n)}\gamma^{j,p}_{z,z'}\Big(Y_{m,j}^{p}(s)\Big)-p_j^p\int_{\mathcal{Z}}\gamma^{j,p}_{z,z'}(x)\mu_j^p(s)ds\Bigg|\Bigg]\\&\leq \mathbb{E}\Bigg[\Bigg|p_j^c\frac{1}{M_{j}^{c}(n)+1}\sum_{m\in\{n\}\cup\mathfrak{N}_j^c(n)}\bigg(\gamma^{j,c}_{z,z'}\big(Y_{m,j}^{c}(s)\big)-\mathbb{E}\Big[\gamma^{j,c}_{z,z'}\big(Y_{m,j}^{c}(s)\big)\Big]\bigg)\Bigg|\Bigg]\\&\qquad\qquad\qquad+\bigg|o(n)-\frac{p_j^c}{M_{j}^{c}(n)+1}\bigg|\mathbb{E}\Bigg[\Bigg|\sum_{m\in\{n\}\cup\mathfrak{N}_j^c(n)}\gamma^{j,c}_{z,z'}\big(Y_{m,j}^{c}(s)\big)\Bigg|\Bigg]\\&+\mathbb{E}\Bigg[\Bigg|p_j^p\frac{1}{M_{j}^{p}(n)}\sum_{m\in\mathfrak{N}_j^p(n)}\bigg(\gamma^{j,p}_{z,z'}\Big(Y_{m,j}^{p}(s)\Big)-\mathbb{E}\Big[\gamma^{j,p}_{z,z'}\big(Y_{m,j}^{c}(s)\big)\Big]\bigg)\Bigg|\Bigg]\\&\qquad\qquad\qquad+\Bigg|o(n)-\frac{p_j^p}{M_{j}^{p}(n)}\Bigg|\mathbb{E}\Bigg[\Bigg|\sum_{m\in\mathfrak{N}_j^p(n)}\gamma^{j,p}_{z,z'}\Big(Y_{m,j}^{p}(s)\Big)\Bigg|\Bigg].\end{split}\end{equation}

Note that, by the exchangeability of $\big\{Y_{n,j}^{c}(s), n\in C_j^c\big\}$ and the boundedness of the functions $\gamma^{j,c}_{z,z'}$ , one obtains

(69) \begin{equation}\Bigg|o(n)-\frac{p_j^c}{M_{j}^{c}(n)+1}\Bigg|\mathbb{E}\Bigg[\sum_{m\in\{n\}\cup\mathfrak{N}_j^c(n)}\gamma^{j,c}_{z,z'}\big(Y_{m,j}^{c}(s)\big)\Bigg]=\big|\varrho_{n,j}^c-p_j^c\big|\mathbb{E}\bigg[\gamma^{j,c}_{z,z'}\big(Y_{1,j}^{c}(s)\big)\bigg]\leq\bar{\gamma} \Big|\varrho_{n,j}^c-p_j^c\Big|.\end{equation}

In the same manner, the fourth term in the right member of (68) is also bounded as follows:

(70) \begin{equation}\Bigg|o(n)-\frac{p_j^p}{M_{j}^{p}(n)}\Bigg|\mathbb{E}\Bigg[\sum_{m\in\mathfrak{N}_j^p(n)}\gamma^{j,p}_{z,z'}\Big(Y_{m,j}^{p}(s)\Big)\Bigg]\leq\bar{\gamma} \Big|\varrho_{n,j}^c-p_j^p\Big|.\end{equation}

Furthermore, using (60), the first and third expectations in (68) are bounded by $\frac{\kappa_1 p_j^c}{\sqrt{M_{j}^{c}(n)+1}}$ and $\frac{\kappa_2 p_j^p}{\sqrt{M_{j}^{p}(n)}}$ , respectively, where $\kappa_1$ and $\kappa_2$ are positive constants.

Now let us take a look at the first term of the right-hand side of (65). Since $X^c_{n,j}$ and $Y^c_{n,j}$ are $\mathcal{Z}$ -valued, and $\mathcal{Z}$ is a subset of $\mathbb{N}$ , one easily sees that

(71) \begin{equation} \begin{split} \mathbb{E}\bigg[\lambda_{j,z,z'}^{c}\Big(\upsilon_{n,j}^{c,N}(s)\Big)\bigg|\Big({\unicode{x1D7D9}}_{X^{c}_{n,j}(s)=z}-{\unicode{x1D7D9}}_{Y^{c}_{n,j}(s)=z}\Big)\bigg|\bigg]&\leq \mathbb{E}\bigg[\lambda_{j,z,z'}^{c}\Big(\upsilon_{n,j}^{c,N}(s)\Big)\Big|X^{c}_{n,j}(s)-Y^{c}_{n,j}(s)\Big|\bigg]\\ &\leq\bar{\gamma}\mathbb{E}\Big|X^{c}_{n,j}(s)-Y^{c}_{n,j}(s)\Big|\\ &\leq\bar{\gamma}\mathbb{E}\Big\|X^{c}_{n,j}-Y^{c}_{n,j}\Big\|_s, \end{split} \end{equation}

where $\bar{\gamma}$ is the upper bound of the functions $\gamma^{j,c}_{z,z'}$ and $\gamma^{j,p}_{z,z'}$ for all $(z,z')\in\mathcal{Z}^2$ . Finally, by combining (65), (66), (67), (68), (69), (70), and (71), one obtains

(72) \begin{equation}\begin{split}\mathbb{E}\left[\big\|X_{n,j}^{c}-Y_{n,j}^{c} \big\|_t\right]\leq K |\mathcal{E}|\int_{0}^t \Bigg[&\bar{\gamma}\mathbb{E}\big\|X^{c}_{n,j}-Y^{c}_{n,j}\big\|_s+\varrho_{n,j}^c L_{\gamma}\max_{\substack{m\in C_j^c}}\mathbb{E}\big\|X_{m,j}^{c}-Y_{m,j}^{c}\big\|_s\\&+\varrho_{n,j}^c L_{\gamma}\max_{\substack{m\in C_j^p}}\mathbb{E}\Big\|X_{m,j}^{p}-Y_{m,j}^{p}\Big\|_s+\bar{\gamma} \Big|\varrho_{n,j}^c-p_j^c\Big|\\&+\bar{\gamma} \Big|\varrho_{n,j}^c-p_j^p\Big|+\frac{\kappa_1 p_j^c}{\sqrt{M_{j}^{c}(n)+1}}+\frac{\kappa_2 p_j^p}{\sqrt{M_{j}^{p}(n)}}\Bigg]ds,\end{split}\end{equation}

where $|\mathcal{E}|$ stands for the cardinality of the set of edges $\mathcal{E}$ of the graph $(\mathcal{Z},\mathcal{E})$ . Recall that $deg(n)=M_{j}^{c}(n)+M_{j}^{p}(n)$ for any $n\in C_j^c$ . Using this and then taking the maximum over $n\in C_j^c$ in (72) and over $1\leq j \leq r$ , one finally obtains

(73) \begin{equation}\begin{split}\max_{1\leq j\leq r}&\max_{n\in C_j^c}\mathbb{E}\left[\big\|X_{n,j}^{c}-Y_{n,j}^{c}\big\|_t\right]\\&\!\!\!\!\leq K |\mathcal{E}|\int_{0}^t \Bigg[(\bar{\gamma}+L_{\gamma})\max_{1\leq j\leq r}\max_{\substack{n\in C_j^c}}\mathbb{E}\big\|X^{c}_{n,j}-Y^{c}_{n,j}\big\|_s+L_{\gamma} \max_{1\leq j\leq r}\max_{\substack{n\in C_j^p}}\mathbb{E}\Big\|X_{n,j}^{p}-Y_{n,j}^{p}\Big\|_s\quad\\&\!\!\!\!+\sum_{1\leq j\leq r}\Bigg( \bar{\gamma}\max_{n\in C_j^c}\Big|\varrho_{n,j}^c-p_j^c\Big|+ \bar{\gamma}\max_{n\in C_j^c}\Big|\varrho_{n,j}^c-p_j^p\Big|+\max_{n\in C_j^c}\frac{\kappa_1 p_j^c}{\sqrt{M_{j}^{c}(n)+1}}+\max_{n\in C_j^c}\frac{\kappa_2 p_j^p}{\sqrt{M_{j}^{p}(n)}}\Bigg)\Bigg]ds.\quad\end{split}\end{equation}

Step 2. Fix a block $1\leq j\leq r$ and a peripheral node $n\in C_j^p$ . For any $t\in [0,T]$ , one has

(74) \begin{equation}\begin{split}\mathbb{E}\Big[\Big\|X_{n,j}^{p}-Y_{n,j}^p \Big\|_t\Big]&=\mathbb{E}\left[\sup_{0\leq s\leq t}\Big|X_{n,j}^{p}(s)-Y_{n,j}^p(s)\Big|\right]\\ &\leq\mathbb{E}\Bigg[\Bigg|\int_{[0,t]\times\mathbb{R}_+}\sum_{(z,z')\in\mathcal{E}}{\unicode{x1D7D9}}_{X^{p}_{n,j}(s)=z}(z'-z){\unicode{x1D7D9}}_{\left[0,\lambda^p_{j,z,z'}(\upsilon_{n,j}^{p,N}(s))\right]}(y)\mathcal{N}_{n,j}^p(ds,dy)\\ &\qquad\quad -\int_{[0,t]\times\mathbb{R}_+}\sum_{(z,z')\in\mathcal{E}}{\unicode{x1D7D9}}_{Y^{p}_{n,j}(s)=z}(z'-z){\unicode{x1D7D9}}_{\left[0,\lambda^p_{j,z,z'}\big(\upsilon_{j}^p(s)\big)\right]}(y)\mathcal{N}_{n,j}^p(ds,dy) \Bigg| \Bigg]\\ &\leq K\int_{[0,t]}\sum_{(z,z')\in\mathcal{E}}\mathbb{E}\bigg[\bigg|\left({\unicode{x1D7D9}}_{X^{p}_{n,j}(s)=z}-{\unicode{x1D7D9}}_{Y^{p}_{n,j}(s)=z}\right)\lambda^p_{j,z,z'}\left(\upsilon_{n,j}^{p,N}(s)\right) \\ &\qquad\qquad+{\unicode{x1D7D9}}_{Y^{p}_{n,j}(s)=z}\bigg(\lambda^p_{j,z,z'}\left(\upsilon_{n,j}^{p,N}(s)\right)-\lambda^p_{j,z,z'}\left(\upsilon_{j}^p(s)\right)\bigg) \bigg|\bigg]ds,\end{split}\end{equation}

where the last inequality is obtained by following the same steps as in (63) and (65). Again, given that $X_{n,j}^p$ and $Y_{n,j}^p$ are $\mathcal{Z}$ -valued and that $\mathcal{Z}\subset\mathbb{N}$ , the first expectation in the right-hand side of (74) can be bounded as follows:

(75) \begin{equation}\begin{split}\mathbb{E}\bigg[\bigg|\left({\unicode{x1D7D9}}_{X^{p}_{n,j}(s)=z}-{\unicode{x1D7D9}}_{Y^{p}_{n,j}(s)=z}\right)\lambda^p_{j,z,z'}\left(\upsilon_{n,j}^{p,N}(s)\right)\bigg|\bigg]\leq \bar{\gamma}\mathbb{E}\bigg[\Big\|X^{p}_{n,j}-Y^{p}_{n,j}\Big\|_s\bigg].\end{split}\end{equation}

It remains to bound the second term in the right-hand side of (74). Using Condition 4.1 one gets

(76) \begin{equation}\begin{split}\mathbb{E}\bigg[\bigg|&{\unicode{x1D7D9}}_{Y^{p}_{n,j}(s)=z}\left(\lambda^p_{j,z,z'}\left(\upsilon_{n,j}^{p,N}(s)\right)-\lambda^p_{j,z,z'}\big(\upsilon_{j}^p(s)\big)\right) \bigg|\bigg]\\&\leq \mathbb{E}\bigg[\bigg|\lambda^p_{j,z,z'}\left(\upsilon_{n,j}^{p,N}(s)\right)-\lambda^p_{j,z,z'}\left(\upsilon_{j}^p(s)\right) \bigg|\bigg]\\&\!\!\!\!\!=\mathbb{E}\Bigg[\Bigg|o(n)\Bigg(\sum_{m\in \mathfrak{N}_j^c(n)}\!\!\gamma^{j,c}_{z,z'}\big(X_{m,j}^{c}(s)\big)+\!\sum_{m\in \mathfrak{N}_1^p(n)}\!\!\!\gamma^{j,p}_{z,z'}\big(X_{m,1}^{p}(s)\big)+\cdots+\!\!\!\sum_{m\in\{n\}\cup \mathfrak{N}_j^p(n)}\!\!\gamma^{j,p}_{z,z'}\Big(X_{m,j}^{p}(s)\Big)\!+\cdots\\&\!\!\!\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\cdots+\sum_{m\in \mathfrak{N}_r^p(n)} \gamma^{j,p}_{z,z'}\big(X_{m,r}^{p}(s)\big)\Bigg)\\&\qquad-\bigg(\alpha_j^c\int_{\mathcal{Z}}\gamma^{j,c}_{z,z'}(x)\mu_j^c(s)ds+q_{j1}\int_{\mathcal{Z}}\gamma^{j,p}_{z,z'}(x)\mu_1^p(s)ds+\cdots+q_{jr}\int_{\mathcal{Z}}\gamma^{j,p}_{z,z'}(x)\mu_r^p(s)ds\bigg)\bigg|\Bigg].\end{split}\end{equation}

By rearranging the terms and using the triangle inequality one obtains

(77) \begin{equation}\begin{split} \mathbb{E}\bigg[\bigg|\bigg(\lambda^p_{j,z,z'}\left(\upsilon_{n,j}^{p,N}(s)\right)-\lambda^p_{j,z,z'}\left(\upsilon_{j}^p(s)\right)\bigg) \bigg|\bigg] &\leq\mathbb{E}\bigg[\bigg| o(n)\sum_{m\in \mathfrak{N}_j^c(n)}\gamma^{j,c}_{z,z'}\big(X_{m,j}^{c}(s)\big)\\& \qquad\qquad -\alpha_j^c\int_{\mathcal{Z}}\gamma^{j,c}_{z,z'}(x)\mu_j^c(s)ds \bigg|\bigg]\\ &+\mathbb{E}\bigg[\bigg|o(n)\sum_{m\in \mathfrak{N}_1^p(n)}\gamma^{j,p}_{z,z'}\big(X_{m,1}^{p}(s)\big)-q_{j1}\int_{\mathcal{Z}}\gamma^{j,p}_{z,z'}(x)\mu_1^p(s)ds\\ &\qquad\vdots\\ &\qquad+o(n)\!\!\sum_{m\in\{n\}\cup \mathfrak{N}_j^p(n)}\gamma^{j,p}_{z,z'}\big(X_{m,j}^{p}(s)\big)-q_{jj}\!\int_{\mathcal{Z}}\gamma^{j,p}_{z,z'}(x)\mu_j^p(s)ds\\ &\qquad\vdots\\ &\qquad+o(n)\sum_{m\in \mathfrak{N}_r^p(n)} \gamma^{j,p}_{z,z'}\big(X_{m,r}^{p}(s)\big)\\ & \qquad\qquad -q_{jr}\int_{\mathcal{Z}}\gamma^{j,p}_{z,z'}(x)\mu_r^p(s)ds \bigg|\bigg].\end{split}\end{equation}

Note that $\int_{\mathcal{Z}}\gamma^{j,p}_{z,z'}(x)\mu_i^p(s)ds= \mathbb{E}\Big[\gamma^{j,p}_{z,z'}(Y_{m,i}^{p}(s))\Big]$ for $m\in C_i^p$ and $\int_{\mathcal{Z}}\gamma^{j,c}_{z,z'}(x)\mu_j^c(s)ds=\mathbb{E}\Big[\gamma^{j,c}_{z,z'}\big(Y_{n,j}^{c}(s)\big)\Big]$ for $n\in C_j^c$ . Then, by using the exchangeability of $Y_{n,j}^{c}(t)$ for $n\in C_j^c$ , one finds

(78) \begin{equation}\begin{split} \mathbb{E}\bigg[\bigg|\bigg(\lambda^p_{j,z,z'}\!\left(\upsilon_{n,j}^{p,N}(s)\right)-\lambda^p_{j,z,z'}\!\left(\upsilon_{j}^p(s)\right)\bigg) \bigg|\bigg] &\leq\mathbb{E}\Bigg[\Bigg| o(n)\sum_{m\in \mathfrak{N}_j^c(n)}\gamma^{j,c}_{z,z'}\big(X_{m,j}^{c}(s)\big)-\alpha_j^c\mathbb{E}\Big[\gamma^{j,c}_{z,z'}\big(Y_{m,j}^{c}(s)\big)\Big] \Bigg|\Bigg] \\ &+\mathbb{E}\Bigg[\Bigg|o(n)\sum_{m\in \mathfrak{N}_1^p(n)}\gamma^{j,p}_{z,z'}\big(X_{m,1}^{p}(s)\big)-q_{j1}\mathbb{E}\Big[\gamma^{j,p}_{z,z'}\big(Y_{m,1}^{p}(s)\big)\Big] \\ &\quad\vdots\\ &\quad+o(n)\!\!\sum_{m\in\{n\}\cup \mathfrak{N}_j^p(n)}\!\gamma^{j,p}_{z,z'}\Big(X_{m,j}^{p}(s)\Big)-q_{jj}\mathbb{E}\Big[\gamma^{j,p}_{z,z'}\Big(Y_{m,j}^{p}(s)\Big)\Big] \\ &\quad\vdots\\ &\quad+o(n)\sum_{m\in \mathfrak{N}_r^p(n)}\gamma^{j,p}_{z,z'}\big(X_{m,r}^{p}(s)\big)-q_{jr}\mathbb{E}\Big[\gamma^{j,p}_{z,z'}\Big(Y_{m,r}^{p}(s)\Big)\Big]\Bigg|\Bigg].\end{split}\end{equation}

Observe that there are $r+1$ terms on the right-hand side of the last inequality. Let us start with the first expectation. By adding and subtracting terms one gets

(79) \begin{equation} \begin{split}&\mathbb{E}\Bigg[\bigg| o(n)\sum_{m\in \mathfrak{N}_j^c(n)}\gamma^{j,c}_{z,z'}\big(X_{m,j}^{c}(s)\big)-\alpha_j^c\mathbb{E}\Big[\gamma^{j,c}_{z,z'}\big(Y_{m,j}^{c}(s)\big)\Big] \bigg|\Bigg] \\ &\leq \mathbb{E}\Bigg[\bigg| o(n)\sum_{m\in \mathfrak{N}_j^c(n)}\bigg(\gamma^{j,c}_{z,z'}\big(X_{m,j}^{c}(s)\big)-\gamma^{j,c}_{z,z'}\big(Y_{m,j}^{c}(s)\big)\bigg)\bigg|\Bigg]\\&\qquad+\mathbb{E}\Bigg[\bigg|o(n)\sum_{m\in \mathfrak{N}_j^c(n)}\gamma^{j,c}_{z,z'}\big(Y_{m,j}^{c}(s)\big)-\alpha_j^c\mathbb{E}\Big[\gamma^{j,c}_{z,z'}\big(Y_{m,j}^{c}(s)\big)\Big] \bigg|\Bigg]. \end{split} \end{equation}

Note that, by the Lipschitz property of the functions $\gamma^{j,c}_{z,z'}$ , one finds that

(80) \begin{equation}\begin{split}\mathbb{E}\Bigg[\Bigg| o(n)\sum_{m\in \mathfrak{N}_j^c(n)}\bigg(\gamma^{j,c}_{z,z'}\big(X_{m,j}^{c}(s)\big)-\gamma^{j,c}_{z,z'}\big(Y_{m,j}^{c}(s)\big)\bigg)\Bigg|\Bigg]\leq \varsigma^c_{n,j}L_{\gamma}\max_{m\in C_j^c}\mathbb{E}\big\|X_{m,j}^{c}-Y_{m,j}^{c}\big\|_s .\end{split} \end{equation}

Moreover, using (60) together with the exchangeability of $\big\{Y_{n,j}^{c}(s),n\in C_j^c\big\}$ leads to

(81) \begin{align}&\mathbb{E}\Bigg[\bigg|o(n)\sum_{m\in \mathfrak{N}_j^c(n)}\gamma^{j,c}_{z,z'}\big(Y_{m,j}^{c}(s)\big)-\alpha_j^c\mathbb{E}\Big[\gamma^{j,c}_{z,z'}\big(Y_{m,j}^{c}(s)\big)\Big] \bigg|\Bigg]\nonumber\\&\leq \mathbb{E}\Bigg[\bigg|\frac{\alpha_j^c}{M_{j}^{c}(n)}\sum_{m\in \mathfrak{N}_j^c(n)}\bigg(\gamma^{j,c}_{z,z'}\big(Y_{m,j}^{c}(s)\big)-\mathbb{E}\Big[\gamma^{j,c}_{z,z'}\big(Y_{m,j}^{c}(s)\big)\Big]\bigg) \bigg|\Bigg]\nonumber\\& +\mathbb{E}\Bigg[\bigg|\left(o(n)-\frac{\alpha_j^c}{M_{j}^{c}(n)}\right)\sum_{m\in \mathfrak{N}_j^c(n)}\gamma^{j,c}_{z,z'}\big(Y_{m,j}^{c}(s)\big)\bigg|\Bigg]\nonumber\\&\leq \frac{\alpha_j^{c}\kappa_3}{\sqrt{M_{j}^{c}(n)}}+\big|\varsigma_{n,j}^c-\alpha_j^c\big|\bar{\gamma}.\end{align}

Now let us examine the remaining r terms on the right-hand side of (78). For simplicity of notation, denote by $\mathcal{I}$ the left-hand side of (78). Then, by adding and subtracting terms, one gets

(82) \begin{equation}\begin{split}&\mathcal{I}\leq \mathbb{E}\Bigg[\bigg|o(n)\sum_{m\in \mathfrak{N}_1^p(n)}\bigg(\gamma^{j,p}_{z,z'}\big(X_{m,1}^{p}(s)\big)-\gamma^{j,p}_{z,z'}\big(Y_{m,1}^{p}(s)\big)\bigg)+\cdots\\&\quad\qquad\cdots+o(n)\sum_{m\in\{n\}\cup \mathfrak{N}_j^p(n)}\bigg(\gamma^{j,p}_{z,z'}\Big(X_{m,j}^{p}(s)\Big)-\gamma^{j,p}_{z,z'}\Big(Y_{m,j}^{p}(s)\Big)\bigg)+\cdots\\&\quad\qquad\cdots+o(n)\sum_{m\in \mathfrak{N}_r^p(n)}\bigg(\gamma^{j,p}_{z,z'}\big(X_{m,r}^{p}(s)\big)-\gamma^{j,p}_{z,z'}\Big(Y_{m,r}^{p}(s)\Big)\bigg)\bigg|\Bigg]\\&\quad+\mathbb{E}\Bigg[\bigg|o(n)\sum_{m\in \mathfrak{N}_1^p(n)}\gamma^{j,p}_{z,z'}\big(Y_{m,1}^{p}(s)\big)-q_{j1} \mathbb{E}\Big[\gamma^{j,p}_{z,z'}\big(Y_{m,1}^{p}(s)\big)\Big]+\cdots\\&\quad\qquad\cdots+o(n)\sum_{m\in\{n\}\cup \mathfrak{N}_j^p(n)}\gamma^{j,p}_{z,z'}\Big(Y_{m,j}^{p}(s)\Big)-q_{jj} \mathbb{E}\Big[\gamma^{j,p}_{z,z'}\Big(Y_{m,j}^{p}(s)\Big)\Big]+\cdots\\&\quad\qquad\cdots+o(n)\sum_{m\in \mathfrak{N}_r^p(n)}\gamma^{j,p}_{z,z'}\Big(Y_{m,r}^{p}(s)\Big)-q_{jr} \mathbb{E}\Big[\gamma^{j,p}_{z,z'}\Big(Y_{m,r}^{p}(s)\Big)\Big] \bigg|\Bigg].\end{split}\end{equation}

Let $\mathcal{I}_1$ and $\mathcal{I}_2$ denote respectively the first and the second expectation in the right-hand side of the inequality (82). Then, by using the Lipschitz property of the functions $\gamma^{j,p}_{z,z'}$ , and recalling that, for $1\leq i\leq r$ , $M_i^{p}(n)$ represents the number of peripheral nodes of the ith block connecting to node n, the first expectation $\mathcal{I}_1$ is straightforwardly bounded as follows:

(83) \begin{equation}\begin{split}\mathcal{I}_1\leq L_{\gamma}\bigg(&\varsigma_{n,j,1}^p \max_{m\in C_1^p}\mathbb{E}\Big\|X_{m,1}^{p}-Y_{m,1}^{p}\Big\|_s+\cdots+ \varsigma_{n,j,j}^p\max_{m\in C_j^p}\mathbb{E}\big\|X_{m,j}^{p}-Y_{m,j}^{p}\big\|_s+\cdots\\&\qquad\qquad\qquad\qquad\cdots+\varsigma_{n,j,r}^p\max_{m\in C_r^p}\mathbb{E}\left\|X_{m,r}^{p}-Y_{m,r}^{p}\right\|_s\bigg).\end{split}\end{equation}

Moreover, by adding and subtracting terms in $\mathcal{I}_2$ one obtains

(84) \begin{equation}\begin{split}\mathcal{I}_2 \leq \mathcal{I}_3+\mathcal{I}_4,\end{split}\end{equation}

where

(85) \begin{align}\mathcal{I}_3&\,:\!=\,\mathbb{E}\Bigg[\bigg|\frac{q_{j1}}{M_1^{p}(n)}\sum_{m\in \mathfrak{N}_1^p(n)}\bigg(\gamma^{j,p}_{z,z'}\big(Y_{m,1}^{p}(s)\big)-\mathbb{E}\left[\gamma^{j,p}_{z,z'}\big(Y_{m,1}^{p}(s)\big)\right]\bigg)\nonumber\\&\qquad\vdots\nonumber \\&\qquad+\frac{q_{jj}}{M_{j}^{p}(n)+1}\sum_{\substack{m\in C_j^p\,:\,}}\bigg(\gamma^{j,p}_{z,z'}\Big(Y_{m,r}^{p}(s)\Big)-\mathbb{E}\left[\gamma^{j,p}_{z,z'}\Big(Y_{m,r}^{p}(s)\Big)\right]\bigg)\\&\qquad\vdots\nonumber\\&\qquad+\frac{q_{jr}}{M_r^{p}(n)}\sum_{m\in \mathfrak{N}_r^p(n)}\bigg(\gamma^{j,p}_{z,z'}\Big(Y_{m,r}^{p}(s)\Big)-\mathbb{E}\left[\gamma^{j,p}_{z,z'}\Big(Y_{m,r}^{p}(s)\Big)\right]\bigg)\bigg|\Bigg],\nonumber\end{align}

and

(86) \begin{equation}\begin{split}\mathcal{I}_4\,:\!=\,\mathbb{E}\Bigg[\Bigg|\left(o(n)-\frac{q_{j1}}{M_1^{p}(n)}\right)&\!\sum_{m\in \mathfrak{N}_1^p(n)}\!\gamma^{j,p}_{z,z'}\big(Y_{m,1}^{p}(s)\big)+\cdots+\left(o(n)-\frac{q_{jj}}{M_{j}^{p}(n)+1}\right)\!\!\sum_{m\in \mathfrak{N}^j_p(n)}\!\gamma^{j,p}_{z,z'}\Big(Y_{m,j}^{p}(s)\Big)+\\&\qquad\qquad\qquad\qquad\quad\cdots+\left(o(n)-\frac{q_{jr}}{M_r^{p}(n)}\right)\sum_{m\in \mathfrak{N}_r^p(n)}\gamma^{j,p}_{z,z'}\Big(Y_{m,r}^{p}(s)\Big)\Bigg|\Bigg].\end{split}\end{equation}

First, from the triangle inequality one gets

(87) \begin{equation}\begin{split}\mathcal{I}_4 &\leq \mathbb{E}\Bigg[\Bigg|\left(o(n)-\frac{q_{j1}}{M_1^{p}(n)}\right)\sum_{m\in \mathfrak{N}_1^p(n)}\gamma^{j,p}_{z,z'}\big(Y_{m,1}^{p}(s)\big)\Bigg|\Bigg]+\cdots\\&\cdots+\mathbb{E}\Bigg[\Bigg|\left(o(n)-\frac{q_{jj}}{M_{j}^{p}(n)+1}\right)\sum_{m\in\{n\}\cup \mathfrak{N}_j^p(n)}\gamma^{j,p}_{z,z'}\Big(Y_{m,j}^{p}(s)\Big)\Bigg|\Bigg]+\cdots\\&\cdots+\mathbb{E}\Bigg[\Bigg|\left(o(n)-\frac{q_{jr}}{M_r^{p}(n)}\right)\sum_{m\in \mathfrak{N}_r^p(n)}\gamma^{j,p}_{z,z'}\Big(Y_{m,r}^{p}(s)\Big)\Bigg|\Bigg].\end{split}\end{equation}

Using (20), the exchangeability of the variables $\{Y^p_{n,j}(s), n\in C_j^p\}$ , and the boundedness of the functions $\gamma^{j,p}_{z,z'}$ , we easily show that the right-hand side of (87) vanishes as $N\rightarrow\infty$ . Indeed, the jth term satisfies

(88) \begin{equation}\begin{split}\bigg|o(n)-\frac{q_{jj}}{M_{j}^{p}(n)+1}\bigg|\mathbb{E}&\Bigg|\sum_{m\in\{n\}\cup \mathfrak{N}_j^p(n)}\gamma^{j,p}_{z,z'}\Big(Y_{m,j}^{p}(s)\Big)\Bigg|\\&=\Big|\varsigma_{n,j,j}^p-q_{jj}\Big|\frac{1}{M_{j}^{p}(n)+1}\mathbb{E}\Bigg|\sum_{m\in\{n\}\cup \mathfrak{N}_j^p(n)}\gamma^{j,p}_{z,z'}\Big(Y_{m,j}^{p}(s)\Big)\Bigg|\\ &\leq\Big|\varsigma_{n,j,j}^p-q_{jj}\Big|\bar{\gamma},\end{split}\end{equation}

and thus goes to zero by (20). Using the same steps, one obtains for all $1\leq i\leq r$ with $i\neq j$ that

(89) \begin{equation}\begin{split}\bigg|o(n)-\frac{q_{ji}}{M_i^{p}(n)}\bigg|\mathbb{E}\Bigg|\sum_{m\in \mathfrak{N}^i_p(n)}\gamma^{j,p}_{z,z'}(Y_{m,i}^{p}(s))\Bigg|&\leq\bigg|\frac{M_i^{p}(n)}{deg(n)+1}-q_{ji}\bigg|\bar{\gamma},\end{split}\end{equation}

which also vanishes as $N\rightarrow\infty$ by (20); thus, so does $\mathcal{I}_4$ . In order to bound $\mathcal{I}_3$ , we use again the moment inequality (60), which straightforwardly gives us

(90) \begin{align}I_3\leq \frac{\theta_1 q_{j1}}{\sqrt{M_1^{p}(n)}}+\cdots+\frac{\theta_j q_{jj}}{\sqrt{M_{j}^{p}(n)+1}}+\cdots+\frac{\theta_r q_{jr}}{\sqrt{M_r^{p}(n)}},\end{align}

where $\theta_1,\cdots,\theta_r$ are positive constants. Now, by (75), (80), (81), (83), (88), (89), and (90), one obtains

(91) \begin{equation}\begin{split}\mathbb{E}\Big[\Big\|X_{n,j}^{p}-Y_{n,j}^p \Big\|_t\Big]\leq &K |\mathcal{E}|\int_0^t\Bigg[\bar{\gamma}\mathbb{E}\Big\|X^{p}_{n,j}-Y^{p}_{n,j}\Big\|_s+\varsigma_{n,j}^cL_{\gamma}\max_{m\in C_j^c}\mathbb{E}\big\|X_{m,j}^{c}-Y_{m,j}^{c}\big\|_s +\frac{\alpha_j^c\kappa_3}{\sqrt{M_{j}^{c}(n)}}+\\&+\Big|\varsigma_{n,j}^c-\alpha_j^c\Big|\bar{\gamma}+L_{\gamma}\Bigg(\varsigma_{n,j,1}^p\max_{m\in C_1^p}\mathbb{E}\Big\|X_{m,1}^{p}-Y_{m,1}^{p}\Big\|_s+\cdots\\&\qquad+\varsigma_{n,j,j}^p\max_{m\in C_j^p}\mathbb{E}\big\|X_{m,j}^{p}-Y_{m,j}^{p}\big\|_s+\cdots+\varsigma_{n,j,r}^p\max_{m\in C_r^p}\mathbb{E}\left\|X_{m,r}^{p}-Y_{m,r}^{p}\right\|_s\Bigg)\\&+\Big|\varsigma_{n,j,1}^p-q_{j1}\Big|\bar{\gamma}+\cdots+\Big|\varsigma_{n,j,j}^p-q_{jj}\Big|\bar{\gamma}+\cdots+\Big|\varsigma_{n,j,r}^p-q_{jr}\Big|\bar{\gamma}\\&+\frac{\theta_1 q_{j1}}{\sqrt{M_1^{p}(n)}}+\cdots+\frac{\theta_j q_{jj}}{\sqrt{M_{j}^{p}(n)+1}}+\cdots+\frac{\theta_r q_{jr}}{\sqrt{M_r^{p}(n)}}\Bigg]ds.\end{split}\end{equation}

Recall that $deg(n)=M_{j}^{c}(n)+\sum_{k=1}^r M_k^{p}(n)$ for any $n\in C_j^p$ . Using this and taking the maximum over $n\in C_j^p$ and over $1\leq j\leq r$ in (91), one gets

(92) \begin{align}\max_{1\leq j\leq r}&\max_{n\in C_j^p} \mathbb{E}\Big[\Big\|X_{n,j}^{p}-Y_{n,j}^p \Big\|_t\Big]\nonumber\\&\leq K |\mathcal{E}|\int_0^t\Bigg[L_{\gamma}\max_{1\leq j\leq r}\max_{n\in C_j^c}\mathbb{E}\left\|X_{n,j}^{c}-Y_{n,j}^{c}\right\|_s +(\bar{\gamma}+ L_{\gamma})\max_{1\leq j\leq r}\max_{n\in C_j^p}\mathbb{E}\left\|X_{n,j}^{p}-Y_{n,j}^{p}\right\|_s\nonumber\\ & \quad\quad+\sum_{1\leq j\leq r}\max_{n\in C_j^p}\Bigg(\frac{\alpha_j^c\kappa_3}{\sqrt{M_{j}^{c}(n)}}\Bigg)+\sum_{1\leq j\leq r}\max_{n\in C_j^p}\big|\varsigma_{n,j}^c-\alpha_j^c\big|\bar{\gamma}\nonumber\\&\quad\quad+\sum_{1\leq j\leq r}\Bigg(\max_{n\in C_j^p}\Big|\varsigma_{n,j,1}^p-q_{j1}\Big|\bar{\gamma}+\cdots+\max_{n\in C_j^p}\Big|\varsigma_{n,j,j}^p-q_{jj}\Big|\bar{\gamma}+\cdots+\max_{n\in C_j^p}\Big|\varsigma_{n,j,r}^p-q_{jr}\Big|\bar{\gamma}\nonumber\\&\quad\quad+\max_{n\in C_j^p}\frac{\theta_1 q_{j1}}{\sqrt{M_1^{p}(n)}}+\cdots+\max_{n\in C_j^p}\frac{\theta_j q_{jj}}{\sqrt{M_{j}^{p}(n)+1}}+\cdots+\max_{n\in C_j^p}\frac{\theta_r q_{jr}}{\sqrt{M_r^{p}(n)}}\Bigg)\Bigg]ds.\end{align}

Adding side by side the two inequalities in (73) and (92) leads to

(93) \begin{equation}\begin{split}&\max_{1\leq j\leq r}\max_{n\in C_j^c}\mathbb{E}\Big\|X_{n,j}^{c}-Y_{n,j}^{c}(t) \Big\|_t+\max_{1\leq j\leq r}\max_{n\in C_j^p}\mathbb{E}\Big\|X_{n,j}^{p}-Y_{n,j}^p \Big\|_t\\&\quad\leq K |\mathcal{E}|\int_0^t\bigg(C_1\max_{1\leq j\leq r}\max_{n\in C_j^c}\mathbb{E}\Big\|X_{n,j}^{c}-Y_{n,j}^{c}(t) \Big\|_s +C_2\max_{1\leq j\leq r}\max_{n\in C_j^p}\mathbb{E}\Big\|X_{n,j}^{p}-Y_{n,j}^p \Big\|_s+C_3(N)\bigg)ds,\end{split}\end{equation}

where, with a slight abuse of notation, the constants $C_1,C_2$ and the function $C_3(N)$ are defined by

\begin{align*}C_1&\,:\!=\, \bar{\gamma}+2 L_{\gamma},\\C_2&\,:\!=\,\bar{\gamma}+2L_{\gamma},\end{align*}

and

\begin{align*}C_3(N)&\,:\!=\, \sum_{1\leq j\leq r}\Bigg( \bar{\gamma}\max_{n\in C_j^c}\big|\varrho_{n,j}^c-p_j^c\big|+ \bar{\gamma}\max_{n\in C_j^c}\big|\varrho_{n,j}^c-p_j^p\big|+\max_{n\in C_j^c}\frac{\kappa_1 p_j^c}{\sqrt{M_{j}^{c}(n)+1}}+\max_{n\in C_j^c}\frac{\kappa_2 p_j^p}{\sqrt{M_{j}^{p}(n)}}\Bigg)\\&\quad+\sum_{1\leq j\leq r}\max_{n\in C_j^p}\Bigg(\frac{\alpha_j^c\kappa_3}{\sqrt{M_{j}^{c}(n)}}\Bigg)+\sum_{1\leq j\leq r}\max_{n\in C_j^p}\Big|\varsigma_{n,j}^c-\alpha_j^c\Big|\bar{\gamma}\\&\quad+\sum_{1\leq j\leq r}\!\Bigg(\max_{n\in C_j^p}\big|\varsigma_{n,j,1}^p-q_{j1}\big|\bar{\gamma}+\cdots+\max_{n\in C_j^p}\big|\varsigma_{n,j,j}^p-q_{jj}\big|\bar{\gamma}+\cdots+\max_{n\in C_j^p}\big|\varsigma_{n,j,r}^p-q_{jr}\big|\bar{\gamma}\\&\quad+\max_{n\in C_j^p}\frac{\theta_1 q_{j1}}{\sqrt{M_1^{p}(n)}}+\cdots+\max_{n\in C_j^p}\frac{\theta_j q_{jj}}{\sqrt{M_{j}^{p}(n)+1}}+\cdots+\max_{n\in C_j^p}\frac{\theta_r q_{jr}}{\sqrt{M_r^{p}(n)}}\Bigg).\end{align*}

Therefore, applying Grönwall’s lemma to (93) gives

(94) \begin{equation}\begin{split}&\max_{1\leq j\leq r}\max_{n\in C_j^c}\mathbb{E}\Big[\Big\|X_{n,j}^{c}-Y_{n,j}^{c}\Big\|_t\Big]+\max_{1\leq j\leq r}\max_{n\in C_j^p}\mathbb{E}\Big[\Big\|X_{n,j}^{p}-Y_{n,j}^p \Big\|_t\Big]\leq K |\mathcal{E}| C_3(N)t\exp\bigg\{\int_{0}^t C_4ds\bigg\},\end{split}\end{equation}

with $C_4=C_1+C_2$ . Finally, Condition 4.1 ensures that $C_3(N)\rightarrow 0$ as $N\rightarrow\infty$ , which proves (61).

We are now ready to conclude the proof. First, Theorem 4.1 ensures the uniqueness of the solution of the limiting stochastic differential equation (14). In addition, the relation in (25) shows that the solution is continuous with respect to the initial condition. Therefore, the process $Y(t)=\big(\big(Y_n^{c}(t),Y_m^{p}(t)\big),n\in C_j^c,m\in C_j^p;\,1\leq j\leq r\big)$ is $\mu^c_{1}\otimes\mu_{1}^p\otimes\cdots \mu_{r}^c\otimes\mu_{r}^p$ -multi-chaotic since the initial condition $Y(0)=X(0)$ is multi-exchangeable and $\nu^{1,c}\otimes\nu^{1,p}\otimes\cdots \nu^{r,c}\otimes\nu^{r,p}$ -multi-chaotic. Then, by the relation in (61), we conclude that the convergence in (59) holds and the sequence of processes $\big(\big(X_n^{c,N}(t),X_m^{p,N}(t),t\geq 0\big),n\in C_j^c,m\in C_j^p;\,1\leq j\leq r\big)$ is also $P_{\bar{X}}=\mu^c_{1}\otimes\mu_{1}^p\otimes\cdots \mu_{r}^c\otimes\mu_{r}^p$ -multi-chaotic, which concludes the proof.

5.2. Laws of large numbers

The following laws of large numbers are immediate consequences of Theorem 5.1.

Corollary 5.1. Suppose that the conditions of Theorem 5.1 hold. Define $\mu_j^{c}\,:\!=\,\mathcal{L}\big( \bar{X}^c_{n,j}\big)$ , $\mu_j^{p}\,:\!=\,\mathcal{L}\big( \bar{X}^p_{m,j}\big)$ for $1\leq j\leq r$ , where $\big(\big(\bar{X}^c_{n,j}(t), \bar{X}^p_{m,j}(t),t\geq 0\big),1\leq j\leq r\big)$ is the solution of the McKean–Vlasov limiting system in (14) with initial distribution $\nu^{1,c}\otimes\nu^{1,p}\cdots \nu^{r,c}\otimes\nu^{r,p}$ . Then, for each $1\leq j\leq r$ , as $N\rightarrow\infty$ ,

(95) \begin{align}\mu_{j}^{c,N}=\frac{1}{N_j^c}\sum_{n\in C_j^c}\delta_{X_{n,j}^c}\rightarrow\mu_{j}^c\end{align}

and

(96) \begin{align}\mu_{j}^{p,N}=\frac{1}{N_j^p}\sum_{n\in C_j^p}\delta_{X_{n,j}^p}\rightarrow\mu_{j}^p,\end{align}

for the weak topology on $\mathcal{M}_1(\mathcal{D}([0,T],\mathcal{Z}))$ with $\mathcal{D}([0,T],\mathcal{Z})$ endowed with the Skorokhod topology.

Proof. We prove (96); the proof of (95) is similar. Let

\begin{align*} \bar{\mu}_j^{N,p}\,:\!=\,\frac{1}{N_j^p}\sum\limits_{n\in C_j^p}\delta_{\bar{X}_{n,j}^p},\end{align*}

and recall that the bounded-Lipschitz metric $d_{BL}$ metrizes the weak convergence on $\mathcal{M}_1(\mathcal{D}([0,T],\mathcal{Z}))$ . Therefore, to prove the convergence in (96), it suffices to prove that $d_{BL}\Big(\mu_j^{p,N},\bar{\mu}_j^{p,N}\Big)\Rightarrow 0$ and that $\bar{\mu}_j^{N,p}\Rightarrow \mu_j^p$ . First note that for all $1\leq j\leq r$ ,

(97) \begin{equation}\begin{split}\mathbb{E}\Big[d_{BL}\Big(\mu_j^{p,N},\bar{\mu}_j^{p,N}\Big)\Big]&=\mathbb{E}\bigg[\sup_{g\in Lip(\mathcal{Z})}\Big|\Big\langle \mu_j^{p,N},g\Big\rangle-\Big\langle \bar{\mu}_j^{p,N},g\Big\rangle\Big|\bigg]\\ &=\mathbb{E}\Bigg[\sup_{g\in Lip(\mathcal{Z})}\Bigg|\frac{1}{N_j^p}\sum_{n\in C_j^p}\Big(g\Big(X_{n,j}^{p}\Big)-g\Big(\bar{X}_{n,j}^p\Big)\Big)\Bigg|\Bigg] \\ &\leq \frac{1}{N_j^p}\sum_{n\in C_j^p}\mathbb{E}\Big[\Big|X_{n,j}^{p}-\bar{X}_{n,j}^p\Big|_T\Big] \\ &\leq \max_{n\in C_j^p}\mathbb{E}\Big|X_{n,j}^{p}-\bar{X}_{n,j}^p\Big|_T,\end{split}\end{equation}

which goes to zero according to (59). Thus, $d_{BL}\Big(\mu_j^{p,N},\bar{\mu}_j^{p,N}\Big)\Rightarrow 0$ as $N\rightarrow\infty$ . It remains to show that $\bar{\mu}_j^{N,p}\Rightarrow \mu_j^p$ as $N\rightarrow\infty$ . Since the stochastic processes $\Big\{\bar{X}_{n,j}^p, n\in C_j^p\Big\}$ are i.i.d., for any continuous and bounded function $g\in C_b(\mathcal{Z})$ one finds that

(98) \begin{equation}\begin{split}\mathbb{E}\bigg( \Big\langle \bar{\mu}_j^{p,N},g\Big\rangle-\Big\langle \mu_j^p,g\Big\rangle\bigg)^2&=\mathbb{E}\Bigg(\frac{1}{N_j^p}\sum_{n\in C_j^p}\Big(g\Big(\bar{X}^p_{n,j}\Big)-\Big\langle \mu_j^p,g\Big\rangle\Big)\Bigg)^2\\ &=\mathbb{E}\Bigg(\frac{1}{\big(N_j^p\big)^2}\sum_{n\in C_j^p}\Big(g\Big(\bar{X}^p_{n,j}\Big)-\Big\langle \mu_j^p,g\Big\rangle\Big)^2\Bigg)\\ &\leq\frac{1}{N_j^p}4\|g\|_{\infty}^2,\end{split}\end{equation}

which goes to zero given the boundedness of g. Therefore, $\bar{\mu}_j^{p,N}$ converges weakly to $\mu_j^p$ as $N\rightarrow\infty$ . Thus, combining the two convergence results, we conclude that $\mu_j^{p,N}$ converges weakly to $\mu_j^{p}$ as $N\rightarrow\infty$ . The corollary is proved.

6. Large deviations

We investigate here the large deviations principles of the interacting particle system introduced in Section 2 over finite time durations. For the sake of simplicity, we restrict ourselves to the case of block-structured graphs where the blocks are cliques, i.e., complete subgraphs, and the peripheral subgraph is complete, that is, all peripheral nodes in the system are connected; see, e.g., Figure 1. The first main result of this section is Theorem 6.1, which states the large deviations principle of the vector of empirical measures. The second main result is Theorem 6.2, which states the large deviations principle of the corresponding vector of empirical processes. The approach we take to establish these results is based on a generalization to the multi-class setting of the classical approach developed in [Reference Dawson and Gärtner24] and adapted in [Reference Léonard49] to the context of jump processes. One might also consult [Reference Feng32, Reference Feng33] for an alternative approach.

Let us first introduce the assumptions under which the results of this section hold.

Assumption 6.1.

  1. 1. The peripheral subgraph is complete; that is, for any two peripheral nodes $n,m\in\bigcup\limits_{1\leq j\leq r} C_{j}^p$ , there exists an edge $(n,m)\in\Xi$ connecting n and m.

  2. 2. The r blocks of the graphs are cliques; that is, for any two nodes $n,m\in C_{j}^p$ of the same block $1\leq j\leq r$ , there exists an edge $(n,m)\in\Xi$ connecting n and m.

  3. 3. The mappings $\lambda_{j,z,z'}^{c}$ and $\lambda^p_{j,z,z'}$ introduced in (16) and (17) are uniformly bounded away from zero; that is, there exists $c > 0$ such that, for all $\nu,\mu_1,\ldots,\mu_r\in\mathcal{M}_1(\mathcal{Z})$ and all $(z, z')\in\mathcal{E}$ , we have $\lambda_{j,z,z'}^{c}(\nu,\mu_j)\geq c$ and $\lambda^p_{j,z,z'}(\nu,\mu_1,\ldots,\mu_r)\geq c$ .

  4. 4. As $N\rightarrow\infty$ ,

    (99) \begin{align}\frac{N_j}{N}\rightarrow \alpha_j\end{align}
    for some $\alpha_j\in (0,1)$ , where we recall that $N_j$ is the number of nodes in the jth block and $N_j^c$ $\big($ resp. $N_j^p\big)$ is the number of central (resp. peripheral) nodes in the jth block.

Remark 6.1.

  1. 1. From (16) and (17), together with Remark 4.1, the functions $\lambda_{j,z,z'}^{c}$ and $\lambda^p_{j,z,z'}$ are Lipschitz.

  2. 2. Since $\mathcal{M}_1(\mathcal{Z})$ is compact and the rate functions $\lambda_{j,z,z'}^{c}$ and $\lambda^p_{j,z,z'}$ are continuous and Lipschitz, the rates are uniformly bounded from above; that is, there exists a constant $C <\infty$ such that for all $\nu,\mu_1,\ldots,\mu_r\in\mathcal{M}_1 (\mathcal{Z})$ , and all $(z,z')\in\mathcal{E}$ , we have $\lambda_{j,z,z'}^{c}(\nu,\mu_j)\leq C$ and $\lambda^p_{j,z,z'}(\nu,\mu_1,\ldots,\mu_r)\leq C$ .

  3. 3. For ease of reading, we have omitted subscripts indicating the dependence of the rate functions $\lambda_{j,z,z'}^{c}$ and $\lambda^p_{j,z,z'}$ on the proportions $a_1,b_1,a,b_1,\ldots,b_r$ .

  4. 4. We use again throughout this section the convention that N goes to infinity when both $\min_{1\leq j\leq r}N_j^c$ and $\min_{1\leq j\leq r}N_j^p$ go to infinity.

  5. 5. We emphasize that for simplicity, the results obtained in this section hold under Assumption 6.1, which describes a special case of the class of models given by Condition 4.1; that is, we suppose here that each block is a clique and the peripheral subgraph is complete.

Let $\mathbb{M}^N\in(\mathcal{M}_1(\mathcal{D}([0,T],\mathcal{Z})))^{2r}$ denote the vector of empirical measures defined by

(100) \begin{equation}\begin{split}\mathbb{M}^N&\,:\!=\,\Big(\mathbb{M}_1^{c,N},\mathbb{M}_1^{p,N},\cdots,\mathbb{M}_r^{c,N},\mathbb{M}_r^{p,N}\Big)\\&=\Bigg(\frac{1}{N_1^c}\sum_{n\in C^c_1}\delta_{X^c_{n,1}},\frac{1}{N_1^p}\sum_{n\in C^p_1}\delta_{X^p_{n,1}},\ldots, \frac{1}{N^c_r}\sum_{n\in C^c_r}\delta_{X^p_{n,r}}, \frac{1}{N^p_r}\sum_{n\in C^p_r}\delta_{X^p_{n,r}} \Bigg),\end{split}\end{equation}

where $X^N=\Big(X^c_{n,j},X^p_{m,j};\,n\in C_j^c, m\in C_j^p;\, 1\leq j\leq r\Big)\in\mathcal{D}([0,T],\mathcal{Z}^N)$ denotes the full description of the N particles and $\mathbb{M}_j^{c,N}$ $\big($ resp. $\mathbb{M}_j^{p,N}\big)$ is the empirical measure of the central (resp. peripheral) nodes of the jth block, for $1\leq j\leq r$ . With a slight abuse of notation, denote by $G_N\,:\,\mathcal{D}\big([0,T],\mathcal{Z}^N\big)\rightarrow (\mathcal{M}_1(\mathcal{D}([0,T],\mathcal{Z})))^{2r}$ the mapping that takes the full description $X^N$ to the empirical measures vector $M^N$ , that is,

\begin{align*}G_N\,:\, \Big(X^c_{n,j}, &X^p_{m,j};\,n\in C_j^c, m\in C_j^p;\, 1\leq j\leq r\Big)\rightarrow\\&\Bigg(\frac{1}{N_1^c}\sum_{n\in C^c_1}\delta_{X^c_{n,1}},\frac{1}{N_1^p}\sum_{n\in C^p_1}\delta_{X^p_{n,1}},\ldots, \frac{1}{N^c_r}\sum_{n\in C^c_r}\delta_{X^c_{n,r}}, \frac{1}{N^p_r}\sum_{n\in C^p_r}\delta_{X^p_{n,r}} \Bigg).\end{align*}

Thus, $\mathbb{M}^N=G_N\big(X^N\big)$ . Denote by $\mathbb{P}_{z^N}^N$ the law of $X^N$ with initial condition $z^N=\Big(z^c_{n,j},z^p_{m,j};\,n\in C_j^c, m\in C_j^p;\, 1\leq j\leq r\Big)$ . Note that the distribution of the empirical vector $\mathbb{M}^N$ depends on the initial condition only through its empirical vector, defined by

(101) \begin{align} & \nu_N\,:\!=\,\left(\nu_N^{1,c},\nu_N^{1,p},\ldots,\nu_N^{r,c},\nu_N^{r,p} \right)=\nonumber\\& \qquad \Bigg(\frac{1}{N_1^c}\sum_{n\in C_1^c}\delta_{z^c_{n,1}},\frac{1}{N_1^c}\sum_{n\in C_1^p}\delta_{z_{n,1}^p},\ldots,\frac{1}{N_r^c}\sum_{n\in C_r^c}\delta_{z^c_{n,r}},\frac{1}{N_r^p}\sum_{n\in C_r^p}\delta_{z^p_{n,r}}\Bigg).\end{align}

Moreover, denote by $P_{\nu^N}^N$ the distribution of $\mathbb{M}^N$ , which is the pushforward of $\mathbb{P}_{z^N}^N$ under the mapping $G_N$ ; that is, $P_{\nu^N}^N=\mathbb{P}_{z^N}^N\circ G_N^{-1}$ .

Let us now introduce the $(\mathcal{M}_1(\mathcal{Z}))^{2r}$ -valued vector of empirical processes

(102) \begin{equation}\begin{split}\mu^N\,:\, t\in [0,T]\longrightarrow \mu^N(t)&=\left(\mu_1^{c,N}(t),\mu_1^{p,N}(t),\cdots,\mu_r^{c,N}(t),\mu_r^{p,N}(t)\right)\\ &=\Bigg(\frac{1}{N_1^c}\!\sum_{n\in C^c_1}\delta_{X^c_{n,1}(t)},\frac{1}{N_1^p}\!\sum_{n\in C^p_1}\delta_{X^p_{n,1}(t)},\ldots, \frac{1}{N^c_r}\!\sum_{n\in C^c_r}\delta_{X^c_{n,r}(t)}, \frac{1}{N^p_r}\!\sum_{n\in C^p_r}\delta_{X^p_{n,r}(t)} \Bigg), \end{split}\end{equation}

and denote by $\gamma_N$ the corresponding mapping that takes a full description $X^N \in\mathcal{D}([0,T], \mathcal{Z}^N)$ of the N particles of the system to the empirical process vector $\mu^N$ , that is,

\begin{align*}\gamma_N \,:\, \Big(X^c_{n,j},X^p_{m,j};\,n\in C_j^c, m\in C_j^p;\, 1\leq j\leq r\Big)\in \mathcal{D}\big([0,T], \mathcal{Z}^N\big)\rightarrow \mu^N\,:\, [0,T] \rightarrow \big(\mathcal{M}_1(\mathcal{Z})\big)^{2r}.\end{align*}

Observe that $\mu^N(0)=\nu_N$ and that $\mu^N(t)$ is the projection $\pi_t\big(\mathbb{M}^N\big)$ at time t, that is,

\begin{align*}\mu^N=\pi\big(\mathbb{M}^N\big)=\pi\big(G_N\big(X^N\big)\big)=\gamma_N\big(X^N\big),\end{align*}

where the notation $\pi$ denotes, again with a slight abuse of notation, both the vector projection

\begin{align*}\pi\,:\,\left(\mathcal{M}_1(\mathcal{D}([0,T],\mathcal{Z}))\right)^{2r}\rightarrow\left(\mathcal{D}([0,T],\mathcal{M}_1(\mathcal{Z}))\right)^{2r}\end{align*}

and the component projection

\begin{align*}\pi\,:\,\mathcal{M}_1(\mathcal{D}([0,T],\mathcal{Z}))\rightarrow\mathcal{D}([0,T],\mathcal{M}_1(\mathcal{Z})).\end{align*}

Finally, denote by $p_{\nu_N}^N$ the distribution of $\mu^N$ , which is the pushforward $p_{\nu_N}^N=\mathbb{P}_{z^N}^N\circ\gamma_N^{-1}$ . Note that, since $\mu^N=\pi\big(M^N\big)$ , we can also write $p^N_{\nu_N}$ as the pushforward $p_{\nu_N}^N=P_{\nu_N}^N\circ\pi^{-1}$ .

The goal of this section is to study the large deviations principles for the sequences of probability measures $\big(P_{\nu^N}^N,N\geq 1\big)$ and $\big(p_{\nu_N}^N,N\geq 1\big)$ . The two main results are Theorem 6.1 and Theorem 6.2.

6.1. Large deviations principle for the empirical measure vector

We start by investigating the large deviations principle of the sequence $\big(P_{\nu^N}^N, N\geq 1\big)$ . To this end, we first establish the result in the non-interacting case. Then, through the Radon–Nikodym derivative, one uses the Laplace–Varadhan principle to deduce the case with interactions.

Let us first describe the hypothetical non-interacting system. Suppose that all the nodes are independent of each other and that the color of each node changes with a constant rate equal to 1 for all allowed transitions $(z, z')\in\mathcal{E}$ , while all other transition rates are zero. Denote by $P_{z_0}$ the marginal law on $\mathcal{D} ([0,T],\mathcal{Z})$ of this process with initial condition $z_0$ . Thus, $P_{z_0}$ is the unique solution to the martingale problem in $\mathcal{D}([0,T],\mathcal{Z})$ associated with the generator $\mathcal{L}^0$ operating on bounded measurable functions $\phi$ on $\mathcal{Z}$ according to

\begin{align*}\mathcal{L}^0\phi(z)\,:\!=\,\sum_{z'\,:\,(z,z')\in\mathcal{E}}1.(\phi (z')-\phi (z))\end{align*}

and the initial condition $z_0$ . Given that the transition rates are upper-bounded and that

\begin{align*}\sup_{z\in\mathcal{Z}}\sum_{z'\,:\,(z,z')\in\mathcal{E}}|z'-z|<\digamma (1+z)\end{align*}

for some constant $\digamma$ , there exists a unique solution to the martingale problem for $(\mathcal{L}^0,z_0)$ (cf. [Reference Ethier and Kurtz31, Problem 4.11.15]).

For any $\eta,\rho_1,\ldots, \rho_r$ in $\mathcal{D}([0,T], \mathcal{M}_1 (\mathcal{Z}))$ , let $R_{z_0}^c(\eta,\rho_j)$ be the unique solution to the martingale problem in $\mathcal{D}([0,T],\mathcal{Z})$ associated with the time-varying generator

(103) \begin{align}\mathcal{L}^c_{\eta (t),\rho_j(t)}\phi (z)\,:\!=\,\sum_{z'\,:\,(z,z')\in\mathcal{E}}\lambda_{j,z,z'}^{c}(\eta (t),\rho_j(t))(\phi (z')-\phi (z))\end{align}

and the initial condition $z_0$ . Similarly, let $R_{z_0}^p(\eta,\rho_1,\ldots,\rho_r)$ be the unique solution to the martingale problem in $\mathcal{D}([0,T],\mathcal{Z})$ associated with the time-varying generator

(104) \begin{align}\mathcal{L}^p_{\eta (t),\rho_1 (t),\ldots,\rho_r(t)}\phi (z)\,:\!=\,\sum_{z'\,:\,(z,z')\in\mathcal{E}}\lambda^p_{j,z,z'}(\eta (t),\rho_1 (t),\ldots,\rho_r(t))(\phi (z')-\phi (z))\end{align}

and the initial condition $z_0$ . Again by the upper-boundedness of $\lambda_{j,z,z'}^{c}$ and $\lambda^p_{j,z,z'}$ , the uniqueness of $R_{z_0}^c(\eta,\rho_j)$ and $R_{z_0}^p(\eta,\rho_1,\ldots,\rho_r)$ follows (see again [Reference Ethier and Kurtz31, Problem 4.11.15]). Therefore, the density of $R_{z_0}^c(\eta,\rho_j)$ and $R_{z_0}^p(\eta,\rho_1,\ldots,\rho_r)$ with respect to $P_{z_0}$ can be written as follows (see [Reference Léonard49, Equation (2.4)]):

(105) \begin{align}\frac{d R_{z_0}^c(\eta,\rho_j)}{d P_{z_0}}(x)=\exp\{h_1(x,\eta,\rho_j)\}\quad\text{and}\quad \frac{d R_{z_0}^p(\eta,\rho_1,\ldots,\rho_r)}{d P_{z_0}}(x)=\exp\{h_2(x,\eta,\rho_1,\ldots,\rho_r)\},\end{align}

where

(106) \begin{align}h_1(x,\eta,\rho_j)&\,:\!=\, \sum_{0\leq t\leq T}{\unicode{x1D7D9}}_{\{x_t\neq x_{t-}\}}\log \Big(\lambda^c_{j,x_{t-},x_t}(\eta (t{-}),\rho_j(t{-}))\Big)\nonumber \\ &\qquad-\int_0^T\Bigg(\sum_{z\,:\,(x_t,z)\in\mathcal{E}}\lambda^{c}_{j,x_t,z}(\eta(t),\rho_j(t))-1\Bigg)dt \end{align}

and

(107) \begin{align}h_2(x,\eta,\rho_1,\ldots,\rho_r)&\,:\!=\,\sum_{0\leq t\leq T}{\unicode{x1D7D9}}_{\{x_t\neq x_{t-}\}}\log \Big(\lambda^p_{j,x_{t-},x_{t}}(\eta (t{-}),\rho_1(t{-}),\ldots,\rho_r(t{-})\Big)\nonumber \\ &\qquad-\int_0^T\Bigg(\sum_{z\,:\,(x_t,z)\in\mathcal{E}}\lambda^{p}_{j,x_t,z}(\eta (t),\rho_1(t),\ldots,\rho_r(t))-1\Bigg)dt. \end{align}

Consider now a system of N non-interacting particles where the law of the nth particle is $P_{z_n}$ with initial condition $z_n$ . The law of such a system is the product distribution $\mathbb{P}_{z^N}^{0,N}=\otimes_{n=1}^NP_{z_n}$ . Moreover, the distribution of the corresponding empirical vector is given by $P_{\nu^N}^{0, N}=\mathbb{P}_{z^N}^{0, N}\circ G_N^{-1}$ where $\nu_N$ is the initial empirical vector (101). Therefore, by applying an analogue of the Cameron–Martin–Girsanov formula for stochastic integrals with respect to point processes (see e.g. [Reference Dawson and Zheng27, Lemma 3.7] or [Reference Léonard49, Equation (2.8)]), one can compute the Radon–Nikodym derivative $dP_{\nu^N}^{N}/dP_{\nu^N}^{0, N}$ at any $\textbf{Q}=(Q_{1}^c, Q_{1}^p,\cdots, Q_{r}^c, Q_{r}^p)\in (\mathcal{M}_1(\mathcal{D}([0, T],\mathcal{Z})))^{2r}$ as follows:

(108) \begin{equation}\begin{split}\frac{dP_{\nu^N}^{N}}{dP_{\nu^N}^{0,N}}(\textbf{Q})&=\exp\Bigg\{\sum_{j=1}^r\Bigg[N_j^c\int_{D([0,T],\mathcal{Z})}h_1\big(x, \pi\big(Q_{j}^c\big),\pi\big(Q_{j}^p\big)\big)Q_{j}^c(dx)\\&\qquad\qquad+N_j^p\int_{D([0,T],\mathcal{Z})}h_2\big(x, \pi\big(Q_{j}^c\big), \pi\big(Q_{1}^p\big),\ldots, \pi\big(Q_{r}^p\big)\big)Q_{j}^p(dx)\Bigg]\Bigg\}\\ &=\exp\big\{N h^N(\textbf{Q})\big\},\end{split}\end{equation}

with

\begin{align*}\pi(\textbf{Q})=\big(\pi\big(Q_{1}^c\big), \pi\big(Q_{1}^p\big),\cdots, \pi\big(Q_{r}^c\big), \pi\big(Q_{r}^p\big)\big)\end{align*}

being the vector containing the component projections $\pi\big(Q_j^{\iota}\big)\in \mathcal{D}([0,T],\mathcal{M}_1(\mathcal{Z}))$ for $1\leq j\leq r$ and $\iota\in\{c,p\}$ , and where

(109) \begin{equation}\begin{split}h^N(\textbf{Q})&\,:\!=\,\sum_{j=1}^r\bigg[\frac{N_j^c}{N}\int_{D([0,T],\mathcal{Z})}h_1\big(x,\pi\big(Q_{j}^c\big),\pi\big(Q_{j}^p\big)\big)Q_{j}^c(dx)\\&\qquad\qquad+\frac{N_j^p}{N}\int_{D([0,T],\mathcal{Z})}h_2\big(x,\pi\big(Q_{j}^c\big), \pi\big(Q_{1}^p\big),\ldots, \pi\big(Q_{r}^p\big)\big)Q_{j}^p(dx)\bigg].\end{split}\end{equation}

Note that under Assumption 6.1, the sequence of functions $\{h^N\}_{N\geq 1}$ converges, as $N\rightarrow\infty$ , towards the function h given by

(110) \begin{equation}\begin{split}h(\textbf{Q})&\,:\!=\,\sum_{j=1}^r\bigg[\alpha_jp_j^c\int_{D([0,T],\mathcal{Z})}h_1\big(x,\pi\big(Q_{j}^c\big),\pi\big(Q_{j}^p\big)\big)Q_{j}^c(dx)\\&\qquad\qquad+\alpha_j p_j^p\int_{D([0,T],\mathcal{Z})}h_2\big(x,\pi\big(Q_{j}^c\big), \pi\big(Q_{1}^p\big),\ldots, \pi\big(Q_{r}^p\big)\big)Q_{j}^p(dx)\bigg].\end{split}\end{equation}

We now introduce the necessary spaces and topologies following the notation of [Reference Borkar and Sundaresan12, Reference Léonard49]. Consider the Polish space $(\mathcal{X},d)$ where

\begin{align*}\mathcal{X}\,:\!=\,\bigg\{x\in \mathcal{D}([0,T], \mathcal{Z})\bigg| &\sum_{0\leq t\leq T}{\unicode{x1D7D9}}_{x_t\neq x_{t-}}<+\infty,\\&\text{ and for each $t\in(0, T ]$ with $x_t\neq x_{t-}$, we have $(x(t{-}),x(t))\in\mathcal{E}$}\bigg\},\end{align*}

and the metric d is defined by

(111) \begin{align}d(x,y)\,:\!=\, d_{Sko}(x,y)+|\varphi (x)-\varphi (y)|,\quad x,y\in\mathcal{X},\end{align}

with $\varphi(x)\,:\!=\,\sum_{0\leq t\leq T}{\unicode{x1D7D9}}_{x_t\neq x_{t-}}$ denoting the number of jumps and $d_{Sko}$ standing for the Skorokhod complete metric (see [Reference Billingsley10, Section 12]). For this topology, the function $\varphi$ is continuous, and two paths are close to each other if they have the same number of jumps and if they are Skorokhod-close [Reference Léonard49, p. 299]. For any function $f\,:\,\mathcal{X}\rightarrow\mathbb{R}$ , define

(112) \begin{align}\|f\|_{\varphi}\,:\!=\,\sup_{x\in\mathcal{X}}\frac{f(x)}{1+\varphi (x)},\end{align}

and write

(113) \begin{align}C_{\varphi}(\mathcal{X}) \,:\!=\, \big\{f | f \,:\, \mathcal{X}\rightarrow\mathbb{R}\text{ is continuous and }\|f\|_{\varphi}<\infty\big\},\end{align}
(114) \begin{align}\mathcal{M}_{1,\varphi}(\mathcal{X})\,:\!=\,\bigg\{Q\in\mathcal{M}_1(\mathcal{X})\big|\int_{\mathcal{X}}\varphi dQ<+\infty \bigg\}.\end{align}

We endow the set $\mathcal{M}_{1,\varphi}(\mathcal{X})$ with the weak- $^*$ topology $\sigma(\mathcal{M}_{1,\varphi}(\mathcal{X}), C_{\varphi}(\mathcal{X}))$ , the weakest topology under which $Q_N\rightarrow Q$ as $N\rightarrow +\infty$ if and only if

\begin{align*}\int_{\mathcal{X}} f dQ_N\rightarrow \int_{\mathcal{X}} f dQ \quad\text{for each $f\in C_{\varphi}(\mathcal{X})$}.\end{align*}

For a measure $\nu=\big(\nu^{1,c},\nu^{1,p},\ldots,\nu^{r,c},\nu^{r,p}\big)\in\big(\mathcal{M}_1(\mathcal{Z})\big)^{2r}$ we define, for all $1\leq j\leq r$ and $\iota\in\{c,p\}$ , the mixture

\begin{align*}dP_{j,\iota}(x)\,:\!=\,\sum_{z_0\in\mathcal{Z}}\nu^{\,j,\iota}(z_0)dP_{z_0}(x).\end{align*}

Moreover, let $R^c(\eta,\rho_j)$ and $R^p(\eta,\rho_1,\ldots,\rho_r)$ be the mixtures given by

(115) \begin{equation}\begin{split}dR^c(\eta,\rho_j)(x)&\,:\!=\,\sum_{z_0\in\mathcal{Z}}\nu^{\,j,c}(z_0)dR^c_{z_0}(\eta,\rho_j)(x),\\dR^p(\eta,\rho_1,\ldots,\rho_r)(x)&\,:\!=\,\sum_{z_0\in\mathcal{Z}}\nu^{\,j,p}(z_0)dR^p_{z_0}(\eta,\rho_1,\ldots,\rho_r)(x).\end{split}\end{equation}

Finally, let us introduce the relative entropy $H\,:\, \mathcal{M}_{1,\varphi} (\mathcal{X})\rightarrow [0,+\infty]$ of Q with respect to P as follows:

(116) \begin{align}H(Q|P)\,:\!=\,\left\{\begin{array}{l@{\quad}c@{\quad}l}\int_{\mathcal{X}}\log \big(\frac{dQ}{dP}\big)dQ & \mbox{if $Q \ll P$}, & \\[4pt]+\infty&\text{otherwise}. &\end{array}\right.\end{align}

We are now ready to state the large deviations principle for the sequence $(P_{\nu_N}^N,N\geq 1)$ .

Theorem 6.1. Let the space $\mathcal{M}_{1,\varphi}(\mathcal{X})$ be equipped with the weak- $^*$ topology $\sigma(\mathcal{M}_{1,\varphi}(\mathcal{X}), C_{\varphi}(\mathcal{X}))$ . Moreover, suppose that the initial condition $\nu_N\rightarrow\nu$ weakly as $N\rightarrow\infty$ . Then the sequence $(P_{\nu_N}^N,N\geq 1)$ satisfies the large deviations principle in the space $(\mathcal{M}_{1,\varphi}(\mathcal{X}))^{2r}$ , endowed with the product topology, with speed N and the good rate function $I(\textbf{Q}) = L(\textbf{Q})-h(\textbf{Q})$ , where the function $h(\textbf{Q})$ is given by (110) and $L\,:\,(\mathcal{M}_{1,\varphi}(\mathcal{X}))^{2r}\rightarrow [0,\infty]$ is defined as

(117) \begin{equation}\begin{split}L(\textbf{Q})\,:\!=\,\alpha_1p_1^c J^{1,c}\big(Q_{1}^c\big)+ \alpha_1p_1^p J^{1,p}\big(Q_{1}^p\big)+\cdots+\alpha_rp_r^c J^{r,c}\big(Q_{r}^c\big)+ \alpha_rp_r^p J^{r,p}\big(Q_{r}^p\big),\end{split}\end{equation}

with, for each $1\leq j\leq r$ , $\iota\in\{c,p\}$ , and $Q\in\mathcal{M}_{1,\varphi}(\mathcal{X})$ ,

(118) \begin{align}J^{j,\iota}(Q)\,:\!=\, \sup_{f\in C_{\varphi}(\mathcal{X})}\Bigg[\int_{\mathcal{X}}fdQ-\sum_{z_0\in\mathcal{Z}}\nu^{\,j,\iota} (z_0)\log\int_{\mathcal{X}}e^fdP_{z_0}\Bigg],\end{align}

and $\alpha_j,p_j^c,p_j^p$ being given in (99) and (18). Furthermore, for each $\textbf{Q}\in(\mathcal{M}_{1,\varphi}(\mathcal{X}))^{2r}$ , the rate function $I(\textbf{Q})$ admits the representation

(119) \begin{align}& I(\textbf{Q})= \nonumber \\& \left\{ \begin{array}{l@{\quad}l}\sum_{j=1}^r\bigg[\alpha_jp_j^cH\bigg(Q_{j}^c\big|R^c\big(\pi\big(Q_{j}^c\big),\pi\big(Q_{j}^p\big)\big)\bigg) & \\\qquad +\alpha_jp_j^pH\bigg(Q_{j}^p\big| R^p\big(\pi\big(Q_{j}^c\big),\pi\big(Q_{1}^p\big),\ldots,\pi\big(Q_{r}^p\big)\big)\bigg)\bigg] & \text{if $\textbf{Q}\circ\pi_0^{-1}=\nu$}, \\+\infty &{otherwise}.\end{array}\right.\end{align}

Remark 6.2. This is a generalization of [Reference Léonard49, Theorem 2.1] to our multi-population setting. Also, while [Reference Léonard49] studied the case where $z_n=z_0$ for some fixed $z_0$ , so that $\nu_N=\delta_{z_0}$ , we consider, as in [Reference Borkar and Sundaresan12, Theorem 3.1], more general initial conditions, provided that the initial empirical vector $\nu_N$ converges weakly towards $\nu=\left(\nu^{1,c},\nu^{1,p},\cdots,\nu^{r,c},\nu^{r,p}\right)$ . Moreover, similarly to [Reference Borkar and Sundaresan12], we consider here the case where not all transitions are allowed, but only those in $\mathcal{E}$ , the set of directed edges in the graph $(\mathcal{Z},\mathcal{E})$ .

Note that, from Definition 5.1, the weak convergence of the initial empirical vector $\nu_N$ towards $\nu$ amounts to the assertion that the initial conditions $\Big(X_{n,j}^{c}(0),X_{m,j}^{p}(0),n\in C_j^c,m\in C_j^p;\,1\leq j\leq r\Big)$ are $\nu^{1,c}\otimes\nu^{1,p}\cdots \nu^{r,c}\otimes\nu^{r,p}$ -multi-chaotic (cf. [Reference Sznitman61]).

Proof of Theorem 6.1. The proof of Theorem 6.1 is based on the generalization of Sanov’s theorem for empirical measures on Polish spaces due to Dawson and Gärtner [Reference Dawson and Gärtner24], the Girsanov transformation, and the Laplace–Varadhan principle [Reference Varadhan64]. We proceed through several lemmas. We follow [Reference Léonard49, Theorem 2.1] and [Reference Borkar and Sundaresan12, Theorem 3.1].

Large deviations principle for the non-interacting case. We first establish a large deviations principle in the non-interacting case.

Lemma 6.1. Suppose that the initial condition $\nu_N$ converges towards $\nu$ weakly as $N\rightarrow\infty$ . Let $\mathcal{M}_{1,\varphi}(\mathcal{X})$ be endowed with the weak- $^*$ topology $\sigma(\mathcal{M}_{1,\varphi} (\mathcal{X}), C_{\varphi} (\mathcal{X}))$ . Then the sequence $\big(P_{\nu_N}^{0,N}, N\geq 1\big)$ satisfies a large deviations principle in $(\mathcal{M}_{1,\varphi}(\mathcal{X}))^{2r}$ , endowed with the product topology, with speed N and the action functional $L\,:\,(\mathcal{M}_{1,\varphi}(\mathcal{X}))^{2r}\rightarrow [0,\infty]$ given by (117).

Proof. Fix a given block $1\leq j\leq r$ . Denote by

$$\left(P_{\nu_{N}^{j,c}}^{0,N_j^c}, N_j^c\geq 1\right) \quad \text{and} \quad \left(P_{\nu_{N}^{j,p}}^{0,{N_j^p}}, N_j^p\geq 1\right)$$

the sequences of probability distributions of the local empirical measures $\mathbb{M}_j^{N,c}$ and $\mathbb{M}_j^{N,p}$ of the central and peripheral nodes, respectively, of the jth block. Note that in the non-interacting case, the transition rate from any state to any other state is bounded by 1. Therefore, the family of probability measures $\{P_z\,:\,z\in\mathcal{Z}\}$ is a subset of $\mathcal{M}_{1,\varphi}(\mathcal{X})$ . Moreover, for any continuous function $F\in C_{\varphi}(\mathcal{X})$ , the integral $\int F(y)P_{z_0}(dy)$ depends continuously upon $z_0$ , and then $\{P_{z_0}\,:\,z_0\in\mathcal{Z}\}$ is a Feller continuous family of probability measures on $\mathcal{X}$ . Now, since $\nu^{\,j,c}_N\rightarrow\nu^{\,j,c}$ and $\nu^{\,j,p}_N\rightarrow\nu^{\,j,p}$ as $N\rightarrow\infty$ , by applying the generalization of Sanov’s theorem [Reference Dawson and Gärtner24, Theorem 3.5], we find that both the sequences $\left(P_{\nu_{N}^{j,c}}^{0,N_j^c}, N_j^c\geq 1\right)$ and $\left(P_{\nu_{N}^{j,p}}^{0,{N_j^p}}, N_j^p\geq 1\right)$ satisfy the large deviations principle in $\mathcal{M}_{1,\varphi} (\mathcal{X})$ endowed with the weak- $^*$ topology $\sigma(\mathcal{M}_{1,\varphi} (\mathcal{X}), C_{\varphi} (\mathcal{X}))$ , with speeds $N_j^c$ and $N_j^p$ , respectively, and good rate functions $J^{j,c}(Q)$ and $J^{j,p}(Q)$ defined by (118). Let $\mathcal{K}^c_1,\mathcal{K}^p_1,\ldots,\mathcal{K}_r^c,\mathcal{K}_r^p\in\mathcal{B}(\mathcal{M}_{1,\varphi}(\mathcal{X}))$ be closed Borelian sets. By independence, one has

(120) \begin{equation}\begin{split}P_{\nu_{N}}^{0,{N}}\Bigg\{M^N\in \prod_{j=1}^r\big(\mathcal{K}^c_j\times \mathcal{K}^p_j\big)\Bigg\}= \prod_{j=1}^r\left( P_{\nu_{N_j^p}}^{0,{N_j^p}}\left\{M_j^{N,c}\in \mathcal{K}_j^c\right\}\times P_{\nu_N^{j,p}}^{0,{N_j^p}}\left\{M_j^{N,p}\in \mathcal{K}_j^p\right\}\right).\end{split}\end{equation}

Therefore, by Assumption 6.1 we get

(121) \begin{align}\limsup_{N\rightarrow\infty}\frac{1}{N}\log P_{\nu_{N}}^{0,{N}}\Bigg(\prod_{j=1}^r\mathcal{K}^c_j\times \mathcal{K}^p_j\Bigg)&=\limsup_{N\rightarrow\infty}\frac{1}{N}\log\Bigg(\prod_{j=1}^rP_{\nu_N^{j,c}}^{0,{N_j^c}}\big(\mathcal{K}_j^c\big)P_{\nu_N^{j,p}}^{0,{N_j^p}}\big(\mathcal{K}_j^p\big) \Bigg)\nonumber\\ &= \limsup_{N\rightarrow\infty}\sum_{j=1}^r\Bigg(\frac{N_j}{N}\frac{N_j^c}{N_j}\frac{1}{N_j^c}\log P_{\nu_N^{j,c}}^{0,{N_j^c}}\big(\mathcal{K}_j^c\big)\nonumber\\ & \qquad\qquad\qquad\qquad +\frac{N_j}{N}\frac{N_j^p}{N_j}\frac{1}{N_j^p}\log P_{\nu_N^{j,p}}^{0,{N_j^p}}\big(\mathcal{K}_j^p\big)\Bigg)\nonumber\\ &\leq \sum_{j=1}^r\Bigg(\alpha_jp_j^c\limsup_{N_j^c\rightarrow\infty}\frac{1}{N_j^c}\log P_{\nu_N^{j,c}}^{0,{N_j^c}}\big(\mathcal{K}_j^c\big)\nonumber\\ & \qquad\qquad\qquad +\alpha_jp_j^p\limsup_{N_j^p\rightarrow\infty}\frac{1}{N_j^p}\log P_{\nu_N^{j,p}}^{0,{N_j^p}}\big(\mathcal{K}_j^p\big)\Bigg)\nonumber\\ &\leq \sum_{j=1}^r\Bigg({-}\alpha_jp_j^c\inf_{Q_{j}^c\in \mathcal{K}_j^c}J^{j,c}\big(Q_{j}^c\big)-\alpha_jp_j^p\inf_{Q_{j}^p\in \mathcal{K}_j^p}J^{j,p}\big(Q_{j}^p\big)\Bigg)\nonumber\\ &=-\inf_{\substack{Q_{j}^c\in \mathcal{K}_1^c\\ Q_{j}^p\in \mathcal{K}_1^p\\ \vdots\\ Q_{r}^c\in \mathcal{K}_r^c\\ Q_{r}^p\in \mathcal{K}_r^p }}\sum_{j=1}^r\Big( \alpha_jp_j^c J^{j,c}\big(Q_{j}^c\big)+\alpha_jp_j^p J^{j,p}\big(Q_{j}^p\big)\Big).\end{align}

Similar arguments allow us to prove the lower bound for the large deviations principle, which concludes the proof.

The next result gives a characterization of the space containing the probability measures satisfying $L(\textbf{Q})<\infty$ .

Lemma 6.2. If, for a given $\textbf{Q}=\big(Q_{1}^c,Q_{1}^p,\ldots,Q_{r}^c,Q_{r}^p\big)\in(\mathcal{M}_{1}(\mathcal{D}([0,T],\mathcal{Z}))^{2r}$ , the action functional $L(\textbf{Q})<\infty$ , then the following hold:

  1. 1. $\textbf{Q}\in(\mathcal{M}_{1,\varphi}(\mathcal{X}))^{2r}$ .

  2. 2. $ \textbf{Q}\circ\pi_0^{-1}=\nu$ . Thus,

    $$\Big(\pi_0^{-1}\big(Q_{1}^c\big),\pi_0^{-1}\big(Q_{1}^p\big),\ldots,\pi_0^{-1}\big(Q_{r}^c\big),\pi_0^{-1}\big(Q_{r}^p\big)\Big)=\Big(\nu^{1,c},\nu^{1,p},\cdots,\nu^{r,c},\nu^{r,p}\Big).$$

Proof. This is a generalization of [Reference Borkar and Sundaresan12, Lemma 5.2]. Recall that the function $\varphi (x)=\sum_{0\leq t\leq T}{\unicode{x1D7D9}}_{x(t{-})\neq x(t)}$ denotes the number of jumps of x in the interval [0,T]. From (112) we have that $\|\varphi\|_{\varphi}\leq 1$ . Moreover, $\varphi$ is continuous in the topology induced by the metric d defined in (111). Hence $\varphi\in C_{\varphi}(\mathcal{X})$ . Furthermore, $L(\textbf{Q})<\infty$ implies that, for all $1\leq j\leq r$ ,

(122) \begin{align}\int_{\mathcal{X}}\varphi dQ_{j}^c-\sum_{z_0\in\mathcal{Z}}\nu^{\,j,c}(z)\log\int_{\mathcal{X}}e^{\varphi} dP_{z_0}<\infty\end{align}

and

(123) \begin{align}\int_{\mathcal{X}}\varphi dQ_{j}^p-\sum_{z_0\in\mathcal{Z}}\nu^{\,j,p}(z)\log\int_{\mathcal{X}}e^{\varphi} dP_{z_0}<\infty.\end{align}

Now, note that under the non-interacting distribution $P_{z_0}$ , the transition rates are bounded by 1. Since the number of allowed transitions from any state is at most equal to $K-1$ , $\varphi$ is thus stochastically dominated by a Poisson random variable of rate $(K-1)T$ . Therefore, for any initial condition $z_0\in\mathcal{Z}$ , we have $1\leq\int_{\mathcal{X}}e^{\varphi} dP_{z_0}<\infty$ . It follows from (122) and (123) that $\int_{\mathcal{X}}\varphi dQ_{j}^c<\infty$ and $\int_{\mathcal{X}}\varphi dQ_{j}^p<\infty$ for each $1\leq j\leq r$ , and so $\textbf{Q}=(Q_{1}^c,Q_{1}^p,\cdots,Q_{r}^c,Q_{r}^p)\in(\mathcal{M}_{1,\varphi}(\mathcal{X}))^{2r}$ , which proves the first claim. In order to prove the second point, we proceed by contraposition. Suppose that for a given measure $\textbf{Q}$ , $L(\textbf{Q})<\infty$ and $\textbf{Q}\circ\pi_0^{-1}=\nu_{\textbf{Q}}\neq\nu$ . Consider the bounded continuous functions $f_1^c(x),f_1^p(x),\ldots,f_r^c(x),f_r^p(x)$ defined on $\mathcal{X}$ and depending on x only through the initial condition; that is, there exist functions $g_1^c,g_1^p,\ldots,g_r^c,g_r^p$ such that, for all $1\leq j\leq r$ ,

\begin{align*}f_j^c(x)=g_j^c(\pi_0(x))\quad\text{and}\quad f_j^p(x)=g_j^p(\pi_0(x)).\end{align*}

Since $\nu_{\textbf{Q}}\neq \nu$ , the above functions satisfy the following claim: either

(124) \begin{align}\sum_zg_j^c(z)\nu_{\textbf{Q}}^{j,c}(z)-\sum_zg_j^c(z)\nu^{\,j,c}(z)\neq 0\end{align}

or

(125) \begin{align}\sum_zg_j^p(z)\nu_{\textbf{Q}}^{j,p}(z)-\sum_zg_j^p(z)\nu^{\,j,p}(z)\neq 0\end{align}

for at least one $1\leq j\leq r$ . Therefore, one can always find, for at least one j, an arbitrarily large $a_j^c>0$ (or $a_j^p>0$ ) such that $\sum_zg_j^c(z)\nu_{\textbf{Q}}^{j,c}(z)-\sum_zg_j^c(z)\nu^{\,j,c}(z)=a_j^c$ $\big($ or $\sum_zg_j^p(z)\nu_{\textbf{Q}}^{j,p}(z)-\sum_zg_j^p(z)\nu^{\,j,p}(z)=a_j^p\big)$ . Indeed, this can be done by flipping the sign of $f_j^c$ (or $f_j^p$ ) if necessary and scaling the functions. Note that, by the assumption, $f_j^c,f_j^p\in C_{\varphi}(\mathcal{X})$ since they are bounded and continuous. Suppose without loss of generality that, for a given j, (124) is satisfied; then by direct calculation we obtain

\begin{align*}\int_{\mathcal{X}}f_j^cdQ_{j}^c-\sum_{z_0\in\mathcal{Z}}\nu^{\,j,c} (z)\log\int_{\mathcal{X}}e^{f_j^c}dP_{z_0}&=\int_{\mathcal{X}}g_j^c(\pi_0(x))Q_{j}^c(dx)\\& \quad -\sum_{z_0\in\mathcal{Z}}\nu^{\,j,c} (z)\log\int_{\mathcal{X}}\exp\big\{g_j^c(\pi_0(x))\big\}dP_{z_0}\\ &=\sum_{z_0}g_j^c(z_0)\nu_{\textbf{Q}}^{j,c}(z_0)-\sum_{z_0}g_j^c(z)\nu^{\,j,c}(z_0)=a.\end{align*}

Hence, since $a>0$ may be arbitrarily large, one gets that $J\big(Q_{j}^c\big)=\infty$ and then $L(\textbf{Q})=\infty$ , which contradicts the assumption of the lemma and thus proves the second claim.

Conditions for application of the Laplace–Varadhan lemma. Lemma 6.1 establishes the large deviations principle for the sequence $\big(P_{\nu_N}^{0,N},N\geq 1\big)$ in the topological space $(\mathcal{M}_{1,\varphi}(\mathcal{X}))^{2r}$ . Moreover, recall that the Radon–Nikodym derivative is given by

(126) \begin{equation}\begin{split}\frac{dP_{\nu^N}^{N}}{dP_{\nu^N}^{0,N}}(\textbf{Q})=\exp\big\{N h^N(\textbf{Q})\big\},\end{split}\end{equation}

where the function $h^N(\textbf{Q})$ is given by (109). Therefore, to find the large deviations principle for $\big(P_{\nu_N}^{N},N\geq 1\big)$ , one can apply the Laplace–Varadhan principle (cf. [Reference Varadhan64, Theorem 3.4]) to the sequence $\big(P_{\nu_N}^{0,N},N\geq 1\big)$ . The Laplace–Varadhan principle holds under the following conditions:

  1. 1. The sequence of functions $\big\{h^N(\textbf{Q})\big\}$ satisfies

    (127) \begin{align}\lim_{A\rightarrow\infty} \limsup_{N\rightarrow\infty} \frac{1}{N} \log\int_{h^N(\textbf{Q})\geq A}\exp\big\{Nh^N(\textbf{Q})\big\}dP_{\nu_N}^{0,N}=-\infty.\end{align}
  2. 2. For every $\textbf{Q}$ in $(\mathcal{M}_{1,\varphi}(\mathcal{X}))^{2r}$ such that $L(\textbf{Q})<\infty$ and $ \textbf{Q}^N\rightarrow \textbf{Q}$ ,

    (128) \begin{align}\limsup_N h^N\big(\textbf{Q}^N\big)\leq h(\textbf{Q}).\end{align}
  3. 3. For every $\textbf{Q}$ in $(\mathcal{M}_{1,\varphi}(\mathcal{X}))^{2r}_0$ and $ \textbf{Q}^N\rightarrow \textbf{Q}$ ,

    (129) \begin{align}\liminf_N h^N\big(\textbf{Q}^N\big)\geq h(\textbf{Q}),\end{align}
    where $(\mathcal{M}_{1,\varphi}(\mathcal{X}))^{2r}_0$ is the set of points $\textbf{Q}^*$ in $(\mathcal{M}_{1,\varphi}(\mathcal{X}))^{2r}$ for which, given any $\varepsilon > 0$ , there exists a neighborhood V of $\textbf{Q}^*$ such that, for any $\textbf{Q}\in V$ and N large enough,
    (130) \begin{align}h^N(\textbf{Q})\geq h(\textbf{Q}^*)-\varepsilon.\end{align}
  4. 4. We have

    (131) \begin{align}\sup_{\textbf{Q}\in \big(\mathcal{M}_{1,\varphi}(\mathcal{X})\big)^{2r}}[h(\textbf{Q})-L(\textbf{Q})]=\sup_{\textbf{Q}\in \big(\mathcal{M}_{1,\varphi}(\mathcal{X})\big)^{2r}_0}[h(\textbf{Q})-L(\textbf{Q})].\end{align}

Note that, on the one hand, the condition in (127) is satisfied if, for any $\alpha>0$ ,

(132) \begin{align}\limsup_{N\rightarrow\infty}\frac{1}{N}\log\int_{\big(\mathcal{M}_{1,\varphi}(\mathcal{X})\big)^{2r}}\exp\big\{ N\alpha \big|h^N\big| \big\} dP_{\nu_N}^{0,N}<\infty.\end{align}

Indeed, take $\alpha>1$ ; then

\begin{align*} \frac{1}{N} \log\int_{h^N(\textbf{Q})\geq A}\exp\big\{Nh^N(\textbf{Q})\big\}dP_{\nu_N}^{0,N}&=A+ \frac{1}{N} \log\int_{h^N(\textbf{Q})\geq A}\exp\big\{N\big(h^N(\textbf{Q})-A\big)\big\}dP_{\nu_N}^{0,N}\\ &\leq A+ \frac{1}{N} \log\int_{h^N(\textbf{Q})\geq A} \exp\big\{\alpha N\big(h^N(\textbf{Q})-A\big)\big\}dP_{\nu_N}^{0,N}\\ &= (1-\alpha )A+ \frac{1}{N} \log \int_{h^N(\textbf{Q})\geq A} \exp\big\{\alpha Nh^N(\textbf{Q})\big\}dP_{\nu_N}^{0,N}\\ &\leq (1-\alpha )A+ \frac{1}{N} \log \int_{\big(\mathcal{M}_{1,\varphi}(\mathcal{X})\big)^{2r}} \exp\big\{\alpha N\big|h^N(\textbf{Q})\big|\big\}dP_{\nu_N}^{0,N}.\end{align*}

Therefore,

\begin{align*}& \limsup_{N\rightarrow\infty} \frac{1}{N} \log\int_{h^N(\textbf{Q})\geq A}\exp\big\{Nh^N(\textbf{Q})\big\}dP_{\nu_N}^{0,N}\leq\limsup_{N\rightarrow\infty}\bigg( (1-\alpha )A\\& \qquad + \frac{1}{N} \log \int_{\big(\mathcal{M}_{1,\varphi}(\mathcal{X})\big)^{2r}} \exp\big\{\alpha N\big|h^N(\textbf{Q})\big|\big\}dP_{\nu_N}^{0,N}\bigg),\end{align*}

where the right-hand side of the last inequality goes to $-\infty$ as $A\rightarrow\infty$ if (132) is true. On the other hand, it is easy to see that the conditions (128), (129), and (131) hold if the functions $h^N$ defined in (109) are continuous and the sequence $\big\{h^N\big\}$ converges uniformly on $(\mathcal{M}_{1,\varphi}(\mathcal{X}))^{2r}$ towards the function h given in (110). The next lemmas are thus dedicated to the verification of these conditions. First, we establish a regularity property for all the probability measures $\textbf{Q}$ satisfying $L(\textbf{Q})<\infty$ .

Lemma 6.3. Let $\textbf{Q}=\big(Q_{1}^c,Q_{1}^p,\cdots,Q_{r}^c,Q_{r}^p \big)\in(\mathcal{M}_{1,\varphi}(\mathcal{X}))^{2r}$ be such that $L(\textbf{Q})<\infty$ . Moreover, suppose that the random vector $\textbf{X}=\big(X_1^c,X_1^p,\cdots,X_r^c,X_r^p \big)$ is distributed according to $\textbf{Q}$ . Then

(133) \begin{align}\sup_{t\in [0,T]}\mathbb{E}\bigg[\sup_{u\in \left[t-\alpha ,t+\alpha\right]\cap [0,T]}\big\{{\unicode{x1D7D9}}_{\textbf{X}(u)\neq\textbf{X}(u{-})}\big\}\bigg]\rightarrow 0\quad\text{as $\alpha\downarrow 0$}.\end{align}

Proof. This is a generalization of [Reference Borkar and Sundaresan12, Lemma 5.7]. Note that $\textbf{X}(u)\neq\textbf{X}(u{-})$ if $X_j^c(u)\neq X_j^c(u{-})$ or $X_j^p(u)\neq X_j^p(u{-})$ for at least one $1\leq j\leq r$ . Therefore, for each $t\in [0,T]$ one obtains

(134) \begin{equation}\begin{split}\mathbb{E}&\bigg[\sup_{u\in \left[t-\alpha ,t+\alpha\right]\cap [0,T]}\big\{{\unicode{x1D7D9}}_{\textbf{X}(u)\neq\textbf{X}(u{-})}\big\}\bigg]\leq \mathbb{E}\bigg[\sup_{u\in \left[t-\alpha ,t+\alpha\right]\cap [0,T]}\Big\{{\unicode{x1D7D9}}_{X_1^c(u)\neq X_1^c(u{-})}\Big\}\bigg]\\& + \mathbb{E}\bigg[\sup_{u\in \left[t-\alpha ,t+\alpha\right]\cap [0,T]}\Big\{{\unicode{x1D7D9}}_{X_1^p(u)\neq X_1^p(u{-})}\Big\}\bigg] \\& +\cdots+ \mathbb{E}\bigg[\sup_{u\in \left[t-\alpha ,t+\alpha\right]\cap [0,T]}\big\{{\unicode{x1D7D9}}_{X_r^c(u)\neq X_r^c(u{-})}\big\}\bigg]+ \mathbb{E}\bigg[\sup_{u\in \left[t-\alpha ,t+\alpha\right]\cap [0,T]}\Big\{{\unicode{x1D7D9}}_{X_r^p(u)\neq X_r^p(u{-})}\Big\}\bigg].\end{split}\end{equation}

Moreover, since $L(\textbf{Q})<\infty$ , one gets that $J\big(Q_{j}^c\big)<\infty$ and $J\big(Q_{j}^p\big)<\infty$ for all $1\leq j\leq r$ . Hence, applying [Reference Borkar and Sundaresan12, Lemma 5.7] to each of the $X_j^c$ and $X_j^p$ with respective marginal distributions $Q_{j}^c$ and $Q_{j}^p$ gives us that, for each $1\leq j\leq r$ ,

(135) \begin{align}\sup_{t\in [0,T]}\mathbb{E}\bigg[\sup_{u\in \left[t-\alpha ,t+\alpha\right]\cap [0,T]}\Big\{{\unicode{x1D7D9}}_{X_j^c(u)\neq X_j^c(u{-})}\Big\}\bigg]\rightarrow 0\quad\text{as $\alpha\downarrow 0$}\end{align}

and

(136) \begin{align}\sup_{t\in [0,T]}\mathbb{E}\bigg[\sup_{u\in \left[t-\alpha ,t+\alpha\right]\cap [0,T]}\Big\{{\unicode{x1D7D9}}_{X_j^p(u)\neq X_j^p(u{-})}\Big\}\bigg]\rightarrow 0\quad\text{as $\alpha\downarrow 0$}.\end{align}

Combining (134), (135), and (136) leads to (133).

The next lemma establishes the continuity of the projection $\pi$ , which is needed for the continuity of the function $h(\textbf{Q})$ .

Lemma 6.4. Let $\mathcal{M}_1(\mathcal{D}([0,T],\mathcal{Z}))$ be equipped with its usual weak topology and let $\mathcal{D}([0,T],\mathcal{M}_1(\mathcal{Z}))$ be equipped with the metric

(137) \begin{align}\rho_T (\mu,\nu)=\sup_{0\leq t\leq T}\rho_0 (\mu_t,\nu_t),\quad\mu,\nu\in\mathcal{D}([0,T],\mathcal{M}_1(\mathcal{Z})),\end{align}

where $\rho_0(\cdot,\cdot)$ is a metric on $\mathcal{M}_1(\mathcal{Z})$ which generates the weak topology $\sigma(\mathcal{M}_1(\mathcal{Z}),C_b(\mathcal{Z}))$ . Moreover, let $(\mathcal{M}_1(\mathcal{D}([0,T],\mathcal{Z})))^{2r}$ be endowed with the product topology induced by the product metric. Equivalently, let $(\mathcal{D}([0,T],\mathcal{M}_1(\mathcal{Z})))^{2r}$ be equipped with the product topology obtained from the product metric $\rho^{2r}_T=\max\{\rho_{T},\ldots,\rho_{T}\}$ . Then the projection

\begin{align*}\pi\,:\, \textbf{Q}\in\left(\mathcal{M}_1(\mathcal{D}([0,T],\mathcal{Z}))\right)^{2r}&\rightarrow\pi(\textbf{Q})=(\textbf{Q}_t)_{0\leq t\leq T}\in\left(\mathcal{D}([0,T],\mathcal{M}_1(\mathcal{Z}))\right)^{2r}\end{align*}

is continuous at each $\textbf{Q}\in\left(\mathcal{M}_1(\mathcal{D}([0,T],\mathcal{Z}))\right)^{2r}$ where $L(\textbf{Q})<\infty$ .

Proof. The statement of our lemma resembles the statement of [Reference Léonard49, Lemma 2.8]. The difference here is that our spaces of interest are the product spaces $(\mathcal{M}_1(\mathcal{D}([0,T],\mathcal{Z})))^{2r}$ and $(\mathcal{D}([0,T],\mathcal{M}_1(\mathcal{Z})))^{2r}$ endowed with the product metrics. Moreover, the rate J(Q) in [Reference Léonard49, Lemma 2.8] is here replaced by $L(\textbf{Q})$ . Therefore, if we replace the norm $|\cdot|$ by the product norm $\|\cdot\|$ adapted to the context of our product spaces, then the proof of our lemma follows verbatim the proof of [Reference Léonard49, Lemma 2.8], provided that we can prove [Reference Léonard49, Equation (2.14)]. This is done in Lemma 6.3. Thus, the proof is complete.

We next prove the continuity of the functions $h^N$ .

Lemma 6.5. The functions $h^N\,:\,(\mathcal{M}_{1,\varphi}(\mathcal{X}))^{2r}\rightarrow\mathbb{R}$ defined at (109) are continuous at any $\textbf{Q}$ such that $L(\textbf{Q})<\infty$ .

Proof. This is a generalization of [Reference Léonard49, Lemma 2.9]. For any $\textbf{Q}\in(\mathcal{M}_{1,\varphi}(\mathcal{X}))^{2r}$ , define

(138) \begin{align}\theta_{\textbf{Q}}^{j,c}(x)&\,:\!=\,\sum_{0\leq t\leq T}{\unicode{x1D7D9}}_{x_t\neq x_{t-}}\log\Bigg(\sum_{(x_{t-},x(t))\in\mathcal{E}}\lambda_{j,x_{t-},x_t}^{c}\left(Q^c_j(t{-}),Q^p_j(t{-})\right)\Bigg),\qquad\qquad\end{align}
(139) \begin{align}\theta_{\textbf{Q}}^{j,p}(x)&\,:\!=\,\sum_{0\leq t\leq T}{\unicode{x1D7D9}}_{x_t\neq x_{t-}}\log\Bigg(\sum_{(x_{t-},x_t)\in\mathcal{E}}\lambda_{j,x_{t-},x_t}^{p}(Q^c_j(t{-}),Q^p_1(t{-}),\ldots,Q^p_r(t{-}) )\Bigg),\end{align}
(140) \begin{align}\gamma_{\textbf{Q}}^{j,c}(x)&\,:\!=\,\int_0^T\Bigg(\sum_{z\,:\,(x_t,z)\in\mathcal{E}}\lambda^{c}_{j,x_t,z}(Q_{j}^c (t),Q_{j}^p (t))-1\Bigg)dt,\qquad\qquad\qquad\qquad\qquad\end{align}
(141) \begin{align}\gamma_{\textbf{Q}}^{j,p}(x)&\,:\!=\,\int_0^T\Bigg(\sum_{z\,:\,(x_t,z)\in\mathcal{E}}\lambda^{p}_{j,x_t,z}(Q_{j}^c (t),Q_{1}^p (t),\ldots,Q_{r}^p (t))-1\Bigg)dt.\qquad\qquad\qquad\qquad\end{align}

Note that the function $h^N$ given by (109) can be rewritten using the functions $\theta_{\textbf{Q}}^{j,c}(x)$ , $\theta_{\textbf{Q}}^{j,p}(x)$ , $\gamma_{\textbf{Q}}^{j,c}(x)$ , and $\gamma_{\textbf{Q}}^{j,p}(x)$ as follows:

(142) \begin{equation} h^N(\textbf{Q}) =\sum_{j=1}^r\Bigg[\frac{N_j^c}{N}\int_{\mathcal{X}}\bigg(\theta_{\textbf{Q}}^{j,c}(x)-\gamma_{\textbf{Q}}^{j,c}(x)\bigg)Q_{j}^c(dx) + \frac{N_j^p}{N}\int_{\mathcal{X}}\bigg(\theta_{\textbf{Q}}^{j,p}(x)-\gamma_{\textbf{Q}}^{j,p}(x)\bigg)Q_{j}^p(dx)\Bigg].\end{equation}

Therefore, to show the continuity of $h^N(\textbf{Q})$ , we show that for any $1\leq j\leq r$ , the functions

\begin{align*}\textbf{Q}\rightarrow\int_{\mathcal{X}}\theta_{\textbf{Q}}^{j,c}(x)Q_{j}^c(dx),&\qquad \textbf{Q}\rightarrow\int_{\mathcal{X}}\theta_{\textbf{Q}}^{j,p}(x)Q_{j}^p(dx),\\\textbf{Q}\rightarrow\int_{\mathcal{X}}\gamma_{\textbf{Q}}^{j,c}(x)Q_{j}^c(dx),&\qquad \textbf{Q}\rightarrow\int_{\mathcal{X}}\gamma_{\textbf{Q}}^{j,p}(x)Q_{j}^p(dx)\\\end{align*}

are continuous at any $\textbf{Q}$ where $L(\textbf{Q})<\infty$ . First, from Assumption 6.1, there exists a positive constant $C>0$ such that, for each $1\leq j\leq r$ ,

(143) \begin{equation}\begin{split}\Big|\theta_{\textbf{Q}}^{j,c}(x)\Big|&\leq\sup_{\xi,\zeta}\Bigg(\Bigg|\log\Bigg(\sum_{(x_{t-},x_t)\in\mathcal{E}}\lambda_{j,x_{t-},x_t}^{c}\left(\xi,\zeta\right)\Bigg)\Bigg|\Bigg)\varphi (x)\\ &\leq C(1+\varphi (x)) \qquad\forall x\in\mathcal{X}\end{split}\end{equation}

and

(144) \begin{equation}\begin{split}\Big|\theta_{\textbf{Q}}^{j,p}(x)\Big|&\leq\sup_{\xi,\zeta_1,\ldots,\zeta_r}\Bigg(\Bigg|\log\Bigg(\sum_{(x_{t-},x_t)\in\mathcal{E}}\lambda_{j,x_{t-},x_t}^{p}\left(\xi,\zeta_1,\ldots,\zeta_r\right)\Bigg)\Bigg|\Bigg)\varphi (x)\\ &\leq C(1+\varphi (x)) \qquad\forall x\in\mathcal{X}.\end{split}\end{equation}

Similarly, by Assumption 6.1 we have that, for each $1\leq j\leq r$ ,

(145) \begin{equation}\Big|\gamma_{\textbf{Q}}^{j,c}(x)\Big|\leq\sup_{\xi,\zeta}\Bigg|\int_0^T\Bigg(\sum_{z\,:\,(x_t,z)\in\mathcal{E}}\lambda^{c}_{j,x_t,z}(\xi,\zeta)-1\Bigg)dt\Bigg|<\infty\qquad\forall x\in\mathcal{X}\end{equation}

and

(146) \begin{equation}\Big|\gamma_{\textbf{Q}}^{j,p}(x)\Big|\leq\sup_{\xi,\zeta_1,\ldots,\zeta_r}\Bigg|\int_0^T\Bigg(\sum_{z\,:\,(x_t,z)\in\mathcal{E}}\lambda^{c}_{j,x_t,z}(\xi,\zeta_1,\ldots,\zeta_r)-1\Bigg)dt\Bigg|<\infty\qquad\forall x\in\mathcal{X}.\end{equation}

Take $\textbf{Q}'\in(\mathcal{M}_{1,\varphi}(\mathcal{X}))^{2r}$ in the neighborhood of $\textbf{Q}$ . Note that

(147) \begin{equation}\begin{split}\big|h^N(\textbf{Q})-h^N(\textbf{Q}')\big|\leq\sum_{j=1}^r\Bigg[&\frac{N_j^c}{N}\bigg(\Big|\langle \theta_{\textbf{Q}}^{j,c}, Q_{j}^c \rangle-\langle \theta_{\textbf{Q}'}^{j,c}, Q_{j}'^c \rangle\Big|+ \Big|\langle \gamma_{\textbf{Q}}^{j,c}, Q_{j}^c \rangle-\langle \gamma_{\textbf{Q}'}^{j,c}, Q_{j}'^c \rangle\Big|\bigg)\\ &+\frac{N_j^p}{N}\bigg(\Big|\langle \theta_{\textbf{Q}}^{j,p}, Q_{j}^p \rangle-\langle \theta_{\textbf{Q}'}^{j,p}, Q_{j}'^p \rangle\Big|+ \Big|\langle \gamma_{\textbf{Q}}^{j,p}, Q_{j}^p \rangle-\langle \gamma_{\textbf{Q}'}^{j,c}, Q_{j}'^p \rangle\Big|\bigg)\Bigg].\end{split}\end{equation}

In addition, for each $1\leq j\leq r$ , the following inequalities hold:

(148) \begin{align}\Big|\langle \theta_{\textbf{Q}}^{j,c}, Q_{j}^c \rangle-\langle \theta_{\textbf{Q}'}^{j,c}, Q_{j}'^c \rangle\Big|&\leq \Big|\langle \theta_{\textbf{Q}}^{j,c}, Q_{j}^c-Q_{j}'^c \rangle\Big| +\Big|\langle \theta_{\textbf{Q}}^{j,c}-\theta_{\textbf{Q}'}^{j,c}, Q_{j}'^c \rangle\Big|,\end{align}
(149) \begin{align}\Big|\langle \theta_{\textbf{Q}}^{j,p}, Q_{j}^p \rangle-\langle \theta_{\textbf{Q}'}^{j,p}, Q_{j}'^p \rangle\Big|&\leq \Big|\langle \theta_{\textbf{Q}}^{j,p}, Q_{j}^p-Q_{j}'^p \rangle\Big| +\Big|\langle \theta_{\textbf{Q}}^{j,p}-\theta_{\textbf{Q}'}^{j,p}, Q_{j}'^p \rangle\Big|,\end{align}
(150) \begin{align}\Big|\langle \gamma_{\textbf{Q}}^{j,c}, Q_{j}^c \rangle-\langle \gamma_{\textbf{Q}'}^{j,c}, Q_{j}'^c \rangle\Big|&\leq \Big|\langle \gamma_{\textbf{Q}}^{j,c}, Q_{j}^c-Q_{j}'^c \rangle\Big| +\Big|\langle \gamma_{\textbf{Q}}^{j,c}-\gamma_{\textbf{Q}'}^{j,c}, Q_{j}'^c \rangle\Big|,\end{align}
(151) \begin{align}\Big|\langle \gamma_{\textbf{Q}}^{j,p}, Q_{j}^p \rangle-\langle \gamma_{\textbf{Q}'}^{j,p}, Q_{j}'^p \rangle\Big|&\leq \Big|\langle \gamma_{\textbf{Q}}^{j,p}, Q_{j}^p-Q_{j}'^p \rangle\Big| +\Big|\langle \gamma_{\textbf{Q}}^{j,p}-\gamma_{\textbf{Q}'}^{j,p}, Q_{j}'^p \rangle\Big|.\end{align}

The idea now is to control the right-hand sides of the last four inequalities. We show this for the inequality in (148). Similar arguments can be used to treat the other three inequalities. First, notice that the function $ \theta_{\textbf{Q}}^{j,c}$ is continuous. Indeed, the topology of $\mathcal{X}$ is built so that the function $x\rightarrow\sum_{0\leq t\leq T}{\unicode{x1D7D9}}_{x_t\neq x_{t-}}$ is continuous. Moreover, from Assumption 6.1, the functions $\lambda_{j,z,z'}^{c}$ are continuous. Furthermore, from Lemma 6.4, the component projection $Q_j^c\rightarrow\pi(Q_j^c)=(Q_{j}^c(t))_{0\leq t\leq T}$ is continuous since $\pi(\textbf{Q})=(\textbf{Q}(t))_{0\leq t\leq T}$ is continuous. Finally, the $\log$ function being continuous gives that $ \theta_{\textbf{Q}}^{j,c}$ is continuous provided that $L(\textbf{Q})<\infty$ . In addition, from (143) we have that $\theta_{\textbf{Q}}^{j,c}\leq C(1+\varphi)$ ; thus $\theta_{\textbf{Q}}^{j,c}\in C_{\varphi}(\mathcal{X})$ provided that $L(\textbf{Q})<\infty$ . Therefore, the term $ \Big|\langle \theta_{\textbf{Q}}^{j,c}, Q_{j}^c-Q_{j}'^c \rangle\Big|$ can be made as small as desired by taking $\textbf{Q}'$ close enough to $\textbf{Q}$ (and thus $Q_{j}'^c$ close enough to $Q_{j}^c$ ). The second term in the right-hand side of (148) is bounded as follows:

(152) \begin{equation}\begin{split}\Big|\langle \theta_{\textbf{Q}}^{j,c}-\theta_{\textbf{Q}'}^{j,c}, Q_{j}'^c \rangle\Big|\leq \sup_{t}&\Bigg|\log\Bigg(\sum_{(x_{t-},x_{t})\in\mathcal{E}}\lambda_{j,x_{t-},x_{t}}^c\left(Q^c_j(t{-}),Q^p_j(t{-})\right)\Bigg)\\ &-\log\Bigg(\sum_{(x_{t-},x_t)\in\mathcal{E}}\lambda_{j,x_{t-},x_{t}}^c\left(Q'^c_j(t{-}),Q'^p_j(t{-})\right)\Bigg)\Bigg|\int_{\mathcal{X}}\varphi dQ'^c_j.\end{split}\end{equation}

Therefore, using again Assumption 6.1, Lemma 6.4, and the continuity of the $\log$ function, the right-hand side of (152) is controlled for any $\textbf{Q}'$ in the neighborhood of $\textbf{Q}$ in $(\mathcal{M}_{1,\varphi}(\mathcal{X}))^{2r}$ provided that $L(\textbf{Q})<\infty$ . Thus, the integral $\textbf{Q}\rightarrow\int\theta_{\textbf{Q}}^{j,c} dQ_{j}^c$ is continuous. The same steps allow us to show that

\begin{align*} \textbf{Q}\rightarrow\int_{\mathcal{X}}\theta_{\textbf{Q}}^{j,p}(x)Q_{j}^p(dx),\qquad \textbf{Q}\rightarrow\int_{\mathcal{X}}\gamma_{\textbf{Q}}^{j,c}(x)Q_{j}^c(dx),\qquad\text{and}\quad \textbf{Q}\rightarrow\int_{\mathcal{X}}\gamma_{\textbf{Q}}^{j,p}(x)Q_{j}^p(dx)\end{align*}

are also continuous at any $\textbf{Q}$ where $L(\textbf{Q})<\infty$ . Hence, the function $h^N$ is a linear combination of continuous functions and thus is continuous, which concludes the proof.

The following lemma states the uniform convergence of $\big\{h^N,N\geq 1\big\}$ towards h.

Lemma 6.6. Suppose Assumption 6.1 holds. Then the sequence of functions $\big\{h^N,N\geq 1\big\}$ introduced in (109) converges uniformly towards the function h given by (110).

Proof. From (110) and (142) one has

(153) \begin{align} \sup_{\textbf{Q}\in\big(\mathcal{M}_{1,\varphi}(\mathcal{X})\big)^{2r}}\big|h^N(\textbf{Q})-h(\textbf{Q})\big|&= \sup_{\textbf{Q}\in\big(\mathcal{M}_{1,\varphi}(\mathcal{X})\big)^{2r}} \Bigg|\sum_{j=1}^r\bigg[\bigg(\frac{N_j^c}{N}-\alpha_jp_j^c\bigg)\!\int_{\mathcal{X}}\!\bigg(\theta_{\textbf{Q}}^{j,c}(x)-\gamma_{\textbf{Q}}^{j,c}(x)\bigg)Q_{j}^c(dx)\nonumber\\ &\qquad\qquad+ \bigg(\frac{N_j^p}{N}-\alpha_jp_j^p\bigg)\int_{\mathcal{X}}\bigg(\theta_{\textbf{Q}}^{j,p}(x)-\gamma_{\textbf{Q}}^{j,p}(x)\bigg)Q_{j}^p(dx)\bigg]\Bigg|\nonumber \\ &\leq \sum_{j=1}^r\Bigg(\bigg|\frac{N_j^c}{N}-\alpha_jp_j^c\bigg|\sup_{Q\in(\mathcal{M}_{1,\varphi}(\mathcal{X}))^{2r}}\bigg|\int_{\mathcal{X}}\bigg(\theta_{\textbf{Q}}^{j,c}(x)-\gamma_{\textbf{Q}}^{j,c}(x)\bigg)Q_{j}^c(dx)\bigg|\qquad\qquad\nonumber\\ &+ \bigg|\frac{N_j^p}{N}-\alpha_jp_j^p\bigg|\sup_{Q\in(\mathcal{M}_{1,\varphi}(\mathcal{X}))^{2r}}\bigg|\int_{\mathcal{X}}\bigg(\theta_{\textbf{Q}}^{j,p}(x)-\gamma_{\textbf{Q}}^{j,p}(x)\bigg)Q_{j}^p(dx)\bigg|\Bigg).\end{align}

First, observe that

\begin{align*}\sup_{\textbf{Q}\in\big(\mathcal{M}_{1,\varphi}(\mathcal{X})\big)^{2r}}\bigg|\int_{\mathcal{X}}\bigg(\theta_{\textbf{Q}}^{j,c}(x)-\gamma_{\textbf{Q}}^{j,c}(x)\bigg)Q_{j}^c(dx)\bigg|&\leq \sup_{\textbf{Q}\in\big(\mathcal{M}_{1,\varphi}(\mathcal{X})\big)^{2r}}\bigg|\int_{\mathcal{X}}\theta_{\textbf{Q}}^{j,c}(x)Q_{j}^c(dx)\bigg|\\&\qquad\qquad+\sup_{\textbf{Q}\in\big(\mathcal{M}_{1,\varphi}(\mathcal{X})\big)^{2r}}\bigg|\int_{\mathcal{X}}\gamma_{\textbf{Q}}^{j,c}(x)Q_{j}^c(dx)\bigg|.\end{align*}

Using (143) one obtains

\begin{align*}\sup_{\textbf{Q}\in\big(\mathcal{M}_{1,\varphi}(\mathcal{X})\big)^{2r}}\bigg|\int_{\mathcal{X}}\theta_{\textbf{Q}}^{j,c}(x)Q_{j}^c(dx)\bigg|&\leq \sup_{\textbf{Q}\in\big(\mathcal{M}_{1,\varphi}(\mathcal{X})\big)^{2r}}\int_{\mathcal{X}}C(1+\varphi(x))Q_j^c(dx)\\&\leq \sup_{\textbf{Q}\in\big(\mathcal{M}_{1,\varphi}(\mathcal{X})\big)^{2r}}C\bigg(\int_{\mathcal{X}}Q_j^c(dx)+\int_{\mathcal{X}}\varphi(x)Q_j^c(dx)\bigg),\end{align*}

which is $<\infty$ since $Q_j^c$ is a probability measure and the second integral is finite for $Q_j^c\in\mathcal{M}_{1,\varphi}(\mathcal{X})$ . Moreover, by (145) one has

\begin{align*}\sup_{\textbf{Q}\in\big(\mathcal{M}_{1,\varphi}(\mathcal{X})\big)^{2r}}\bigg|\int_{\mathcal{X}}\gamma_{\textbf{Q}}^{j,c}(x)Q_{j}^c(dx)\bigg|<\infty,\end{align*}

again since $Q_j^c$ is a probability measure. Therefore, by Assumption 6.1, one deduces that

\begin{align*}\Bigg(\frac{N_j^c}{N}-\alpha_jp_j^c\Bigg)\sup_{\textbf{Q}\in\big(\mathcal{M}_{1,\varphi}(\mathcal{X})\big)^{2r}}\bigg|\int_{\mathcal{X}}\bigg(\theta_{\textbf{Q}}^{j,c}(x)-\gamma_{\textbf{Q}}^{j,c}(x)\bigg)Q_{j}^c(dx)\bigg|\overset{N\rightarrow\infty}{\longrightarrow} 0.\end{align*}

Similarly, one obtains

\begin{align*}\Bigg(\frac{N_j^p}{N}-\alpha_jp_j^p\Bigg)\sup_{\textbf{Q}\in\big(\mathcal{M}_{1,\varphi}(\mathcal{X})\big)^{2r}}\bigg|\int_{\mathcal{X}}\bigg(\theta_{\textbf{Q}}^{j,p}(x)-\gamma_{\textbf{Q}}^{j,p}(x)\bigg)Q_{j}^p(dx)\bigg|\overset{N\rightarrow\infty}{\longrightarrow} 0.\end{align*}

Thus $h^N$ converges uniformly towards h.

The final step before applying the Laplace–Varadhan principle is to verify that (132) is satisfied.

Lemma 6.7. For any $\alpha>0$ ,

\begin{align*}\limsup_{N\rightarrow\infty}\frac{1}{N}\log\int_{\big(\mathcal{M}_{1,\varphi}(\mathcal{X})\big)^{2r}}\exp\big\{ N\alpha \big|h^N\big| \big\} dP_{\nu_N}^{0,N}<\infty.\end{align*}

Proof. First note that, using the bounds (143), (144), (145), and (146), we find that for all $\textbf{Q}\in(\mathcal{M}_{1,\varphi}(\mathcal{X}))^{2r}$ ,

(154) \begin{equation} \big|h^N(\textbf{Q})\big| \leq\sum_{j=1}^r\Bigg[\frac{N_j^c}{N}C\left(1+\int_{\mathcal{X}}\varphi (x) Q_{j}^c(dx)\right) + \frac{N_j^p}{N}C\bigg(1+\int_{\mathcal{X}}\varphi (x)Q_{j}^p(dx)\bigg)\Bigg].\end{equation}

Therefore, to show (132), it is enough to show that, for any $\alpha>0$ ,

(155) \begin{align} & \limsup_{N\rightarrow\infty}\frac{1}{N}\log\int_{\big(\mathcal{M}_{1,\varphi}(\mathcal{X})\big)^{2r}}\exp\Bigg\{ N\alpha \sum_{j=1}^r\bigg[\frac{N_j^c}{N}\int_{\mathcal{X}}\varphi (x) Q_{j}^c(dx)\nonumber\\& \quad +\frac{N_j^p}{N}\int_{\mathcal{X}}\varphi (x)Q_{j}^p(dx)\bigg] \Bigg\} dP_{\nu_N}^{0,N}(\textbf{Q})<\infty. \end{align}

Recall that $P_{\nu^N}^{0, N}=\mathbb{P}_{z^N}^{0, N}\circ G_N^{-1}$ , where $\mathbb{P}_{z^N}^{0, N}=\otimes_{n=1}^NP_{z_n}$ and $P_{z_n}$ is the law of the nth particle in the case of non-interaction, with the initial condition being $z_n$ . Hence, by independence, the integral term in the left-hand side of (155) is equivalent to

(156) \begin{align} & \prod_{j=1}^r \Bigg(\int_{\mathcal{M}_{1,\varphi (\mathcal{X})}}\exp\bigg\{ N_j^c\alpha\int_{\mathcal{X}}\varphi (x) Q_{j}^c(dx)\bigg\}dP_{\nu_N^{j,p}}^{0,N_j^c}\big(Q_{j}^c\big)\int_{\mathcal{M}_{1,\varphi (\mathcal{X})}}\nonumber\\& \qquad\exp\bigg\{N_j^p\alpha\int_{\mathcal{X}}\varphi (x)Q_{j}^p(dx)\bigg\} dP_{\nu_N^{j,p}}^{0,N_j^p}\big(Q_{j}^p\big)\Bigg).\end{align}

Now, using [Reference Léonard49, Lemma 2.10], we find that for all $1\leq j\leq r$ ,

(157) \begin{align}\limsup_{N_j^c\rightarrow\infty}\frac{1}{N_j^c}\log \int_{\mathcal{M}_{1,\varphi (\mathcal{X})}}\exp\bigg\{ N_j^c\alpha \int_{\mathcal{X}}\varphi (x) Q_{j}^c(dx)\bigg\}dP_{\nu_N^{j,c}}^{0,N_j^c}\big(Q_{j}^c\big)<\infty\end{align}

and

(158) \begin{align}\limsup_{N_j^p\rightarrow\infty}\frac{1}{N_j^p}\log \int_{\mathcal{M}_{1,\varphi (\mathcal{X})}}\exp\bigg\{ N_j^p\alpha \int_{\mathcal{X}}\varphi (x) Q_{j}^p(dx)\bigg\}dP_{\nu_N^{j,p}}^{0,N_j^p}\big(Q_{j}^p\big)<\infty.\end{align}

Since $N_j^c<N$ and $N_j^p<N$ for all $1\leq j\leq r$ , (157), (158), and (156) lead to (155), which concludes the proof.

The interacting case. We are now ready to apply the Laplace–Varadhan principle to the sequence of probability measures $\big\{P^{0,N}_{\nu_N},N\geq 1\big\}$ . By Lemma 6.1, the sequence $\big\{P^{0,N}_{\nu_N},N\geq 1\big\}$ obeys a large deviations principle in the topological space $(\mathcal{M}_{1,\varphi}(\mathcal{X}))^{2r}$ with rate function $L(\textbf{Q})$ defined by (117), and with speed N. Moreover, by Lemma 6.5, the functions $h^N$ defined in (109) are continuous at any $\textbf{Q}$ such that $L(\textbf{Q})<\infty$ . Furthermore, by Lemma 6.2, the functions $h^N$ are continuous on the set $\big\{\textbf{Q}\in(\mathcal{M}_{1}(\mathcal{D}([0,T],\mathcal{Z}))^{2r}|L(\textbf{Q})<\infty\big\}$ . Therefore, the conditions in (128), (129), and (131) hold true. Finally, we have seen in Lemma 6.7 that (132) is satisfied. Hence, a straightforward application of [Reference Varadhan64, Theorem 3.4] gives

(159) \begin{equation}\begin{split}\frac{1}{N}\log\int_{\big(\mathcal{M}_{1,\varphi (\mathcal{X})}\big)^{2r}}\exp\big\{ N h^N \big\} dP_{\nu_N}^{0,N}\longrightarrow\sup_{\textbf{Q}\in\big(\mathcal{M}_{1,\varphi (\mathcal{X})}\big)^{2r}}\big[h(\textbf{Q})-L(\textbf{Q})\big]\end{split}\end{equation}

as $N\rightarrow\infty$ , and the sequence

(160) \begin{equation}\begin{split}\Bigg\{ \frac{\exp\big(Nh^N\big)}{\int_{\big(\mathcal{M}_{1,\varphi (\mathcal{X})}\big)^{2r}}\exp\big(Nh^N\big)dP^{0,N}_{\nu_N}}\cdot P^{0,N}_{\nu_N}, N\geq 1 \Bigg\}\end{split}\end{equation}

obeys a large deviations principle with speed N and rate function

(161) \begin{align}\textbf{Q}\rightarrow L(\textbf{Q})-h(\textbf{Q})-\inf_{\textbf{Q}'\in\big(\mathcal{M}_{1,\varphi (\mathcal{X})}\big)^{2r}}[L(\textbf{Q}')-h(\textbf{Q}')].\end{align}

Now, from (108) we have

(162) \begin{align} \frac{dP_{\nu^N}^{N}}{dP_{\nu^N}^{0,N}}(\textbf{Q})=\exp\big\{N h^N(\textbf{Q})\big\}.\end{align}

Since $P_{\nu^N}^{N}$ is a probability measure we obtain

(163) \begin{align}\int_{\big(\mathcal{M}_{1,\varphi (\mathcal{X})}\big)^{2r}}\exp\big(Nh^N\big)dP^{0,N}_{\nu_N}=\int_{\big(\mathcal{M}_{1,\varphi (\mathcal{X})}\big)^{2r}}dP_{\nu^N}^{N}=1.\end{align}

Thus, the left-hand side of (159) is always zero and so

(164) \begin{align}\sup_{\big(\mathcal{M}_{1,\varphi (\mathcal{X})}\big)^{2r}}\big[h(\textbf{Q})-L(\textbf{Q})\big]=0,\end{align}

which gives that

(165) \begin{align}\inf_{\textbf{Q}'\in\big(\mathcal{M}_{1,\varphi (\mathcal{X})}\big)^{2r}}[L(\textbf{Q}')-h(\textbf{Q}')]=0.\end{align}

We then conclude that the sequence $\big\{P_{\nu^N}^{N},N\geq 1\big\}$ obeys a large deviations principle in the topological space $(\mathcal{M}_{1,\varphi} (\mathcal{X}))^{2r}$ , with speed N and rate function

(166) \begin{align}I(\textbf{Q})=L(\textbf{Q})-h(\textbf{Q}).\end{align}

In order to obtain the representation in (119), we proceed as follows. First, from (154) we have that, for $\textbf{Q}\in(\mathcal{M}_{1,\varphi}(\mathcal{X}))^{2r}$ , $h(\textbf{Q})<\infty$ . Moreover, from [Reference Borkar and Sundaresan12, Lemma 5.6], the functions $J^{j,\iota}(Q)$ defined by (118) have the following representation:

(167) \begin{align}J^{j,\iota}(Q)=\left\{\begin{array}{l@{\quad}l} H\big(Q\big|P_{j,\iota}\big) & \mbox{if $Q\circ\pi_0^{-1}=\nu^{\,j,\iota}$},\\[4pt] +\infty & \mbox{otherwise},\end{array}\right.\end{align}

where $H\big(Q\big|P_{j,\iota}\big)$ is the relative entropy defined by (116). Therefore, if either $Q\circ\pi_0^{-1}\neq\nu^{\,j,\iota}$ or Q is not absolutely continuous with respect to $P_{j,\iota}$ , then one can immediately observe that $J^{j,\iota}(Q)=\infty$ ; thus $L(\textbf{Q})=\infty$ , and finally $I(\textbf{Q})=\infty$ . Now, assume that for all $1\leq j\leq r$ we have $Q_{j}^c\circ\pi_0^{-1}=\nu^{\,j,c}$ , $Q_{j}^c\ll P$ , and $Q_{j}^p\circ\pi_0^{-1}=\nu^{\,j,p}$ , $Q_{j}^p\ll P$ ; then

\begin{align*}L(\textbf{Q})=\alpha_1p_1^c H\big(Q_{1}^c\big|P_{1,c}\big)+ \alpha_1p_1^p H\big(Q_{1}^p\big|P_{1,p}\big)+\cdots+\alpha_rp_r^c H\big(Q_{r}^c\big|P_{r,c}\big)+ \alpha_rp_r^p H\big(Q_{r}^p\big|P_{r,p}\big).\end{align*}

Furthermore, one can observe from (105) that the densities $\exp\big\{h_1(x,\eta,\rho_j)\big\}$ and $\exp\big\{h_2(x,\eta,\rho_1,\ldots,\rho_r)\big\}$ do not depend on the initial condition $z_0$ . Therefore, for each $1\leq j\leq r$ , the densities of $R^c\big(\pi\big(Q_{j}^c\big),\pi\big(Q_{j}^p\big)\big)$ and $R^p\big(\pi\big(Q_{j}^c\big)\big),\pi\big(Q_{1}^p\big),\ldots,\pi\big(Q_{r}^p\big)\big)\big)$ with respect to the mixtures $P_{j,c}$ and $P_{j,p}$ are given by, respectively,

\begin{align*}\frac{d R^c\big(\pi\big(Q_{j}^c\big),\pi\big(Q_{j}^p\big)\big)}{d P_{j,c}}(x)=\exp\Big\{h_1\big(x,\pi\big(Q_{j}^c\big),\pi\big(Q_{j}^p\big)\big)\Big\}\end{align*}

and

\begin{align*} \frac{d R^p\big(\pi\big(Q_{j}^c\big)\big),\pi\big(Q_{1}^p\big),\ldots,\pi\big(Q_{r}^p\big)\big)\big)}{d P_{j,p}}(x)=\exp\Big\{h_2\big(x,\pi\big(Q_{j}^c\big)\big),\pi\big(Q_{1}^p\big),\ldots,\pi\big(Q_{r}^p\big)\big)\big)\Big\}.\end{align*}

Replacing $h_1(\cdot)$ and $h_2(\cdot)$ in (109) by the two representations above, we find

(168) \begin{equation}\begin{split}h(\textbf{Q})&=\sum_{j=1}^r\Bigg[\frac{N_j^c}{N}\int_{D([0,T],\mathcal{Z})}dQ_{j}^c\log\frac{d R^c\big(\pi\big(Q_{j}^c\big),\pi\big(Q_{j}^p\big)\big)}{d P_{j,c}}\\&\qquad\qquad+\frac{N_j^p}{N}\int_{D([0,T],\mathcal{Z})}dQ_{j}^p\log\frac{d R^p\big(\pi(Q_{j}^c\big),\pi\big(Q_{1}^p\big),\ldots,\pi\big(Q_{r}^p\big)\big)\big)}{d P_{j,p}}\Bigg].\end{split}\end{equation}

Finally, using (99) we find that, as $N\rightarrow\infty$ ,

(169) \begin{equation}\begin{split}L(\textbf{Q})-h(\textbf{Q})&=\sum_{j=1}^r\Bigg[\alpha_jp_j^c\Bigg(\int_{D([0,T],\mathcal{Z})}dQ_{j}^c\log\frac{d Q_{j}^c}{d P_{j,c}}-\int_{D([0,T],\mathcal{Z})}dQ_{j}^c\log\frac{d R^c\big(\pi\big(Q_{j}^c\big),\pi\big(Q_{j}^p\big)\big)}{d P_{j,c}}\Bigg)\\&\qquad\qquad+\alpha_jp_j^p\Bigg(\int_{D([0,T],\mathcal{Z})}dQ_{j}^p\log\frac{d Q_{j}^p}{d P}\\& \qquad -\int_{D([0,T],\mathcal{Z})}dQ_{j}^p\log\frac{d R^p\big(\pi\big(Q_{j}^c\big),\pi\big(Q_{1}^p\big),\ldots,\pi\big(Q_{r}^p\big)\big)\big)}{d P}\Bigg)\Bigg]\\ &=\sum_{j=1}^r\bigg[\alpha_jp_j^c\int_{D([0,T],\mathcal{Z})}dQ_{j}^c\log\frac{d Q_{j}^c}{d R^c\big(\pi\big(Q_{j}^c\big),\pi\big(Q_{j}^p\big)\big)}\\&\qquad\qquad+\alpha_jp_j^p\int_{D([0,T],\mathcal{Z})}dQ_{j}^p\log\frac{d Q_{j}^p}{d R^p\big(\pi\big(Q_{j}^c\big),\pi\big(Q_{1}^p\big),\ldots,\pi\big(Q_{r}^p\big)\big)}\bigg] \\ &=\sum_{j=1}^r\bigg[\alpha_jp_j^cH\bigg(Q_{j}^c\Big|R^c\big(\pi\big(Q_{j}^c\big),\pi\big(Q_{j}^p\big)\big)\bigg)\\ & \qquad +\alpha_jp_j^pH\bigg(Q_{j}^p\Big| R^p\big(\pi\big(Q_{j}^c\big),\pi\big(Q_{1}^p\big),\ldots,\pi\big(Q_{r}^p\big)\bigg)\bigg].\end{split}\end{equation}

This concludes the proof.

6.2. Large deviations principle for the empirical process

We now investigate the large deviations of the sequence $\big(p_{\nu_N}^N,N\geq 1\big)$ where, for any $N\geq 1$ , $p_{\nu_N}^N=\mathbb{P}_{z^N}^N\circ\gamma_N^{-1}=\pi(\mathbb{M}^N)$ is the distribution of the $(\mathcal{M}_1(\mathcal{Z}))^{2r}$ -valued empirical process defined by

\begin{equation*}\begin{split}\mu^N\,:\, t\in [0,T]\longrightarrow \mu^N(t)&=\left(\mu_1^{c,N}(t),\mu_1^{p,N}(t),\cdots,\mu_r^{c,N}(t),\mu_r^{p,N}(t)\right)\\ &=\Bigg(\frac{1}{N_1^c}\sum_{n\in C^c_1}\delta_{X^c_{n,1}(t)},\frac{1}{N_1^p}\sum_{n\in C^p_1}\delta_{X^p_{n,1}(t)},\ldots, \frac{1}{N^c_r}\sum_{n\in C^c_r}\delta_{X^c_{n,r}(t)}, \frac{1}{N^p_r}\sum_{n\in C^p_r}\delta_{X^p_{n,r}(t)} \Bigg).\end{split}\end{equation*}

The flow $\mu^N$ takes values in the product space $\big(\mathcal{D}([0,T],\mathcal{M}_1(\mathcal{Z}))\big)^{2r}$ . Again let $\mathcal{D}([0,T],\mathcal{M}_1(\mathcal{Z}))$ be equipped with the metric

(170) \begin{align}\rho_T (\mu,\nu)\,:\!=\,\sup_{0\leq t\leq T}\rho_0 (\mu_t,\nu_t),\quad\mu,\nu\in\mathcal{D}([0,T],\mathcal{M}_1(\mathcal{Z})),\end{align}

where $\rho_0(\alpha,\beta)$ , $\alpha,\beta\in\mathcal{M}_1(\mathcal{Z})$ , is a metric on $\mathcal{M}_1(\mathcal{Z})$ that generates the weak topology on $\mathcal{M}_1(\mathcal{Z})$ . Moreover, let the product space $\big(\mathcal{D}([0,T],\mathcal{M}_1(\mathcal{Z}))\big)^{2r}$ be equipped with the product topology induced by the product metric $\rho^{2r}_T=\max\{\rho_{T},\ldots,\rho_{T}\}$ .

For any $\xi=\big(\xi_1^c,\xi_1^p,\ldots,\xi_r^c,\xi_r^p\big)\in\big(\mathcal{M}_1(\mathcal{Z})\big)^{2r}$ , define the rate matrices

(171) \begin{equation}\begin{split}A^{j,c}_{\xi}\,:\!=\,\bigg(\lambda^c_{j,z,z'}\Big(\xi_j^c,\xi_j^p\Big)\bigg)_{(z,z')\in\mathcal{Z}\times\mathcal{Z}}\qquad\text{and}\quad A^{j,p}_{\xi}\,:\!=\,\bigg(\lambda^p_{j,z,z'}\Big(\xi_j^c,\xi_1^p\ldots,\xi^p_r\Big)\bigg)_{(z,z')\in\mathcal{Z}\times\mathcal{Z}},\end{split}\end{equation}

where

$$\lambda^c_{j,z,z}\Big(\xi_j^c,\xi_j^p\Big)=-\sum_{z'\neq z}\lambda^c_{j,z,z'}\Big(\xi_j^c,\xi_j^p\Big)$$

and

$$\lambda^p_{j,z,z}\Big(\xi_j^c,\xi_1^p,\ldots,\xi_r^p\Big)=-\sum_{z'\neq z}\lambda^p_{j,z,z'}\Big(\xi_j^c,\xi_1^p,\ldots,\xi_r^p\Big).$$

From the laws of large numbers given in Corollary 5.1, one can deduce that, as $N\rightarrow\infty$ , the sequence $\big(\mu^N, N\geq 1\big)$ converges weakly, for converging initial conditions, towards the solution $\mu$ of the following McKean–Vlasov system:

(172) \begin{equation}\begin{split}\left\{ \begin{array}{l@{\quad}c@{\quad}l}\dot{\mu}_j^c(t)=A^{j,c^*}_{\mu (t)}\mu_j^c(t), & & \\[3pt] \dot{\mu}_j^p(t)=A^{j,p^*}_{\mu(t)}\mu_j^p(t), & & \\[3pt] \mu_j^c(0)=\nu_j^c,\mu_j^p(0)=\nu_j^p, & & \\[3pt] 1\leq j\leq r, & & \end{array}\right.\end{split}\end{equation}

where $A^*$ is the adjunct/transpose of the matrix A and $\dot{\mu}(t)= \frac{\partial}{\partial t} \mu(t)$ . Note that the Lipschitz property of the functions $\lambda_{j,z,z'}^{c}$ and $\lambda^p_{j,z,z'}$ ensures that (172) is well-posed. Also, one can notice that the representation (172) is consistent with the infinitesimal generators $\mathcal{L}^c_{\xi,\eta_j}$ and $\mathcal{L}_{\xi,\eta_1,\dots,\eta_r}^p$ introduced in (103) and (104). Indeed, if we consider $\phi$ , $\mathcal{L}^c_{\xi,\eta_j}\phi$ , and $\mathcal{L}_{\xi,\eta_1,\dots,\eta_r}^p\phi$ as column vectors, then the right-hand sides of (103) and (104) are the results of right-multiplying the rates matrices $A^{j,c}$ and $A^{j,p}$ , respectively, by the vector $\phi$ .

Denote by $\tau$ the log-Laplace transform of the centered Poisson distribution with parameter 1 given by $\tau(u)=e^u-u-1$ , and let $\tau^*$ be its Legendre transform, defined by

\begin{align*}\tau^*(u)\,:\!=\,\left\{\begin{array}{l@{\quad}l@{\quad}l} (u+1) \log (u+1)-u & \text{if} & u>-1 , \\1 & \text{if} & u=-1 ,\\ +\infty & \text{if} & u<-1 . \\\end{array}\right.\end{align*}

Let us recall now the notion of absolute continuity introduced in [Reference Dawson and Gärtner24, Definition 4.1]. Denote by $\mathcal{S}$ the Schwartz space of test functions $\mathbb{R}^d\rightarrow\mathbb{R}$ having compact support and possessing continuous derivatives of all orders. We endow $\mathcal{S}$ with the usual inductive topology. Let $\mathcal{S}'$ be the corresponding space of real distributions. For each compact set $K\subset\mathbb{R}^d$ , $\mathcal{S}_K$ will denote the subspace of $\mathcal{S}$ consisting of all test functions whose support is contained in K. Finally, let $\langle\nu,f\rangle$ denote the application of the test function f to the distribution $\nu$ .

Definition 6.1. Let I be an interval of the real line. A map $\nu (\cdot)\,:\, I\rightarrow\mathcal{S}'$ is called absolutely continuous if, for each compact set $K\subset\mathbb{R}^d$ , there exists a neighborhood $U_K$ of 0 in $\mathcal{S}_K$ and an absolutely continuous function $H_K\,:\, I\rightarrow{R}$ such that

\begin{align*}|\langle\nu(u),f\rangle|-|\langle\nu(v),f\rangle|\leq H_K(u)-H_K(v)\end{align*}

for all $u,v\in I$ and $f\in U_K$ .

Finally, for any $\theta\in\mathcal{M}(\mathcal{Z})$ , define

\begin{align*}|||\theta|||^{j,c}_{\mu(t)}&\,:\!=\sup_{\Phi\,:\,\mathcal{Z}\rightarrow\mathbb{R}} \!\Bigg\{\! \sum_{z\in\mathcal{Z}}\theta(z)\cdot\Phi(z)-\!\sum_{{ (z,z')\in\mathcal{E}}} \tau \big(\Phi(z')- \Phi(z)\big)\cdot\mu_j^c(t)(z)\cdot\lambda^{c}_{j,z,z'}\big(\mu_j^c(t),\mu_j^p(t)\big) \Bigg\},\\|||\theta|||^{j,p}_{\mu(t)}&\,:\!=\,\sup_{\Phi\,:\,\mathcal{Z}\rightarrow\mathbb{R}} \Bigg\{ \sum_{z\in\mathcal{Z}}\theta(z)\cdot\Phi(z)\\& -\sum_{{ (z,z')\in\mathcal{E}}} \tau \big(\Phi(z')- \Phi(z)\big)\cdot\mu_j^p(t)(z)\cdot\lambda^p_{j,z,z'}\big(\mu_j^c(t),\mu_1^p(t),\ldots,\mu_r^p(t)\big) \Bigg\}.\end{align*}

Also let us introduce, for each $\nu\in(\mathcal{M}_1(\mathcal{Z}))^{2r}$ , and according to [Reference Dawson and Gärtner24, Equation (4.9)], the functional $S(\mu|\nu)$ defined from $(\mathcal{D}([0, T ], \mathcal{M}_1 (Z)))^{2r}$ to $[0,\infty]$ by setting

(173) \begin{equation}\begin{split} & S_{[0,T]}(\mu|\nu)\,:\!=\, \sum_{j=1}^r\bigg[\alpha_jp_j^c \int_0^T \big|\big|\big| \dot{\mu}_j^c(t)-{ A^{j,c^*}_{\mu (t)}}\mu_j^c(t)\big|\big|\big|_{\mu(t)}dt +\alpha_jp_j^p\int_0^T \big|\big|\big| \dot{\mu}_j^p(t)\\& \qquad -{ A^{j,p^*}_{\mu (t)}}\mu_j^p(t)\big|\big|\big|_{\mu(t)}dt \bigg]\end{split}\end{equation}

if $\mu(0)=\nu$ and $\mu_j^c,\mu_j^p$ are absolutely continuous in the sense of Definition 6.1 for all $1\leq j\leq r$ , and $S_{[0,T]}(\mu|\nu)=+\infty$ otherwise.

We are now ready to state our large deviations principle for the sequence $\big(p^N_{\nu_N}, N\geq 1\big)$ .

Theorem 6.2. Suppose that $\nu_N\rightarrow\nu$ weakly. The sequence of probability measures $\big(p^N_{\nu_N},N\geq 1\big)$ obeys a large deviations principle in the space $\big(\mathcal{D}([0,T],\mathcal{M}_1(\mathcal{Z}))\big)^{2r}$ , with speed N, and rate function $S_{[0,T]}(\mu|\nu)$ given by (173).

Moreover, if a path $\mu\in\big(\mathcal{D}([0,T],\mathcal{M}_1(\mathcal{Z}))\big)^{2r}$ satisfies $S_{[0,T]}(\mu|\nu)<\infty$ , then $\mu_j^c$ and $\mu_j^p$ are absolutely continuous and there exist families of rate matrices $L_{j,c}(t)=\Big(l_{z,z'}^{j,c}(t), (z, z')\in\mathcal{E}\Big)$ and $L_{j,p}(t)=\Big(l_{z,z'}^{j,p}(t), (z, z')\in\mathcal{E}\Big)$ such that, for all $1\leq j\leq r$ and $t \in [0, T ]$ ,

\begin{align*}\dot{\mu}_j^c(t)&={L_{j,c}(t)}^{*}\mu_j^c(t),\\\dot{\mu}_j^p(t)&={L_{j,p}(t)}^{*}\mu_j^p(t).\end{align*}

Furthermore, in this case, the good rate function $S_{[0,T]}(\mu|\nu)$ is given by

(174) \begin{equation}\begin{split}\sum_{j=1}^r\Bigg[&\alpha_jp_j^c \int_0^T \Bigg( \sum_{(z,z')\in\mathcal{E}}\Big(\mu_j^c(t)(z)\Big)\lambda_{j,z,z'}^{c}\Big(\mu_j^c(t),\mu_j^p(t)\Big) \tau^*\Bigg(\frac{l_{z,z'}^{j,c}(t)}{\lambda_{j,z,z'}^{c}\Big(\mu_j^c(t),\mu_j^p(t)\Big)}-1\Bigg) \Bigg)dt \\ &+\alpha_jp_j^p\int_0^T \Bigg( \sum_{(z,z')\in\mathcal{E}}\Big(\mu_j^p(t)(z)\Big)\lambda^p_{j,z,z'}\Big(\mu_j^c(t),\mu_1^p(t),\ldots,\mu_r^p(t)\Big)\\ & \tau^*\Bigg(\frac{l_{z,z'}^{j,p}(t)}{\lambda^p_{j,z,z'}\Big(\mu_j^c(t),\mu_1^p(t),\ldots,\mu_r^p(t)\Big)}-1\Bigg) \Bigg) dt \Bigg].\end{split}\end{equation}

Proof. We first use a contraction argument to derive a large deviations principle for the sequence $\big(p_{\nu_N}^N,N\geq 1\big)$ . From Theorem 6.1, the sequence $\big(P_{\nu_N}^N,N\geq 1\big)$ obeys a large deviations principle with speed N and rate function $I(\textbf{Q})$ given by

\begin{align*}I(\textbf{Q})= \left\{ \begin{array}{l@{\quad}l@{\quad}l}\sum_{j=1}^r\Big[\alpha_jp_j^cH\Big(Q_{j}^c\big|R^c\big(\pi\big(Q_{j}^c\big),\pi\big(Q_{j}^p\big)\big)\Big) & \text{if $\textbf{Q}\circ\pi_0^{-1}=\nu$},\\ \qquad +\alpha_jp_j^pH\Big(Q_{j}^p\big| R^p\big(\pi\big(Q_{j}^c\big),\pi\big(Q_{1}^p\big),\ldots,\pi\big(Q_{r}^p\big)\Big)\Big] & \\+\infty &\text{otherwise}.&\end{array}\right.\end{align*}

Moreover, from Lemma 6.4, the projection

\begin{align*}\pi\,:\, \big(\mathcal{M}_1(\mathcal{D}([0,T],\mathcal{Z}))\big)^{2r}&\rightarrow\big(\mathcal{D}([0,T],\mathcal{M}_1(\mathcal{Z}))\big)^{2r}\end{align*}

is continuous at each $\textbf{Q}\in\big(\mathcal{M}_1(\mathcal{D}([0,T],\mathcal{Z}))\big)^{2r}$ where $L(\textbf{Q})<\infty$ , and thus at any $\textbf{Q}$ such that $I(\textbf{Q})<\infty$ . The latter corresponds to the effective domain $\mathcal{D}_I=\{\textbf{Q}\,:\,I(\textbf{Q})<\infty\}$ of the rate function I (see [Reference Dembo and Zeitouni29, p. 4]). Therefore, by applying the contraction principle to the large deviations principle of $\big(P_{\nu_N}^N,N\geq 1\big)$ (see [Reference Dembo and Zeitouni29, Theorem 4.2.1, Remark (c)]) with rate I, we deduce that the family of probability measures $\big(P_{\nu_N}^N\circ\pi^{-1},N\geq 1\big)$ obeys a large deviations principle in $\big(\mathcal{D}([0,T],\mathcal{M}_1(\mathcal{Z}))\big)^{2r}$ with the rate function defined, for any $\mu\in\big(\mathcal{D}([0,T],\mathcal{M}_1(\mathcal{Z}))\big)^{2r}$ , by

(175) \begin{align}V(\mu)\,:\!=\,\inf\Big\{ I(\textbf{Q}), \textbf{Q}\in\big(\mathcal{M}_1(D([0,T],\mathcal{Z}))\big)^{2r},\pi(\textbf{Q})=\mu \Big\}.\end{align}

We now derive another representation for the rate function V following [Reference Dawson and Gärtner24, Reference Léonard49]. Fix $\mu=\big(\mu_j^c,\mu_j^p,1\leq j\leq r\big)\in\big(\mathcal{D}([0,T],\mathcal{M}_1(\mathcal{Z}))\big)^{2r}$ . Note that writing $\pi(\textbf{Q})=\mu$ , with $\textbf{Q}\in\big(\mathcal{M}_1(D([0,T],\mathcal{Z}))\big)^{2r}$ , is equivalent to $\pi\big(Q_{j}^c\big)=\mu_j^c$ and $\pi\big(Q_{j}^p\big)=\mu_j^p$ for all $1\leq j\leq r$ . Therefore V can be rewritten as

\begin{align*}V(\mu)=\inf\Bigg\{ \sum_{j=1}^r\Big[\alpha_jp_j^cH\Big(Q_{j}^c\big|R^c\big(\mu_j^c,\mu_j^p\big)\Big) &+\alpha_jp_j^pH\Big(Q_{j}^p\big| R^p\big(\mu_j^c,\mu_1^p,\ldots, \mu_r^p\big) \Big)\Big],\\&\qquad\qquad \textbf{Q}\in\big(\mathcal{M}_1(D([0,T],\mathcal{Z}))\big)^{2r}, \pi(\textbf{Q})=\mu \Bigg\}.\end{align*}

Fix $1\leq j\leq r$ . Let $\left(X^{(i)}_{j,c}\right)_{i\geq 1}$ and $\left(X^{(i)}_{j,p}\right)_{i\geq 1}$ be sequences of i.i.d. processes with common laws $R^c\big(\mu_j^c,\mu_j^p\big)$ and $R^p\big(\mu_j^c,\mu_1^p,\ldots, \mu_r^p\big)$ , respectively. By Sanov’s theorem, the empirical measures

$$\frac{1}{N_j^c}\sum_{i=1}^{N_j^c}X^{(i)}_{j,c} \quad \text{and} \quad \frac{1}{N_j^p}\sum_{i=1}^{N_j^p}X^{(i)}_{j,p}$$

obey large deviations principles as $N_j^c\rightarrow\infty$ and $N_j^p\rightarrow\infty$ , with speeds $N_j^c$ and $N_j^p$ , respectively, and rate functions given by

\[ Q\in \mathcal{M}_1(\mathcal{D}([0,T],\mathcal{Z}))\rightarrow H\big(Q\big|R^c(\mu_j^c,\mu_j^p\big)\big)\]

and

\[ Q\in \mathcal{M}_1(\mathcal{D}([0,T],\mathcal{Z}))\rightarrow H\big(Q\big|R^p\big(\mu_j^c,\mu_1^p,\ldots,\mu_r^p\big)\big),\]

respectively. Using the same arguments as in the proof of Lemma 6.4, one can show that the projection $\pi$ is continuous at any $\textbf{Q}\in\big(\mathcal{M}_1(D([0,T],\mathcal{Z}))\big)^{2r}$ such that

\begin{align*}\sum_{j=1}^r\Big[\alpha_jp_j^cH\Big(Q_{j}^c\big|R^c\big(\mu_j^c,\mu_j^p\big)\Big) +\alpha_jp_j^pH\Big(Q_{j}^p\big| R^p\big(\mu_j^c,\mu_1^p,\ldots, \mu_r^p \big)\Big]<\infty.\end{align*}

Thus, the component projections $\pi\big(Q_{j}^c\big)$ and $\pi\big(Q_{j}^p\big)$ are also continuous. Hence, using the contraction principle ([Reference Dembo and Zeitouni29, Theorem 4.2.1]), the sequences

$$\Bigg\{t\in [0,T]\rightarrow \frac{1}{N_j^c}\sum_{i=1}^{N_j^c}X^{(i)}_{j,c}(t);\,N_j^c\geq 1\Bigg\}$$

and

$$\Bigg\{t\in [0,T]\rightarrow \frac{1}{N_j^p}\sum_{i=1}^{N_j^p}X^{(i)}_{j,p}(t);\,N_j^p\geq 1\Bigg\}$$

obey large deviations principles with speeds $N_j^c$ and $N_j^p$ , respectively, and rate functions

\begin{align*}&\qquad\eta\in\mathcal{D}([0,T],\mathcal{M}_1(\mathcal{Z})) \rightarrow S^{j,c}_{\mu}(\eta)\,:\!=\,\inf\Big\{ H\Big(Q\big|R^c\big(\mu_j^c,\mu_j^p\big)\Big),\\& \qquad\qquad Q\in\mathcal{M}_1(D([0,T],\mathcal{Z})),\pi(Q)=\eta \Big\} \\&\text{and}\\&\qquad\eta\in\mathcal{D}([0,T],\mathcal{M}_1(\mathcal{Z}))\rightarrow S^{j,p}_{\mu}(\eta)\,:\!=\,\inf\Big\{ H\Big(Q\big|R^p\big(\mu_j^c,\mu_1^p,\ldots,\mu_r^p\big)\Big),\\& \qquad\qquad Q\in\mathcal{M}_1(D([0,T],\mathcal{Z})),\pi(Q)=\eta \Big\},\end{align*}

respectively. Note that, by using an independence argument and following the same steps as in the proof of Lemma 6.1, one can show that the sequence

\begin{align*}\Bigg(t\in [0,T]\rightarrow \Bigg(\frac{1}{N_1^c}\sum_{i=1}^{N_1^c}X^{(i)}_{1,c}(t),\frac{1}{N_1^p}\sum_{i=1}^{N_1^p}X^{(i)}_{1,p}(t),\ldots, \frac{1}{N_r^c}\sum_{i=1}^{N_r^c}X^{(i)}_{r,c}(t),\frac{1}{N_r^p}\sum_{i=1}^{N_r^p}X^{(i)}_{r,p}(t) \Bigg)\Bigg)_{N\geq 1}\end{align*}

obeys a large deviations principle with speed N and rate function

\begin{align*}{\eta}=\big(\eta_1^c,\eta_1^p,\ldots,\eta_r^c,\eta_r^p \big)\in\big(\mathcal{D}([0,T],\mathcal{M}_1(\mathcal{Z}))\big)^{2r}\rightarrow S_{\mu}(\eta)=\sum_{j=1}^r\bigg[\alpha_jp_j^c S^{j,c}_{\mu}\big(\eta_j^c \big) +\alpha_jp_j^pS^{j,p}_{\mu}\big(\eta_j^p\big)\big)\bigg].\end{align*}

In addition, the vector

\begin{align*}\Bigg( \frac{1}{N_1^c}\sum_{i=1}^{N_1^c}X^{(i)}_{1,c},\frac{1}{N_1^p}\sum_{i=1}^{N_1^p}X^{(i)}_{1,p},\ldots, \frac{1}{N_r^c}\sum_{i=1}^{N_r^c}X^{(i)}_{r,c},\frac{1}{N_r^p}\sum_{i=1}^{N_r^p}X^{(i)}_{r,p} \Bigg)_{N\geq 1}\end{align*}

obeys a large deviations principle with rate I(Q). Therefore, by a contraction argument and using again the continuity of the projection, we find that

\begin{align*}\Bigg(t\in [0,T]\rightarrow\Bigg( \frac{1}{N_1^c}\sum_{i=1}^{N_1^c}X^{(i)}_{1,c}(t),\frac{1}{N_1^p}\sum_{i=1}^{N_1^p}X^{(i)}_{1,p}(t),\ldots, \frac{1}{N_r^c}\sum_{i=1}^{N_r^c}X^{(i)}_{r,c}(t),\frac{1}{N_r^p}\sum_{i=1}^{N_r^p}X^{(i)}_{r,p}(t) \Bigg)\Bigg)_{N\geq 1}\end{align*}

obeys a large deviations principle with rate $V(\mu)$ . Hence, by the uniqueness of the rate function (cf. [Reference Deuschel and Stroock30, Lemma 2.1.1]), we find

(176) \begin{align}V(\mu)=S_{\mu}(\mu).\end{align}

We next derive another representation for $S_{\mu}(\nu)$ . For any $1\leq j\leq r$ and $\nu\in\mathcal{D}([0,T],\mathcal{M}_1(\mathcal{Z}))$ , we have from [Reference Léonard49, p. 319] that

(177) \begin{equation}\begin{split}S^{j,c}_{\mu}(\nu)&=U^{j,c}_{\mu}(\nu_t(dz)dt), \\S^{j,p}_{\mu}(\nu)&=U^{j,p}_{\mu}(\nu_t(dz)dt),\end{split}\end{equation}

where, for all $\tilde{\nu}\in\mathcal{M}_1([0,T[\times\mathcal{Z})$ , $U^{j,c}_{\mu}(\tilde{\nu})$ and $U^{j,p}_{\mu}(\tilde{\nu})$ are given by the following (see [Reference Léonard49, Equation (3.14)]):

(178) \begin{equation}\begin{split}U^{j,c}_{\mu}(\tilde{\nu})=\sup_{f\in \mathcal{C}_1^c}\Bigg\{\int_0^T\bigg\langle-&\bigg(\frac{\partial}{\partial t}+ { A^{j,c}_{\mu (t)}}\bigg)f(t,z)\\&-\sum_{z'\,:\,(z,z')\in\mathcal{E}}\tau\big(f(t,z')-f(t,z)\big)\lambda^{c}_{j,z,z'}\Big(\mu_j^c(t),\mu_j^p(t)\Big),\nu_t (dz)\bigg\rangle\Bigg\},\end{split}\end{equation}
(179) \begin{equation}\begin{split}U^{j,p}_{\mu}(\tilde{\nu})& =\sup_{f\in \mathcal{C}_1^c}\Bigg\{\int_0^T\bigg\langle-\bigg(\frac{\partial}{\partial t}+{ A^{j,p}_{\mu (t)}}\bigg)f(t,z)\\&\quad -\sum_{z'\,:\,(z,z')\in\mathcal{E}}\tau\big((f(t,z')-f(t,z)\big)\lambda^p_{j,z,z'}\Big(\mu_j^c(t),\mu_1^p(t),\ldots,\mu_r^p(t)\Big),\nu_t (dz)\bigg\rangle\Bigg\},\end{split}\end{equation}

where $\mathcal{C}_1^c$ stands for the set of all continuous functions with compact support on $[0,T[ \times\mathcal{Z}$ which are t-differentiable. Using (177), (178), and (179) together with [Reference Léonard49, Lemma 3.2], we obtain

(180) \begin{equation}\begin{split}S^{j,c}_{\mu}(\mu)&=\int_0^T \Big|\Big|\Big| \dot{\mu}_j^c(t)-{ A^{j,c^*}_{\mu (t)}}\mu_j^c(t)\Big|\Big|\Big|_{\mu(t)}dt, \\S^{j,p}_{\mu}(\mu)&=\int_0^T \Big|\Big|\Big| \dot{\mu}_j^p(t)-{ A^{j,p^*}_{\mu (t)}}\mu_j^p(t)\Big|\Big|\Big|_{\mu(t)}dt.\end{split}\end{equation}

Finally, using (176), we deduce that $\big(p^N_{\nu_N},N\geq 1\big)$ obeys a large deviations principle with rate N and good rate function (173). The representation (174) follows immediately from [Reference Léonard49, Lemma 3.2], and the statement about absolute continuity follows from [Reference Léonard49, Theorem 3.1]. The theorem is proved.

The following result shows that the large deviations principle for $\big(p_{\nu}^{N},N\geq 1\big)$ holds uniformly in the initial condition.

Corollary 6.1. For any compact set $K\subset\big(\mathcal{M}_1(\mathcal{Z}))\big)^{2r}$ , any closed set $F\subset\big(\mathcal{D}([0,T],\mathcal{M}_1(\mathcal{Z}))\big)^{2r}$ , and any open set $G\subset\big(\mathcal{D}([0,T],\mathcal{M}_1(\mathcal{Z}))\big)^{2r}$ , we have

(181) \begin{align}\limsup_{N\rightarrow\infty}\frac{1}{N} \log \sup_{\nu\in K} p_{\nu}^{N} \big({\mu_N\in F}\big) \leq - \inf_{\substack{\nu\in K\\ }}\inf_{\mu\in F} S_{[0,T]} (\mu|\nu),\end{align}
(182) \begin{align}\liminf_{N\rightarrow\infty}\frac{1}{N} \log \inf_{\nu\in K} p_{\nu}^{N} \big({\mu_N\in G}\big) \geq - \sup_{\nu\in K}\inf_{\substack{ \mu\in G}} S_{[0,T]} (\mu|\nu).\end{align}

Proof. This follows immediately from [Reference Dembo and Zeitouni29, Corollary 5.6.15] and Theorem 6.2.

Acknowledgements

We would like to thank the anonymous referees and the associate editor for having read the paper with great care and made several very important comments that improved the exposition.

Funding information

This research was supported by the Natural Sciences and Engineering Research Council of Canada Discovery Grants and by Carleton University.

Competing interests

There were no competing interests to declare which arose during the preparation or publication process of this article.

References

Agliari, E., Migliozzi, D. and Tantari, D. (2018). Non-convex multi-species Hopfield models. J. Statist. Phys. 172, 12471269.Google Scholar
Akhil, P. T., Altman, E. and Sundaresan, R. (2019). A mean-field approach for controlling singularly perturbed multi-population SIS epidemics. Preprint. Available at https://arxiv.org/abs/1902.05713.Google Scholar
Alberici, D., Camilli, F., Contucci, P. and Mingione, E. (2021). The multi-species mean-field spin-glass on the Nishimori line. J. Statist. Phys. 182, article no. 2.Google Scholar
Aleandri, M. and Minelli, I. G. (2019). Opinion dynamics with Lotka–Volterra type interactions. Electron. J. Prob. 24, 31 pp.Google Scholar
Barra, A., Contucci, P., Mingione, E. and Tantari, D. (2015). Multi-species mean field spin glasses. Rigorous results. Ann. Inst. H. Poincaré Prob. Statist. 16, 691–708.Google Scholar
Bayraktar, E., Chakraborty, S. and Wu, R. (2020). Graphon mean field systems. Preprint. Available at https://arxiv.org/abs/2003.13180.Google Scholar
Bayraktar, E. and Wu, R. (2021). Mean field interaction on random graphs with dynamically changing multi-color edges. Stoch. Process. Appl. 141, 197244.Google Scholar
Benaïm, M. and Le Boudec, J.-Y. (2008). A class of mean field interaction models for computer and communication systems. Performance Evaluation 65, 823838.Google Scholar
Bhamidi, S., Budhiraja, A. and Wu, R. (2019). Weakly interacting particle systems on inhomogeneous random graphs. Stoch. Process. Appl. 129, 21742206.Google Scholar
Billingsley, P. (1999). Convergence of Probability Measures, 2nd edn. John Wiley, New York.CrossRefGoogle Scholar
Bolley, F. (2008). Separability and completeness for the Wasserstein distance. In Séminaire de Probabilités XLI, eds C. Donati-Martin, M. Émery, A. Rouault and C. Stricker, Springer, Berlin, Heidelberg, pp. 371–377.Google Scholar
Borkar, V. S. and Sundaresan, R. (2012). Asymptotics of the invariant measure in mean field models with jumps. Stoch. Systems 2, 322380.CrossRefGoogle Scholar
Bossy, M., Faugeras, O. and Talay, D. (2015). Clarification and complement to ‘Mean-field description and propagation of chaos in networks of Hodgkin–Huxley and FitzHugh–Nagumo neurons’. J. Math. Neurosci. 5, article no. 31, 23 pp.Google ScholarPubMed
Buckdahn, R., Li, J. and Peng, S. (2014). Nonlinear stochastic differential games involving a major player and a large number of collectively acting minor agents. SIAM J. Control Optimization 52, 451492.Google Scholar
Budhiraja, A., Mukherjee, D. and Wu, R. (2019). Supermarket model on graphs. Ann. Appl. Prob. 29, 17401777.Google Scholar
Budhiraja, A. and Wu, R. (2016). Some fluctuation results for weakly interacting multi-type particle systems. Stoch. Process. Appl. 126, 22532296.Google Scholar
Carmona, R. and Zhu, X. (2016). A probabilistic approach to mean field games with major and minor players. Ann. Appl. Prob. 26, 15351580.CrossRefGoogle Scholar
Chong, C. and Klüppelberg, C. (2019). Partial mean field limits in heterogeneous networks. Stoch. Process. Appl. 129, 49985036.Google Scholar
Collet, F. (2014). Macroscopic limit of a bipartite Curie–Weiss model: a dynamical approach. J. Statist. Phys. 157, 13091319.Google Scholar
Collet, F., Formentin, M. and Tovazzi, D. (2016). Rhythmic behavior in a two-population mean-field Ising model. Phys. Rev. E 94, article no. 042139.Google Scholar
Dawson, D. A. (1983). Critical dynamics and fluctuations for a mean field model of cooperative behaviour. J. Statist. Phys. 41, 2985.Google Scholar
Dawson, D. A. (1991). Measure-valued Markov processes. In École d’Été de Probabilités de Saint-Flour XXI—1991, Springer, Berlin, Heidelberg, pp. 1260.Google Scholar
Dawson, D. A. (2017). Introductory lectures on stochastic population systems. Preprint. Available at https://arxiv.org/abs/1705.03781.Google Scholar
Dawson, D. A. and Gärtner, J. (1987). Large deviations from the McKean–Vlasov limit for weakly interacting diffusions. Stochastics 20, 247308.Google Scholar
Dawson, D. A., Sid-Ali, A. and Zhao, Y. Q. (2022). Large-time behavior of finite-state mean-field systems with multi-classes. Stoch. Systems 13, 93127.Google Scholar
Dawson, D. A., Tang, J. and Zhao, Y. Q. (2005). Balancing queues by mean field interaction. Queueing Systems 49, 335361.Google Scholar
Dawson, D. A. and Zheng, X. (1991). Law of large numbers and central limit theorem for unbounded jump mean-field models. Adv. Appl. Math. 12, 293326.Google Scholar
Delattre, S., Giacomin, G. and Luçon, E. (2016). A note on dynamical models on random graphs and Fokker–Planck equations. J. Statist. Phys. 165, 785798.CrossRefGoogle Scholar
Dembo, A. and Zeitouni, O. (2010). Large Deviations Techniques and Applications, 2nd edn. Springer, Berlin, Heidelberg.Google Scholar
(1989). Deuschel, J.-D. and Stroock, D. W. Large Deviations. Academic Press, New York.Google Scholar
Ethier, S. N. and Kurtz, T. G. (1986). Markov Processes: Characterization and Convergence. John Wiley, New York.Google Scholar
Feng, S. (1994). Large deviations for empirical process of mean-field interacting particle system with unbounded jumps. Ann. Prob. 22, 21222151.CrossRefGoogle Scholar
Feng, S. (1994). Large deviations for Markov processes with mean field interaction and unbounded jumps. Prob. Theory Relat. Fields 100, 227252.Google Scholar
Finnoff, W. (1993). Law of large numbers for a general system of stochastic differential equations with global interaction. Stoch. Process. Appl. 46, 153182.Google Scholar
Finnoff, W. (1994). Law of large numbers for a heterogeneous system of stochastic differential equations with strong local interaction and economic applications. Ann. Appl. Prob. 4, 494528.Google Scholar
Freidlin, M. I. and Wentzell, A. D. (2012). Random Perturbations of Dynamical Systems, 3rd edn. Springer, Berlin, Heidelberg.Google Scholar
Gärtner, J. (1988). On the McKean–Vlasov limit for interacting diffusions. Math. Nachr. 137, 197248.Google Scholar
Giesecke, K., Spiliopoulos, K., Sowers, R. B. and Sirignano, J. A. (2015). Large portfolio asymptotics for loss from default. Math. Finance 25, 77114.Google Scholar
Graham, C. (2000). Chaoticity on path space for a queueing network with selection of the shortest queue amongst several. J. Appl. Prob. 37, 198211.Google Scholar
Graham, C. (2008). Chaoticity for multiclass systems and exchangeability within classes. J. Appl. Prob. 45, 11961203.Google Scholar
Graham, C. and Méléard, S. (1993). Propagation of chaos for a fully connected loss network with alternate routing. Stoch. Process. Appl. 44, 159180.Google Scholar
Graham, C. and Méléard, S. (1995). Dynamic asymptotic results for a generalized star-shaped loss network. Ann. Appl. Prob. 5, 666680.Google Scholar
Graham, C. and Robert, P. (2009). Interacting multi-class transmissions in large stochastic networks. Ann. Appl. Prob. 19, 23342361.Google Scholar
Hwang, C.-R. and Sheu, S.-J. (1990). Large-time behavior of perturbed diffusion Markov processes with applications to the second eigenvalue problem for Fokker–Planck operators and simulated annealing. Acta Appl. Math. 19, 253295.Google Scholar
Kac, M. (1956). Foundations of kinetic theory. In Proc. 3rd Berkeley Symposium on Mathematical Statistics and Probability, Vol. III: Contributions to Astronomy and Physics, University of California Press, Berkeley, pp. 171197.Google Scholar
Kirsch, W. and Toth, G. (2020). Two groups in a Curie–Weiss model with heterogeneous coupling. J. Theoret. Prob. 33, 20012026.Google Scholar
Kley, O., Klüppelberg, C. and Reichel, L. (2015). Systemic risk through contagion in a core-periphery structured banking network. Banach Center Publ. 104, 133149.Google Scholar
Knöpfel, H., Löwe, M., Schubert, K. and Sinulis, A. (2020). Fluctuation results for general block spin Ising models. J. Statist. Phys. 178, 11751200.Google Scholar
Léonard, C. (1995). Large deviations for long range interacting particle systems with jumps. Ann. Inst. H. Poincaré Prob. Statist. 31, 289323.Google Scholar
Löwe, M. and Schubert, K. (2018). Fluctuations for block spin Ising models. Electron. Commun. Prob. 23, 12 pp.Google Scholar
McKean, H. P. (1966). A class of Markov processes associated with nonlinear parabolic equations. Proc. Nat. Acad. Sci. USA 56, 19071911.Google Scholar
McKean, H. P. (1966). Speed of approach to equilibrium for Kac’s caricature of a Maxwellian gas. Arch. Rational Mech. Anal. 21, 343367.Google Scholar
Méléard, S. and Bansaye, V. (2015). Stochastic Models for Structured Populations: Scaling Limits and Long Time Behavior. Springer, Cham.Google Scholar
Meylahn, J. M. (2020). Two-community noisy Kuramoto model. Nonlinearity 33, 18471880.Google Scholar
Mitzenmather, M. (1996). The power of two choices in randomized load balancing. Doctoral Thesis, University of California, Berkeley.Google Scholar
Nagasawat, M. and Tanaka, H. (1987). Diffusion with interactions and collisions between coloured particles and the propagation of chaos. Prob. Theory Relat. Fields 74, 161198.Google Scholar
Nagasawat, M. and Tanaka, H. (1987). On the propagation of chaos for diffusion processes with drift coefficients not of average form. Tokyo J. Math. 10, 403418.Google Scholar
Nguyen, D. T., Nguyen, S. L. and Du, N. H. (2020). On mean field systems with multiclasses. Discrete Continuous Dynam. Systems 40, 683707.Google Scholar
Porter, M. A., Onnela, J. P. and Mucha, P. J. (2009). Communities in networks. Notices Amer. Math. Soc. 56, 10821097.Google Scholar
Skorokhod, A. V. (1989). Asymptotic Methods in the Theory of Stochastic Differential Equations. American Mathematical Society, Providence, RI.Google Scholar
Sznitman, A. S. (1991). Topics in propagation of chaos. In École d’Été de Probabilités de Saint-Flour XIX—1989, ed. P. L. Hennequin, Springer, Berlin, Heidelberg, pp. 165–251.Google Scholar
Touboul, J. (2014). Propagation of chaos in neural fields. Ann. Appl. Prob. 24, 12981328.Google Scholar
Touboul, J. (2018). Erratum: ‘Propagation of chaos in neural fields’. Ann. Appl. Prob. 28, 32873289.Google Scholar
Varadhan, S. R. S. (1996). Asymptotic probabilities and differential equations. Commun. Pure Appl. Math. 19, 261286.Google Scholar
Vvedenskaya, N. D., Dobrushin, R. L. and Karpelevich, F. I. (1996). Queueing system with selection of the shortest of two queues: an asymptotic approach. Problems Inf. Transmission 32, 1527.Google Scholar
Vvedenskaya, N. D. and Suhov, Y. M. (1997). Dobrushin’s mean-field approximation for a queue with dynamic routing. Markov Process. Relat. Fields 3, 493526.Google Scholar
Yasodharan, S. and Sundaresan, R. (2022). Large-time behaviour and the second eigenvalue problem for finite-state mean-field interacting particle systems. Adv. Appl. Prob. 55, 85125.Google Scholar
Figure 0

Figure 1. Example of a block-structured graph with a complete peripheral subgraph. Here we have a 4-block-structured graph linked by a set of peripheral nodes. For the first block, the set of central nodes is $C_1^c=\{1,2\}$ and the set of peripheral nodes is $C_1^p=\{3,4\}$. The set of all peripheral nodes of the graph is given by the set of nodes $C^p=\{3,4,5,10,11,14,18\}$.