1. Introduction
Pólya urns are popular modeling tools, as they capture the dynamics of many real-world applications involving change over time. Numerous applications in informatics and biosciences are listed in [Reference Mahmoud24]; see also [Reference Johnson and Kotz15, Reference Kotz and Balakrishnan19] for a wider perspective.
In the early models, single balls are drawn one at a time. In the last two decades, several authors have generalized the models to schemes evolving on multiple drawing; see [Reference Bose, Dasgupta and Maulik3, Reference Chen and Kuba5–Reference Crimaldi, Louis and Minelli7, Reference Kuba and Mahmoud21–Reference Lasmar, Mailler and Selmi23, Reference Mahmoud25]. However, most of these investigations deal only with urn schemes involving two colors. So far, explicit expressions and asymptotic expansions for the expected value, the variance and covariances, and higher moments have proven to be elusive in the multicolor case.
In this manuscript, we present a more comprehensive approach in which there are multiple colors in the urn, and we sample several balls in each drawing. Depending on the composition of the sample, action is taken to place balls in the urn. We ought to mention that this is a new trend in investigating multicolor urns; little exists in the literature beyond the study of urn schemes with an infinite number of colors, where the authors imposed restrictions on the number of colors to add after each drawing [Reference Bandyopadhyay and Thacker2, Reference Janson13].
As a modeling tool, multicolor multiple-drawing urns have their niche in applications to capture the dynamics in systems that internally have multiple interacting parts. For instance, the Maki–Thompson rumor spreading model [Reference Maki and Thompson26] can be modeled by an urn scheme on three colors (representing ignorants, spreaders, and stiflers of the rumor) experiencing pair interactions, such as social visits and phone calls (represented as draws of pairs of balls at a time). Another application is to produce a profile of the nodes in random hypergraphs; see [Reference Sparks29, Chapter 6], and its prelude [Reference Sparks, Balaji and Mahmoud30].
2. The mechanics of a multicolor multiple-drawing urn scheme
We first describe a multicolor multiple-drawing urn model qualitatively. We consider an urn growing on k colors, which we label $1,\ldots, k$ . Initially, the urn contains a positive number of balls of colors $i\in [k]$ (some of the colors may initially be absent), where [k] stands for the set $\{1, 2, \ldots, k\}$ . At each discrete time step, a sample of balls is drawn out of the urn (the sample is taken without replacement). Note that we speak of the entire sample as one draw. After the sample is collected, we put it back in the urn. Depending on what appears in the sample, we add balls of various colors. Such models will be called multicolor multiple-drawing schemes.
Many such schemes can emerge upon specifying a particular growth algorithm (how many balls to draw, what to add, etc.). We can also develop a model based on sampling with replacement. The proof techniques and results for such a sampling scheme are very similar to those that appear in a sampling scheme without replacement. We devote Section 7 to outline results under sampling with replacement. We refer readers interested in more details to [Reference Sparks29].
A sample of size s is a partition of the number s into nonnegative integers. Each part of the partition corresponds to the number of balls in the sample from a particular color after drawing the sample. In other words, we view a sample as a row vector $\textbf{{s}}$ with nonnegative components $s_1, \ldots, s_k$ . The components of $\textbf{{s}}$ are the solutions of the Diophantine equation $s_1 + \cdots + s_k = s$ in nonnegative integers. There are $\binom{s +k-1}{s}$ such solutions.
In a very general model, when the sample $\textbf{{s}}$ is drawn, we add $m_{\textbf{{s}},j}$ balls of color j, for $j=1, \ldots, k$ , where $m_{\textbf{{s}},j}$ can be a random variable. We only consider the case of fixed additions: $m_{\textbf{{s}}, j}\in \mathbb Z$ , the class of urn schemes where the sample size (in each drawing) is a positive fixed number s (not changing over time), and the total number of balls added after each drawing is also a fixed number b. Such an urn is said to be balanced, and b is called the balance factor. Further specification will follow while crafting conditions for linear recurrence.
We canonically rank the various samples through an application of [Reference Knuth17, Algorithm L], called reverse lexicographic order [Reference Konzem and Mahmoud18]. In this scenario, the sample $(x_1, \ldots, x_k)$ precedes the sample $(y_1, \ldots, y_k)$ if, for some $r \in [k]$ , we have $x_i = y_i$ for $i=1, \ldots, r-1$ , and $x_r > y_r$ . For example, drawing samples of size $s=3$ from an urn with $k=3$ colors, in this modification of the canonical ranking the sample (3,0,0) precedes the sample (2,1,0), which in turn precedes the sample (2, 0, 1).
Our interest is in urns that can endure indefinite drawing. That is, no matter which stochastic path is followed, we can draw ad infinitum and are able to execute the addition rules after each drawing.
We call such a scheme tenable. For instance, if all the numbers $m_{\textbf{{s}}, j}$ , for all feasible $\textbf{{s}}$ and $j=1, \ldots, k$ , are positive, the scheme is tenable.
3. The (k, s, b)-urn
Consider a k-color multiple-drawing tenable balanced urn scheme grown by taking samples of a fixed size $s\ge 1$ , with balance factor $b \ge 1$ . We call such a class (k, s, b)-urns. Let $X_{n,i}$ be the number of balls of color $i\in [k]$ in the urn after $n\ge 0$ draws, and let $\textbf{X}_n$ be the row vector with these components. We call $\textbf{X}_n$ the composition vector. Thus, initially, there are $X_{0,i}\ge 0$ balls of color i, for $i=1,\ldots, k$ . Since the scheme is tenable, we must enable a first draw by having $\sum_{i=1}^k X_{0,i}\ge s$ . Let $\tau_n$ be the total number of balls after n draws. For a balanced urn, we have $\tau_n = \tau_{n-1} + b = bn + \tau_0$ . To study (k, s, b)-urns, we first need to set up some notation.
3.1. Notation
We use the notation $1_{\mathcal E}$ for the indicator of event $\mathcal E$ . For integer $r\ge 0$ and $z\in \mathbb C$ , the falling factorial $z(z-1) \cdots(z-r+1)$ is denoted by $(z)_r$ , and $(z)_0$ is to be interpreted as 1. For a row vector $\textbf{{s}} = (s_1, \ldots, s_k)$ , with $s_i\ge 0$ (for $i\in [k]$ ) and $\sum_{i=1}^k s_i = s$ , the multinomial coefficient $\binom{s}{s_1, \ldots, s_2}$ may be written as $\binom{s}{\textbf{{s}}}$ , and the concatenation $(y_1)_{s_1}(y_2)_{s_2}\cdots(y_k)_{s_k}$ of falling factorials may be succinctly written as $(\textbf{y})_\textbf{s}$ for economy, where $\textbf{y} = (y_1, \ldots, y_k)$ .
There is a multinomial theorem for falling factorials, which expands a falling factorial of a sum in terms of the falling factorials of the individual numbers in the summand. Namely, for $\textbf{z}=(z_1, \ldots, z_k)\in \mathbb{ C}^k$ , the theorem states that
This expansion is known as the (multivariate) Chu–Vandermonde identity; the reader can find it in classic books like [Reference Bailey1, Reference Graham, Knuth and Patashnik8].
For probabilistic convergence modes we use ${\buildrel \textrm{D} \over \longrightarrow}$ , $\buildrel \mathbb{P} \over \longrightarrow$ , and $\buildrel \textrm{a.s.} \over \longrightarrow$ to respectively denote convergence in distribution, convergence in probability, and convergence almost surely.
We denote a diagonal matrix with the provided entries $a_1, a_2, \ldots, a_k$ as
with $\textbf{diag}(\textbf{v})$ representing the diagonal matrix with diagonal entries equal to the components of a vector $\textbf{v}$ .
We employ the standard asymptotic notation o and O. We also need them in a matricial sense, which will be indicated in bold, so $\textbf{o}(n)$ and $\textbf{O}(n)$ indicate a matrix of appropriate dimensions with all its entries being o(n) or O(n).
In the ensuing text, unless otherwise stated, all vectors are row vectors of k components and all square matrices are $k\times k$ . The identity matrix is denoted by $\textbf{I}$ , the row vector of zeros is denoted by $\textbf{0}$ , and the zero matrix is denoted by $\textbf{0}$ . Likewise, the vector of ones is denoted by $\textbf{1}$ .
We define the discrete simplex $\Delta_{k,s}$ to be the collection of all possible samples $\textbf{s}$ or, more precisely,
We arrange the ball addition numbers in a $\binom{s+k-1}{s} \times k$ replacement matrix $\textbf{M} = [m_{\textbf{{s}},j}]_{\textbf{{s}}\in \Delta_{k,s},j\in [k]}$ , where the entry $m_{\textbf{{s}},j}$ is the number of balls of color j that we add to the urn upon drawing the sample $\textbf{{s}}$ . We take the entire row corresponding to $\textbf{{s}}$ as a row vector denoted by $\textbf{{m}}_\textbf{{s}} =(m_{\textbf{{s}}, 1},m_{\textbf{{s}},2}, \ldots ,m_{\textbf{{s}},k})$ .
When we deal with the eigenvalues $\lambda_1, \ldots, \lambda_k$ of a $k\times k$ matrix, we arrange them in decreasing order of their real parts. In other words, we consider them indexed in the fashion $\textrm{Re}\;\lambda_1\ge\textrm{Re}\;\lambda_2\ge\cdots\ge\textrm{Re}\;\lambda_k$ . In the case that $\lambda_1$ is of multiplicity 1, we call it the principal eigenvalue. The associated eigenvector $\textbf{v}_1$ is dubbed the principal left eigenvector.
The replacement matrix $\textbf{M} = [\textbf{{m}}_\textbf{{s}}]_{\textbf{{s}}\in \Delta_{k,s}}$ is obtained by stacking up the row vectors $\textbf{{m}}_\textbf{{s}}$ for $\textbf{{s}}\in\Delta_{k,s}$ from canonically smallest to largest (that is, the smallest canonical sample labels the bottom row of $\textbf{M}$ and the largest canonical sample labels the top row of $\textbf{M}$ ).
Example 1. Consider a (3,3,9)-urn scheme. The replacement matrix shown below is $10\times 3$ , filled with some numbers consistent with the chosen parameters – the sum across each row is the balance factor $b=9$ . The rows are labeled to the left with the partitions of $s=3$ into $k=3$ components listed (from top to bottom) in decreasing canonical order. For example, for $\textbf{{s}}=(2,0,1)$ , the entry $m_{\textbf{{s}},3}$ is 5.
3.2. Tools from linear algebra
For a square matrix $\textbf{A}$ , we let $\sigma(\textbf{A})$ be the spectrum of $\textbf{A}$ (its set of eigenvalues). We use the Jordan decomposition of $\textbf{A}$ to produce a direct sum $\bigoplus_\lambda\textbf{E}_\lambda$ of general eigenspaces $\textbf{E}_\lambda$ that decomposes $\mathbb{C}^k$ , where $\textbf{A}-\lambda\textbf{I}$ is a nilpotent operator on $\textbf{E}_\lambda$ for all $\lambda\in\sigma(\textbf{A})$ .
Using [Reference Nomizu27, Theorem 7.6], this result implies that, for each $\lambda\in\sigma(\textbf{A})$ , there exists a projection matrix $\textbf{P}_\lambda$ that commutes with $\textbf{A}$ and satisfies the conditions
where $\textbf{N}_\lambda = \textbf{P}_\lambda \textbf{N}_\lambda=\textbf{N}_\lambda \textbf{P}_\lambda$ is nilpotent. Furthermore, these projection matrices satisfy $\textbf{P}_{\lambda_i} \textbf{P}_{\lambda_j} = \textbf{0}$ , when $\lambda_i \ne \lambda_j$ . We define $\nu_\lambda \ge 0$ to be the integer such that $\textbf{N}_\lambda^{\nu_\lambda} \ne \textbf{0}_{k \times k}$ but $\textbf{N}_\lambda^{\nu_\lambda+1} = \textbf{0}$ .
Remark 1. In the Jordan normal form of $\textbf{A}$ , the largest Jordan block with $\lambda$ has size $(\nu_\lambda+1)$ and thus $(\nu_\lambda+1)$ is at most equal to the multiplicity of $\lambda$ . If the algebraic multiplicity of $\lambda$ is $\mathcal{M}_\lambda$ , we can determine the size of the Jordan blocks for $\lambda$ by analyzing the dimension of the nullity of $(\textbf{A}-\lambda\textbf{I})^i$ for $i=1,\ldots,\mathcal{M}_\lambda$ . If $\nu_\lambda=0$ for all $\lambda\in\sigma(\textbf{A})$ , then $\textbf{A}$ is diagonalizable.
3.3. The affine subclass
Suppose $\mathbb{F}_n$ is the sigma field generated by the first n samples. For a general scheme, the conditional expectation $\mathbb{E}[\textbf{X}_n\mid\mathbb{F}_{n-1}]$ may contain combinations of $\textbf{X}_{n-1}$ corresponding to higher moments of the counts in $\textbf{X}_{n-1}$ , such as $\mathbb{E}[\textbf{X}_{n-1}^\top\textbf{X}_{n-1}]$ , of all mixed second moments.
We call an urn scheme affine if, for each $n\ge 0$ , the conditional expectation of $\textbf{X}_n$ takes the form $\mathbb{E}[\textbf{X}_n\mid\mathbb{F}_{n-1}] = \textbf{X}_{n-1}\textbf{C}_n$ for some real-valued $k\times k$ matrix $\textbf{C}_n$ , thus keeping all unconditional expectations linear. This condition aids the mathematical tractability of the model.
To guarantee affinity, certain conditions must be met. The following theorem specifies necessary and sufficient conditions for affinity. These conditions are expressed in terms of the vectors $\textbf{e}_1,\ldots,\textbf{e}_k$ , which are the basis of a k-dimensional vector space where, for $i\in [k]$ , we have
The affinity will be built around the $k\times k$ submatrix of $\textbf{M}$ corresponding to the rows labeled $s\textbf{e}_i$ , for $i \in [k]$ . This submatrix is filled with a set of integers that are balanced (that is, the sum across any row in the submatrix adds up to b). We call this square submatrix the core, and denote it by $\textbf{A}$ ; we refer to the ith row in $\textbf{A}$ as $\textbf{a}_{s\textbf{e}_i}$ (which is also $\textbf{{m}}_{s\textbf{e}_i}$ ). To ensure tenability, we assume all the entries of $\textbf{A}$ are nonnegative, except possibly the diagonal elements – if one such element is negative, we assume it must be greater than or equal to $-s$ . The initial number $\tau_{0}$ must be at least s, too.
Remark 2. This class may be expanded to cases where a diagonal element is less than $-s$ , given specific starting conditions and column configurations for $\textbf{A}$ are met.
A version of the following theorem appears in [Reference Kuba20].
Theorem 1. A (k,s,b)-urn with core $\textbf{A}$ is affine if and only if
for all $\textbf{{s}}\in \Delta_{k,s}$ .
Proof. Recall that for a vector in the basis $\textbf{e}_i = (e_1,\ldots,e_k)$ all the components are 0 except for $e_i$ , which is 1. By construction, in the core we have the trivial relationship
Let the sample in the nth draw be recorded in $\textbf{Q}_n$ . The ball addition rules translate into a stochastic recurrence:
giving rise to the conditional expectation
Assume the scheme is affine. For affinity to hold, $\mathbb{E}[\textbf{X}_n\mid\mathbb{F}_{n-1}]$ should be equated to $\textbf{X}_{n-1}\textbf{C}_n$ for some $k\times k$ matrix $\textbf{C}_n$ , for each $n\ge1$ . This is the same as
Using the fact that $\tau_{n-1} = \sum_{i={1}}^k X_{n-1, i}$ , write the latter relation in the form
On the left-hand, the highest power of $X_{n-1,i}$ for $i\in [k]$ is $s+1$ , while that on the right-hand side is only s, unless $\textbf{C}_n - \textbf{I}$ rids the left-hand side of these powers. So, $\textbf{C}_n-\textbf{I}$ can be in the form $(1 \tau_{n-1}\textbf{B}_n$ for some matrix $\textbf{B}_n$ that does not depend on the history of composition vectors. Write $\textbf{B}_n = \big[\textbf{b}_{n,1}^\top,\ldots,\textbf{b}_{n,k}^\top\big]$ , where $\textbf{b}_{n,j}^\top$ is the jth column vector of $\textbf{B}_n$ , for $j\in [k]$ . So, we write (3) as
We utilize the multinomial theorem stated in (1). For any $i\in k$ , expand the falling factorial ${(\tau_{n-1}-1)_{s-1}=\big(\big(\sum_{i=1}^k X_{n-1,i} \big) -1\big)_{s-1}}$ as
This being valid for any $i\in [k]$ , we get
To match coefficients of similar falling factorials, execute the vectorial product on the left:
Extracting the coefficient of $(\textbf{X}_{n-1})_\textbf{{s}}$ on both sides, we get
Applying this to a row in the core, where $\textbf{{s}}= s\textbf{e}_j$ for some $j\in [k]$ , we get $\textbf{b}_{n,j}^\top = s \textbf{e}_j$ (as all the multinomial coefficients include negative lower indices, except the jth). It follows that
For the converse, assume the affinity condition in the statement of the theorem. Recall that $\textbf{Q}_n$ is the random sample at time n, which has a conditional multi-hypergeometric distribution (given $\textbf{X}_{n-1}$ ) with parameters $\tau_{n-1}$ , $\textbf{X}_{n-1}$ , and s. Now, we write the stochastic recurrence (2) in the form
The mean of the multi-hypergeometric distribution is well known [Reference Kendall, Stuart and Ord16]. In vectorial form, the conditional mean (given $\textbf{X}_{n-1}$ ) is $(s/{\tau_{n-1}})\textbf{X}_{n-1}$ . We thus have the conditional expectation
which is an affine form.
Remark 3. It is worth describing (k, s, b)-urn schemes from a qualitative viewpoint. There is a saying that states ‘the whole is equal to the sum of its parts’. Here, each color plays an independent role on how the urn changes within a given step. That is, a ball of color 1 in the sample creates one set of changes, a ball of color 2 in the sample produces another set of changes, and so on. Thus, $\textbf{m}_\textbf{s}$ is merely the sum of the separate impacts of each ball color, with no additional effect produced by a specific sample combination. Furthermore, this class simplifies to traditional study of single-draw urn models when $s=1$ .
Corollary 1.
Proof. The corollary follows from direct iteration of (5).
Example 2. The matrix $\textbf{M}$ in Example 1 is affine. It is built from the core
As an instantiation of the computation of a row of $\textbf{M}$ , as given in Theorem 1, take the sample $\textbf{{s}} =(2,0,1)$ ; the entry $\textbf{{m}}_\textbf{{s}}$ in $\textbf{M}$ is obtained as
4. Classification of cores
There is a large variety of $k\times k$ matrices that can be used as cores to build replacement matrices for urn schemes. As already mentioned, our focus in this paper is on (k, s, b)-urn schemes. Results are more easily obtained when the core of the replacement matrix is irreducible. Loosely speaking, this means that all the colors are essential (or communicating). More precisely, this is defined in terms of dominant colors. Color $i\in [k]$ is said to be dominant when we start with only balls of color i and almost surely balls of color $j\in [k]$ appear at some future point in time, for every $j\not = i$ . An urn scheme is then called irreducible if every color is dominant.
However, for affine (k, s, b)-urn schemes, irreducibility in the above sense implies that its core matrix $\bf{A}$ is irreducible in the matrix sense. The following lemma and example demonstrate this case.
Lemma 1. If an affine (k,s,b)-urn scheme is irreducible, then its core matrix $\bf A$ is irreducible in the matrix sense.
Proof. Consider an affine (k, s, b)-urn scheme that is irreducible. Let $\mathcal{U}$ be the set of all partitions labeling the drawing schema of the urn. On the contrary, assume that its core matrix, denoted by $\bf A$ , is reducible in the matrix sense. Then, there exist a permutation matrix $\bf P$ and positive integer $r <k$ such that
where $\textbf{A}_{11}$ and $\textbf{A}_{22}$ are square matrices of orders r and $(k-r)$ , respectively. Let $W=\{W_1, W_2, \ldots, W_r\} \subsetneq [k]$ be the colors represented by $\textbf{A}_{11}$ . Then, monocolor samples restricted to colors in W only add colors in W. By the affine nature of our urn scheme, samples which come from the set of partitions $\mathcal{U}_{W} \subsetneq \mathcal U$ , generated from all multiple drawings of size s from colors in W, would never introduce balls of colors outside of W as well. Therefore, given that an urn begins with only balls of color $i \in W$ , no path of samples exists which would produce balls of color $j \not\in W$ ; such a conclusion contradicts the irreducibility of the urn scheme.
Example 3. To illustrate this result, we present two affine urn schemas, represented by replacement matrices $\textbf{M}_1$ and $\textbf{M}_2$ (with core matrices $\textbf{A}_1$ and $\textbf{A}_2$ , respectively). We will determine if a square matrix is irreducible in the matricial sense by whether all elements of $(\textbf{I} + |{\bf A}|)^{k-1}$ are positive; for further details, see [Reference Horn and Johnson11, Section 6.2].
Here, the urn scheme represented by $\textbf{M}_1$ is reducible. If we start monochromatically in either color 1 or color 2, there is no path of samples from only balls of colors 1 and 2 which will produce balls of color 3. Furthermore, we see that its core matrix $\textbf{A}_1$ is reducible in the matricial sense. The urn scheme represented by $\textbf{M}_2$ , however, is irreducible. No matter which initiating color is chosen, there exist sampling paths which eventually produce every possible color. Such conclusions can also be made from the results from the core matrix $\textbf{A}_2$ .
When an urn is irreducible, the eigenvalue of $\textbf{A}$ with the largest real part, $\lambda_1$ , is real, positive, and simple (of multiplicity 1), a consequence of a classic Perron–Frobenius theorem [Reference Horn and Johnson11, Section 8.4].
Remark 4. When $\textbf{A}$ is nonnegative (balls are not removed for any sample), the consequence is direct. If $\textbf{A}$ has negative entries, they must be along the diagonal, and so we decompose $\textbf{A}$ into $\textbf{A}=\textbf{R}-s\textbf{I}$ , where $\textbf{R}$ is nonnegative and irreducible, and therefore has leading eigenvalue $(\lambda_1+s)$ that is simple. The desired results for $\textbf{A}$ thus follow.
As in what follows, the results for an affine irreducible (k, s, b)-urn scheme are governed by the ratio of $\textrm{Re}(\lambda_2)$ to $\lambda_1$ . We call this ratio the core index and denote it by $\Lambda$ , which is always strictly less than 1 in irreducible core matrix. A trichotomy arises:
-
• Small-index core, where $\Lambda < \frac 1 2$ .
-
• Critical-index core, where $\Lambda = \frac 1 2$ .
-
• Large-index core, where $\Lambda > \frac 1 2$ .
Occasionally, we may just call the schemes small, critical, or large.
5. A multivariate martingale underlying the (k, s, b)-urn class
The conditional expectation in (5) leads to a martingale formulation. We use the operator $\nabla$ to denote backward differences. At draw n, we have
Define $\textbf{Y}_n := \nabla\textbf{X}_n - ({1}/{\tau_{n-1}})\textbf{X}_{n-1}\textbf{A}.$ Then, $\textbf{Y}_n$ is $\mathbb{F}_{n-1}$ -measurable and
Thus, we have the recurrence
In the proofs, we make use of the matrix products $\textbf{F}_{i,j}:=\prod_{i\le\ell \lt j}(\textbf{I} + ({1}/{\tau_{\ell}})\textbf{A})$ , $0\le i \le j$ , to unravel the martingale differences in the process $\textbf{X}_n$ . The product $\textbf{F}_{i,j}$ can also be written in terms of the polynomial function $f_{i,j}$ , where
We order the eigenvalues of $\textbf{A}$ in the fashion specified in Section 3.1 and note that $\lambda_1={b}$ is a simple eigenvalue and $\textrm{Re}(\lambda_i) < \lambda_1$ for $i > 1$ , since $\textbf{A}$ is irreducible. Accordingly, there exist corresponding left and right eigenvectors of $\textbf{A}$ , to be called $\textbf{v}_1$ and $\textbf{w}_1$ , unique up to normalization. Since $\textbf{A}$ is row-balanced, all the components of $\textbf{w}_1$ are equal. We may set $\textbf{w}_1=\textbf{1}$ and normalize $\textbf{v}_1$ such that $\textbf{w}_1\textbf{v}_1^\top=1$ . This projection results in $\textbf{P}_{\lambda_1}=\textbf{w}_1^\top\textbf{v}_1$ , and so we have
The following lemmas provide important properties that can be applied to the polynomial function (8). We omit the proofs as they are nearly identical to those in [Reference Janson14].
Lemma 2. For $1\le i\le j$ and $\lambda \in \sigma(\textbf{A})$ ,
Furthermore, for any $\nu \ge \nu_\lambda$ , as $i, j \to \infty$ with $i\le j$ , we have
Lemma 3. When $\lambda_1=b$ is a simple eigenvalue then, for $0\le i \le j$ ,
Lemma 4. For any fixed $x \in (0,1]$ , as $n\to \infty$ , $\textbf{F}_{\lceil xn\rceil,n} \to x^{-({1}/{b})\textbf{A}}$ .
5.1. Asymptotic mean
While exact, the execution of the calculation in Corollary 1 may be quite cumbersome. Here, we present a shortcut to the asymptotics of this computation via a multivariate martingale.
Theorem 2. Suppose an affine (k, s,b)-urn scheme is built from an irreducible core $\textbf{A}$ with principal eigenvector $\textbf{v}_1$ and core index $\Lambda\le1$ . Then the expected value of the composition vector is $\mathbb{E}[\textbf{X}_n] = \lambda_1 n \textbf{v}_1 + \tau_0\textbf{v}_1 + \textbf{O}\big(n^\Lambda\ln^{\nu_2}(n)\big)$ .
Proof. The linear vectorial recurrence (7) is of the general form $\textbf{L}_n = \textbf{G}_{\textbf{n}}\textbf{L}_{n-1} + \textbf{H}_n$ , with solution (interpreting the product corresponding to $i=n$ as the identity matrix $\textbf{I}$ )
Write the recurrence in the form $\textbf{X}_n = \textbf{F}_{n-1,n} + \textbf{Y}_n$ , to recognize the solution
The factors $\textbf{X}_0$ and $\textbf{F}_{i,j}$ are nonrandom. We have $\mathbb{E}[\textbf{Y}_i]=0$ , and by taking expectations we get
Janson [Reference Janson14] analyzes the function $\textbf{F}_{0,n}$ carefully. When $\lambda \ne \lambda_1$ , [Reference Janson14, Lemma 4.3] implies that
By (9), we get $\textbf{X}_0\textbf{P}_{\lambda_1} = \textbf{X}_0\textbf{1}^\top\textbf{v}_1 = \tau_0\textbf{v}_1$ , and so, by (10), we have
Therefore, we have $\mathbb{E}[\textbf{X}_n] = (\lambda_1n+\tau_0)\textbf{v}_1+\textbf{O}\big(n^{\Lambda}\ln^{\nu_2}(n)\big)$ .
5.2. The covariance matrix
For the covariance matrix, we define $\hat{\textbf{P}} = \sum_{\lambda\ne\lambda_1}\textbf{P}_{\lambda} = \textbf{I} - \textbf{P}_{\lambda_1}$ and the symmetric matrix
Lastly, we use ${\textrm{e}}^{{s}\textbf{A}} := \sum_{k=0}^{\infty}({1}/{k!})({s}\textbf{A})^k$ for all $s\in\mathbb{R}$ to define the integral
Theorem 3. Consider a growing irreducible affine (k,s,b)-urn. For both unordered samples with and without replacement, and for large n:
-
(i) If the core index is small, then $({1}/{n})\boldsymbol{\Sigma}_n \to \boldsymbol{\Sigma}_{(\textbf{A})}$ , where $\boldsymbol{\Sigma}_{(\textbf{A})}$ is a limiting matrix defined in (14).
-
(ii) If the core index is critical, then
\begin{equation*} \frac1{n\ln^{2\nu_2+1}(n)}\boldsymbol{\Sigma}_n \to \frac1{{b}^{2\nu_2}(2\nu_2+1)(\nu_2!)^2} \sum_{\textrm{Re}(\lambda)={b}/2}(\textbf{N}^*_\lambda)^{\nu_2}\textbf{P}^*_\lambda\textbf{B}\textbf{P}_\lambda \textbf{N}^{\nu_2}_\lambda. \end{equation*}
Before we prove the theorems for the covariance, we provide some additional scaffolding. Using (12), we rewrite (11) as
Thus, the covariance can be computed as
Since $\textbf{Y}_j$ is $\mathbb{F}_{j-1}$ -measurable and $\mathbb{E}[\textbf{Y}_i\mid\mathbb{F}_j]=\textbf{0}$ when $i>j$ , $\mathbb{E}[\textbf{Y}_i^\top\textbf{Y}_j] = \mathbb{E}[\mathbb{E}[\textbf{Y}_i^\top\mid\mathbb{F}_j]\textbf{Y}_j] = \textbf{0}$ . In similar fashion, we have $\mathbb{E}[\textbf{Y}_i^\top\textbf{Y}_j] = \textbf{0}$ when $i<j$ , and so (16) reduces to
Since the urn is balanced, $\nabla\textbf{X}_n\textbf{1}^\top={b}$ for all $n\ge1$ . Thus, we have
which implies that $\textbf{Y}_n\textbf{P}_{\lambda_1}=\textbf{0}$ . Thus, we can rewrite (17) as
We define $\textbf{T}_{i,n,\lambda,\varrho}:=\textbf{F}_{i,n}^\top\textbf{P}_{\lambda}^\top\mathbb{E}[\textbf{Y}_{i}^\top\textbf{Y}_i]\textbf{P}_\varrho\textbf{F}_{i,n}$ for convenience. When either $\lambda$ or $\varrho$ is equal to $\lambda_1$ , $\textbf{P}_{\lambda}^\top\mathbb{E}[\textbf{Y}_{i}^\top\textbf{Y}_i]\textbf{P}_\varrho = \textbf{0}$ , and so we may drop those cases and rewrite (18) as
From this result we produce three lemmas, the first of which gives us the general asymptotic growth of the covariances, the second demonstrating convergence in $\mathcal{L}_2$ and asymptotic approximation in $\mathcal{L}_1$ , and the last providing us with the development of the $\textbf{B}$ matrix described in (13).
Lemma 5. If $\lambda_1$ is a simple eigenvalue then, for $n\ge 2$ ,
Consequently, we have $\mathbb{C}\textrm{ov}[\textbf{X}_n]=\textbf{o}(n^2)$ whenever $\lambda_2 < \lambda_1$ .
Proof. By construction, $\textbf{Y}_n=(({1}/{s})\textbf{Q}_n - ({1}/{\tau_{n-1}})\textbf{X}_{n-1})\textbf{A}$ , and so
We provide detailed results for this in Lemma 7. This result, along with Lemma 2, implies that if $\lambda$ and $ \varrho$ are two eigenvalues of $\textbf{A}$ then, for $1\le i \le n$ ,
If $\textrm{Re}(\lambda)+\textrm{Re}(\varrho) \ge {b}$ , we have
If $\textrm{Re}(\lambda)+\textrm{Re}(\varrho) < {b}$ , choose $\alpha$ such that $({1}/{b})(\textrm{Re}(\lambda)+\textrm{Re}(\varrho)) < \alpha < 1$ . Then, we are guaranteed that
Summing together over i, we get
Summing over all combinations of $\lambda,\varrho\in\sigma(\textbf{A})\backslash\{\lambda_1\}$ , $\sum_{i=1}^{n}\textbf{T}_{i,n,\lambda,\varrho}$ is of highest order when $\lambda=\varrho=\lambda_2$ , and so $\mathbb{C}\textrm{ov}[\textbf{X}_n]=\textbf{o}(n^2)$ when $\lambda_2 < \lambda_1$ .
Lemma 6. As $n\to\infty$ , $({1}/{n})\textbf{X}_n\ \buildrel {\mathcal{L}_2} \over \longrightarrow\ b\textbf{v}_1$ . Furthermore, we have the asymptotic approximation
Proof. From Lemma 5, we have
and $({1}/{n})\boldsymbol{\mu}_n \to b\textbf{v}_1$ by Theorem 2. Thus, $\|({1}/{n})\textbf{X}_n - b\textbf{v}_1\|_{\mathcal{L}_2}^2 \to 0$ . Now, given Lemma 5 and Theorem 2, for each $i \in [k]$ we obtain
So, by Jensen’s inequality, $\mathbb{E}[|X_{n,i} - bnv_{1,i}|] \le \sqrt{\mathbb{E}[(X_{n,i}-bv_{1,i}n)^2]}$ . It follows that $(\textbf{X}_{n} - bn\textbf{v}_1)$ has the asymptotic approximation claimed.
Lemma 7. If $\lambda_1$ is a simple eigenvalue then, as $n\to\infty$ , $\mathbb{E}\big[\textbf{Y}_n^\top\textbf{Y}_n\big] \to \textbf{B} - ({b^2}/{s})\textbf{v}_1^\top\textbf{v}_1$ . Hence, for any eigenvalue $\lambda \ne \lambda_1$ , $\mathbb{E}\big[\textbf{Y}_n^\top\textbf{Y}_n\big]\textbf{P}_\lambda \to \textbf{B}\textbf{P}_\lambda$ .
Proof. Since $\textbf{Y}_n=\nabla\textbf{X}_n - ({1}/{\tau_{n-1}})\textbf{X}_{n-1}\textbf{A}$ and $\textbf{X}_{n}=\textbf{X}_{n-1}+({1}/{s})\textbf{Q}_n\textbf{A}$ ,
Thus, $\mathbb{E}\big[\textbf{Y}^\top_n\textbf{Y}_n \mid \mathbb{F}_{n-1}\big] = \mathbb{E}[(\nabla\textbf{X}_n)^\top(\nabla\textbf{X}_n)] - \tau_{n-1}^{-2}\textbf{A}^\top\mathbb{E}\big[\textbf{X}_{n-1}^\top\textbf{X}_{n-1}\big]\textbf{A}$ .
We tackle the asymptotics of this expression in two parts. First, note that
Also, $\mathbb{E}\big[\textbf{Q}_{n}^\top\textbf{Q}_{n} \mid \mathbb{F}_{n-1}\big] = {\mathcal{Q}}_n$ , with the entries
which follow from the known second moments of the multivariate hypergeometric distribution [Reference Kendall, Stuart and Ord16].
Let $n_{\Lambda}$ be a function that represents the appropriate asymptotic order, given $\Lambda$ and Lemma 6. Rewrite $X_{n-1,i}= b(n-1)v_{1,i}+\textbf{O}_{{\mathcal{L}}_1}(n_{\Lambda})$ . Then, when $i=j$ ,
For $i\ne j$ , we get $\mathbb{E}[{\mathcal{Q}}_n[i,j]] \to s(s-1)v_{1,i}v_{1,j}$ as $n\to\infty$ . Putting the two results together, we have $\mathbb{E}[{\mathcal{Q}}_{n}] \to s(s-1)\textbf{v}_1^\top\textbf{v}_1 + s\textbf{diag}(\textbf{v}_1)$ . We take the expectation of (24) to get
Since $\mathbb{C}\textrm{ov}[\textbf{X}_n]=\textbf{o}(n^2)$ , we have
Define the matrix $\textbf{B}$ as provided by (13). Since $\textbf{v}_1\textbf{A}=\lambda_1\textbf{v}_1=b\textbf{v}_1$ , our convergence simplifies to
We complete the proof by noting that when $\lambda \ne \lambda_1$ , $\textbf{v}_1\textbf{P}_\lambda=\textbf{v}_1\textbf{P}_{\lambda_1}\textbf{P}_\lambda = \textbf{0}$ .
With these results, we now prove Theorem 3. We use techniques quite similar to those found in [Reference Janson14], applying them to the multiple drawing scheme.
Proof of Theorem 3(i). Let $\lambda, \varrho \in \sigma(\textbf{A})\backslash \{\lambda_1\}$ . Then, $\textrm{Re}(\lambda)$ and $\textrm{Re}(\varrho)$ are at most $\textrm{Re}(\lambda_2) < b/2$ . We convert the inner sum into an integral and get
For each fixed $x\in(0,1]$ , Lemmas 4 and 7 imply that
Now, choose some $\alpha \in [0,1)$ , such that ${\textrm{Re}(\lambda_2)}/b<{\alpha}/{2}$ . Then (22) can be applied provided that, for some $C<\infty$ ,
which is integrable over (0,1]. Applying Lebesgue dominated convergence to (26) and a change of variables to $x={\textrm{e}}^{-bs}$ , we get
Thus, (19) and the definition (14) give us the result
as desired.
Proof of Theorem 3(ii). We again use (19) and consider $\sum_{i=1}^{n}\textbf{T}_{i,n,\lambda,\varrho}$ for eigenvalues $\lambda,\varrho\in\sigma(\textbf{A})\backslash\{\lambda_1\}$ . By assumption, $\textrm{Re}(\lambda)+\textrm{Re}(\varrho) \le 2\textrm{Re}(\lambda_2) = b$ . If $\textrm{Re}(\lambda)+\textrm{Re}(\varrho) < b$ , then we have $\sum_{i=1}^{n}\textbf{T}_{i,n,\lambda,\varrho}=\textbf{O}(n)$ , based on (23). Thus, we only need to consider cases where $\textrm{Re}(\lambda)=\textrm{Re}(\varrho)=\textrm{Re}(\lambda_2)=b/2$ . Furthermore, we have $\nu_\lambda,\nu_\varrho \le \nu_2$ .
We again transform the sum into an integral, but first separate the term corresponding to $i=1$ from the sum and then use the change of variables $x=n^y = {\textrm{e}}^{y \ln(n)}$ . From these steps, we get
From (21), we know that $\textbf{T}_{1,n,\lambda,\varrho}=\textbf{O}\big(n\ln^{2\nu_2}(n)\big)$ , and so
We fix $y\in (0,1)$ . Then, by Lemma 2,
We reach similar conclusions when using the eigenvalue $\varrho$ .
Since $\textrm{Re}(\lambda)+\textrm{Re}(\varrho) = b$ , we define $\delta = \textrm{Im}(\lambda)+\textrm{Im}(\varrho)$ , and so $\lambda+\varrho = b+\delta\textrm{i}\mkern1mu$ . Then, we have
From Lemma 5, for $y\in (0,1]$ and $n\ge 2$ ,
Thus, the error bound $\textbf{o}(1)$ in (27) is also uniformly bounded, and so we apply Lebesgue dominated convergence to the new integral to obtain
If $\delta=0$ $(\bar{\varrho}={\lambda})$ , then the integral in (28) simplifies to $b^{-2\nu_2}(2\nu_2+1)^{-1}$ . Furthermore, this situation yields $\textbf{P}_\lambda=\textbf{P}_{\bar{\varrho}}=\bar{\textbf{P}}_\varrho$ , and so $\textbf{P}_\lambda^\top=\textbf{P}_\varrho^*$ . In similar fashion, we get $\textbf{N}_\lambda^\top=\textbf{N}_\varrho^*$ , and thus (28) becomes
If $\delta \ne 0$ , then, by setting $u=1-y$ , we have
as $n\to\infty$ , via integration by parts. Hence, when $\lambda \ne \bar{\varrho}$ , we get
As mentioned earlier, we may ignore pairs where $\textrm{Re}(\lambda) \le b/2$ or $\textrm{Re}(\varrho) \le b/2$ . We have now shown that cases of pairs such that $\textrm{Re}(\lambda)=\textrm{Re}(\varrho)=b/2$ but $\lambda\ne\bar{\varrho}$ are asymptotically negligible as well. Therefore, we get
as desired.
5.3. A martingale multivariate central limit theorem
Theorem 4. Consider an irreducible affine (k,s,b)-urn scheme. As $n\to\infty$ ,
where
Proof. The proof draws from the construction of $\textbf{Y}_n$ . By construction and (6), $\textbf{Y}_i$ is a martingale difference sequence, and thus so is $\textbf{Y}_i\textbf{F}_{i,n}$ . Furthermore, by (15), the sum of our martingale differences leads to $(\textbf{X}_n - \boldsymbol{\mu}_n)$ .
To prove the conditional Lindeberg condition, choose $\varepsilon>0$ and rewrite $\textbf{Y}_i\textbf{F}_{i,n}$ as $\textbf{Y}_i\textbf{F}_{i,n} = (\textbf{X}_i-\boldsymbol{\mu}_i) - (\textbf{X}_{i-1}-\boldsymbol{\mu}_{i-1})$ . Then, we have
since componentwise we have $\textbf{Q}_i \le s\textbf{1}$ . Based on this, there exists a natural number $n_0(\varepsilon)$ such that $\{\|\textbf{Y}_i\textbf{F}_{i,n}\|_{\mathcal{L}_2} > \xi_n\varepsilon\}$ is empty for all $n > n_0(\varepsilon)$ . Thus,
which implies convergence in probability as well. Lindeberg’s conditional condition is verified.
By Lemma 7, Theorem 3, and based on the construction of ${\mathcal Q}$ , we have
which implies convergence in probability. By the martingale central limit theorem (applied from [Reference Hall and Heyde9, pp. 57--59]),
Applying Slutsky’s theorem [Reference Slutsky28] to $\xi_n^{-{1}/{2}}\textbf{O}\big(1+n^{\textrm{Re}(\lambda_2)/b}\ln^{\nu_2}(n)\big)$ , we get the result:
We conclude this section with a strong law for the composition vector. The details of the proof are identical to those found in [Reference Janson14, p. 19].
Theorem 5. For an affine irreducible (k,s,b)-urn scheme, as $n\to\infty$ , $({1}/{n})\textbf{X}_n \buildrel \textrm{a.s.} \over \longrightarrow \lambda_1\textbf{v}_1$ .
6. Comments on large-index urns
As mentioned in [Reference Chauvin, Pouyanne and Sahnoun4, Reference Janson12, Reference Janson14] and others, $\textbf{X}_n$ possesses an almost-sure expansion based on spectral decomposition of $\textbf{A}$ . Unfortunately, deriving the covariance for large-index urns becomes trickier, as it may depend on the initial condition of the urn.
However, by using the recursion (4), we have a recursive relationship for the covariance that may be used regardless of core index size.
Corollary 2. For a linear k-color urn model with unordered multiple drawings, the covariance matrix $\boldsymbol{\Sigma}_n = \mathbb{E}[(\textbf{X}_n-\boldsymbol{\mu}_{n})^\top(\textbf{X}_n-\boldsymbol{\mu}_{n})]$ satisfies the following recurrence relation:
Proof. We first derive a recursion for $\mathbb{E}[\textbf{X}_n^\top\textbf{X}_n \mid \mathbb{F}_{n-1}]$ and then compute $\boldsymbol{\Sigma}_n = \mathbb{E}[\textbf{X}_n^\top\textbf{X}_n]-\boldsymbol{\mu}_{n}^\top\boldsymbol{\mu}_{n}$ .
To begin, consider $\mathbb{E}[\textbf{X}_n^\top\textbf{X}_n \mid \mathbb{F}_{n-1}]$ . Using (4), we have
We then substitute (25) into here to obtain
From here, we take the expected value of both sides and use $\mathbb{E}\big[\textbf{X}_j^\top\textbf{X}_j\big]=\boldsymbol{\Sigma}_j+\boldsymbol{\mu}_{j}^\top\boldsymbol{\mu}_{j}$ as well as the relationship $\boldsymbol{\mu}_j = \boldsymbol{\mu}_{j-1}(\textbf{I} + ({1}/{\tau_{j-1}})\textbf{A})$ to get
7. Growth under sampling with replacement
If the sample is drawn with replacement, we get results that are very similar to the case of sampling without replacement, with very similar proof techniques. So, we only point out the salient points of difference and very succinctly describe the main result.
To create a notational distinction between the without-replacement and the with-replacement sampling schemes, we use tilded variables to refer to the counterparts in the without-replacement schemes. Letting $\tilde{\textbf{Q}}_n$ be the sample drawn with replacement at step n, we get a multinomial distribution with conditional probability
The composition vector in this case has a mean value identical to that in the case of sampling without replacement. The covariance matrix develops slightly differently from that of sampling without replacement, but remains of the same order as that described in (20) and can be solved via Corollary 2, but with
Furthermore, for small- and critical-index urns, a central limit theorem follows identically to Theorem 4.
8. Examples
We give four examples of (k, s, b)-urns. The first two are small urns, one with a diagonalizable core and one with a nondiagonalizable core. We work out all the details in Example 4, and portray a more sketchy picture in the rest. Example 5 provides a useful application. Example 6 focuses on a critical urn, and we finish with a brief mention of the behavior of a large-index urn in Example 7.
Example 4. (Small diagonalizable core.) Consider an affine urn scheme with $s=2$ draws per sample and irreducible core
This is a (3,2,16)-urn. Theorem 1 completes the replacement matrix to
Suppose the urn starts in $\textbf{X}_0= (4,3,5)$ . With $n=2000$ , the computation in Corollary 1 gives
The eigenvalues of $\textbf{A}$ are $\lambda_1=16$ , $\lambda_2=1+\sqrt{5}$ , and $\lambda_3=1-\sqrt{5}$ . With $\lambda_2 < \lambda_1/2$ , this is a small-index case. The principal left eigenvector is $\textbf{v}_1 = \big(\frac{13}{55}, \frac{19}{55}, \frac{23}{55}\big)$ . As $n\to\infty$ , we have
We apply Theorem 3 to obtain $\boldsymbol{\Sigma}_\infty$ . We note that $\textbf{A}$ is diagonalizable, and so we may write $\textbf{A}=\textbf{T}\textbf{diag}(\lambda_1,\lambda_2,\lambda_3)\textbf{T}^{-1}$ , where
Set $\textbf{P}_{\lambda_1}=\textbf{1}^\top\textbf{v}_1$ , and take $\textbf{B}$ as in (13). Let ${\textrm{e}}^{s\textbf{A}}=\textbf{T}\textbf{diag}({\textrm{e}}^{16s},{\textrm{e}}^{(1+\sqrt{5})s},{\textrm{e}}^{(1+\sqrt{5})s})\textbf{T}^{-1}$ . Then,
We apply Theorem 4, and conclude that
Example 5. (Small nondiagonalizable core.) A new series of weekly art events is about to begin in the city. You and three of your friends decide to go together to the kick-off ceremony, but for the first official event, one friend cannot make it. To retain the group size at four, a new person is invited to join the group for the next event. For each subsequent event this process repeats, where three of the prior attenders group together with a new person to attend. Former attenders may return to each new event, and the group can be formed via individuals who are linked by an outside acquaintance (so, the network does not necessarily need to include someone who knows all attending members).
To estimate the total number of individuals from the collective that have attended altogether one, two, three, and at least four of these events over a long run, we may apply an affine urn model. Note that we begin with four individuals who have attended one event (the kick-off), and with each new art event we sample three former attendees to return with a new member. We always add one person to the group who has attended only one event (the new member), and if a former member is selected to attend the new event, they leave the list of those who attended i events to the list of those who have attended $(i+1)$ events, unless they have attended at least four events, in which case they remain in this category. Thus, the affine model with $s=3$ drawings has an initial state of $\textbf{X}_0=(4,0,0,0)$ and the core matrix
Here, we have an irreducible affine (4,3,1)-urn. The eigenvalues of $\textbf{A}$ are $\lambda_1=1$ and $\lambda_\ell=-3$ for $\ell\ge2$ . With $\lambda_2 < \lambda_1 / 2$ , this is a small urn. The leading left eigenvector is $\textbf{v}_1=\big(\frac{1}{4},\frac{3}{16},\frac{9}{64},\frac{27}{64}\big)$ . Since $\textbf{A}$ is 1-balanced, we have $\boldsymbol{\mu}_{\infty}=\big(\frac{1}{4},\frac{3}{16},\frac{9}{64},\frac{27}{64}\big)$ .
Note that $\textbf{A}$ is not diagonalizable, so the analysis requires the Jordan decomposition $\textbf{A}=\textbf{T}\boldsymbol{J}\textbf{T}^{-1}$ , where
Again, we set $\textbf{P}_{\lambda_1}=\textbf{1}_k^\top\textbf{v}_1$ and $\textbf{B}$ as in (13), and let
We apply Theorem 3 to get the limiting matrix $\boldsymbol{\Sigma}_\infty$ . We apply Theorem 4 to conclude that
Example 6. (Critical core.) Consider an affine urn model with $s=2$ and replacement matrix with the core
The eigenvalues of $\textbf{A}$ are $\lambda_1=6$ , $\lambda_2=3+\textrm{i}\mkern1mu\sqrt{3}$ , and $\lambda_3=3-\textrm{i}\mkern1mu\sqrt{3}$ . Here, the core index is $\frac 1 2$ , and thus this urn is critical. The principal left eigenvector is $\textbf{v}_1=\frac{1}{3}\textbf{1}$ . For large n, we have $({1}/{n})\mathbb{E}[\textbf{X}_n] \to 6\textbf{v}_1 = (2, 2, 2)$ .
Note that $\textbf{A}$ is diagonalizable, and so $\nu_2=0$ and we have
By Theorem 3, we get $({1}/({n\ln(n)}))\boldsymbol{\Sigma}_n \to \boldsymbol{\Sigma}_{\infty} = \textbf{P}^*_{\lambda_2}\textbf{B}\textbf{P}_{\lambda_2}+\textbf{P}^*_{\lambda_3}\textbf{B}\textbf{P}_{\lambda_3}$ . Applying Theorem 4, we conclude that
Example 7. (Large core.) Consider an affine urn scheme with $s=3$ draws per sample and irreducible core
This is a (3,3,12)-urn. Suppose the urn starts in $\textbf{X}_0= (3,2,2)$ . With $n=2000$ , the computation in Corollary 1 gives $({1}/{2000})\textbf{X}_{2000} \approx {(3.992, 4.062, 3.947)}$ .
The eigenvalues of $\textbf{A}$ here are $\lambda_1=12$ , $\lambda_2=7.5+3\textrm{i}\mkern1mu\sqrt{\frac{3}{4}}$ , and $\lambda_3=7.5-3\textrm{i}\mkern1mu\sqrt{\frac{3}{4}}$ . With $\textrm{Re}\;\lambda_2 > \lambda_1/2$ , this is a large-index case. The principal left eigenvector is $\textbf{v}_1 =\frac{1}{3} \textbf{1}$ . As $n \to \infty$ , we have $({1}/{n})\mathbb{E}[\textbf{X}_n] \to \boldsymbol{\mu}_\infty = 12\textbf{v}_1 = (4, 4, 4)$ .
Acknowledgements
We wish to thank the referee for their thorough examination of the manuscript, which led to a richer analysis on the connection between urn irreducibility and matrix-theory irreducibility.
Funding information
There are no funding bodies to thank relating to the creation of this article.
Competing interests
There were no competing interests to declare which arose during the preparation or publication process of this article.