Hostname: page-component-cd9895bd7-gvvz8 Total loading time: 0 Render date: 2024-12-27T08:45:18.841Z Has data issue: false hasContentIssue false

Self-normalized Cramér moderate deviations for a supercritical Galton–Watson process

Published online by Cambridge University Press:  24 April 2023

Xiequan Fan*
Affiliation:
Northeastern University at Qinhuangdao
Qi-Man Shao*
Affiliation:
Southern University of Science and Technology
*
*Postal address: School of Mathematics and Statistics, Northeastern University at Qinhuangdao, Qinhuangdao, 066004, Hebei, China. Email address: fanxiequan@hotmail.com
**Postal address: Department of Statistics and Data Science, SICM, National Center for Applied Mathematics Shenzhen, Southern University of Science and Technology, Shenzhen 518055, Guangdong, China. Email address: shaoqm@sustech.edu.cn
Rights & Permissions [Opens in a new window]

Abstract

Let $(Z_n)_{n\geq0}$ be a supercritical Galton–Watson process. Consider the Lotka–Nagaev estimator for the offspring mean. In this paper we establish self-normalized Cramér-type moderate deviations and Berry–Esseen bounds for the Lotka–Nagaev estimator. The results are believed to be optimal or near-optimal.

Type
Original Article
Copyright
© The Author(s), 2023. Published by Cambridge University Press on behalf of Applied Probability Trust

1. Introduction

A Galton–Watson process can be described as follows:

\begin{equation*} Z_0=1,\quad Z_{n+1}= \sum_{i=1}^{Z_n} X_{n,i} \quad \text{for $ n \geq 0$,} \end{equation*}

where $X_{n,i}$ is the offspring number of the ith individual of the generation n. Moreover, the random variables $ (X_{n,i})_{i\geq 1} $ are independent of each other with common distribution law

\begin{equation*} \mathbb{P}(X_{n,i} =k ) = p_k,\quad k \in \mathbb{N},\end{equation*}

and are also independent of $Z_n$ .

An important task in statistical inference for Galton–Watson processes is to estimate the average offspring number of an individual m, usually termed the offspring mean. Clearly we have

\[ m=\mathbb{E}Z_1=\mathbb{E} X_{n,i} =\sum_{k=0}^\infty k p_k. \]

Let v denote the standard variance of $Z_1$ , that is,

\begin{equation*} v^2=\mathbb{E} (Z_1-m)^2.\end{equation*}

To avoid triviality, assume that $v> 0$ . For estimation of the offspring mean m, the Lotka–Nagaev [Reference Lotka12, Reference Nagaev14] estimator $Z_{n+1}/Z_{n}$ plays an important role. Throughout the paper we assume that

\[ p_0=0. \]

Then the Lotka–Nagaev estimator is well-defined $\mathbb{P}$ -a.s. For the Galton–Watson processes, Athreya [Reference Athreya1] has established large deviations for the normalized Lotka–Nagaev estimator (see also Chu [Reference Chu3] for self-normalized large deviations); Ney and Vidyashankar [Reference Ney and Vidyashankar15, Reference Ney and Vidyashankar16] and He [Reference He9] obtained sharp rate estimates for the large deviation behavior of the Lotka–Nagaev estimator; Maaouia and Touati [Reference Maaouia and Touati13] established a self-normalized central limit theorem (CLT) for the maximum likelihood estimator of m; Bercu and Touati [Reference Bercu and Touati2] proved an exponential inequality for the Lotka–Nagaev estimator via self-normalized martingale methods. Alternative approaches for obtaining self-normalized exponential inequalities can be found in de la Peña, Lai, and Shao [Reference De la Peña, Lai and Shao4]. Despite the fact that the Lotka–Nagaev estimator is well studied, there is no result for self-normalized Cramér moderate deviations for the Lotka–Nagaev estimator. The main purpose of this paper is to fill this gap.

Let us briefly introduce our main result. Assume that $n_0, n \in \mathbb{N}$ . Notice that, by the classical CLT for independent and identically distributed (i.i.d.) random variables,

\[\textbf{X}_{n_n, n }=\biggl( \sqrt{Z_{n_0}} \biggl( \dfrac{Z_{n_0+1}}{Z_{ n_0}} -m \biggr),\ldots , \sqrt{Z_{ n_0+n-1}} \biggl( \dfrac{Z_{ n_0+n}}{Z_{n_0+n-1}} -m \biggr) \biggr)\]

asymptotically behaves like a vector of i.i.d. Gaussian random variables with mean 0 and variance $v^2$ (even if $n_0$ depends on n) and the convergence rate to Gaussian distribution is exponential; see Kuelbs and Vidyashankar [Reference Kuelbs and Vidyashankar11]. Because

\[\dfrac1n \sum_{k=n_0}^{n_0+n-1} Z_{ k} \biggl( \dfrac{Z_{ k+1}}{Z_{ k}} - m \biggr)^2\]

is an estimator of the offspring variance $v^2$ , it is natural to compare the self-normalized sum

\[M_{n_0,n}\,{:\!=}\, \dfrac{ {(n v^2)^{-1/2}} \sum_{k=n_0}^{n_0+n-1} \sqrt{Z_{ k}} ( {{Z_{ k+1}}/{Z_{ k}}} -m )}{\sqrt{ {(n v^2)^{-1}} \sum_{k=n_0}^{n_0+n-1} Z_{ k} ( {{Z_{ k+1}}/{Z_{ k}}} - m )^2 } }=\dfrac{ \sum_{k=n_0}^{n_0+n-1} \sqrt{Z_{ k}} ( {{Z_{ k+1}}/{Z_{ k}}} -m )}{\sqrt{\sum_{k=n_0}^{n_0+n-1} Z_{ k} ( {{Z_{ k+1}}/{Z_{ k}}} - m )^2 } }\]

to the tail of the Gaussian distribution. This is the main purpose of the paper. Assume that $ \mathbb{E} Z_1 ^{2+\rho}< \infty$ for some $ \rho \in (0, 1]$ . We prove the following self-normalized Cramér moderate deviations for the Lotka–Nagaev estimator. It holds that

(1) \begin{equation} \mathbb{P} (\!\pm \, M_{n_0,n}\geq x ) = (1-\Phi(x))(1+{\textrm{o}}(1) )\end{equation}

uniformly for $x \in [0, \, {\textrm{o}}( n^{\rho/(4+2\rho)} ))$ as $n\rightarrow \infty$ ; see Theorem 2.1. This type of result is user-friendly in the statistical inference of m, since in practice we usually do not know the variance $v^2$ or the distribution of $Z_1$ . Let $\kappa_n \in (0, 1)$ . Assume that

\begin{equation*} |\! \ln \kappa_n | ={\textrm{o}}\bigl( n^{\rho/(2+\rho)} \bigr), \quad n\rightarrow \infty .\end{equation*}

From (1) we can easily obtain a $1-\kappa_n$ confidence interval for m, for n large enough. Clearly, the right-hand side of (1) and $M_{n_0,n}$ do not depend on $v^2$ , so the confidence interval of m does not depend on $v^2$ ; see Proposition 3.1. Due to these significant advantages, the limit theory for self-normalized processes is attracting more and more attention. We refer to Jing, Shao, and Wang [Reference Jing, Shao and Wang10] and Fan et al. [Reference Fan, Grama, Liu and Shao8] for closely related results.

The paper is organized as follows. In Section 2 we present Cramér moderate deviations for the self-normalized Lotka–Nagaev estimator, provided that $(Z_n)_{n\geq0}$ can be observed. In Section 3 we present some applications of our results in statistics. The remaining sections are devoted to the proofs of theorems.

2. Main results

Assume that the total populations $(Z_{k})_{k\geq 0}$ of all generations can be observed. For $n_0, n \in \mathbb{N}$ , recall the definition of $M_{n_0,n}$ :

\begin{equation*}M_{n_0,n}= \dfrac{ \sum_{k=n_0}^{n_0+n-1} \sqrt{Z_{ k}} ( {{Z_{ k+1}}/{Z_{ k}}} -m )}{\sqrt{\sum_{k=n_0}^{n_0+n-1} Z_{ k} ( {{Z_{ k+1}}/{Z_{ k}}} - m )^2 } }.\end{equation*}

Here $n_0$ may depend on n. For instance, we can take $n_0$ as a function of n. We may take $n_0=0$ . However, in real-world applications it may happen that we know historical data $(Z_{k})_{ n_0 \leq k \leq n_0+n}$ for some $n_0\geq2$ , as well as the increment n of generation numbers, but do not know the data $(Z_{k})_{ 0 \leq k \leq n_0-1}$ . In such a case $M_{0,n}$ is no longer applicable to estimating m, whereas $M_{n_0,n}$ is suitable. Motivated by this problem, it would be better to consider the more general case $n_0\geq0$ instead of taking $n_0=0$ . As $(Z_k)_{k=n_0,\ldots,n_0+n}$ can be observed, $M_{n_0,n}$ can be regarded as a time-type self-normalized process for the Lotka–Nagaev estimator $Z_{k+1}/Z_{k}$ . The following theorem gives a self-normalized Cramér moderate deviation result for the Galton–Watson processes.

Theorem 2.1. Assume that $ \mathbb{E} Z_1 ^{2+\rho}< \infty$ for some $ \rho \in (0, 1]$ .

  1. (i) If $\rho \in (0, 1)$ , then for all $x \in [0,\, {\textrm{o}}( \sqrt{n} ))$ ,

    (2) \begin{equation}\biggl|\ln\dfrac{\mathbb{P}(M_{n_0,n} \geq x)}{1-\Phi(x)} \biggr| \leq C_\rho \biggl( \dfrac{ x^{2+\rho} }{n^{\rho/2}} + \dfrac{ (1+x)^{1-\rho(2+\rho)/4} }{n^{\rho(2-\rho)/8}} \biggr) ,\end{equation}
    where $C_\rho$ depends only on the constants $\rho, v$ and $ \mathbb{E} Z_1 ^{2+\rho}$ .
  2. (ii) If $\rho =1$ , then for all $x \in [0,\, {\textrm{o}}( \sqrt{n} ))$ ,

    (3) \begin{equation}\biggl|\ln\dfrac{\mathbb{P}(M_{n_0,n} \geq x)}{1-\Phi(x)} \biggr| \leq C \biggl( \dfrac{ x^{3} }{\sqrt{n} } + \dfrac{ \ln n }{\sqrt{n} } + \dfrac{ (1+x)^{1/4} }{n^{1/8}} \biggr) ,\end{equation}
    where C depends only on the constants v and $ \mathbb{E} Z_1 ^{3}$ .

In particular, inequalities (2) and (3) together imply that

(4) \begin{equation}\dfrac{\mathbb{P}(M_{n_0,n} \geq x)}{1-\Phi(x)} =1+{\textrm{o}}(1)\end{equation}

uniformly for $ x \in [0, \, {\textrm{o}}(n^{\rho/(4+2\rho)}))$ as $n\rightarrow \infty$ . Moreover, the same inequalities remain valid when

\[ \frac{\mathbb{P}( M_{n_0,n} \geq x)}{1-\Phi(x)} \]

is replaced by

\[ \frac{\mathbb{P}(M_{n_0,n} \leq -x)}{ \Phi (\!-x)}. \]

Notice that the mean of a standard normal random variable is 0. By the maximum likelihood method, it is natural to let $M_{n_0,n}=0$ ; then we have

\[\sum_{k=n_0}^{n_0+n-1} \sqrt{Z_{ k}} \biggl( \frac{Z_{ k+1}}{Z_{ k}} -m \biggr)=0,\]

which implies that

\[\overline{m}_n \,{:\!=}\, \dfrac{1}{\sum_{k=n_0}^{n_0+n-1} \sqrt{Z_{ k}} } \sum_{k=n_0}^{n_0+n-1} \sqrt{Z_{ k}} \biggl( \dfrac{Z_{ k+1}}{Z_{ k}} \biggr)\]

can be regarded as a random weighted Lotka–Nagaev estimator for m.

Equality (4) implies that $\mathbb{P}(M_{n_0,n} \leq x) \rightarrow \Phi(x)$ as n tends to $\infty$ . Thus Theorem 2.1 implies the central limit theory for $M_{n_0,n}$ . Moreover, equality (4) states that the relative error of normal approximation for $M_{n_0,n}$ tends to zero uniformly for $ x \in [0, \, {\textrm{o}}(n^{\rho/(4+2\rho)})) $ as $n\rightarrow \infty$ .

Theorem 2.1 implies the following moderate deviation principle (MDP) result for the time-type self-normalized Lotka–Nagaev estimator.

Corollary 2.1. Assume the conditions of Theorem 2.1. Let $(a_n)_{n\geq1}$ be any sequence of real numbers satisfying $a_n \rightarrow \infty$ and $a_n/ \sqrt{n} \rightarrow 0$ as $n\rightarrow \infty$ . Then, for each Borel set B,

\begin{equation*}- \inf_{x \in B^o}\dfrac{x^2}{2} \leq \liminf_{n\rightarrow \infty}\dfrac{1}{a_n^2}\ln \mathbb{P}\biggl( \dfrac{M_{n_0,n} }{ a_n } \in B \biggr) \leq \limsup_{n\rightarrow \infty}\dfrac{1}{a_n^2}\ln \mathbb{P}\biggl(\dfrac{ M_{n_0,n} }{ a_n } \in B \biggr) \leq - \inf_{x \in \overline{B}}\dfrac{x^2}{2} , \end{equation*}

where $B^o$ and $\overline{B}$ denote the interior and the closure of B, respectively.

Remark 2.1. From (2) and (3), it is easy to derive the following Berry–Esseen bound for the self-normalized Lotka–Nagaev estimator:

\begin{equation*}\sup_{x \in \mathbb{R} }|\mathbb{P}(M_{n_0,n} \leq x) - \Phi(x) | \leq \dfrac{ C_\rho }{n^{\rho(2-\rho)/8}},\end{equation*}

where $C_\rho$ depends only on the constants $\rho, v$ and $ \mathbb{E} Z_1 ^{2+\rho}$ . When $\rho> 1$ , by the self-normalized Berry–Esseen bound for martingales in Fan and Shao [Reference Fan and Shao6], we can get a Berry–Esseen bound of order $n^{- \rho/(6+2\rho) }$ .

The last remark gives a self-normalized Berry–Esseen bound for the Lotka–Nagaev estimator, while the next theorem presents a normalized Berry–Esseen bound for the Lotka–Nagaev estimator. Denote

\[H_{n_0, n}= \dfrac{1}{ \sqrt{n } v }\sum_{k=n_0}^{n_0+n-1} \sqrt{Z_{ k}} \biggl( \dfrac{Z_{k+1}}{Z_{k}} -m \biggr).\]

Notice that the random variables $(X_{k,i})_{1\leq i\leq Z_k}$ have the same distribution as $Z_1$ , and that $(X_{k,i})_{1\leq i\leq Z_k}$ are independent of $Z_k$ . Then, for the Galton–Watson processes, it holds that

\[\mathbb{E} [ (Z_{k+1} -m Z_{k})^2 \mid Z_k] = \mathbb{E} \Biggl[ \Biggl( \sum_{i=1}^{Z_k} (X_{k,i} -m) \Biggr)^2 \biggm| Z_k \Biggr] = Z_{k} v^2.\]

It is easy to see that

\[H_{n_0, n}= \sum_{k=n_0}^{n_0+n-1} \frac{1}{ \sqrt{\, n v^2 / Z_{ k} }} \biggl( \frac{Z_{k+1}}{Z_{k}} -m \biggr) .\]

Thus $H_{n_0,n}$ can be regarded as a normalized process for the Lotka–Nagaev estimator $Z_{k+1}/Z_{k}$ . We have the following normalized Berry–Esseen bounds for the Galton–Watson processes.

Theorem 2.2. Assume the conditions of Theorem 2.1.

  1. (i) If $\rho \in (0, 1)$ , then

    (5) \begin{equation} \sup_{x \in \mathbb{R}}|\mathbb{P}( H_{n_0,n} \leq x) - \Phi(x) | \leq \dfrac{ C_\rho }{ n^{\rho/2}},\end{equation}
    where $C_\rho$ depends only on $\rho, v$ and $ \mathbb{E} Z_1 ^{2+\rho}$ .
  2. (ii) If $\rho =1$ , then

    (6) \begin{equation}\sup_{x \in \mathbb{R}}|\mathbb{P}( H_{n_0,n} \leq x) - \Phi(x) | \leq C \dfrac{ \ln n }{ \sqrt{n} },\end{equation}
    where C depends only on v and $ \mathbb{E} Z_1 ^{3}$ .

Moreover, the same inequalities remain valid when $H_{n_0,n}$ is replaced by $-H_{n_0,n}$ .

The convergence rates of (5) and (6) are identical to the best possible convergence rates of the Berry–Esseen bounds for martingales; see Theorem 2.1 of Fan [Reference Fan5] and the associated comment. Notice that $H_{n_0,n}$ is a martingale with respect to the natural filtration.

3. Applications

Cramér moderate deviations certainly have many applications in statistics.

3.1. p-value for hypothesis testing

Self-normalized Cramér moderate deviations can be applied to hypothesis testing of m for the Galton–Watson processes. When $(Z_{k})_{k=n_0,\ldots,n_0+n}$ can be observed, we can use Theorem 2.1 to estimate the p-value. Assume that $ \mathbb{E} Z_1 ^{2+\rho}< \infty$ for some $0 < \rho \leq 1$ , and that $m> 1$ . Let $(z_{k})_{k=n_0,\ldots,n_0+n}$ be the observed value of the $(Z_{k})_{k=n_0,\ldots,n_0+n}$ . In order to estimate the offspring mean m, we can make use of the Harris estimator [Reference Bercu and Touati2] given by

\[\widehat{m}_n=\dfrac{\sum_{k=n_0}^{n_0+n-1} Z_{ k+1}}{\sum_{k=n_0}^{n_0+n-1}Z_{ k}}.\]

Then the observation for the Harris estimator is

\[\widehat{m}_n=\dfrac{\sum_{k=n_0}^{n_0+n-1} z_{ k+1}}{\sum_{k=n_0}^{n_0+n-1}z_{ k}}.\]

By Theorem 2.1, it is easy to see that

(7) \begin{equation} \dfrac{\mathbb{P}( M_{n_0,n} \geq x)}{1-\Phi(x)}=1+{\textrm{o}}(1)\quad \textrm{and} \quad \dfrac{\mathbb{P}(M_{n_0,n} \leq- x)}{1-\Phi(x)}=1+{\textrm{o}}(1)\end{equation}

uniformly for $ x \in [0, {\textrm{o}}( n^{\rho/(4+2\rho)} ))$ . Notice that $ 1-\Phi(x) = \Phi (\!-x). $ Thus, when $|\widetilde{m}_n|={\textrm{o}}( n^{\rho/(4+2\rho)} )$ , by (7), the probability $\mathbb{P}(M_{n_0,n} > |\widetilde{m}_n|)$ is almost equal to $2 \Phi (\!-|\widetilde{m}_n|) $ , where

\[ \widetilde{m}_n= \dfrac{ \sum_{k=n_0}^{n_0+n-1} \sqrt{z_{ k}} ( z_{ k+1}/z_{ k} -\widehat{m}_n )}{\sqrt{\sum_{k=n_0}^{n_0+n-1} z_{ k} ( z_{ k+1}/z_{ k} - \widehat{m}_n )^2 } }. \]

3.2. Construction of confidence intervals

Assume the data $(Z_{k})_{k\geq 0}$ can be observed. Cramér moderate deviations can also be applied to the construction of confidence intervals of m. We use Theorem 2.1 to construct confidence intervals.

Proposition 3.1. Assume that $\mathbb{E} Z_1 ^{2+\rho}< \infty$ for some $ \rho \in (0, 1]$ . Let $\kappa_n \in (0, 1)$ . Assume that

(8) \begin{equation} | \!\ln \kappa_n | ={\textrm{o}} \bigl( n^{\rho/(2+\rho)} \bigr) .\end{equation}

Let

\begin{align*}a_{n_0, n}&= \Biggl( \sum_{k=n_0}^{n_0+n-1} \sqrt{Z_{k}} \Biggr)^2- (\Phi^{-1}(1-\kappa_n /2) )^2 \sum_{k=n_0}^{n_0+n-1} Z_{k}, \\b_{n_0, n}&= 2 (\Phi^{-1}(1-\kappa_n /2) )^2 \sum_{k=n_0}^{n_0+n-1} Z_{k+1} - 2 \Biggl( \sum_{k=n_0}^{n_0+n-1} \dfrac{Z_{k+1}}{\sqrt{Z_{k }}} \Biggr)\Biggl( \sum_{k=n_0}^{n_0+n-1} \sqrt{Z_{k}} \Biggr), \\c_{n_0, n}&= \Biggl( \sum_{k=n_0}^{n_0+n-1} \dfrac{Z_{k+1}}{\sqrt{Z_{k }}} \Biggr)^2 - (\Phi^{-1}(1-\kappa_n /2) )^2 \sum_{k=n_0}^{n_0+n-1} \dfrac{Z_{k+1}^2}{Z_{k}}.\end{align*}

Then $[A_{n_0, n},B_{n_0, n}]$ , with

\begin{equation*}A_{n_0, n}=\dfrac{- b_{n_0, n} - \sqrt{b_{n_0, n}^2-4 a_{n_0, n}c_{n_0, n} }}{ 2 a_{n_0, n} }\end{equation*}

and

\begin{equation*}B_{n_0, n}=\dfrac{ -b_{n_0, n} + \sqrt{b_{n_0, n}^2-4 a_{n_0, n}c_{n_0, n} }}{ 2 a_{n_0, n} },\end{equation*}

is a $1-\kappa_n$ confidence interval for m, for n large enough.

Proof. Notice that $ 1-\Phi(x) = \Phi (\!-x). $ Theorem 2.1 implies that

(9) \begin{equation} \dfrac{\mathbb{P}( M_{n_0,n} \geq x)}{1-\Phi(x)}=1+{\textrm{o}}(1)\quad \textrm{and} \quad \dfrac{\mathbb{P}(M_{n_0,n} \leq- x)}{1-\Phi(x)}=1+{\textrm{o}}(1)\end{equation}

uniformly for $0\leq x={\textrm{o}}( n^ {\rho/(4+2\rho)} )$ ; see (4). Notice that the inverse function $\Phi^{-1}$ of a standard normal distribution function $\Phi$ has the following asymptotic expansion:

\[\Phi^{-1}(1-p)=\sqrt{\ln(1/p^2)-\ln\ln(1/p^2) -\ln(2\pi) }+ {\textrm{o}}(p) ,\quad p \searrow 0.\]

In particular, this says that for any positive sequence $(\kappa_n)_{n\geq1} $ that converges to zero, as $n\rightarrow \infty$ , we have

\[\Phi^{-1}( 1-\kappa_n/2) = \sqrt{2|\!\ln \kappa_n |}+{\textrm{o}}(\sqrt{ |\!\ln \kappa_n |}).\]

Thus, when $\kappa_n$ satisfies the condition (8), the upper $(\kappa_n/2)$ th quantile of a standard normal distribution is of order ${\textrm{o}}( n^ {\rho/(4+2\rho)} )$ . Then, applying (9) to the last equality, we complete the proof of Proposition 3.1. Notice that $A_{n_0, n}$ and $B_{n_0, n}$ are solutions of the following equation:

\[\dfrac{ \sum_{k=n_0}^{n_0+n-1} \sqrt{Z_{ k}} ( Z_{ k+1}/Z_{ k} -x )}{\sqrt{\sum_{k=n_0}^{n_0+n-1} Z_{ k} ( Z_{ k+1}/Z_{ k} - x )^2 } }=\Phi^{-1}(1-\kappa_n /2).\]

This completes the proof of Proposition 3.1.

Assume the parameter $v^2$ is known. When $v^2$ is known, we can apply normalized Berry–Esseen bounds (see Theorem 2.2) to construct confidence intervals.

Proposition 3.2. Assume that $\mathbb{E} Z_1 ^{2+\rho}< \infty$ for some $ \rho \in (0, 1]$ . Let $\kappa_n \in (0, 1)$ . Assume that

(10) \begin{equation} |\! \ln \kappa_n | ={\textrm{o}}(\! \ln n ) .\end{equation}

Then $[A_n,B_n]$ , with

\begin{equation*}A_n=\dfrac{ \sum_{k=n_0}^{n_0+n } Z_{k+1}/\sqrt{Z_k}-\sqrt{n } v \Phi^{-1}(1-\kappa_n/2) }{ \sum_{k=n_0}^{n_0+n } \sqrt{Z_{k}}}\end{equation*}

and

\begin{equation*}B_n=\dfrac{ \sum_{k=n_0}^{n_0+n } Z_{k+1}/\sqrt{Z_k}+\sqrt{n } v \Phi^{-1}(1-\kappa_n/2) }{ \sum_{k=n_0}^{n_0+n } \sqrt{Z_{k}}} ,\end{equation*}

is a $1-\kappa_n$ confidence interval for m, for n large enough.

Proof. Theorem 2.2 implies that

(11) \begin{equation} \dfrac{\mathbb{P}(H_{n_0,n} \geq x)}{1-\Phi(x)}=1+{\textrm{o}}(1)\quad \textrm{and} \quad \dfrac{\mathbb{P}(H_{n_0,n} \leq -x)}{1-\Phi(x)}=1+{\textrm{o}}(1)\end{equation}

uniformly for $0\leq x={\textrm{o}}( \sqrt{\ln n} )$ . The upper $(\kappa_n/2)$ th quantile of a standard normal distribution satisfies

\[\Phi^{-1}( 1-\kappa_n/2) = {\textrm{O}}(\sqrt{|\! \ln \kappa_n |} ),\]

which, by (10), is of order $ {\textrm{o}}( \sqrt{\ln n} )$ . Proposition 3.2 follows from applying (11) to $H_{n_0,n}$ .

4. Proof of Theorem 2.1

In the proof of Theorem 2.1 we will use the following lemma (see Corollary 2.3 of Fan et al. [Reference Fan, Grama, Liu and Shao7]), which gives self-normalized Cramér moderate deviations for martingales.

Lemma 4.1. Let $(\eta_k, \mathcal{F}_k)_{k=1,\ldots,n}$ be a finite sequence of martingale differences. Assume that there exist a constant $\rho \in (0, 1]$ and numbers $\gamma_n>0$ and $\delta_n\geq 0$ satisfying $\gamma_n, \delta_n \rightarrow 0$ such that for all $1\leq i\leq n$ ,

(12) \begin{equation} \mathbb{E}[ |\eta_{k}|^{2+\rho} \mid \mathcal{F}_{k-1 } ] \leq \gamma_n^\rho \mathbb{E}[ \eta_{k} ^2 \mid \mathcal{F}_{k-1 } ]\end{equation}

and

(13) \begin{equation} \Biggl| \sum_{k=1}^n \mathbb{E}[ \eta_{k} ^2 \mid \mathcal{F}_{k-1 } ]-1\Biggr| \leq \delta_n^2 \quad {a.s.}\end{equation}

Denote

\[V_n= \dfrac{\sum_{k=1}^n \eta_{k} }{ \sqrt{\sum_{k=1}^n \eta_{k}^2} }\]

and

\[\widehat{\gamma}_n(x, \rho) = \dfrac{ \gamma_n^{ \rho(2-\rho)/4 } }{ 1+ x ^{ \rho(2+\rho)/4 }}.\]
  1. (i) If $\rho \in (0, 1)$ , then for all $0\leq x = {\textrm{o}}( \gamma_n^{-1} )$ ,

    \begin{equation*}\biggl|\ln \dfrac{\mathbb{P}(V_n \geq x)}{1-\Phi(x)} \biggr| \leq C_{\rho} \bigl( x^{2+\rho} \gamma_n^\rho+ x^2 \delta_n^2 +(1+x)( \delta_n + \widehat{\gamma}_n(x, \rho) ) \bigr) .\end{equation*}
  2. (ii) If $\rho =1$ , then for all $0\leq x = {\textrm{o}}( \gamma_n^{-1} )$ ,

    \begin{equation*}\biggl|\ln \dfrac{\mathbb{P}(V_n \geq x)}{1-\Phi(x)} \biggr| \leq C \bigl( x^{3} \gamma_n + x^2 \delta_n^2+(1+x)( \delta_n+ \gamma_n |\!\ln \gamma_n| +\widehat{\gamma}_n(x, 1) ) \bigr) .\end{equation*}

Now we are in a position to prove Theorem 2.1. Denote

\[\hat{\xi}_{k+1}= \sqrt{Z_{ k}} ( Z_{ k+1}/Z_{ k} -m ),\]

$\mathcal{F}_{n_0} =\{ \emptyset, \Omega \} $ and $\mathcal{F}_{k+1}=\sigma \{ Z_{i}\colon n_0\leq i\leq k+1 \}$ for all $k\geq n_0$ . Notice that $X_{k,i}$ is independent of $Z_k$ . Then it is easy to verify that

\begin{align*} \mathbb{E}[ \hat{\xi}_{k+1} \mid \mathcal{F}_{k } ] &= Z_{ k} ^{-1/2} \mathbb{E}[ Z_{ k+1} -mZ_{ k} \mid \mathcal{F}_{k } ]\notag \\ &= Z_{ k} ^{-1/2} \sum_{i=1}^{Z_k} \mathbb{E}[ X_{ k, i} -m \mid \mathcal{F}_{k } ] \notag \\& = Z_{ k} ^{-1/2} \sum_{i=1}^{Z_k} \mathbb{E}[ X_{ k, i} -m ] \notag \\& =0.\end{align*}

Thus $(\hat{\xi}_k, \mathcal{F}_k)_{k=n_0+1,\ldots,n_0+n}$ is a finite sequence of martingale differences. Notice that $X_{ k, i}-m$ , $i\geq 1$ , are centered and independent random variables. Thus the following equalities hold:

\begin{align*} \sum_{k=n_0}^{n_0+n-1} \mathbb{E}[ \hat{\xi}_{k+1}^2 \mid \mathcal{F}_{k } ] &=\sum_{k=n_0}^{n_0+n-1} Z_{k}^{-1} \mathbb{E}[ ( Z_{ k+1} -mZ_{ k} )^2 \mid \mathcal{F}_{k } ]\notag \\& = \sum_{k=n_0}^{n_0+n-1} Z_{k}^{-1} \mathbb{E}\Biggl[ \Biggl( \sum_{i=1}^{Z_k} (X_{ k, i} -m) \Biggr)^2 \biggm|\mathcal{F}_{k } \Biggr] \notag \\&=\sum_{k=n_0}^{n_0+n-1} Z_{k}^{-1} Z_k \mathbb{E}[ (X_{ k, i} -m)^2 ] \notag \\&= n v^2. \end{align*}

Moreover, it is easy to see that

(14) \begin{align} \mathbb{E}[ |\hat{\xi}_{k+1}|^{2+\rho} \mid \mathcal{F}_{k } ] &= Z_{k}^{-1-\rho/2} \mathbb{E}[ | Z_{ k+1} -mZ_{ k} |^{2+\rho } \mid \mathcal{F}_{k } ] \notag \\[3pt] & = Z_{k}^{-1-\rho/2} \mathbb{E}\Biggl[ \Biggl| \sum_{i=1}^{Z_k} (X_{ k, i} -m) \Biggr|^{2+\rho } \biggm| \mathcal{F}_{k } \Biggr] . \end{align}

By Rosenthal’s inequality, we have

\begin{align*} \mathbb{E}\Biggl[ \Biggl| \sum_{i=1}^{Z_k} (X_{ k, i} -m) \Biggr|^{2+\rho } \biggm|\mathcal{F}_{k } \Biggr]&\leq C^{\prime}_\rho \Biggl(\Biggl( \sum_{i=1}^{Z_k}\mathbb{E}( X_{ k, i} -m )^2 \Biggr)^{1+\rho/2} + \sum_{i=1}^{Z_k}\mathbb{E}| X_{ k, i} -m |^{2+\rho } \Biggr) \\[3pt] & \leq C^{\prime}_\rho \bigl( Z_k^{1+\rho/2} v^{2+\rho} + Z_k \mathbb{E}| Z_1 -m |^{2+\rho } \bigr).\end{align*}

Since the set of extinction of the process $(Z_{k})_{k\geq 0}$ is negligible with respect to the annealed law $\mathbb{P}$ , we have $Z_k\geq 1$ for any k. From (14), by the last inequality and the fact $Z_k\geq 1$ , we deduce that

\begin{align} \mathbb{E}[ |\hat{\xi}_{k+1}|^{2+\rho} \mid \mathcal{F}_{k } ] &\leq C^{\prime}_\rho \bigl( v^{ \rho} + \mathbb{E}| Z_1 -m |^{2+\rho }/v^2 \bigr) v^2 \notag \\[3pt] & = C^{\prime}_\rho \bigl( v^{ \rho} + \mathbb{E}| Z_1 -m |^{2+\rho }/v^2 \bigr) \mathbb{E}[ \hat{\xi}_{k+1}^{2} \mid \mathcal{F}_{k } ]\notag \\[3pt] &\leq C^{\prime}_\rho \bigl( v^{ \rho} + 2^{1+\rho} \bigl(\mathbb{E} Z_1 ^{2+\rho } + m ^{2+\rho }\bigr)/v^2 \bigr) \mathbb{E}[ \hat{\xi}_{k+1}^{2} \mid \mathcal{F}_{k } ] . \notag\end{align}

By Jensen’s inequality, we have $ m ^{2+\rho } = (\mathbb{E} Z_1) ^{2+\rho } \leq \mathbb{E} Z_1 ^{2+\rho }. $ Thus we have

\begin{equation*} \mathbb{E}[ |\hat{\xi}_{k+1}|^{2+\rho} \mid \mathcal{F}_{k } ] \leq C_\rho \bigl( v^{ \rho} + \mathbb{E} Z_1 ^{2+\rho }/v^2 \bigr) \mathbb{E}[ \hat{\xi}_{k+1}^{2} \mid \mathcal{F}_{k } ]. \end{equation*}

Let $\eta_k =\hat{\xi}_{n_0+k}/(\sqrt{n} v)$ and $\mathcal{F}_{k}=\mathcal{F}_{n_0+k}$ . Then $(\eta_k, \mathcal{F}_{k})_{k=1,\ldots,n}$ is a martingale difference sequence and satisfies conditions (12) and (13) with $ \delta_n=0$ and

\[ \gamma_n=\bigl( C_\rho \bigl( v^{ \rho} + \mathbb{E} Z_1 ^{2+\rho }/v^2 \bigr) \bigr)^{1/\rho}/ {(\sqrt{n} v)}. \]

Clearly,

\[M_{n_0,n}= \dfrac{ \sum_{k=1}^{n} \eta_{k } }{\sqrt{\sum_{k=1}^{n} \eta_{k }^2 } }.\]

Applying Lemma 4.1 to $(\eta_k, \mathcal{F}_{k})_{k=1,\ldots,n}$ , we obtain the desired inequalities. Notice that for any $\rho \in (0, 1]$ and all $x\geq 0$ , the following inequality holds:

\[ \dfrac{ 1+x}{ 1+ x ^{ \rho(2+\rho)/4 }} \leq C_\rho (1+x)^{1-\rho(2+\rho)/4}. \]

5. Proof of Corollary 2.1

We only give a proof of Corollary 2.1 for $\rho \in (0, 1)$ . The proof for $\rho=1$ is similar. We first show that for any Borel set $B\subset \mathbb{R}$ ,

(15) \begin{equation}\limsup_{n\rightarrow \infty}\dfrac{1}{a_n^2}\ln \mathbb{P}\biggl(\dfrac{M_{n_0,n} }{ a_n } \in B \biggr) \leq - \inf_{x \in \overline{B}}\dfrac{x^2}{2}.\end{equation}

When $B =\emptyset$ , the last inequality is obvious, with $-\inf_{x \in \emptyset}{{x^2}/{2}}=-\infty$ . Thus we may assume that $B \neq \emptyset$ . Let $x_0=\inf_{x\in B} |x|$ . Clearly, we have $x_0\geq\inf_{x\in \overline{B}} |x|$ . Then, by Theorem 2.1, it follows that for $\rho \in (0, 1)$ and $a_n ={\textrm{o}}(\sqrt{n})$ ,

\begin{align*} \mathbb{P}\biggl(\dfrac{ M_{n_0,n} }{ a_n } \in B \biggr) &\leq \mathbb{P}( | M_{n_0,n}| \geq a_n x_0)\\ & \leq 2( 1-\Phi ( a_nx_0)) \exp\biggl\{ C_\rho \biggl( \dfrac{ ( a_nx_0)^{2+\rho} }{n^{\rho/2}} + \dfrac{ (1+a_nx_0)^{1-\rho(2+\rho)/4} }{n^{\rho(2-\rho)/8}} \biggr) \biggr\}.\end{align*}

Using the inequalities

(16) \begin{equation}\dfrac{1}{\sqrt{2 \pi}(1+x)} {\textrm{e}}^{-x^2/2} \leq 1-\Phi ( x ) \leq \dfrac{1}{\sqrt{ \pi}(1+x)} {\textrm{e}}^{-x^2/2}, \quad x\geq 0,\end{equation}

and the fact that $a_n \rightarrow \infty$ and $a_n/\sqrt{n}\rightarrow 0$ , we obtain

\begin{equation*}\limsup_{n\rightarrow \infty}\dfrac{1}{a_n^2}\ln \mathbb{P}\biggl(\dfrac{ M_{n_0,n} }{ a_n } \in B \biggr) \leq -\dfrac{x_0^2}{2} \leq - \inf_{x \in \overline{B}}\dfrac{x^2}{2} ,\end{equation*}

which gives (15).

Next we prove that

(17) \begin{equation}\liminf_{n\rightarrow \infty}\dfrac{1}{a_n^2}\ln \mathbb{P}\biggl(\dfrac{ M_{n_0,n} }{ a_n } \in B \biggr) \geq - \inf_{x \in B^o}\dfrac{x^2}{2} .\end{equation}

When $B^o =\emptyset$ , the last inequality is obvious, with $ -\inf_{x \in \emptyset}{{x^2}/{2}}=-\infty$ . Thus we may assume that $B^o \neq \emptyset$ . Since $B^o$ is an open set, for any given small $\varepsilon_1>0$ there exists an $x_0 \in B^o$ such that

\begin{equation*} 0< \dfrac{x_0^2}{2} \leq \inf_{x \in B^o}\dfrac{x^2}{2} +\varepsilon_1.\end{equation*}

Again by the fact that $B^o$ is an open set, for $x_0 \in B^o$ and all sufficiently small $\varepsilon_2 \in (0, |x_0|] $ , we have $(x_0-\varepsilon_2, x_0+\varepsilon_2] \subset B^o$ . Without loss of generality, we may assume that $x_0>0$ . Clearly, we have

(18) \begin{align}\mathbb{P}\biggl(\dfrac{M_{n_0,n} }{ a_n } \in B \biggr) &\geq \mathbb{P}( M_{n_0,n} \in (a_n ( x_0-\varepsilon_2), a_n( x_0+\varepsilon_2)] ) \notag \\&= \mathbb{P}( M_{n_0,n} \geq a_n ( x_0-\varepsilon_2) )-\mathbb{P}( M_{n_0,n} \geq a_n( x_0+\varepsilon_2) x). \end{align}

Again by Theorem 2.1, it is easy to see that for $a_n \rightarrow \infty$ and $ a_n ={\textrm{o}}(\sqrt{n} )$ ,

\[\lim_{n\rightarrow \infty} \dfrac{\mathbb{P}(M_{n_0,n} \geq a_n( x_0+\varepsilon_2) ) }{\mathbb{P}( M_{n_0,n} \geq a_n ( x_0-\varepsilon_2) ) } =0 .\]

From (18), by the last line and Theorem 2.1, for all n large enough and $a_n ={\textrm{o}}(\sqrt{n} )$ it holds that

\begin{align*}\mathbb{P}\biggl(\dfrac{M_{n_0,n} }{ a_n } \in B \biggr) & \geq \dfrac12 \mathbb{P}( M_{n_0,n} \geq a_n ( x_0-\varepsilon_2) ) \\& \geq \dfrac12 ( 1-\Phi ( a_n( x_0-\varepsilon_2))) \exp\biggl\{ -C_\rho \biggl( \dfrac{ ( a_nx_0)^{2+\rho} }{n^{\rho/2}} + \dfrac{ (1+a_nx_0)^{1-\rho(2+\rho)/4} }{n^{\rho(2-\rho)/8}} \biggr) \biggr\}.\end{align*}

Using (16) and the fact that $a_n \rightarrow \infty$ and $a_n/\sqrt{n}\rightarrow 0$ , after some calculations we get

\begin{equation*} \liminf_{n\rightarrow \infty}\dfrac{1}{a_n^2}\ln \mathbb{P}\biggl(\dfrac{M_{n_0,n} }{ a_n } \in B \biggr) \geq - \dfrac{1}{2}( x_0-\varepsilon_2)^2 . \end{equation*}

Letting $\varepsilon_2\rightarrow 0$ , we deduce that

\begin{equation*}\liminf_{n\rightarrow \infty}\dfrac{1}{a_n^2}\ln \mathbb{P}\biggl(\dfrac{ M_{n_0,n} }{ a_n } \in B \biggr) \geq - \dfrac{x_0^2}{2} \geq -\inf_{x \in B^o}\dfrac{x^2}{2} -\varepsilon_1.\end{equation*}

Since $\varepsilon_1$ can be arbitrarily small, we get (17). Combining (15) and (17), we complete the proof of Corollary 2.1.

6. Proof of Theorem 2.2

Recall the martingale differences $(\eta_k, \mathcal{F}_{k})_{k=1,\ldots,n }$ defined in the proof of Theorem 2.1. Then $\eta_k$ satisfies conditions (12) and (13) with $\delta_n=0$ and

\[\gamma_n=\bigl( C_\rho \bigl( v^{ \rho} + \mathbb{E} Z_1 ^{2+\rho }/v^2 \bigr) \bigr)^{1/\rho}/ \sqrt{n} v.\]

Clearly, we have $H_{n_0,n}= \sum_{k=1}^{n} \eta_k$ . Applying Theorem 2.1 of Fan [Reference Fan5] to $(\eta_k, \mathcal{F}_{k})_{k=1,\ldots,n}$ , we obtain the desired inequalities.

Acknowledgements

The authors would like to thank the two referees for their helpful remarks and suggestions.

Funding information

The research is partially supported by the National Nature Science Foundation of China NSFC (grants 12031005 and 11971063) and Shenzhen Outstanding Talents Training Fund, China.

Competing interests

There were no competing interests to declare which arose during the preparation or publication process of this article.

References

Athreya, K. B. (1994). Large deviation rates for branching processes I: Single type case. Ann. Appl. Prob. 4, 779790.CrossRefGoogle Scholar
Bercu, B. and Touati, A. (2008). Exponential inequalities for self-normalized martingales with applications. Ann. Appl. Prob. 18, 18481869.CrossRefGoogle Scholar
Chu, W. (2018). Self-normalized large deviation for supercritical branching processes. J. Appl. Prob. 55, 450458.Google Scholar
De la Peña, V. H., Lai, T. L. and Shao, Q. M. (2009). Self-Normalized Processes, Limit Theorems and Statistical Applications. Springer.Google Scholar
Fan, X. (2019). Exact rates of convergence in some martingale central limit theorems. J. Math. Anal. Appl. 469, 10281044.CrossRefGoogle Scholar
Fan, X. and Shao, Q. M. (2018). Berry–Esseen bounds for self-normalized martingales. Commun. Math. Statist. 6, 1327.CrossRefGoogle Scholar
Fan, X., Grama, I., Liu, Q. and Shao, Q. M. (2019). Self-normalized Cramér type moderate deviations for martingales. Bernoulli 25, 27932823.CrossRefGoogle Scholar
Fan, X., Grama, I., Liu, Q. and Shao, Q. M. (2020). Self-normalized Cramér type moderate deviations for stationary sequences and applications. Stoch. Process. Appl. 130, 51245148.CrossRefGoogle Scholar
He, H. (2016). On large deviation rates for sums associated with Galton–Watson processes. Adv. Appl. Prob. 48, 672690.CrossRefGoogle Scholar
Jing, B. Y., Shao, Q. M. and Wang, Q. (2003). Self-normalized Cramér-type large deviations for independent random variables. Ann. Prob. 31, 21672215.CrossRefGoogle Scholar
Kuelbs, J. and Vidyashankar, A. N. (2011). Weak convergence results for multiple generations of a branching process. J. Theor. Prob. 24, 376396.CrossRefGoogle Scholar
Lotka, A. (1939). Théorie analytique des associations biologiques. Actualités Sci. Ind. 780, 123136.Google Scholar
Maaouia, F. and Touati, A. (2005). Identification of multitype branching processes. Ann. Statist. 33, 26552694.CrossRefGoogle Scholar
Nagaev, S. V. (1967). On estimating the expected number of direct descendants of a particle in a branching process. Theory Prob. Appl. 12, 314320.CrossRefGoogle Scholar
Ney, P. E. and Vidyashankar, A. N. (2003). Harmonic moments and large deviation rates for supercritical branching processes. Ann. Appl. Prob. 13, 475489.CrossRefGoogle Scholar
Ney, P. E. and Vidyashankar, A. N. (2004). Local limit theory and large deviations for supercritical branching processes. Ann. Appl. Prob. 14, 11351166.CrossRefGoogle Scholar