Hostname: page-component-78c5997874-s2hrs Total loading time: 0 Render date: 2024-11-10T14:20:01.755Z Has data issue: false hasContentIssue false

Matrix calculations for moments of Markov processes

Published online by Cambridge University Press:  02 September 2022

Andrew Daw*
Affiliation:
University of Southern California
Jamol Pender*
Affiliation:
Cornell University
*
*Postal address: Marshall School of Business, University of Southern California, 401B Bridge Hall, Los Angeles, CA 90089. Email address: andrew.daw@usc.edu
**Postal address: School of Operations Research and Information Engineering, Cornell University, 228 Rhodes Hall, Ithaca, NY 14853.
Rights & Permissions [Opens in a new window]

Abstract

Matryoshka dolls, the traditional Russian nesting figurines, are known worldwide for each doll’s encapsulation of a sequence of smaller dolls. In this paper, we exploit the structure of a new sequence of nested matrices we call matryoshkan matrices in order to compute the moments of the one-dimensional polynomial processes, a large class of Markov processes. We characterize the salient properties of matryoshkan matrices that allow us to compute these moments in closed form at a specific time without computing the entire path of the process. This simplifies the computation of the polynomial process moments significantly. Through our method, we derive explicit expressions for both transient and steady-state moments of this class of Markov processes. We demonstrate the applicability of this method through explicit examples such as shot noise processes, growth–collapse processes, ephemerally self-exciting processes, and affine stochastic differential equations from the finance literature. We also show that we can derive explicit expressions for the self-exciting Hawkes process, for which finding closed-form moment expressions has been an open problem since their introduction in 1971. In general, our techniques can be used for any Markov process for which the infinitesimal generator of an arbitrary polynomial is itself a polynomial of equal or lower order.

Type
Original Article
Copyright
© The Author(s), 2022. Published by Cambridge University Press on behalf of Applied Probability Trust

1. Introduction

In our recent study of the intensity of Markovian Hawkes processes, originally defined in [Reference Hawkes33], we have been interested in computing all the moments of this process. In surveying the previous literature for this process, there do not seem to be any closed-form transient solutions at the fourth order or higher (see Proposition 5 of [Reference Gao and Zhu29] for moments one through three), and both steady-state solutions and ordinary differential equations (ODEs) have only been available up to the fourth moment; see [Reference Da Fonseca and Zaatour15, Reference Errais, Giesecke and Goldberg22]. Similarly, [Reference At-Sahalia, Cacho-Diaz and Laeven2] give expressions for the fourth transient moment of a self-exciting jump-diffusion model up to squared error in the length of time, and one could simplify these expressions to represent the Markovian Hawkes intensity with the same error. The standard methodology for finding moments is to differentiate the moment generating function to obtain the moments, but this is intractable for practical reasons; see for example [Reference Errais, Giesecke and Goldberg22]. The problem of finding the moments of the Hawkes process intensity is also the subject of the recent interesting research in [Reference Cui, Hawkes and Yi13, Reference Cui, Wu and Yin14], works that are concurrent with and independent from this one. In [Reference Cui, Hawkes and Yi13], the authors propose a new approach for calculating moments that they construct from elementary probability arguments and also relate to the infinitesimal generator. Like the infinitesimal generator, this new methodology produces differential equations that can be solved algebraically or numerically to yield the process moments, and the authors provide closed-form transient expressions up to the second moment. The paper [Reference Cui, Wu and Yin14] extends this methodology to cases of gamma decay kernels. In other recent previous works [Reference Daw and Pender18, Reference Koops, Saxena, Boxma and Mandjes41], the authors have identified the differential equation for an arbitrary moment of the Hawkes process intensity, although the closed-form solutions for these equations have remained elusive and prompted closer investigation. Upon inspecting the differential equation for a given moment of the Hawkes process intensity, one can notice that this expression depends on the moments of lower order. Thus, to compute a given moment one must solve a system of differential equations with size equal to the order of the moment, meaning one must at least implicitly solve for all the lower-order moments first. This same pattern occurs in [Reference Cui, Hawkes and Yi13]. Noticing this nesting pattern leads one to wonder: what other processes have moments that follow this structure?

In this paper, we explore this question by identifying what exactly this nesting structure is. In the sequel, we will define a novel sequence of matrices that captures this pattern. Just as matryoshka dolls—the traditional Russian nesting figurines—stack inside of one another, these matrices are characterized by their encapsulation of their predecessors in the sequence. Hence, we refer to this sequence as matryoshkan matrices. As we will show, these matrices can be used to describe the linear system of differential equations that arise in solving for the moments of the Hawkes process, as well as the moments of a large class of other Markov processes. Similar structure has been successfully leveraged previously in the case of discrete state spaces (see, e.g., [Reference Nielsen, Nilsson, Thygesen and Beyer44, Reference Nasr, Charanek and Maddah43, Reference Bladt, Asmussen and Steffensen5]), but here we consider continuous state spaces as well, such as our original motivating example, the Hawkes process. As we will demonstrate through detailed examples, this includes a wide variety of popular stochastic processes, such as Itô diffusions and shot noise processes. By utilizing this nesting structure we are able to solve for the moments of these processes in closed form. In comparison to traditional methods of solving these systems of differential equations numerically, the advantage of the approach introduced herein is the fact that the moments can be computed at a specific point in time rather than on a path through time.

In fact, the only assumption we make on these processes is that their moments satisfy differential equations that do not depend on any higher-order moments. Notably, this assumption exactly coincides with the definition of polynomial processes, hence specifying the exact class of Markov processes we will consider. In [Reference Cuchiero, Keller-Ressel and Teichmann11], this is defined as the collection of Markov processes such that the expected value of any polynomial is ‘again a polynomial in the initial value of the process’ up to the same order. Hence, the processes we study here are precisely the one-dimensional polynomial processes. The paper [Reference Cuchiero, Keller-Ressel and Teichmann11] demonstrates that the moments of such processes can be computed through a matrix exponential, and thus our contributions to the polynomial process literature are that the sequence of these matrices (and their inverses, exponentials, and decompositions) exhibit a convenient nested structure that enables simple iterative computation. Hence, this paper can be seen as building on a line of work that aims to efficiently and/or recursively compute polynomial process moments, such as [Reference Forman and Sørensen26, Reference Ackerer, Filipović and Pulido1, Reference Benth and Lavagnini3]. This also provides context and relevance for the various examples on which we demonstrate our methods. For example, we show the nested matrix structure arising in Itô diffusions, and this connects to a rich history of study on polynomial diffusions; see for example [Reference Wong47, Reference Mazet42, Reference Forman and Sørensen26, Reference Filipović and Larsson24, Reference Filipović and Larsson25] and references therein. Polynomial processes have been used to great success in many contexts in finance, such as [Reference Biagini and Zhang4, Reference Cuchiero10, Reference Guasoni and Wong31], in addition to many of the papers mentioned above.

The breadth of applications of polynomial processes suggests that our methodology has the potential to be quite relevant in practice. Of course, these techniques can be used to efficiently calculate the commonly used first four moments, thus obtaining the mean, variance, skewness, and kurtosis. Moreover, though, let us note that the higher moment calculations are also of practical use. For example, these higher moments can be used in Markov-style concentration inequalities, as the higher order should improve the accuracy of the tail bounds. To that end, one can also use the vector of moments to approximate generating functions such as the moment generating function or Laplace–Stieltjes transforms. This could then be used to characterize the stationary distribution of the process, for example, or to provide approximate calculations of quantities such as the cumulative distribution function through transform methods. The calculation of moments is also highly relevant for many applications in mathematical finance. One could also expect this efficiently calculated vector of moments to be of use in estimation through method of moments techniques. Again in this case, access to higher-order moments should improve fit.

The remainder of this paper is organized as follows. In Section 2, we introduce matryoshkan matrix sequences and identify some of their key properties. In Section 3 we use these matrices to find the moments of a large class of general Markov processes—the one-dimensional polynomial processes. We also give specific examples. In Section 4, we conclude. Throughout the course of this study, we make the two following primary contributions:

  1. (i) We define a novel class of matrix sequences, which we call matryoshkan matrix sequences for their nesting structure. We identify key properties of these matrices, such as their inverses and matrix exponentials.

  2. (ii) Through these matryoshkan matrices, we solve for closed-form expressions for the moments of polynomial Markov processes. Furthermore, we demonstrate the general applicability of this technique through application to notable stochastic processes, including Hawkes processes, shot noise processes, Itô diffusions, growth–collapse processes, and linear birth–death–immigration processes. In the case of the Hawkes process and growth–collapse processes this resolves an open problem, as closed-form expressions for these general transient moments were not previously known in the literature.

2. Matryoshkan matrix sequences

For the sake of clarity, let us begin this section by introducing general notation patterns we will use throughout this paper. Because of the heavy use of matrices in this work, we reserve boldface uppercase variables for these objects; for example, we use $\textbf{I}$ for the identity matrix. Similarly we let boldface lowercase variables denote vectors—for example, using $\textbf{v}$ for the vector of all ones or $\textbf{v}_i$ for the unit vector in the ith direction. One can assume that all vectors are column vectors unless otherwise noted. Scalar terms will not be bolded. A special matrix that we will use throughout this work is the diagonal matrix, denoted by $\textbf{diag}(\textbf{a})$ , which is a square matrix with the values of the vector $\textbf{a}$ along its diagonal and zeros elsewhere. We will also make use of a generalization of this, denoted by $\textbf{diag}(\textbf{a}, k)$ , which instead contains the values of $\textbf{a}$ on the kth off-diagonal, with negative k being below the diagonal and positive k being above.

Let us now introduce a sequence of matrices that will be at the heart of this work. We begin as follows: consider a sequence of lower triangular matrices $\big\{\textbf{M}_n, n \in \mathbb{Z}^+\big\}$ such that

(1) \begin{align}\textbf{M}_{n}=\left[\begin{array}{c@{\quad}c}\textbf{M}_{n-1} & \textbf{0}_{n-1 \times 1} \\[4pt]\textbf{m}_{n} & m_{n,n}\end{array}\right],\end{align}

where $\textbf{m}_{n} \in \mathbb{R}^{n-1}$ is a row vector, $m_{n,n} \in \mathbb{R}$ , and $\textbf{M}_1 = m_{1,1}$ , an initial value. Taking inspiration from matryoshka dolls, the traditional Russian nesting dolls, we will refer to these objects as matryoshkan matrices. Using their nested and triangular structures, we can make four quick observations of note regarding matryoshkan matrices.

Proposition 1. Each of the following statements is a consequence of the definition of matryoshkan matrices given by Equation (1):

  1. (i) If $\textbf{X}_n \in \mathbb{R}^{n \times n}$ and $\textbf{Y}_n \in \mathbb{R}^{n \times n}$ are both matryoshkan matrix sequences, then so are $\textbf{X}_n + \textbf{Y}_n$ and $\textbf{X}_n\textbf{Y}_n$ .

  2. (ii) If $m_{i,i} \ne 0$ for all $i \in \{1, \dots , n\}$ then the matryoshkan matrix $\textbf{M}_n\in \mathbb{R}^{n \times n}$ is nonsingular. Moreover, the inverse of $\textbf{M}_{n}$ is given by the recursion

    (2) \begin{align}\textbf{M}_n^{-1}=\left[\begin{array}{c@{\quad}c}\textbf{M}_{n-1}^{-1} & \textbf{0}_{n-1 \times 1} \\[8pt]-\frac{1}{m_{n,n}}\textbf{m}_{n}\textbf{M}_{n-1}^{-1} & \frac{1}{m_{n,n}}\end{array}\right].\end{align}
  3. (iii) If $m_{i,i} \ne 0$ for all $i \in \{1, \dots , n\}$ and are all distinct, then the matrix exponential of the matryoshkan matrix $\textbf{M}_n\in \mathbb{R}^{n \times n}$ multiplied by $t \in \mathbb{R}$ follows the recursion

    (3) \begin{align}e^{\textbf{M}_{n}t}=\left[\begin{array}{c@{\quad}c}e^{\textbf{M}_{n-1}t} & \textbf{0}_{n-1 \times 1} \\[4pt]\textbf{m}_{n}\!\left(\textbf{M}_{n-1} - m_{n,n}\textbf{I}\right)^{-1}\!\left(e^{\textbf{M}_{n-1}t} - e^{m_{n,n} t}\textbf{I}\right)&e^{m_{n,n} t}\end{array}\right].\end{align}
  4. (iv) If $m_{i,i} \ne 0$ for all $i \in \{1, \dots , n\}$ and are all distinct, then the matrices $\textbf{U}_n \in \mathbb{R}^{n \times n}$ and $\textbf{D}_n \in \mathbb{R}^{n \times n}$ are such that

    \begin{equation*}\textbf{M}_n \textbf{U}_n = \textbf{U}_n \textbf{D}_n\end{equation*}
    for the matryoshkan matrix $\textbf{M}_n \in \mathbb{R}^{n \times n}$ , when defined recursively as
    (4) \begin{align}\textbf{U}_n=\left[\begin{array}{c@{\quad}c}\textbf{U}_{n-1} & \textbf{0}_{n-1 \times 1} \\[4pt]\textbf{m}_{n}\!\left(\textbf{D}_{n-1} - m_{n,n}\textbf{I} \right)^{-1} \textbf{U}_{n-1} & 1\end{array}\right],\,\textbf{D}_n=\left[\begin{array}{c@{\quad}c}\textbf{D}_{n-1} & \textbf{0}_{n-1 \times 1} \\[4pt]\textbf{0}_{1 \times n-1} & m_{n,n}\end{array}\right].\end{align}

One can pause to note that in some sense any lower triangular matrix could be considered matryoshkan, or at least could satisfy these properties. However, we note that some of the most significant insights we can gain from the matryoshkan structure are the recursive implications available for sequences of matrices. Moreover, it is the combination of the nested relationship of consecutive matrices and the lower triangular structure that enables us to find these patterns. We will now see how this notion of matryoshkan matrix sequences and the associated properties above can be used to specify element-wise solutions to a sequence of differential equations.

Lemma 1. Let $\textbf{M}_n \in \mathbb{R}^{n \times n}$ , $\textbf{c}_n \in \mathbb{R}^{n}$ , and $\textbf{s}_n(t) \,:\, \mathbb{R}^+ \to \mathbb{R}^n$ be such that

\begin{equation*}\textbf{M}_n=\left[\begin{array}{c@{\quad}c}\textbf{M}_{n-1} & \textbf{0}_{n-1\times 1} \\[5pt]\textbf{m}_{n} & m_{n,n}\\\end{array}\right],\quad\textbf{c}_{n}=\left[\begin{array}{c}\textbf{c}_{n-1}\\[5pt]c_n\end{array}\right],\quad\textrm{ and }\quad\textbf{s}_{n}(t)=\left[\begin{array}{c}\textbf{s}_{n-1}(t)\\[5pt]s_n(t)\end{array}\right],\end{equation*}

where $\textbf{m}_{n} \in \mathbb{R}^{n-1}$ is a row vector, $\textbf{c}_{n-1} \in \mathbb{R}^{n-1}$ , $s_n(t) \in \mathbb{R}$ , and $\textbf{M}_1 = m_{1,1}$ . Further, suppose that

\begin{equation*}\frac{\textrm{d}}{\textrm{d}t}\textbf{s}_{n}(t)=\textbf{M}_{n}\textbf{s}_{n}(t)+\textbf{c}_{n} .\end{equation*}

Then, if $m_{k,k} \ne 0$ for all $k \in \{1, \dots, n\}$ , the vector function $\textbf{s}_n(t)$ is given by

(5) \begin{align}\textbf{s}_n(t)=e^{\textbf{M}_n t} \textbf{s}_n(0)-\textbf{M}_n^{-1} \big(\textbf{I} - e^{\textbf{M}_n t}\big) \textbf{c}_n ,\end{align}

and if all $m_{k,k} \ne 0$ for $k \in \{1, \dots, n\}$ are distinct, the nth scalar function $s_n(t)$ is given by

(6) \begin{align}s_n(t)&=\textbf{m}_{n}\!\left(\textbf{M}_{n-1} - m_{n,n}\textbf{I}\right)^{-1}\big(e^{\textbf{M}_{n-1} t} - e^{m_{n,n}t}\textbf{I}\big)\bigg(\textbf{s}_{n-1}(0)+\frac{\textbf{c}_{n-1}}{m_{n,n}}\bigg)+e^{m_{n,n}t}s_n(0)\nonumber\\[4pt]&\quad-\frac{c_n}{m_{n,n}} \!\left( 1-e^{m_{n,n}t} \right)+\frac{\textbf{m}_{n}}{m_{n,n}}\textbf{M}_{n-1}^{-1}\big(\textbf{I}-e^{\textbf{M}_{n-1}t}\big)\textbf{c}_{n-1},\end{align}

where $t \geq 0$ .

In Equation (5), we can see that the vector solution to the n-dimensional ODE system is written in terms of both the inverse and the exponential of the nth matryoshkan matrix. To compute the full vector solution, then, we could simply directly exploit the matryoshkan structure of both the inverse and the exponential from Equations (2) and (3) and calculate each individual matrix function recursively. However, one promising method to compute these at once would be to instead leverage the matryoshkan form of the diagonalized decomposition in Equation (4). That is, both the matrix exponential and the inverse can of course be easily computed from the decomposition matrices, but this way one can also conveniently use the recursion in Equation (4) to simultaneously compute terms needed for both the matrix exponential and the inverse. At each step in the recursion, one needs to compute $\textbf{U}_n$ , $\textbf{U}_n^{-1}$ , and $\textbf{D}_n$ . Because $\textbf{U}_n$ is itself a matryoshkan matrix, note that we can use Equation (2) to find $\textbf{U}_n^{-1}$ . Starting from $\textbf{U}_1=\textbf{U}_1^{-1}=1$ and $\textbf{D}_1 = m_{1,1}$ with pre-allocated space, the update from $n-1$ to n takes $O\!\left(1\right)$ operations to append $m_{n,n}$ to $\textbf{D}_{n-1}$ , $O\!\left(n-1\right)$ operations to compute the divisions needed to form the vector $\textbf{m}_n\!\left(\textbf{D}_{n-1} - m_{n,n}\textbf{I}\right)^{-1}$ for $\textbf{U}_n^{-1}$ , and $O\!\left((n-1)^2\right)$ operations to then multiply this vector by $\textbf{U}_{n-1}$ for $\textbf{U}_n$ . Thus, Step i is $O\!\left((i-1)^2\right)$ for $i \in \{1, \dots n\}$ , yielding that in sum it takes $O\big(n^3\big)$ operations to form $\textbf{U}_n$ , $\textbf{U}_n^{-1}$ , and $\textbf{D}_n$ . Naturally, these same ideas apply to the computation of Equation (6), where many matryoshkan forms—the inverses, exponentials, and even products of square matrices—can be recognized and calculated recursively.

With this lemma in hand, we can now move to using these matrix sequences for calculating Markov process moments. To do so, we will use the infinitesimal generator, a key tool for Markov processes, to find the derivatives of the moments through time. By identifying a matryoshkan matrix structure in these differential equations, we are able to apply Lemma 1 to find closed-form expressions for the moments.

3. Calculating moments through matryoshkan matrix sequences

In this section we connect matryoshkan matrix sequences with the moments of Markov processes. In Subsection 3.1, we prove the main result, which is the computation of the moment of a one-dimensional polynomial Markov process through matryoshkan matrices. To demonstrate the applicability of this result, we now apply it to a series of examples. First, in Subsection 3.2, we obtain the moments of the self-exciting Hawkes process, for which finding moments in closed form has been an open problem. Then, in Subsection 3.3, we study the Markovian shot noise process, a stochastic intensity process that trades self-excitement for external shocks. Next, in Subsection 3.4, we showcase the use of these techniques for diffusive dynamics through application to Itô diffusions. In Subsection 3.5, we consider growth–collapse processes as an application of these techniques to models with state-dependent and randomly sized down jumps. Finally, in Subsection 3.6 we apply this technique to a process with jumps both upwards and downwards, a linear birth–death–immigration process we have studied previously. In each scenario, we describe the process of interest, define the infinitesimal generator, and identify the matrix structure. Through this, we solve for the process moments.

3.1. The moments of general polynomial Markov processes

The connection between matryoshkan matrices and Markov processes is built upon a key tool for Markov processes, the infinitesimal generator. For a detailed introduction to infinitesimal generators and their use in studying Markov processes, see e.g. [Reference Ethier and Kurtz23]. For a Markov process $X_t$ on a state space $\mathbb{S}$ , the infinitesimal generator on a function $f \,:\, \mathbb{S} \to \mathbb{R}$ is defined as

\begin{equation*}\mathcal{L}\,f(x)=\lim_{\tau \to 0}\frac{{\mathbb{E}\!\left[f(X_\tau) \mid X_0 = x\right]} - f(x)}{\tau}.\end{equation*}

In our context and in many others, the power of the infinitesimal generator comes through use of Dynkin’s formula, which gives us that

\begin{equation*}\frac{\textrm{d}}{\textrm{d}t}{\mathbb{E}\!\left[f(X_t)\right]}={\mathbb{E}\!\left[\mathcal{L}\,f(X_t)\right]}.\end{equation*}

To study the moments of a Markov process, we are interested in functions f that are polynomials. Let us suppose now that $\mathcal{L}x^n$ for any $n \in \mathbb{Z}^+$ is polynomial in the lower powers of x for a given Markov process $X_t$ , meaning that $X_t$ is a one-dimensional polynomial process. Then we can write

\begin{equation*}\mathcal{L}X_t^n=\theta_{0,n}+\sum_{i=1}^n \theta_{i,n} X_t^i,\end{equation*}

which implies that the differential equation for the nth moment of this process is

\begin{equation*}\frac{\textrm{d}}{\textrm{d}t} {\mathbb{E}\!\left[X_t^n\right]}=\theta_{0,n} + \sum_{i=1}^n \theta_{i,n} {\mathbb{E}\big[X_t^i\big]},\end{equation*}

for some collection of constants $\theta_{0,n}$ , $\theta_{1,n}$ , $\theta_{n,n}$ . Thus, to solve for the nth moment of $X_t$ we must first solve for all the moments of lower order. We can also observe that to solve for the $(n-1)$ th moment we must have all the moments below it. In comparing these systems of differential equations, we can see that all of the equations in the system for the $(n-1)$ th moment are also in the system for the nth moment. No coefficients are changed in any of these lower moment equations; the only difference between the two systems is the inclusion of the differential equation for the nth moment in its own system. Hence, the nesting matryoshkan structure arises. By expressing each system of linear ODEs in terms of a vector of moments, a matrix of coefficients, and a vector of shift terms, we can use these matrix sequences to capture how one moment’s system encapsulates all the systems below it. This observation then allows us to calculate all the moments of the process in closed form, as we now show in Theorem 1. In steady state, this also implies that the $(n+1)$ th moment can be computed as a linear combination of moments 1 through n.

Theorem 1. Let $X_t$ be a Markov process such that the time derivative of its nth moment can be written as

(7) \begin{align}\frac{\textrm{d}}{\textrm{d}t} {\mathbb{E}\!\left[X_t^n\right]}&=\theta_{0,n} + \sum_{i=1}^n \theta_{i,n} {\mathbb{E}\big[X_t^i\big]},\end{align}

for any $n \in \mathbb{Z}^+$ , where $t \geq 0$ and $\theta_{i, n} \in \mathbb{R}$ for all $i \leq n$ . Let $\boldsymbol{\Theta}_n \in \mathbb{R}^{n \times n}$ be defined recursively by

(8) \begin{align}{\boldsymbol{\Theta}}_n=\left[\begin{array}{c@{\quad}c}\boldsymbol{\Theta}_{n-1} & \textbf{0}_{n-1 \times 1} \\[5pt]\boldsymbol{\theta}_n & \theta_{n,n}\end{array}\right],\end{align}

where $\boldsymbol{\theta}_n = [\theta_{1, n}, \, \dots, \, \theta_{n-1, n}]$ and $\boldsymbol{\Theta}_1 = \theta_{1,1}$ . Furthermore, let $\boldsymbol{\theta}_{0,n} = [\theta_{0,1}, \, \dots, \, \theta_{0,n}]^{\textrm{T}}$ . Then, if $\theta_{k,k} \ne 0$ for all $k \in \{1, \dots , n\}$ are distinct, the nth moment of $X_t$ can be expressed as

(9) \begin{align}{\mathbb{E}\!\left[X_t^n\right]}&=\boldsymbol{\theta}_{n}\!\left(\boldsymbol{\Theta}_{n-1} - \theta_{n,n}\textbf{I}\right)^{-1}\big(e^{\boldsymbol{\Theta}_{n-1} t} - e^{\theta_{n,n}t}\textbf{I}\big)\left(\textbf{x}_{n-1}(x_0)+\frac{\boldsymbol{\theta}_{0,n-1}}{\theta_{n,n}}\right)\nonumber\\&\quad+x_0^n e^{\theta_{n,n}t}-\frac{\theta_{0,n}\big(1-e^{\theta_{n,n}t}\big)}{\theta_{n,n}}+\frac{\boldsymbol{\theta}_{n}}{\theta_{n,n}}\boldsymbol{\Theta}_{n-1}^{-1}\big(\textbf{I}-e^{\boldsymbol{\Theta}_{n-1}t}\big)\boldsymbol{\theta}_{0,n-1},\end{align}

where $x_0$ is the initial value of $X_t$ and where $\textbf{x}_n(a) \in \mathbb{R}^n$ is such that $\left(\textbf{x}_n(a)\right)_i = a^i$ . If $X_t$ has a stationary distribution and the nth steady-state moment ${\mathbb{E}\!\left[X_\infty^n\right]}$ exists, it is given by

(10) \begin{align}{\mathbb{E}\!\left[X_\infty^n\right]}&=\frac{1}{\theta_{n,n}}\big(\boldsymbol{\theta}_{n}\boldsymbol{\Theta}_{n-1}^{-1}\boldsymbol{\theta}_{0,n-1}-\theta_{0,n}\big),\end{align}

and these steady-state moments satisfy the recursive relationship

(11) \begin{align}{\mathbb{E}\big[X_\infty^{n+1}\big]}&=-\frac{1}{\theta_{n+1,n+1}}\!\left(\boldsymbol{\theta}_{n+1}\textbf{s}_{n}^X(\infty)+\theta_{0,n+1}\right),\end{align}

where $\textbf{s}_n^X(\infty) \in \mathbb{R}^n$ is the vector of steady-state moments such that $\left(\textbf{s}_n^X(\infty)\right)_i = {\mathbb{E}\!\left[X_\infty^i\right]}$ .

Proof. Using the definition of $\boldsymbol{\Theta}_n$ in Equation (8), Equation (7) gives rise to the system of ODEs given by

(12) \begin{align}\frac{\textrm{d}}{\textrm{d}t}\textbf{s}_n^X(t)=\boldsymbol{\Theta}_n \textbf{s}_n^X(t)+\boldsymbol{\theta}_{0,n},\end{align}

where $\textbf{s}_n^X(t) \in \mathbb{R}^n$ is the vector of transient moments at time $t \geq 0$ such that $\left(\textbf{s}_n^X(t)\right)_i = {\mathbb{E}\big[X_t^i\big]}$ for all $1 \leq i \leq n$ . We can observe that by definition the matrices $\boldsymbol{\Theta}_n$ form a matryoshkan sequence, and thus by Lemma 1 we achieve the stated transient solution. To prove the steady-state solution, we can first note that if the process has a steady-state distribution then the vector $\textbf{s}_n^X(\infty) \in \mathbb{R}^n$ defined by $\left(\textbf{s}_n^X(\infty)\right)_i = {\mathbb{E}\!\left[X_\infty^i\right]}$ will satisfy

(13) \begin{align}0=\boldsymbol{\Theta}_n \textbf{s}_n^X(\infty)+\boldsymbol{\theta}_{0,n},\end{align}

as this is the equilibrium solution to the differential equation corresponding to each of the moments. This system has a unique solution, since $\boldsymbol{\Theta}_n$ is nonsingular owing to the assumption that the diagonal values are unique and nonzero. Using Proposition 1, we find the nth moment by

\begin{equation*}{\mathbb{E}\!\left[X_\infty^n\right]}=-\textbf{v}_n^{\textrm{T}} \boldsymbol{\Theta}_n^{-1} \boldsymbol{\theta}_{0,n}=\left[\begin{array}{c@{\quad}c}\frac{1}{\theta_{n,n}}\boldsymbol{\theta}_{n}\boldsymbol{\Theta}_{n-1}^{-1} & -\frac{1}{\theta_{n,n}}\end{array}\right]\left[\begin{array}{c}\boldsymbol{\theta}_{0,n-1} \\[5pt]\theta_{0,n}\end{array}\right]=\frac{1}{\theta_{n,n}}\Big(\boldsymbol{\theta}_{n}\boldsymbol{\Theta}_{n-1}^{-1}\boldsymbol{\theta}_{0,n-1}-\theta_{0,n}\Big),\end{equation*}

which completes the proof of Equation (10). To conclude, one can note that each line of the linear system in Equation (13) implies the stated recursion.

3.2. Application to Hawkes process intensities

For our first example of this method let us turn to our motivating application, the Markovian Hawkes process intensity. Via [Reference Hawkes33], this process is defined as follows. Let $\lambda_t$ be a stochastic arrival process intensity such that

\begin{equation*}\lambda_t=\lambda^* + (\lambda_0 - \lambda^*) e^{-\beta t} + \int_0^t \alpha e^{-\beta (t-s)} \textrm{d}N_s=\lambda^* + (\lambda_0 - \lambda^*) e^{-\beta t} + \sum_{i=1}^{N_t} \alpha e^{-\beta (t - A_i)},\end{equation*}

where $\big\{A_i \mid i \in \mathbb{Z}^+\big\}$ is the sequence of arrival epochs in the point process $N_t$ , with

\begin{equation*}\mathbb{P}\!\left( {N_{t + s} - N_t = 0 \mid \mathcal{F}_t} \right)=\mathbb{P}\!\left( {N_{t + s} - N_t = 0 \mid \lambda_t} \right)=e^{-\int_0^s \lambda_{t + u} \textrm{d}u},\end{equation*}

where $\mathcal{F}_t$ is the filtration generated by the history of $\lambda_t$ up to time t. Let us note that the Hawkes process as defined in [Reference Hawkes33] is not in general a Markov process. If we restrict the excitation kernel to being exponential, then we obtain a Markovian process. In what remains, we will assume that the excitation kernel is exponential. Moreover, we will assume that $\beta > \alpha > 0$ , so that the process has a stationary distribution, and we will also let $\lambda^* > 0$ and $\lambda_0 > 0$ . Note that the process behaves as follows: at arrivals $\lambda_t$ increases by $\alpha$ , and in the interims it decays exponentially at rate $\beta$ towards the baseline level $\lambda^*$ . Thus, $(\lambda_t, N_t)$ is referred to as a self-exciting point process, as the occurrence of an arrival increases the intensity and thus increases the likelihood that another arrival will occur soon afterwards. Because the intensity $\lambda_t$ forms a Markov process, we can follow the literature (see, e.g., Equation (7) of [Reference Da Fonseca and Zaatour15]) and write its infinitesimal generator for a (sufficiently regular) function $f\,:\,\mathbb{R}^+ \to \mathbb{R}$ as follows:

\begin{equation*}\mathcal{L}\,f(\lambda_t) = \lambda_t \!\left(\,f(\lambda_t + \alpha) - f(\lambda_t)\right) - \beta \!\left(\lambda_t - \lambda^*\right) f^{\prime}(\lambda_t),\end{equation*}

where f (x) is the first derivative of $f({\cdot})$ evaluated at x. Note that this expression showcases the process dynamics that we have described, as the first term on the right-hand side corresponds to the product of the arrival rate and the change in the process when an arrival occurs, while the second term captures the decay.

To obtain the nth moment we must consider $f({\cdot})$ of the form $f(x) = x^n$ . In the simplest case, when $n = 1$ , this formula yields an ODE for the mean, which can be written as

\begin{equation*}\frac{\textrm{d}}{\textrm{d}t}{\mathbb{E}\!\left[\lambda_t\right]}=\alpha{\mathbb{E}\!\left[\lambda_t\right]} - \beta\!\left({\mathbb{E}\!\left[\lambda_t\right]} - \lambda^*\right)=\beta \lambda^*-(\beta - \alpha) {\mathbb{E}\!\left[\lambda_t\right]}.\end{equation*}

By comparison, for the second moment at $n = 2$ we have

\begin{equation*}\frac{\textrm{d}}{\textrm{d}t}{\mathbb{E}\big[\lambda_t^2\big]}={\mathbb{E}\!\left[\lambda_t \big((\lambda_t + \alpha)^2 - \lambda_t^2\big) - 2 \beta \lambda_t \big(\lambda_t - \lambda^*\big)\right]}=\big(2 \beta \lambda^* + \alpha^2\big){\mathbb{E}\!\left[ \lambda_t \right]}-2(\beta-\alpha){\mathbb{E}\big[\lambda_t^2\big]},\end{equation*}

and we can note that while the ODE for the mean is autonomous, the second moment equation depends on both the mean and the second moment. Thus, to solve for the second moment we must also solve for the mean, leading us to the following system of differential equations:

\begin{equation*}\frac{\textrm{d}}{\textrm{d}t}\!\left[\begin{array}{c}{\mathbb{E}\!\left[\lambda_t\right]}\\[5pt] {\mathbb{E}\!\left[\lambda_t^2\right]}\\[2pt]\end{array}\right]=\left[\begin{array}{c@{\quad}c}-(\beta - \alpha) & 0 \\[5pt]2\beta\lambda^* + \alpha^2 & -2(\beta-\alpha)\end{array}\right]\left[\begin{array}{c}{\mathbb{E}\!\left[\lambda_t\right]}\\[5pt]{\mathbb{E}\!\left[\lambda_t^2\right]}\\[2pt]\end{array}\right]+\left[\begin{array}{c}\beta\lambda^* \\[3pt] 0\end{array}\right].\end{equation*}

Moving on to the third moment, the infinitesimal generator formula yields

\begin{align*}\frac{\textrm{d}}{\textrm{d}t}{\mathbb{E}\big[\lambda_t^3\big]}&={\mathbb{E}\!\left[\lambda_t\big((\lambda_t + \alpha)^3 - \lambda_t^3\big) - 3\beta \lambda_t^{2}\!\left(\lambda_t - \lambda^*\right)\right]}\\&=\alpha^{3}{\mathbb{E}\!\left[\lambda_t \right]}+3\big(\beta\lambda^* + \alpha^2\big){\mathbb{E}\big[\lambda_t^{2}\big]}-3(\beta - \alpha) {\mathbb{E}\big[\lambda_t^{3} \big]},\end{align*}

and hence we see that this ODE now depends on all of the first three moments. Thus, to solve for ${\mathbb{E}\!\left[\lambda_t^3\right]}$ we need to solve the system of ODEs

\begin{equation*}\frac{\textrm{d}}{\textrm{d}t}\!\left[\begin{array}{c}{\mathbb{E}\!\left[\lambda_t\right]}\\[5pt]{\mathbb{E}\!\left[\lambda_t^2\right]}\\[5pt] {\mathbb{E}\!\left[\lambda_t^3\right]}\\[2pt]\end{array}\right]=\left[\begin{array}{c@{\quad}c@{\quad}c}-(\beta - \alpha) & 0 & 0\\[5pt]2\beta\lambda^* + \alpha^2 & -2(\beta-\alpha) & 0 \\[5pt]\alpha^3 & 3\big(\beta\lambda^* + \alpha^2\big) & - 3(\beta-\alpha)\end{array}\right]\left[\begin{array}{c}{\mathbb{E}\!\left[\lambda_t\right]}\\[5pt] {\mathbb{E}\!\left[\lambda_t^2\right]}\\[5pt] {\mathbb{E}\!\left[\lambda_t^3\right]}\\[2pt]\end{array}\right]+\left[\begin{array}{c}\beta\lambda^* \\[5pt] 0 \\[5pt] 0\end{array}\right],\end{equation*}

and this now suggests the matryoshkan structure of these process moments: we can note that the system for the second moment is nested within the system for the third moment. That is, the matrix for the three-dimensional system contains the two-dimensional system in its upper left-hand block, just as the vector of the first three moments has the first two moments in its first two coordinates. In general, we can see that the nth moment will satisfy the ODE given by

\begin{align*}\frac{\textrm{d}}{\textrm{d}t}{\mathbb{E}\!\left[\lambda_t^n\right]}&={\mathbb{E}\big[\lambda_t\!\left((\lambda_t + \alpha)^n - \lambda_t^n\right) - n\beta \lambda_t^{n-1}\!\left(\lambda_t - \lambda^*\right)\!\big]}\\&=\sum_{k=1}^{n}\!\left(\begin{array}{c}n\\k - 1\end{array}\right)\alpha^{n-k+1}{\mathbb{E}\big[\lambda_t^{k} \big]}-n\beta {\mathbb{E}\!\left[\lambda_t^{n} \right]}+n\beta\lambda^* {\mathbb{E}\big[\lambda_t^{n-1}\big]},\end{align*}

where we have simplified by use of the binomial theorem. Thus, the system of differential equations needed to solve for the nth moment uses the matrix from the $(n-1)$ th system augmented below by the row

(14) \begin{align}\left[\begin{array}{c@{\quad}c@{\quad}c@{\quad}c@{\quad}c@{\quad}c@{\quad}c}\alpha^n & n\alpha^{n-1},& \scriptsize\left(\begin{array}{c}n\\2\\\end{array}\right)\alpha^{n-2},& \dots,& \scriptsize\left(\begin{array}{c}n\\n - 3\\\end{array}\right)\alpha^3,& \scriptsize\left(\begin{array}{c}n\\n - 2\\\end{array}\right)\alpha^2 + n \beta\lambda^*,& -n(\beta-\alpha)\end{array}\right],\end{align}

and buffered on the right by a column of zeros. To collect these coefficients into a coherent structure, let us define the matrix $\boldsymbol{\mathcal{P}}_n(a) \in \mathbb{R}^{n \times n}$ for $a \in \mathbb{R}$ such that

(15) \begin{align}\left(\boldsymbol{\mathcal{P}}_n(a)\right)_{i,j}=\begin{cases}\scriptsize\big(\begin{array}{c}i\\[-3pt]\,j - 1\end{array}\big) a^{i-j+1}, & i \geq j, \\[5pt]0, & i < j .\end{cases}\end{align}

If we momentarily disregard the terms with $\beta$ in the general augment row in Equation (14), one can observe that the remaining terms in this vector are given by the bottom row of the matrix $\boldsymbol{\mathcal{P}}_n(\alpha)$ . Furthermore, by definition $\big\{\boldsymbol{\mathcal{P}}_n(a) \mid n \in \mathbb{Z}^+\big\}$ forms a matryoshkan matrix sequence. We can also note that $\boldsymbol{\mathcal{P}}_n(a)$ can be equivalently defined as

\begin{equation*}\boldsymbol{\mathcal{P}}_n(a)=\sum_{k=1}^na\!\left[\begin{array}{c@{\quad}c}\textbf{0}_{n-k \times n-k} & \textbf{0}_{n-k \times k}\\[6pt]\textbf{0}_{k \times n-k} & \textbf{L}_k(a)\end{array}\right],\end{equation*}

where $\textbf{L}_k(a) = e^{a\textbf{diag}\!\left(1:k-1,-1\right)}$ is the kth lower triangular Pascal matrix; i.e. the nonzero terms in $\textbf{L}_k(1)$ yield the first k rows of Pascal’s triangle. Alternatively, $\boldsymbol{\mathcal{P}}_n(a)$ can be found by creating a lower triangular matrix from the strictly lower triangular values in $\textbf{L}_{n+1}(a)$ . For these reasons, we refer to the sequence of $\boldsymbol{\mathcal{P}}_n(a)$ as matryoshkan Pascal matrices. For brief overviews and beautiful properties of Pascal matrices, see [Reference Brawer and Pirovino8, Reference Call and Velleman9, Reference Zhang49, Reference Edelman and Strang21]. As we have seen in the preceding derivation, matryoshkan Pascal matrices arise naturally in using the infinitesimal generator to calculate moments of Markov processes. This follows from the application of the binomial theorem to jump terms. Now, in the case of the Markovian Hawkes process intensity we find closed-form expressions for all transient moments in Corollary 1.

Corollary 1. Let $\lambda_t$ be the intensity of a Hawkes process with baseline intensity $\lambda^* > 0$ , intensity jump $\alpha > 0$ , and decay rate $\beta > \alpha$ . Then the nth moment of $\lambda_t$ is given by

(16) \begin{align}{\mathbb{E}\!\left[\lambda_t^n\right]}&=\textbf{m}^\lambda_{n}\!\left(\textbf{M}^\lambda_{n-1} + n(\beta-\alpha)\textbf{I}\right)^{-1}\big(e^{\textbf{M}^\lambda_{n-1} t} - e^{-n(\beta-\alpha)t}\textbf{I}\big)\left(\textbf{x}_{n-1}(\lambda_0)-\frac{\beta \lambda^* \textbf{v}_1 }{n(\beta-\alpha)}\right)\nonumber\\&\quad+\lambda_0^n e^{-n(\beta-\alpha)t}+\mathbb{I}_{\{n = 1\}}\frac{\beta\lambda^*}{\beta - \alpha}\big(1 - e^{-(\beta - \alpha) t}\big)\nonumber\\&\quad-\frac{\beta\lambda^*}{n(\beta-\alpha)}\textbf{m}^\lambda_{n}\!\left(\textbf{M}^\lambda_{n-1}\right)^{-1}\Big(\textbf{I}-e^{\textbf{M}^\lambda_{n-1} t}\Big)\textbf{v}_{1},\end{align}

for all $t \geq 0$ and $n \in \mathbb{Z}^+$ , where $\textbf{v}_1 \in \mathbb{R}^n$ is the unit vector in the first coordinate, $\textbf{M}_n^\lambda = \beta \lambda^* \textbf{diag}\!\left(2\,:\,n,-1\right) - \beta \textbf{diag}\!\left(1\,:\,n\right) + \boldsymbol{\mathcal{P}}_n(\alpha)$ , $\textbf{m}_n^{\lambda}=\left[\left(\textbf{M}_n^\lambda\right)_{n,1},\,\dots,\,\left(\textbf{M}_n^\lambda\right)_{n,n-1}\right]$ is given by

\begin{align*}\left(\textbf{m}_n^{\lambda}\right)_{j}&=\begin{cases}\scriptsize\big(\begin{array}{c}n\\j - 1\end{array}\big)\alpha^{n-j+1}& if\ \ j < n-1,\\[8pt]\scriptsize\big(\begin{array}{c}n\\n - 2\end{array}\big) \alpha^{2} + n\beta \lambda^*& if\ \ j =n-1,\end{cases}\end{align*}

and $\textbf{x}_n(a) \in \mathbb{R}^n$ is such that $\left(\textbf{x}_n(a)\right)_i = a^i$ . In steady state, the nth moment of $\lambda_t$ is given by

(17) \begin{align}\lim_{t \to \infty}{\mathbb{E}\!\left[\lambda_t^n\right]}=-\frac{\beta\lambda}{n(\beta - \alpha)}\textbf{m}_n^\lambda \big(\textbf{M}_{n-1}^\lambda\big)^{-1}\textbf{v}_1,\end{align}

for $n \geq 2$ , with $\lim_{t \to \infty} {\mathbb{E}\!\left[\lambda_t\right]} = \frac{\beta \lambda^*}{\beta - \alpha}$ . Moreover, the $(n+1)$ th steady-state moment of the Hawkes process intensity is given by the recursion

(18) \begin{align}\lim_{t \to \infty}{\mathbb{E}\big[\lambda_t^{n+1}\big]} = \frac{1}{(n+1)(\beta-\alpha)}\textbf{m}_{n+1}^\lambda \textbf{s}_n^\lambda ,\end{align}

for all $n \in \mathbb{Z}^+$ , where $\textbf{s}_n^\lambda \in \mathbb{R}^n$ is the vector of steady-state moments defined so that $\left(\textbf{s}_n^\lambda\right)_i = \lim_{t \to \infty}{\mathbb{E}\!\left[\lambda_t^i\right]}$ for $1 \leq i \leq n$ .

3.3. Application to shot noise processes

As a second example of calculating moments through matryoshkan matrices, consider a Markovian shot noise process; see e.g. [Reference Daley and Vere-Jones16, Chapter 6] for an introduction. That is, let $\psi_t$ be defined so that

\begin{equation*}\psi_t=\sum_{i=1}^{N_t} J_i e^{-\beta (t - A_i)},\end{equation*}

where $\beta > 0$ , $\big\{J_i \mid i \in \mathbb{Z}^+\big\}$ is a sequence of independent and identically distributed positive random variables with cumulative distribution function $F_J({\cdot})$ , $N_t$ is a Poisson process at rate $\lambda > 0$ , and $\big\{A_i \mid i \in \mathbb{Z}^+\big\}$ is the sequence of arrival times in the Poisson process. These dynamics yield the following infinitesimal generator:

\begin{equation*}\mathcal{L}\,f(\psi_t)=\lambda \int_0^\infty \!\left(\,f(\psi_t + j) - f(\psi_t)\right) \textrm{d}F_J(\,j) - \beta \psi_t f^{\prime}(\psi_t),\end{equation*}

which can be simplified directly from Equation (2.2) of [Reference Dassios and Zhao17]. We can note that this is similar to the Hawkes process discussed in Subsection 3.2, as the right-hand side contains a term for jumps and a term for exponential decay. However, this infinitesimal generator formula also shows key differences between the two processes, as the jumps in the shot noise process are of random size and they occur at the fixed, exogenous rate $\lambda > 0$ . Supposing the mean jump size is finite, this now yields that the mean satisfies the ODE

\begin{equation*}\frac{\textrm{d}}{\textrm{d}t}{\mathbb{E}\!\left[\psi_t\right]}=\lambda {\mathbb{E}\!\left[J_1\right]} - \beta {\mathbb{E}\!\left[\psi_t\right]},\end{equation*}

whereas if ${\mathbb{E}\big[J_1^2\big]} < \infty$ , the second moment of the shot noise process is given by the solution to

\begin{equation*}\frac{\textrm{d}}{\textrm{d}t}{\mathbb{E}\big[\psi_t^2\big]}={\mathbb{E}\!\left[\lambda \big( (\psi_t + J_1)^2 - \psi_t^2 \big) - 2 \beta \psi_t^2\right]}=\lambda {\mathbb{E}\big[J_1^2\big]} + 2\lambda {\mathbb{E}\!\left[J_1\right]} {\mathbb{E}\!\left[\psi_t\right]} - 2 \beta {\mathbb{E}\big[\psi_t^2\big]},\end{equation*}

which depends on both the second moment and the mean. This gives rise to the linear system of differential equations

\begin{equation*}\frac{\textrm{d}}{\textrm{d}t}\!\left[\begin{array}{c}{\mathbb{E}\!\left[\psi_t\right]}\\[5pt] {\mathbb{E}\big[\psi_t^2\big]}\\[2pt] \end{array}\right]=\left[\begin{array}{c@{\quad}c}-\beta & 0 \\[5pt] 2\lambda{\mathbb{E}\!\left[J_1\right]} & -2\beta\end{array}\right]\left[\begin{array}{c}{\mathbb{E}\!\left[\psi_t\right]}\\[5pt] {\mathbb{E}\big[\psi_t^2\big]}\\[2pt]\end{array}\right]+\left[\begin{array}{c}\lambda {\mathbb{E}\!\left[J_1\right]} \\[5pt] \lambda {\mathbb{E}\big[J_1^2\big]}\\[2pt]\end{array}\right],\end{equation*}

and by observing that the differential equation for the third moment depends on the first three moments if the third moment of the jump size is finite,

\begin{align*}\frac{\textrm{d}}{\textrm{d}t}{\mathbb{E}\big[\psi_t^3\big]} =\, & {\mathbb{E}\!\left[\lambda \!\left( (\psi_t + J_1)^3 - \psi_t^3 \right) - 3 \beta \psi_t^3\right]}=\lambda {\mathbb{E}\big[J_1^3\big]} + 3\lambda {\mathbb{E}\big[J_1^2\big]} {\mathbb{E}\!\left[\psi_t\right]}\\& + 3\lambda{\mathbb{E}\!\left[J_1\right]} {\mathbb{E}\big[\psi_t^2\big]} - 3 \beta {\mathbb{E}\big[\psi_t^3\big]},\end{align*}

we can see that the system for the first two moments is again contained in the system for the first three moments:

\begin{equation*}\frac{\textrm{d}}{\textrm{d}t}\!\left[\begin{array}{c}{\mathbb{E}\!\left[\psi_t\right]}\\[5pt] {\mathbb{E}\big[\psi_t^2\big]}\\[5pt] {\mathbb{E}\big[\psi_t^3\big]}\end{array}\right]=\left[\begin{array}{c@{\quad}c@{\quad}c}-\beta & 0 & 0 \\[5pt]2\lambda{\mathbb{E}\!\left[J_1\right]} & -2\beta & 0 \\[5pt]3\lambda{\mathbb{E}\big[J_1^2\big]} & 3\lambda{\mathbb{E}\!\left[J_1\right]} & - 3\beta\\[3pt]\end{array}\right]\left[\begin{array}{c}{\mathbb{E}\!\left[\psi_t\right]}\\[5pt] {\mathbb{E}\big[\psi_t^2\big]}\\[5pt] {\mathbb{E}\big[\psi_t^3\big]}\end{array}\right]+\left[\begin{array}{c}\lambda {\mathbb{E}\!\left[J_1\right]} \\[5pt] \lambda {\mathbb{E}\big[J_1^2\big]} \\[5pt] \lambda {\mathbb{E}\big[J_1^3\big]}\end{array}\right].\end{equation*}

By use of the binomial theorem, we can observe that if ${\mathbb{E}\!\left[J_1^n\right]} < \infty$ then the nth moment of the shot noise process satisfies

\begin{equation*}\frac{\textrm{d}}{\textrm{d}t}{\mathbb{E}\!\left[\psi_t^n\right]}={\mathbb{E}\!\left[\lambda \!\left( (\psi_t + J_1)^n - \psi_t^n \right) - n \beta \psi_t^n\right]}=\sum_{k=0}^{n-1} \!\left(\begin{array}{c}n\\k\end{array}\right){\mathbb{E}\!\left[J_1^{n-k}\right]}{\mathbb{E}\!\left[\psi_t^k\right]} - n \beta {\mathbb{E}\!\left[\psi_t^n\right]},\end{equation*}

which means that the n-dimensional system is equal to the preceding one augmented below by the row vector

\begin{equation*}\left[\begin{array}{c@{\quad}c@{\quad}c@{\quad}c@{\quad}c@{\quad}c@{\quad}c}n\lambda{\mathbb{E}\big[J_1^{n-1}\big]},& \scriptsize\left(\begin{array}{c}n\\2\end{array}\right)\!\lambda{\mathbb{E}\big[J_1^{n-2}\big]},& \scriptsize\left(\begin{array}{c}n\\3\end{array}\right)\!\lambda{\mathbb{E}\big[J_1^{n-3}\big]},& \dots,& \scriptsize\left(\begin{array}{c}n\\n - 2\end{array}\right)\!\lambda {\mathbb{E}\big[J_1^2\big]},& n\lambda{\mathbb{E}\!\left[J_1\right]} & -n\beta\end{array}\right],\end{equation*}

and to the right by zeros. Bringing this together, this now leads us to Corollary 2.

Corollary 2. Let $\psi_t$ be the intensity of a shot noise process with epochs given by a Poisson process with rate $\lambda > 0$ , jump sizes drawn from the sequence of independent and identically distributed random variables $\big\{J_i \mid i \in \mathbb{Z}^+\big\}$ , and exponential decay at rate $\beta > 0$ . If ${\mathbb{E}\!\left[J_1^n\right]} < \infty$ , the nth moment of $\psi_t$ is given by

(19) \begin{align}{\mathbb{E}\!\left[\psi_t^n\right]}&=\textbf{m}^{\psi}_{n}\!\left(\textbf{M}^{\psi}_{n-1} + n \beta\textbf{I}\right)^{-1}\!\left(e^{\textbf{M}^{\psi}_{n-1} t} - e^{-n \beta t}\textbf{I}\right)\left(\textbf{x}_{n-1}\!\left(\psi_0\right)-\frac{\lambda \textbf{j}_{n-1}}{n\beta}\right)+\psi_0^n e^{-n\beta t}\nonumber\\&\quad+\frac{\lambda{\mathbb{E}\!\left[J_1^n\right]}}{n\beta}\!\left(1-e^{-n\beta t}\right)-\frac{\lambda \textbf{m}^{\psi}_{n}}{n\beta}\!\left(\textbf{M}^{\psi}_{n-1}\right)^{-1}\Big(\textbf{I}-e^{\textbf{M}^{\psi}_{n-1}t}\Big)\textbf{j}_{n-1},\end{align}

for all $t \geq 0$ and $n \in \mathbb{Z}^+$ , where $\textbf{j}_{n} \in \mathbb{R}^n$ is such that $\left(\textbf{j}_{n}\right)_i = {\mathbb{E}\!\left[J_1^i\right]}$ ; $\textbf{M}_n^\psi \in \mathbb{R}^{n \times n}$ is recursively defined by

\begin{equation*}\textbf{M}_n^\psi=\left[\begin{array}{c@{\quad}c}\textbf{M}_{n-1}^\psi & \textbf{0}_{n-1 \times 1} \\[5pt]\textbf{m}_n^\psi & - n\beta\end{array}\right],\end{equation*}

with the row vector $\textbf{m}_n^{\psi} \in \mathbb{R}^{n-1}$ defined so that $\big(\textbf{m}_n^{\psi}\big)_i = \scriptsize\left(\begin{array}{c}n\\i\end{array}\right)\!\lambda {\mathbb{E}\big[J_1^{n-i}\big]}$ , and with $\textbf{M}_1^\psi = -\beta$ ; and $\textbf{x}_n(a) \in \mathbb{R}^n$ is such that $\left(\textbf{x}_n(a)\right)_i = a^i$ . In steady state, the $(n+1)$ th moment of the shot noise process is given by

(20) \begin{align}\lim_{t \to \infty} {\mathbb{E}\!\left[\psi_t^n\right]}&=\frac{\lambda}{n\beta}\!\left({\mathbb{E}\!\left[J_1^n\right]} - \textbf{m}_n^\psi \!\left( \textbf{M}_{n-1}^\psi\right)^{-1} \textbf{j}_{n-1} \right)\end{align}

for $n \geq 2$ , where $\lim_{t \to \infty} {\mathbb{E}\!\left[\psi_t\right]} = \frac{\lambda}{\beta}{\mathbb{E}\!\left[J_1\right]}$ . Moreover, if ${\mathbb{E}\big[J_1^{n+1}\big]} < \infty$ , the $(n+1)$ th moment of the shot noise process is given by the recursion

(21) \begin{align}\lim_{t \to \infty} {\mathbb{E}\big[\psi_t^{n+1}\big]}&=\frac{1}{(n+1)\beta}\!\left(\textbf{m}_{n+1}^\psi \textbf{s}_n^\psi + {\mathbb{E}\big[J_1^{n+1}\big]}\right),\end{align}

for all $n \in \mathbb{Z}^+$ , where $\textbf{s}_n^\psi \in \mathbb{R}^n$ is the vector of steady-state moments defined so that $\big(\textbf{s}_n^\psi\big)_i = \lim_{t \to \infty} {\mathbb{E}\!\left[\psi_t^i\right]}$ for $1 \leq i \leq n$ .

3.4. Application to Itô diffusions

For our third example, we consider an Itô diffusion; see e.g. [Reference Øksendal45] for an overview. Let $S_t$ be given by the stochastic differential equation

\begin{equation*}\textrm{d}S_t = g(S_t) \textrm{d}t + h(S_t) \textrm{d}B_t,\end{equation*}

where $B_t$ is a Brownian motion and $g({\cdot})$ and $h({\cdot})$ are real-valued functions. By Theorem 7.9 of [Reference Øksendal45], the infinitesimal generator for this process is given by

\begin{equation*}\mathcal{L}\,f(S_t)=g(S_t) f^{\prime}(S_t)+\frac{h(S_t)^2}{2} f^{\prime\prime}(S_t),\end{equation*}

where f ′′(x) is the second derivative of $f({\cdot})$ evaluated at x. Because we will be considering functions of the form $f(x) = x^n$ for $n \in \mathbb{Z}^+$ , we will now specify the forms of $g({\cdot})$ and $h({\cdot})$ to be $g(x) = \mu + \theta x$ for some $\mu \in \mathbb{R}$ and $\theta \in \mathbb{R}$ and $h(x) = \sigma x^{\gamma/ 2}$ for some $\sigma \in \mathbb{R}$ and $\gamma \in \{0,1,2\}$ . One can note that this encompasses a myriad of relevant stochastic processes including many that are popular in the financial models literature, such as Ornstein–Uhlenbeck processes, geometric Brownian motion, and Cox–Ingersoll–Ross processes. In this case, the infinitesimal generator becomes

\begin{equation*}\mathcal{L}\,f(S_t)=\left(\mu + \theta S_t\right) f^{\prime}(S_t)+\frac{\sigma^2 S_t^{\gamma}}{2} f^{\prime\prime}(S_t),\end{equation*}

meaning that we can express the ODE for the mean as

\begin{equation*}\frac{\textrm{d}}{\textrm{d}t}{\mathbb{E}\!\left[S_t\right]}=\mu + \theta {\mathbb{E}\!\left[S_t\right]},\end{equation*}

and similarly the second moment will be given by the solution to

\begin{equation*}\frac{\textrm{d}}{\textrm{d}t}{\mathbb{E}\big[S_t^2\big]}={\mathbb{E}\big[2(\mu + \theta S_t) S_t + \sigma^2 S_t^{\gamma}\big]}=2\mu {\mathbb{E}\!\left[S_t\right]} + 2\theta {\mathbb{E}\big[S_t^2\big]} + \sigma^2 {\mathbb{E}\!\left[S_t^\gamma\right]}.\end{equation*}

For the sake of example, we now let $\gamma = 1$ , as is the case in the Cox–Ingersoll–Ross process. Then the first two transient moments of $S_t$ will be given by the solution to the system

\begin{equation*}\frac{\textrm{d}}{\textrm{d}t}\!\left[\begin{array}{c}{\mathbb{E}\!\left[S_t\right]} \\[5pt]{\mathbb{E}\big[S_t^2\big]}\\[2pt]\end{array}\right]=\left[\begin{array}{c@{\quad}c}\theta & 0 \\[5pt]2 \mu + \sigma^2 & 2\theta\end{array}\right]\left[\begin{array}{c}{\mathbb{E}\!\left[S_t\right]} \\[5pt]{\mathbb{E}\big[S_t^2\big]}\\[2pt]\end{array}\right]+\left[\begin{array}{c}\mu \\[5pt]0\end{array}\right].\end{equation*}

By observing that the third moment differential equation is

\begin{equation*}\frac{\textrm{d}}{\textrm{d}t}{\mathbb{E}\big[S_t^3\big]}={\mathbb{E}\Big[3(\mu + \theta S_t) S_t^2 + 3\sigma^2 S_t^{\gamma + 1}\Big]}=3\mu {\mathbb{E}\big[S_t^2\big]} + 3\theta {\mathbb{E}\big[S_t^3\big]} + 3 \sigma^2 {\mathbb{E}\big[S_t^{\gamma + 1}\big]},\end{equation*}

we can note that the third moment system for $\gamma = 1$ is

\begin{equation*}\frac{\textrm{d}}{\textrm{d}t}\left[\begin{array}{c}{\mathbb{E}\!\left[S_t\right]} \\[5pt]{\mathbb{E}\big[S_t^2\big]} \\[5pt]{\mathbb{E}\big[S_t^3\big]}\\[2pt]\end{array}\right]=\left[\begin{array}{c@{\quad}c@{\quad}c}\theta & 0 & 0 \\[5pt]2 \mu + \sigma^2 & 2\theta & 0 \\[5pt]0 & 3\mu + 3 \sigma^2 & 3\theta\end{array}\right]\left[\begin{array}{c}{\mathbb{E}\!\left[S_t\right]} \\[5pt]{\mathbb{E}\big[S_t^2\big]} \\[5pt]{\mathbb{E}\big[S_t^3\big]}\\[2pt]\end{array}\right]+\left[\begin{array}{c}\mu \\[5pt]0 \\[5pt]0\end{array}\right],\end{equation*}

and this showcases the matryoshkan nesting structure, as the second moment system is contained within the third. Because the general nth moment for $n \geq 2$ has differential equation given by

\begin{align*}\frac{\textrm{d}}{\textrm{d}t}{\mathbb{E}\!\left[S_t^n\right]}&={\mathbb{E}\!\left[n(\mu + \theta S_t) S_t^{n-1} + \frac{n(n-1)\sigma^2}{2} S_t^{n + \gamma - 2}\right]}\\&=n\mu {\mathbb{E}\big[S_t^{n-1}\big]} + n\theta {\mathbb{E}\!\left[S_t^n\right]} + \frac{n(n-1)\sigma^2}{2} {\mathbb{E}\Big[S_t^{n + \gamma - 2}\Big]},\end{align*}

we can see that the $(n-1)$ th system can be augmented below by the row vector for $\gamma = 1$ ,

\begin{equation*}\left[\begin{array}{c@{\quad}c@{\quad}c@{\quad}c@{\quad}c@{\quad}c}0, & 0, & \dots, & 0, & n\mu + \frac{n(n-1)\sigma^2}{2}, & n\theta\end{array}\right],\end{equation*}

and to the right by zeros. Through this observation, we can now give the moments of Itô diffusions in Corollary 3.

Corollary 3. Let $S_t$ be an Itô diffusion that satisfies the stochastic differential equation

(22) \begin{align}\textrm{d}S_t=(\mu + \theta S_t)\textrm{d}t+\sigma S_t^{\gamma / 2} \textrm{d} B_t,\end{align}

where $B_t$ is a Brownian motion, $\mu, \theta, \sigma \in \mathbb{R}$ , and $\gamma \in \{0, 1, 2\}$ . Then the nth moment of $S_t$ is given by

(23) \begin{align}{\mathbb{E}\!\left[S_t^n\right]}&=\textbf{m}^S_{n}\Big(\textbf{M}^S_{n-1} - \chi_n\textbf{I}\Big)^{-1}\Big(e^{\textbf{M}^S_{n-1} t} - e^{\chi_n t}\textbf{I}\Big)\left(\textbf{x}_{n-1}(S_0)+\frac{\mu \textbf{v}_1 + \sigma^2\mathbb{I}_{\{\gamma = 0\}}\textbf{v}_2}{\chi_n}\right)\nonumber\\&\quad+S_0^n e^{\chi_n t}-\big(\mu \mathbb{I}_{\{n=1\}} + \sigma^2\mathbb{I}_{\{\gamma = 0, n= 2\}}\big)\frac{1-e^{\chi_n t}}{\chi_n}\nonumber\\&\quad+\frac{\textbf{m}^S_{n}}{\chi_{n}}\!\left(\textbf{M}^S_{n-1} \right)^{-1}\!\left(\textbf{I}-e^{\textbf{M}^S_{n-1}t}\right)\left(\mu\textbf{v}_1 + \sigma^2\mathbb{I}_{\{\gamma = 0\}}\textbf{v}_2\right),\end{align}

for all $t \geq 0$ and $n \in \mathbb{Z}^+$ , where $\chi_n = n\theta + \frac{n}{2}(n-1)\sigma^2\mathbb{I}_{\{\gamma = 2\}}$ ;

\begin{equation*}\textbf{M}^S_n = \theta \textbf{diag}(1\,:\,n) + \mu \textbf{diag}(2\,:\,n,-1) + \frac{\sigma^2}{2}\textbf{diag}\!\left(\textbf{d}^{2-\gamma}_{n+\gamma-2}, \gamma - 2 \right)\end{equation*}

for $\textbf{d}^j_k \in \mathbb{R}^k$ such that $\big(\textbf{d}_k^j\big)_i = (\,j+i)(\,j+i-1)$ ;

\begin{equation*}\textbf{m}_n^{S}=\left[\big(\textbf{M}_n^S\big)_{n,1},\,\dots,\,\big(\textbf{M}_n^S\big)_{n,n-1}\right]\end{equation*}

is such that

\begin{equation*}\left(\textbf{m}_n^{S}\right)_{j}=\begin{cases}n\mu + \frac{n(n-1)\sigma^2}{2}\mathbb{I}_{\{\gamma = 1\}}, & j = n -1, \\[5pt]\frac{n(n-1)\sigma^2}{2}\mathbb{I}_{\{\gamma = 0\}}, & j = n - 2, \\[4pt]0, & 1 \leq j < n - 2;\end{cases}\end{equation*}

and $\textbf{x}_n(a) \in \mathbb{R}^n$ is such that $\left(\textbf{x}_n(a)\right)_i = a^i$ . If $\theta < 0$ and $\gamma \in \{0,1\}$ , then the nth steady-state moment of $S_t$ is given by

(24) \begin{align}\lim_{t \to \infty}{\mathbb{E}\!\left[S_t^n\right]}=\frac{\mu}{\chi_n}\textbf{m}_n^S \!\left(\textbf{M}_{n-1}^S\right)^{-1} \textbf{v}_1,\end{align}

for $n \geq 2$ , with $\lim_{t \to \infty} {\mathbb{E}\!\left[S_t\right]} = -\frac{\mu}{\theta}$ . Moreover, the $(n+1)$ th steady-state moment of $S_t$ is given by the recursion

(25) \begin{align}\lim_{t \to \infty} {\mathbb{E}\big[S_t^{n+1}\big]}&=-\frac{1}{\chi_n}\textbf{m}_{n+1}^S \textbf{s}_n^S,\end{align}

for all $n \in \mathbb{Z}^+$ , where $\textbf{s}_n^S \in \mathbb{R}^n$ is the vector of steady-state moments defined so that $\left(\textbf{s}_n^S\right)_i = \lim_{t \to \infty}{\mathbb{E}\!\left[S_t^i\right]}$ for $1 \leq i \leq n$ .

As a consequence of these expressions we can also gain insight on the moments of an Itô diffusion in the case of non-integer $\gamma \in [0,2]$ , as is used in volatility models such as the CEV model and the SABR model; see e.g. [Reference Henry-Labordère34]. This can be achieved through bounding the differential equations, as the nth moment of such a diffusion is again given by

\begin{equation*}\frac{\textrm{d}}{\textrm{d}t}{\mathbb{E}\!\left[S_t^n\right]}=n\mu {\mathbb{E}\big[S_t^{n-1}\big]} + n\theta {\mathbb{E}\!\left[S_t^n\right]} + \frac{n(n-1)\sigma^2}{2} {\mathbb{E}\Big[S_t^{n + \gamma - 2}\Big]},\end{equation*}

the rightmost term of which can be bounded above and below by

\begin{equation*} {\mathbb{E}\Big[S_t^{n + \lfloor\gamma\rfloor - 2}\Big]}\leq {\mathbb{E}\Big[S_t^{n + \gamma - 2}\Big]}\leq {\mathbb{E}\Big[S_t^{n + \lceil\gamma\rceil - 2}\Big]},\end{equation*}

and the differential equations given by substituting these bounded terms form a closed system solvable by Corollary 3. Assuming the true differential equation and the upper and lower bounds all share an initial value, the solution to the bounded equations bounds the solution to the true moment equation; see [Reference Hale and Lunel32].

3.5. Application to growth–collapse processes

For a fourth example, we consider growth–collapse processes with Poisson-driven shocks. These processes have been studied in a variety of contexts; see e.g. [Reference Boxma, Perry, Stadje and Zacks7, Reference Kella38, Reference Kella and Löpker39, Reference Boxma, Kella and Perry6]. More recently, these processes and their related extensions have seen renewed interest in connection with the study of the crypto-currency Bitcoin; see for example [Reference Frolkova and Mandjes28, Reference Koops40, Reference Javier and Fralix37, Reference Fralix27]. While growth–collapse processes can be defined in many different ways, for this example we use a definition in the style of Section 4 from [Reference Boxma, Perry, Stadje and Zacks7]. We let $Y_t$ be the state of the growth–collapse model and let $\{U_i \mid i \in \mathbb{Z}^+\}$ be a sequence of independent $\textrm{Uni}(0,1)$ random variables that are also independent from the state and history of the growth–collapse process. From Proposition 1 of [Reference Boxma, Perry, Stadje and Zacks7], the infinitesimal generator of $Y_t$ is given by

\begin{equation*}\mathcal{L}\,f(Y_t)= \lambda f^{\prime}(Y_t) + \mu \int_0^1 \!\left( f( u Y_t ) - f( Y_t ) \right) \textrm{d}u.\end{equation*}

Thus, $Y_t$ experiences linear growth at rate $\lambda > 0$ throughout time, but it also collapses at epochs given by a Poisson process with rate $\mu > 0$ . At the ith collapse epoch the process falls to a fraction of its current level; specifically it jumps down to $U_i Y_t$ . Using the infinitesimal generator, we can see that the mean of this growth–collapse process satisfies

\begin{equation*}\frac{\textrm{d}}{\textrm{d}t}{\mathbb{E}\!\left[Y_t\right]}=\lambda + \mu \!\left( {\mathbb{E}\!\left[U_1 Y_t\right]} - {\mathbb{E}\!\left[Y_t\right]} \right)=\lambda - \frac{\mu}{2} {\mathbb{E}\!\left[Y_t\right]},\end{equation*}

and its second moment satisfies

\begin{equation*}\frac{\textrm{d}}{\textrm{d}t}{\mathbb{E}\big[Y_t^2\big]}=2 \lambda {\mathbb{E}\!\left[Y_t\right]} + \mu \!\left( {\mathbb{E}\big[U_1^2 Y_t^2\big]} - {\mathbb{E}\big[Y_t^2\big]} \right)=2\lambda {\mathbb{E}\!\left[Y_t\right]} - \frac{2 \mu}{3} {\mathbb{E}\big[Y_t^2\big]}.\end{equation*}

Therefore, we can write the linear system of differential equations for the second moment of this growth–collapse process as

\begin{equation*}\frac{\textrm{d}}{\textrm{d}t}\!\left[\begin{array}{c}{\mathbb{E}\!\left[Y_t\right]}\\[5pt] {\mathbb{E}\big[Y_t^2\big]}\\[2pt]\end{array}\right]=\left[\begin{array}{c@{\quad}c}-\frac{\mu}2 & 0 \\[5pt] 2\lambda & - \frac{2\mu}{3}\\[2pt]\end{array}\right]\left[\begin{array}{c}{\mathbb{E}\!\left[Y_t\right]}\\[5pt] {\mathbb{E}\big[Y_t^2\big]}\\[2pt]\end{array}\right]+\left[\begin{array}{c}\lambda \\[5pt] 0\end{array}\right].\end{equation*}

Moving to the third moment, via the infinitesimal generator we write its differential equation as

\begin{equation*}\frac{\textrm{d}}{\textrm{d}t}{\mathbb{E}\big[Y_t^3\big]}=3 \lambda {\mathbb{E}\big[Y_t^2\big]} + \mu \!\left( {\mathbb{E}\big[U_1^3 Y_t^3\big]} - {\mathbb{E}\big[Y_t^3\big]} \right)=3\lambda {\mathbb{E}\big[Y_t^2\big]} - \frac{3 \mu}{4} {\mathbb{E}\big[Y_t^3\big]},\end{equation*}

which shows that the system of differential equations for the third moment is

\begin{equation*}\frac{\textrm{d}}{\textrm{d}t}\!\left[\begin{array}{c}{\mathbb{E}\!\left[Y_t\right]}\\[5pt] {\mathbb{E}\big[Y_t^2\big]}\\[5pt] {\mathbb{E}\big[Y_t^3\big]}\\[2pt]\end{array}\right]= \left[\begin{array}{c@{\quad}c@{\quad}c}-\frac{\mu}2 & 0 & 0\\[5pt] 2\lambda & - \frac{2\mu}{3} & 0\\[5pt] 0 & 3\lambda & - \frac{3\mu}{4}\\[2pt]\end{array}\right] \left[\begin{array}{c}{\mathbb{E}\!\left[Y_t\right]}\\[5pt] {\mathbb{E}\big[Y_t^2\big]}\\[5pt] {\mathbb{E}\big[Y_t^3\big]}\\[2pt]\end{array}\right]+ \left[\begin{array}{c}\lambda \\[5pt] 0\\[5pt] 0\end{array}\right];\end{equation*}

this obviously encapsulates the system for the first two moments. We can note that the general nth moment will satisfy

\begin{equation*}\frac{\textrm{d}}{\textrm{d}t}{\mathbb{E}\!\left[Y_t^n\right]}=n \lambda {\mathbb{E}\big[Y_t^{n-1}\big]} + \mu \!\left( {\mathbb{E}\!\left[U_1^n Y_t^n\right]} - {\mathbb{E}\!\left[Y_t^n\right]} \right)=n\lambda {\mathbb{E}\big[Y_t^{n-1}\big]} - \frac{n \mu}{n+1} {\mathbb{E}\!\left[Y_t^n\right]},\end{equation*}

and thus the system for the nth moment is given by appending the row vector

\begin{equation*}\left[\begin{array}{c@{\quad}c@{\quad}c@{\quad}c@{\quad}c@{\quad}c}0, &, 0, & \dots, & 0, & n\lambda, & -\frac{n\mu}{n+1}\end{array}\right]\end{equation*}

below the matrix from the $(n-1)$ th system augmented by zeros on the right. Following this derivation, we reach the general expressions for the moments in Corollary 4. Furthermore, we can note that because of the relative simplicity of this particular structure, we are able to solve the recursion for the steady-state moments and give these terms explicitly.

Corollary 4. Let $Y_t$ be a growth–collapse process with growth rate $\lambda > 0$ and uniformly sized collapses occurring according to a Poisson process with rate $\mu > 0$ . Then the nth moment of $Y_t$ is given by

(26) \begin{align}{\mathbb{E}\!\left[Y_t^n\right]}&=n\lambda\textbf{v}_{n-1}^{\textrm{T}}\!\left(\textbf{M}^Y_{n-1} + \frac{n\mu}{n+1}\textbf{I}\right)^{-1}\!\left(e^{\textbf{M}^Y_{n-1} t} - e^{-\frac{n\mu t}{n+1}}\textbf{I}\right)\left(\textbf{x}_{n-1}(y_0)-\frac{(n+1)\lambda \textbf{v}_1}{n\mu}\right)\nonumber\\&\quad+y_0^n e^{-\frac{n\mu t}{n+1}}+\frac{(n+1)\lambda \mathbb{I}_{\{n=1\}}}{n\mu}\!\left(1-e^{-\frac{n\mu t}{n+1}}\right)\nonumber\\&\quad-\frac{(n+1)\lambda^2}{\mu}\textbf{v}^{\textrm{T}}_{n-1}\!\left(\textbf{M}^Y_{n-1}\right)^{-1}\!\left(\textbf{I}-e^{\textbf{M}^Y_{n-1}t}\right)\textbf{v}_1,\end{align}

where $y_0$ is the initial value of $Y_t$ , $\textbf{x}_n(a) \in \mathbb{R}^n$ is such that $\left(\textbf{x}_n(a)\right)_i = a^i$ ,

\begin{equation*}\textbf{M}_n^Y = \lambda \textbf{diag}\!\left(2\,:\,n,-1\right) - \mu \textbf{diag}\!\left(\frac{1}{2} \,:\, \frac{n}{n+1}\right),\end{equation*}

and

\begin{equation*}\textbf{m}_n^Y = \left[\left(\textbf{M}_n^Y\right)_{n,1}, \dots, \left(\textbf{M}_n^Y\right)_{n,n-1} \right]\end{equation*}

is such that $\textbf{m}_n^Y = n\lambda\textbf{v}_{n-1}^{\textrm{T}}$ . Moreover, the nth steady-state moment of $Y_t$ is given by

(27) \begin{align}\lim_{t \to \infty}{\mathbb{E}\!\left[Y_t^n\right]}&=(n+1)!\left(\frac{\lambda}{\mu}\right)^n,\end{align}

for $n \in \mathbb{Z}^+$ .

3.6. Application to ephemerally self-exciting processes

As a final detailed example of the applicability of matryoshkan matrices, we now consider a stochastic process we have analyzed in [Reference Daw and Pender19]. This process is a linear birth–death–immigration process, which has been of interest in classical teletraffic theory; see for example [Reference Wallström46], [Reference Iversen35], and Section 8.4 of [Reference Iversen36]. By comparison, our interests in [Reference Daw and Pender19] are in alternate representations of self-exciting processes and alternate constructions of the original self-exciting process, the Hawkes process. Here, the occurrence of an arrival increases the arrival rate by an amount $\alpha > 0$ , as in the Hawkes process, and this increase expires after an exponentially distributed duration with some rate $\mu > \alpha$ . Accordingly, this process is an ephemerally self-exciting process. Given a baseline intensity $\nu^* > 0$ , let $Q_t$ be such that new arrivals occur at rate $\nu^* + \alpha Q_t$ and then the overall rate until the next excitement expiration is $\mu Q_t$ . One can then think of $Q_t$ as the number of entities still causing active excitement at time $t \geq 0$ . We will refer to $Q_t$ as the number in system for this ephemerally self-exciting process. The infinitesimal generator for a function $f \,:\, \mathbb{N} \to \mathbb{R}$ is thus

\begin{equation*}\mathcal{L}\,f(Q_t) = \left(\nu^* + \alpha Q_t\right) \left(\,f(Q_t + 1) - f(Q_t)\right) + \mu Q_t \!\left(\,f(Q_t - 1) - f(Q_t)\right),\end{equation*}

which again captures the dynamics of the process, as the first term on the right-hand side is the product of the up-jump rate and the change in function value upon an increase in the process, while the second term is the product of the down-jump rate and the corresponding process decrease. This now yields an ODE for the mean given by

\begin{equation*}\frac{\textrm{d}}{\textrm{d}t}{\mathbb{E}\!\left[Q_t\right]}=\nu^* + \alpha{\mathbb{E}\!\left[Q_t\right]} - \mu {\mathbb{E}\!\left[Q_t\right]}=\nu^*-(\mu - \alpha) {\mathbb{E}\!\left[Q_t\right]},\end{equation*}

while the second moment will satisfy

\begin{align*}\frac{\textrm{d}}{\textrm{d}t}{\mathbb{E}\big[Q_t^2\big]}&={\mathbb{E}\!\left[\left(\nu^* + \alpha Q_t\right)\left((Q_t+1)^2 - Q_t^2\right) + \mu Q_t \!\left((Q_t-1)^2 - Q_t^2\right)\right]}\\&=\left(2\nu^* + \mu + \alpha\right){\mathbb{E}\!\left[Q_t\right]}+ \nu^*- 2(\mu - \alpha){\mathbb{E}\big[Q_t^2\big]}.\end{align*}

Thus, the first two moments are given by the solution to the linear system

\begin{equation*}\frac{\textrm{d}}{\textrm{d}t}\!\left[\begin{array}{c}{\mathbb{E}\!\left[Q_t\right]}\\[4pt] {\mathbb{E}\big[Q_t^2\big]}\\[2pt]\end{array}\right]=\left[\begin{array}{c@{\quad}c}-(\mu - \alpha) & 0 \\[4pt] 2\nu^* + \mu + \alpha & -2(\mu-\alpha)\end{array}\right]\left[\begin{array}{c}{\mathbb{E}\!\left[Q_t\right]}\\[4pt] {\mathbb{E}\big[Q_t^2\big]}\\[2pt]\end{array}\right]+\left[\begin{array}{c}\nu^* \\[4pt] \nu^*\end{array}\right],\end{equation*}

and by observing that the third moment differential equation is

\begin{align*}\frac{\textrm{d}}{\textrm{d}t}{\mathbb{E}\big[Q_t^3\big]}&={\mathbb{E}\!\left[\left(\nu^* + \alpha Q_t\right)\big((Q_t+1)^3 - Q_t^3\big) + \mu Q_t \big((Q_t-1)^3 - Q_t^3\big)\right]}\\&=\left(3\nu^* + 3\alpha + 3\mu\right){\mathbb{E}\big[Q_t^2\big]}+\left(3\nu^* + \alpha - \mu\right){\mathbb{E}\!\left[Q_t\right]}+\nu^*- 3(\mu - \alpha){\mathbb{E}\big[Q_t^3\big]},\end{align*}

we can see that the third moment system does indeed encapsulate the second moment system:

\begin{equation*}\frac{\textrm{d}}{\textrm{d}t}\!\left[\begin{array}{c}{\mathbb{E}\!\left[Q_t\right]}\\[5pt] {\mathbb{E}\big[Q_t^2\big]} \\[5pt] {\mathbb{E}\big[Q_t^3\big]}\\[2pt]\end{array}\right]=\left[\begin{array}{c@{\quad}c@{\quad}c}-(\mu - \alpha) & 0 & 0 \\[5pt] 2\nu^* + \mu + \alpha & -2(\mu-\alpha) & 0 \\[5pt] 3\nu^* + \alpha - \mu & 3\nu^* + 3\alpha + 3\mu & - 3(\mu-\alpha)\\[2pt]\end{array}\right] \left[\begin{array}{c} {\mathbb{E}\!\left[Q_t\right]}\\[5pt] {\mathbb{E}\big[Q_t^2\big]} \\[5pt] {\mathbb{E}\big[Q_t^3\big]}\\[2pt]\end{array}\right]+ \left[\begin{array}{c}\nu^* \\[5pt] \nu^* \\[5pt] \nu^*\end{array}\right].\end{equation*}

In general, the nth moment is given by the solution to

\begin{align*}\frac{\textrm{d}}{\textrm{d}t}{\mathbb{E}\!\left[Q_t^n\right]}&={\mathbb{E}\!\left[(\nu^* + \alpha Q_t) \left((Q_t + 1)^n - Q_t^n\right) + \mu Q_t \!\left((Q_t - 1)^n - Q_t^n\right)\right]}\\&=\nu^*+\nu^* \sum_{k=1}^{n-1}\!\left(\begin{array}{c}n\\k\end{array}\right) {\mathbb{E}\!\left[Q_t^k\right]}+\alpha \sum_{k=1}^{n}\!\left(\begin{array}{c}n\\k - 1\end{array}\right) {\mathbb{E}\!\left[Q_t^k\right]}\\& \quad+\mu \sum_{k=1}^{n} \!\left(\begin{array}{c}n\\k - 1\end{array}\right) {\mathbb{E}\!\left[Q_t^{k}\right]}({-}1)^{n-k-1},\end{align*}

which means that the nth system is given by augmenting the previous system below by

\begin{equation*}\left[\begin{array}{c@{\quad}c@{\quad}c@{\quad}c@{\quad}c}n\nu^* + \alpha + \mu({-}1)^n,&\!\!\scriptsize\left(\begin{array}{c}n\\2\end{array}\right)\!\nu^* + n\alpha + n\mu({-}1)^{n-1},&\!\!\dots,&\!\! n\nu^* + \scriptsize\left(\begin{array}{c}n\\2\end{array}\right)\!\alpha + \scriptsize\left(\begin{array}{c}n\\2\end{array}\right)\!\mu,&\!\! -n(\mu-\alpha)\end{array}\right],\end{equation*}

and to the right by zeros. By comparing this row vector to the definition of the matryoshkan Pascal matrices in Equation (15), we arrive at explicit forms for the moments of this process, shown now in Corollary 5.

Corollary 5. Let $Q_t$ be the number in system for an ephemerally self-exciting process with baseline intensity $\nu^* > 0$ , intensity jump $\alpha > 0$ , and duration rate $\mu > \alpha$ . Then the nth moment of $Q_t$ is given by

(28) \begin{align}{\mathbb{E}\!\left[Q_t^n\right]}&=\textbf{m}^Q_{n}\!\left(\textbf{M}^Q_{n-1} + n(\mu-\alpha)\textbf{I}\right)^{-1}\!\left(e^{\textbf{M}^Q_{n-1} t} - e^{-n(\mu-\alpha)t}\textbf{I}\right)\left(\textbf{x}_{n-1}(Q_0)-\frac{\nu^* \textbf{v}}{n(\mu-\alpha)}\right)\nonumber\\&\quad+Q_0^n e^{-n(\mu-\alpha)t}+\frac{\nu^*}{n(\mu-\alpha)}\!\left(1-e^{-n(\mu-\alpha)t}\right)\nonumber\\&\quad-\frac{\nu^* \textbf{m}^Q_{n}}{n(\mu-\alpha)}\!\left(\textbf{M}^Q_{n-1}\right)^{-1}\!\left(\textbf{I}-e^{\textbf{M}^Q_{n-1}t}\right)\textbf{v},\end{align}

for all $t \geq 0$ and $n \in \mathbb{Z}^+$ , where $\textbf{M}^Q_n = \nu^*\boldsymbol{\mathcal{P}}_n(1)\textbf{diag}(\textbf{v},-1) + \alpha\boldsymbol{\mathcal{P}}_n(1) + \mu\boldsymbol{\mathcal{P}}_n({-}1)$ ,

\begin{equation*}\textbf{m}_n^{Q}=\left[\left(\textbf{M}_n^Q\right)_{n,1},\,\dots,\,\left(\textbf{M}_n^Q\right)_{n,n-1}\right]\end{equation*}

is such that

\begin{equation*}\left(\textbf{m}_n^{Q}\right)_{j}=\left(\begin{array}{c}n\\j\end{array}\right) \nu^*+\left(\begin{array}{c}n\\j - 1\end{array}\right)\alpha+\left(\begin{array}{c}n\\j - 1\end{array}\right)\mu ({-}1)^{n-j-1},\end{equation*}

and $\textbf{x}_n(a) \in \mathbb{R}^n$ is such that $\left(\textbf{x}_n(a)\right)_i = a^i$ . In steady state, the nth moment of $Q_t$ is given by

(29) \begin{align}\lim_{t \to \infty}{\mathbb{E}\!\left[Q_t^n\right]}=\frac{\nu^*}{\mu - \alpha}\!\left(1 - \textbf{m}_n^Q \!\left(\textbf{M}_{n-1}^Q\right)^{-1} \textbf{v}\right),\end{align}

for $n \geq 2$ , with $\lim_{t \to \infty} {\mathbb{E}\!\left[Q_t\right]} = \frac{\nu^*}{\mu-\alpha}$ . Moreover, the $(n+1)$ th steady-state moment of the ephemerally self-exciting process is given by the recursion

(30) \begin{align}\lim_{t \to \infty} {\mathbb{E}\big[Q_t^{n+1}\big]}&=\frac{1}{(n+1)(\mu-\alpha)}\Big(\textbf{m}_{n+1}^Q \textbf{s}_n^Q + \nu^*\Big),\end{align}

for all $n \in \mathbb{Z}^+$ , where $\textbf{s}_n^Q \in \mathbb{R}^n$ is the vector of steady-state moments defined so that $\big(\textbf{s}_n^Q\big)_i = \lim_{t \to \infty}{\mathbb{E}\!\left[Q_t^i\right]}$ for $1 \leq i \leq n$ .

Let us note that the ephemerally self-exciting process is known to be negative binomially distributed in steady state [Reference Daw and Pender19]. Hence, while the vector and recursive computations in Equations (29) and (30), respectively, may be useful for higher moments, for lower moments there may be closed-form expressions available.

3.7. Additional applications by combination and permutation

While the preceding examples are the the only detailed examples we include in this paper, we can note that these matryoshkan matrix methods can be applied to many other settings. In fact, one can observe that these example derivations can be applied directly to processes that feature a combination of their structures, such as the dynamic contagion process introduced in [Reference Dassios and Zhao17]. The dynamic contagion process is a point process that is both self-excited and externally excited, meaning that its intensity experiences jumps driven both by its own activity and by the activity of an exogenous Poisson process. In this way, the process combines the behavior of the Hawkes and shot noise processes. Hence, its infinitesimal generator can be written using a combination of expressions used in Subsections 3.2 and 3.3, implying that all moments of the process can be calculated through this methodology. Similarly, these methods can also be readily applied to processes that combine dynamics from Hawkes processes and from Itô diffusions, such as affine point processes. These processes, studied in e.g. [Reference Errais, Giesecke and Goldberg22, Reference Zhang, Blanchet, Giesecke and Glynn48, Reference Gao and Zhu30], feature both self-excitement and diffusion behavior and thus have an infinitesimal generator that can be expressed using terms from the generators for Hawkes and Itô processes. One could also study the combination of externally driven jumps and diffusive behavior, such as in affine jump diffusions; see e.g. [Reference Duffie, Pan and Singleton20]. Of course, one can also consider permutations of the model features seen in our examples, such as trading fixed-size jumps for random ones to form marked Hawkes processes or changing to randomly sized batches of arrivals in the ephemerally self-exciting process. In general, the key requirement from the assumptions in Theorem 1 is the closure of the system of moment differential equations specified in Equation (7). This is equivalent to having the infinitesimal generator of any polynomial being a polynomial of order no more than the original, which again connects this work to the literature on polynomial processes. In fact, the notion of closure of polynomial processes under combinations of infinitesimal generators has been studied by [Reference Cuchiero, Larsson and Svaluto-Ferro12]. In summary, infinitesimal generators of the form

\begin{align*}\mathcal{L}\,f(X_t)&=\underbrace{(\alpha_0 + \alpha_1 X_t ) \int_0^\infty \!\left(\,f(X_t + a) - f(X_t) \right)\! \textrm{d}F_{A}(a) }_{\text{Up-jumps}}+\underbrace{(\alpha_2 + \alpha_3 X_t ) f^{\prime}(X_t)}_{\text{Drift, decay, or growth}}\\&\quad+\underbrace{(\alpha_4 + \alpha_5 X_t ) \int_0^\infty \!\left(\,f(X_t - b) - f(X_t) \right)\! \textrm{d}F_{B}(b)}_{\text{Down-jumps}}+\underbrace{\big(\alpha_6 + \alpha_7 X_t + \alpha_8 X_t^2 \big) f^{\prime\prime}(X_t)}_{\text{Diffusion}}\\&\quad+\underbrace{\alpha_9 \int_0^\infty \!\left( f(c X_t) - f(X_t) \right)\! \textrm{d}F_C(c)}_{\text{Expansion or collapse}}\end{align*}

can be handled by this methodology, where $\alpha_j \in \mathbb{R}$ for all j and where the sequences $\{A_i\}$ , $\{B_i\}$ , and $\{C_i\}$ are of mutually independent random variables with respective cumulative distribution functions $F_A({\cdot})$ , $F_B({\cdot})$ , and $F_C({\cdot})$ . Finally we note that this example generator need not be exclusive, as it is possible that other dynamics may also meet the closure requirements in Equation (7).

4. Conclusion

In this work, we have defined a novel sequence of matrices called matryoshkan matrices, which stack like their Russian nesting doll namesakes. In doing so, we have found an efficient manner of calculating the moments of a large class of Markov processes that satisfy a closure condition for the time derivatives of their transient moments, namely the one-dimensional polynomial processes. Furthermore, this has yielded closed-form expressions for the transient and steady-state moments of these process. Notably, this includes the intensity of the Hawkes process, for which finding an expression for the nth moment was previously an open problem. Other examples we have discussed include Itô diffusions from the mathematical finance literature and shot noise processes from the physics literature, which showcases the breadth of this methodology.

We can note that there are many applications of this methodology that we have not explored in this paper; these present opportunities for future work. For example, the vector form of the moments arising from this matrix-based method naturally lends itself to use in the method of moments. Thus, matryoshkan matrices have the potential to greatly simplify estimation for the myriad of Markov processes to which they apply. Additionally, this vector of solutions may also be of use in providing computationally tractable approximations of moment generating functions. That is, by a Taylor expansion one can approximate a moment generating function by a weighted sum of its moments. Because this paper’s matryoshkan matrix methods enable efficient calculation of higher-order moments, this enables higher-order approximations of the moment generating function.

As another important direction of future work, we are also interested in extending these techniques to multivariate Markov processes. This is of practical relevance in many of the settings we have described, such as point processes driven by the Hawkes or shot noise process intensities. The challenge in this case arises in the fact that a moment’s differential equation now depends on the lower product moments rather than just the lower moments, so the nesting structure is not as neatly organized. Nevertheless, addressing this generalization is an extension worth pursuing, as this would render these techniques even more widely applicable.

Appendix A. Proof of Proposition 1

Proof. For the sake of clarity and ease of reference, we will enumerate the proofs of the four statements.

  1. (i) Suppose $\textbf{X}_{n}$ and $\textbf{Y}_{n}$ are both matryoshkan matrices. Then, by Equation (1), we have that

    \begin{align*}\textbf{X}_{n} + \textbf{Y}_{n}&=\left[\begin{array}{c@{\quad}c}\textbf{X}_{n-1} & \textbf{0}_{n-1 \times 1} \\[5pt]\textbf{x}_{n} & x_{n,n}\end{array}\right]+\left[\begin{array}{c@{\quad}c}\textbf{Y}_{n-1} & \textbf{0}_{n-1 \times 1} \\[5pt]\textbf{y}_{n} & y_{n,n}\end{array}\right]=\left[\begin{array}{c@{\quad}c}\textbf{X}_{n-1} + \textbf{Y}_{n-1} & \textbf{0}_{n-1 \times 1} \\[5pt]\textbf{x}_{n} + \textbf{y}_{n} & x_{n,n} + y_{n,n}\end{array}\right],\end{align*}
    and
    \begin{align*}\textbf{X}_{n} \textbf{Y}_{n}&=\left[\begin{array}{c@{\quad}c}\textbf{X}_{n-1} & \textbf{0}_{n-1 \times 1} \\[5pt]\textbf{x}_{n} & x_{n,n}\end{array}\right]\left[\begin{array}{c@{\quad}c}\textbf{Y}_{n-1} & \textbf{0}_{n-1 \times 1} \\[5pt]\textbf{y}_{n} & y_{n,n}\end{array}\right]=\left[\begin{array}{c@{\quad}c}\textbf{X}_{n-1}\textbf{Y}_{n-1} & \textbf{0}_{n-1 \times 1} \\[5pt]\textbf{x}_{n}\textbf{Y}_{n-1} + x_{n,n}\textbf{y}_{n} & x_{n,n} y_{n,n}\end{array}\right].\end{align*}
    We can now again invoke Equation (1) to observe that these forms satisfy this definition and thus are also matryoshkan matrices.
  2. (ii) Let $\textbf{M}_{n} \in \mathbb{R}^{n \times n}$ be a matryoshkan matrix with all nonzero diagonal elements $m_{i,i}$ for $i \in \{1, \dots, n\}$ . By definition $\textbf{M}_n$ is lower triangular and hence its eigenvalues are on its diagonal. Since all the eigenvalues are nonzero by assumption, $\textbf{M}_n$ is invertible. Moreover, it is known that the inverse of a lower triangular matrix is lower triangular as well. Thus, we will now solve for a lower triangular matrix $\textbf{W}_{n} \in \mathbb{R}^{n \times n}$ such that $ \textbf{I}_n = \textbf{M}_n \textbf{W}_n, $ where $\textbf{I}_n \in \mathbb{R}^{n \times n}$ is the identity. This can be written as

    \begin{align*} \left[\begin{array}{c@{\quad}c} \textbf{I}_{n-1} & \textbf{0}_{n-1 \times 1} \\[5pt] \textbf{0}_{1 \times n-1} & 1 \end{array}\right] = \textbf{I}_n = \textbf{M}_n \textbf{W}_n = \left[\begin{array}{c@{\quad}c} \textbf{M}_{n-1} & \textbf{0}_{n-1 \times 1} \\[5pt] \textbf{m}_{n} & m_{n,n} \end{array}\right] \left[\begin{array}{c@{\quad}c} \textbf{A} & \textbf{0}_{n-1 \times 1} \\[5pt] \textbf{b} & c \end{array}\right] , \end{align*}
    where $\textbf{A} \in \mathbb{R}^{n-1\times n-1}$ , $b \in \mathbb{R}^{1\times n -1}$ , and $c \in \mathbb{R}$ . Because $m_{i,i} \ne 0$ for all $i \in \{1,\dots,$ $n-1\}$ , we also know that $\textbf{M}_{n-1}$ is nonsingular. Thus, we can see that $\textbf{A} = \textbf{M}_{n-1}^{-1}$ from $\textbf{M}_{n-1}\textbf{A} = \textbf{I}_{n-1}$ . Likewise, $c m_{n,n} = 1$ implies $c = \frac{1}{m_{n,n}}$ . Then, we have that
    \begin{equation*} \textbf{0}_{1 \times n-1} = \textbf{m}_n \textbf{A} + m_{n,n} \textbf{b} = \textbf{m}_n \textbf{M}_{n-1}^{-1} + m_{n,n} \textbf{b} , \end{equation*}
    and so $\textbf{b} = - \frac{1}{m_{n,n}}\textbf{m}_n \textbf{M}_{n-1}^{-1}$ . This completes the solution for $\textbf{W}_{n}$ , and hence provides the inverse of $\textbf{M}_n$ .
  3. (iii) To begin, we will prove that

    \begin{equation*} \textbf{M}_n^k = \left[\begin{array}{c@{\quad}c} \textbf{M}_{n-1}^k & \textbf{0}_{n-1 \times 1} \\[9pt] \textbf{m}_{n}\sum_{j=0}^{k-1} \textbf{M}_{n-1}^j m_{n,n}^{k-1-j} & m_{n,n}^k \end{array}\right] \end{equation*}
    for $k \in \mathbb{Z}^+$ . We proceed by induction. The base case, $k=1$ , holds by definition. Therefore we suppose that the hypothesis holds at k. Then at $k+1$ we can observe that
    \begin{align*} \textbf{M}_{n}^{k+1} & = \textbf{M}_{n} \textbf{M}_{n}^{k} \\[5pt] & = \left[\begin{array}{c@{\quad}c} \textbf{M}_{n-1} & \textbf{0}_{n-1 \times 1} \\[5pt] \textbf{m}_{n} & m_{n,n} \end{array}\right] \left[\begin{array}{c@{\quad}c} \textbf{M}_{n-1}^k & \textbf{0}_{n-1 \times 1} \\[9pt] \textbf{m}_{n}\sum_{j=0}^{k-1} \textbf{M}_{n-1}^j m_{n,n}^{k-1-j} & m_{n,n}^k \end{array}\right] \\[5pt] & = \left[\begin{array}{c@{\quad}c} \textbf{M}_{n-1}^{k+1} & \textbf{0}_{n-1 \times 1} \\[9pt] \textbf{m}_{n}\textbf{M}_{n-1}^k + \textbf{m}_{n}\sum_{j=0}^{k-1} \textbf{M}_{n-1}^j m_{n,n}^{k-j} & m_{n,n}^{k+1} \end{array}\right] \\[5pt] & = \left[\begin{array}{c@{\quad}c} \textbf{M}_{n-1}^{k+1} & \textbf{0}_{n-1 \times 1} \\[9pt] \textbf{m}_{n}\sum_{j=0}^{k} \textbf{M}_{n-1}^j m_{n,n}^{k-j} & m_{n,n}^{k+1} \end{array}\right] , \end{align*}
    which completes the induction. We now observe further that for matrices $\textbf{A} \in \mathbb{R}^{n \times n}$ and $\textbf{B} \in \mathbb{R}^{n \times n}$ such that $\textbf{A}\textbf{B} = \textbf{B}\textbf{A}$ and $\textbf{A} - \textbf{B}$ is nonsingular,
    \begin{equation*} \sum_{j=0}^{k-1} \textbf{A}^j \textbf{B}^{k-1-j} = \left(\textbf{A} -\textbf{B}\right)^{-1}\big(\textbf{A}^k-\textbf{B}^k\big) . \end{equation*}
    This relationship can verified by multiplying the left-hand side by $\textbf{A}-\textbf{B}$ :
    \begin{equation*} (\textbf{A}-\textbf{B})\sum_{j=0}^{k-1} \textbf{A}^j \textbf{B}^{k-1-j} = \sum_{j=0}^{k-1} \textbf{A}^{j+1} \textbf{B}^{k-1-j} - \sum_{j=0}^{k-1} \textbf{A}^j \textbf{B}^{k-j} = \textbf{A}^k-\textbf{B}^k . \end{equation*}
    This allows us to observe that
    \begin{equation*} \textbf{M}_n^k = \left[\begin{array}{c@{\quad}c} \textbf{M}_{n-1}^k & \textbf{0}_{n-1 \times 1} \\[9pt] \textbf{m}_{n}\big(\textbf{M}_{n-1} - m_{n,n}\textbf{I}\big)^{-1}\Big(\textbf{M}_{n-1}^k - m_{n,n}^k\textbf{I}\Big) & m_{n,n}^k \end{array}\right] , \end{equation*}
    and thus
    \begin{align*} e^{\textbf{M}_{n} t} & = \sum_{k=0}^\infty \frac{t^k \textbf{M}_n^k}{k!} = \sum_{k=0}^\infty \frac{t^k }{k!} \left[\begin{array}{c@{\quad}c} \textbf{M}_{n-1}^k & \textbf{0}_{n-1 \times 1} \\[9pt] \textbf{m}_{n}\big(\textbf{M}_{n-1} - m_{n,n}\textbf{I}\big)^{-1}\Big(\textbf{M}_{n-1}^k - m_{n,n}^k\textbf{I}\Big) & m_{n,n}^k \end{array}\right] \\[5pt] & = \left[\begin{array}{c@{\quad}c} e^{\textbf{M}_{n-1} t} & \textbf{0}_{n-1 \times 1} \\[9pt] \textbf{m}_{n}\big(\textbf{M}_{n-1} - m_{n,n}\textbf{I}\big)^{-1}\!\left(e^{\textbf{M}_{n-1} t} - e^{m_{n,n} t}\textbf{I}\right) & e^{m_{n,n} t} \end{array}\right] , \end{align*}
    which completes the proof. Note that because $\textbf{M}_{n-1}$ is triangular and because we have assumed $m_{1,1}, \dots, m_{n,n}$ are distinct, we know that $\textbf{M}_{n-1} - m_{n,n}\textbf{I}$ is invertible.
  4. (iv) From the statement, we seek a matrix $\textbf{A} \in \mathbb{R}^{n-1\times n-1}$ , a row vector $\textbf{b} \in \mathbb{R}^{1 \times n -1}$ , and a scalar $c \in \mathbb{R}$ such that

    \begin{equation*} \left[\begin{array}{c@{\quad}c} \textbf{M}_{n-1} & \textbf{0}_{n-1 \times 1} \\[4pt] \textbf{m}_{n} & m_{n,n} \end{array}\right] \left[\begin{array}{c@{\quad}c} \textbf{A} & \textbf{0}_{n-1 \times 1} \\[4pt] \textbf{b} & c \end{array}\right] = \left[\begin{array}{c@{\quad}c} \textbf{A} & \textbf{0}_{n-1 \times 1} \\[4pt] \textbf{b} & c \end{array}\right] \left[\begin{array}{c@{\quad}c} \textbf{D}_{n-1} & \textbf{0}_{n-1 \times 1} \\[4pt] \textbf{0}_{1 \times n-1} & m_{n,n} \end{array}\right], \end{equation*}
    where $\textbf{D}_{n-1} \in \mathbb{R}^{n-1 \times n -1}$ is a diagonal matrix with values $m_{1,1}, \dots , m_{n-1,n-1}$ . From the triangular structure of $\textbf{M}_n$ , we know that $\textbf{D}_n$ contains all the eigenvalues of $\textbf{M}_n$ . We will now solve the resulting sub-systems. From $\textbf{M}_{n-1} \textbf{A} = \textbf{A} \textbf{D}_{n-1}$ , we take $\textbf{A} = \textbf{U}_{n-1}$ . Substituting this forward, we see that
    \begin{equation*} \textbf{m}_n \textbf{U}_{n-1} + m_{n,n} \textbf{b} = \textbf{m}_n \textbf{A} + m_{n,n} \textbf{b} = \textbf{b}\textbf{D}_{n-1} \end{equation*}
    and so $b = \textbf{m}_n \textbf{U}_{n-1}\big(\textbf{D}_{n-1} - m_{n,n} \textbf{I}\big)^{-1}$ , where as in Step (iii) we are justified in inverting $\textbf{D}_{n-1} - m_{n,n} \textbf{I}$ because of the fact that $m_{1,1}, \dots, m_{n,n}$ are distinct. Finally, we take $c = 1$ , as any value will satisfy $c m_{n,n} = c m_{n,n}$ .

Appendix B. Proof of Lemma 1

Proof. The vector solution in Equation (5) is known and is thus displayed for reference. Expanding this expression in bracket-notation form, by use of Proposition 1 this is

\begin{align*}&\left[\begin{array}{c}\textbf{s}_{n-1}(t)\\[5pt]s_n(t)\end{array}\right]=\left[\begin{array}{c@{\quad}c}e^{\textbf{M}_{n-1}t} & \textbf{0}_{n-1 \times 1} \\[5pt]\textbf{m}_{n}\!\left(\textbf{M}_{n-1} - m_{n,n}\textbf{I}\right)^{-1}\!\left(e^{\textbf{M}_{n-1}t} - e^{m_{n,n} t}\textbf{I}\right)&e^{m_{n,n} t}\end{array}\right]\left[\begin{array}{c}\textbf{s}_{n-1}(0)\\[5pt]s_n(0)\end{array}\right]\\[5pt]&\quad-\left[\begin{array}{c@{\quad}c}\textbf{M}_{n-1}^{-1} & \textbf{0}_{n-1 \times 1} \\[5pt]-\frac{1}{m_{n,n}}\textbf{m}_{n}\textbf{M}_{n-1}^{-1} & \frac{1}{m_{n,n}}\end{array}\right]\left[\begin{array}{c@{\quad}c}\textbf{I} - e^{\textbf{M}_{n-1}t} & \textbf{0}_{n-1 \times 1} \\[5pt]-\textbf{m}_{n}\!\left(\textbf{M}_{n-1} - m_{n,n}\textbf{I}\right)^{-1}\!\left(e^{\textbf{M}_{n-1}t} - e^{m_{n,n} t}\textbf{I}\right)&1 - e^{m_{n,n} t}\end{array}\right]\left[\begin{array}{c}\textbf{c}_{n-1}\\[5pt]c_n\end{array}\right].\end{align*}

Thus, we can find $s_n(t)$ by multiplying each left side of the equality by a unit row vector in the direction of the nth coordinate, which we denote by $\textbf{v}_n^{\textrm{T}}$ . This yields

\begin{align*}s_n(t)&=\textbf{v}_n^{\textrm{T}}\!\left[\begin{array}{c}\textbf{s}_{n-1}(t)\\[5pt]s_n(t)\end{array}\right]\\[5pt]&=\left[\begin{array}{c@{\quad}c}\textbf{m}_{n}\!\left(\textbf{M}_{n-1} - m_{n,n}\textbf{I}\right)^{-1}\!\left(e^{\textbf{M}_{n-1}t} - e^{m_{n,n} t}\textbf{I}\right)&e^{m_{n,n} t}\end{array}\right]\left[\begin{array}{c}\textbf{s}_{n-1}(0)\\[5pt]s_n(0)\end{array}\right]\\[5pt]&\quad-\left[\begin{array}{c@{\quad}c}-\frac{1}{m_{n,n}}\textbf{m}_{n}\textbf{M}_{n-1}^{-1} & \frac{1}{m_{n,n}}\end{array}\right]\left[\begin{array}{c@{\quad}c}\textbf{I} - e^{\textbf{M}_{n-1}t} & \textbf{0}_{n-1 \times 1} \\[5pt]-\textbf{m}_{n}\!\left(\textbf{M}_{n-1} - m_{n,n}\textbf{I}\right)^{-1}\!\left(e^{\textbf{M}_{n-1}t} - e^{m_{n,n} t}\textbf{I}\right)&1 - e^{m_{n,n} t}\end{array}\right]\left[\begin{array}{c}\textbf{c}_{n-1}\\[5pt]c_n\end{array}\right]\\[5pt]&=\left[\begin{array}{c@{\quad}c}\textbf{m}_{n}\!\left(\textbf{M}_{n-1} - m_{n,n}\textbf{I}\right)^{-1}\!\left(e^{\textbf{M}_{n-1}t} - e^{m_{n,n} t}\textbf{I}\right)&e^{m_{n,n} t}\end{array}\right]\left[\begin{array}{c}\textbf{s}_{n-1}(0)\\[5pt]s_n(0)\end{array}\right]\\[5pt]&\quad-\left[\begin{array}{c@{\quad}c}-\frac{1}{m_{n,n}}\textbf{m}_{n}\textbf{M}_{n-1}^{-1} & \frac{1}{m_{n,n}}\end{array}\right]\left[\begin{array}{c}\left(\textbf{I} - e^{\textbf{M}_{n-1}t}\right)\textbf{c}_{n-1}\\[5pt]-\textbf{m}_{n}\!\left(\textbf{M}_{n-1} - m_{n,n}\textbf{I}\right)^{-1}\!\left(e^{\textbf{M}_{n-1}t} - e^{m_{n,n} t}\textbf{I}\right) \textbf{c}_{n-1}+c_n(1 - e^{m_{n,n} t})\end{array}\right].\end{align*}

Then, by taking these inner products, we obtain

\begin{align*}s_n(t)&=\textbf{m}_{n}\!\left(\textbf{M}_{n-1} - m_{n,n}\textbf{I}\right)^{-1}\!\left(e^{\textbf{M}_{n-1}t} - e^{m_{n,n} t}\textbf{I}\right)\textbf{s}_{n-1}(0)+s_n(0) e^{m_{n,n} t}\\& \quad+\textbf{m}_{n}\textbf{M}_{n-1}^{-1}\!\left(\textbf{I} - e^{\textbf{M}_{n-1}t}\right)\frac{\textbf{c}_{n-1}}{m_{n,n}}\\&\quad+\textbf{m}_{n}\!\left(\textbf{M}_{n-1} - m_{n,n}\textbf{I}\right)^{-1}\!\left(e^{\textbf{M}_{n-1}t} - e^{m_{n,n} t}\textbf{I}\right)\frac{ \textbf{c}_{n-1} }{m_{n,n}}-\frac{c_n}{m_{n,n}}\!\left(1 - e^{m_{n,n} t}\right),\end{align*}

and this simplifies to the stated solution.

Acknowledgements

We are grateful to the anonymous referees and to Nicolas Privault, who have improved the paper through their comments and insights.

Funding information

We acknowledge the generous support of the National Science Foundation (NSF) for A. D.’s Graduate Research Fellowship under grant DGE-1650441, received during his graduate studies at Cornell.

Competing interests

There were no competing interests to declare which arose during the preparation or publication process for this article.

References

Ackerer, D., Filipović, D. and Pulido, S. (2018). The Jacobi stochastic volatility model. Finance Stoch. 22, 667700.CrossRefGoogle Scholar
At-Sahalia, Y., Cacho-Diaz, J. and Laeven, R. J. (2015). Modeling financial contagion using mutually exciting jump processes. J. Financial Econom. 117, 585606.CrossRefGoogle Scholar
Benth, F. E. and Lavagnini, S. (2021). Correlators of polynomial processes. SIAM J. Financial Math. 12, 13741415.CrossRefGoogle Scholar
Biagini, F. and Zhang, Y. (2016). Polynomial diffusion models for life insurance liabilities. Insurance Math. Econom. 71, 114129.CrossRefGoogle Scholar
Bladt, M., Asmussen, S. and Steffensen, M. (2019). Matrix calculations for inhomogeneous Markov reward processes, with applications to life insurance and point processes. Preprint. Available at https://arxiv.org/abs/1905.04605.Google Scholar
Boxma, O., Kella, O. and Perry, D. (2011). On some tractable growth–collapse processes with renewal collapse epochs. J. Appl. Prob. 48, 217234.CrossRefGoogle Scholar
Boxma, O., Perry, D., Stadje, W. and Zacks, S. (2006). A Markovian growth–collapse model. Adv. Appl. Prob. 38, 221243.CrossRefGoogle Scholar
Brawer, R. and Pirovino, M. (1992). The linear algebra of the Pascal matrix. Linear Algebra Appl. 174, 1323.CrossRefGoogle Scholar
Call, G. S. and Velleman, D. J. (1993). Pascal’s matrices. Amer. Math. Monthly 100, 372376.CrossRefGoogle Scholar
Cuchiero, C. (2019). Polynomial processes in stochastic portfolio theory. Stoch. Process. Appl. 129, 18291872.CrossRefGoogle Scholar
Cuchiero, C., Keller-Ressel, M. and Teichmann, J. (2012). Polynomial processes and their applications to mathematical finance. Finance Stoch. 16, 711740.CrossRefGoogle Scholar
Cuchiero, C., Larsson, M. and Svaluto-Ferro, S. (2018). Polynomial jump-diffusions on the unit simplex. Ann. Appl. Prob. 28, 24512500.CrossRefGoogle Scholar
Cui, L., Hawkes, A. and Yi, H. (2020). An elementary derivation of moments of Hawkes processes. Adv. Appl. Prob. 52, 102137.CrossRefGoogle Scholar
Cui, L., Wu, B. and Yin, J. (2021). Moments for Hawkes processes with gamma decay kernel functions. To appear in Methodology Comput. Appl. Prob. Google Scholar
Da Fonseca, J. and Zaatour, R. (2014). Hawkes process: fast calibration, application to trade clustering, and diffusive limit. J. Futures Markets 34, 548579.Google Scholar
Daley, D. and Vere-Jones, D. (2003). An Introduction to the Theory of Point Processes, Vol. I, Elementary Theory and Methods. Springer, New York.Google Scholar
Dassios, A. and Zhao, H. (2011). A dynamic contagion process. Adv. Appl. Prob. 43, 814846.CrossRefGoogle Scholar
Daw, A. and Pender, J. (2018). Queues driven by Hawkes processes. Stoch. Systems 8, 192229.CrossRefGoogle Scholar
Daw, A. and Pender, J. (2022). An ephemerally self-exciting point process. Adv. Appl. Prob. 54, 340–403.CrossRefGoogle Scholar
Duffie, D., Pan, J. and Singleton, K. (2000). Transform analysis and asset pricing for affine jump-diffusions. Econometrica 68, 13431376.CrossRefGoogle Scholar
Edelman, A. and Strang, G. (2004). Pascal matrices. Amer. Math. Monthly 111, 189197.CrossRefGoogle Scholar
Errais, E., Giesecke, K. and Goldberg, L. R. (2010). Affine point processes and portfolio credit risk. SIAM J. Financial Math. 1, 642665.CrossRefGoogle Scholar
Ethier, S. N. and Kurtz, T. G. (2009). Markov Processes: Characterization and Convergence. John Wiley, Hoboken, NJ.Google Scholar
Filipović, D. and Larsson, M. (2016). Polynomial diffusions and applications in finance. Finance Stoch. 20, 931972.CrossRefGoogle Scholar
Filipović, D. and Larsson, M. (2020). Polynomial jump-diffusion models. Stoch. Systems 10, 7197.CrossRefGoogle Scholar
Forman, J. L. and Sørensen, M. (2008). The Pearson diffusions: a class of statistically tractable diffusion processes. Scand. J. Statist. 35, 438465.CrossRefGoogle Scholar
Fralix, B. (2020). On classes of Bitcoin-inspired infinite-server queueing systems. Queueing Systems 95, 2952.CrossRefGoogle Scholar
Frolkova, M. and Mandjes, M. (2019). A Bitcoin-inspired infinite-server model with a random fluid limit. Stoch. Models 35, 132.CrossRefGoogle Scholar
Gao, X. and Zhu, L. (2018). Limit theorems for Markovian Hawkes processes with a large initial intensity. Stoch. Process. Appl. 128, 38073839.CrossRefGoogle Scholar
Gao, X. and Zhu, L. (2019). Affine point processes: refinements to large-time asymptotics. Preprint. Available at https://arxiv.org/abs/1903.06371.Google Scholar
Guasoni, P. and Wong, K. C. (2020). Asset prices in segmented and integrated markets. Finance Stoch. 24, 939980.CrossRefGoogle Scholar
Hale, J. K. and Lunel, S. M. V. (2013). Introduction to Functional Differential Equations. Springer, New York.Google Scholar
Hawkes, A. G. (1971). Spectra of some self-exciting and mutually exciting point processes. Biometrika 58, 8390.CrossRefGoogle Scholar
Henry-Labordère, P. (2008). Analysis, Geometry, and Modeling in Finance: Advanced Methods in Option Pricing. Chapman and Hall/CRC, Boca Raton, FL.CrossRefGoogle Scholar
Iversen, V. B. (1985). A generalization of the classical teletraffic theory. In Proc. Eleventh International Teletraffic Congress, Elsevier, Amsterdam, pp. 5864.Google Scholar
Iversen, V. B. (ed.) (2005). Teletraffic Engineering Handbook. International Telecommunication Union, Geneva.Google Scholar
Javier, K. and Fralix, B. (2020). A further study of some Markovian Bitcoin models from Göbel et al. Stoch. Models 36, 223250.CrossRefGoogle Scholar
Kella, O. (2009). On growth–collapse processes with stationary structure and their shot-noise counterparts. J. Appl. Prob. 46, 363371.CrossRefGoogle Scholar
Kella, O. and Löpker, A. (2010). A Markov-modulated growth collapse model. Prob. Eng. Inf. Sci. 24, 99107.CrossRefGoogle Scholar
Koops, D. (2018). Predicting the confirmation time of Bitcoin transactions. Preprint. Available at https://arxiv.org/abs/1809.10596.Google Scholar
Koops, D. T., Saxena, M., Boxma, O. J. and Mandjes, M. (2018). Infinite-server queues with Hawkes input. J. Appl. Prob. 55, 920943.CrossRefGoogle Scholar
Mazet, O. (1997). Classification des semi-groupes de diffusion sur IR associés à une famille de polynômes orthogonaux. In Séminaire de Probabilités XXXI, Springer, Berlin, Heidelberg, pp. 4053.CrossRefGoogle Scholar
Nasr, W. W., Charanek, A. and Maddah, B. (2018). MAP fitting by count and inter-arrival moment matching. Stoch. Models 34, 292321.CrossRefGoogle Scholar
Nielsen, B. F., Nilsson, L. F., Thygesen, U. H. and Beyer, J. E. (2007). Higher order moments and conditional asymptotics of the batch Markovian arrival process. Stoch. Models 23, 126.CrossRefGoogle Scholar
Øksendal, B. (2013). Stochastic Differential Equations: an Introduction with Applications. Springer, Berlin, Heidelberg.Google Scholar
Wallström, B. (1964). A distribution model for telephone traffic with varying call intensity, including overflow traffic. Ericsson Technics 20, 183202.Google Scholar
Wong, E. (1964). The construction of a class of stationary Markoff processes. Stoch. Process. Math. Phys. Eng. 17, 264276.Google Scholar
Zhang, X., Blanchet, J., Giesecke, K. and Glynn, P. W. (2015). Affine point processes: approximation and efficient simulation. Math. Operat. Res. 40, 797819.CrossRefGoogle Scholar
Zhang, Z. (1997). The linear algebra of the generalized Pascal matrix. Linear Algebra Appl. 250, 5160.CrossRefGoogle Scholar