A Markov multiple state model for epidemic and insurance modelling

Minh-Hoang Tran

doi:10.1017/asb.2024.8

A Markov multiple state model for epidemic and insurance modelling

Published online by Cambridge University Press: 14 March 2024

Minh-Hoang Tran

Show author details

Minh-Hoang Tran*: Affiliation:
International School of Business, University of Economics Ho Chi Minh City, Vietnam
*: Email: hoang.tran@isb.edu.vn

Article contents

Abstract
Introduction
The deterministic SIR model
The SIR Markov multiple state model
The distribution of waiting times
The duration and final susceptible size of the epidemic
Valuations of insurance benefits
Calculation of premiums and reserves
Numerical illustration
Conclusions
Competing interests
References

Rights & Permissions

Abstract

With recent epidemics such as COVID-19, H1N1 and SARS causing devastating financial loss to the economy, it is important that insurance companies plan for financial costs of epidemics. This article proposes a new methodology for epidemic and insurance modelling by combining the existing deterministic compartmental models and the Markov multiple state models to facilitate actuarial computations to design new health insurance plans that cover epidemics. Our method is inspired by the seminal paper (Feng and Garrido (2011) North American Actuarial Journal, 15, 112–136.) of Feng and Garrido and complements the work of Hillairet and Lopez et al. in Hillairet and Lopez ((2021) Scandinavian Actuarial Journal, 2021(8), 671–694.) and Hillairet et al. ((2022) Insurance: Mathematics and Economics, 107, 88–101.) In this work, we use the deterministic SIR model and the Eyam epidemic data set to provide numerical illustrations for our method.

Keywords

Epidemic Markov multiple state compartmental models SIR

Type: Research Article
Information: ASTIN Bulletin: The Journal of the IAA , Volume 54 , Issue 2 , May 2024 , pp. 360 - 384

DOI: https://doi.org/10.1017/asb.2024.8 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © The Author(s), 2024. Published by Cambridge University Press on behalf of The International Actuarial Association

1. Introduction

Given the impact of epidemics such as COVID-19, and to a lesser extent H1N1, SARS, etc. on our society, health insurance providers must pay more attention to developing health insurance coverage to the public that includes those caused by infectious diseases. In that regard, the actuaries who design those health insurance plans may find the vast literature of epidemiological compartmental models immensely useful. Epidemiological compartmental models have been developed since the 1910s with the works of Ross in fighting malaria (Ross, Reference Ross1910) and Kermack and McKendrick (Reference Kermack and McKendrick1927) and (Reference Kermack and McKendrick1932) who introduced the celebrated SIR model. Those early compartmental models are deterministic (i.e., it gives the exact number of individuals in each compartment at any given time). In Bartlett (Reference Bartlett1949), Bartlett introduced a stochastic version of the SIR model known as the general stochastic epidemic model. Since then, stochastic compartmental models have been studied and generalised extensively (see Andersson and Britton, Reference Andersson and Britton2000; Britton et al., Reference Britton2019).

Even though the literature on epidemiological compartmental models is vast, its applications to actuarial science and in designing insurance plans are modest in comparison. Early papers to investigate actuarial applications of epidemiological models include Hua and Cox (Reference Hua and Cox2009) by Hua and Cox and Feng and Garrido (Reference Feng and Garrido2011) by Feng and Garrido. In Feng and Garrido (Reference Feng and Garrido2011), the authors developed actuarial models to provide financial arrangement arising due to epidemics based on deterministic epidemiological compartmental models. Other papers in this direction include Chen (Reference Chen2021), Feng et al. (Reference Feng2022, Reference Feng, Jin and Loke2020), and Hainaut (Reference Hainaut2021). In particular, the paper Feng et al. (Reference Feng2022) by Feng et al. contains a detailed bibliography of related works.

In another development, Lefevre, Picard and Simon in a series of papers (Lefevre et al., Reference Lefevre, Picard and Simon2017-Lefevre and Simon, Reference Lefevre and Simon2021) also developed actuarial applications of the general stochastic epidemic models such as one proposed in Bartlett (Reference Bartlett1949) and its generalisations. Using sophisticated techniques of probability theory, they obtained formulas for various important quantities of actuarial interest, for example, the final susceptible size distribution of the epidemic, the expected susceptibility and infectivity times, the ruin probability for insurers, etc.

It has been pointed out that modelling the financial impact of epidemics is significantly different from those caused by regular mortality or diseases. One of main differences is that the speed of infection depends heavily on the number of infected individuals in the population. As a result, stochastic Markovian epidemic models usually take the entire population into account rather than looking at each individual. This approach, even though is very natural and powerful, has one drawback that the state space grows with the population size which makes actuarial calculations impractical.

In this paper, we want to propose an idea that starting with a deterministic compartmental model as an exogenous input, we can construct a Markov multiple state model for the infection time line of an individual such that the behaviour of the population of such independent individuals match the deterministic compartmental model. Such a model can then be used to calculate premiums and reserves for an insurance policy covering epidemics. The advantage of this approach is that we can then utilise the full machinery of Markov multiple state models to study the financial impacts of epidemics at the individual level and then obtain related results for the entire population. To illustrate the idea, we analyse in detail the Markov multiple state model arising from the deterministic SIR model. Consequently, we can clarify and extend some of the results in Feng and Garrido (Reference Feng and Garrido2011).

The idea of using Markov multiple state models at the individual level to study the financial impact of epidemics is not new. Indeed, many of the quantities introduced by Feng and Garrido in Feng and Garrido (Reference Feng and Garrido2011) are remarkably similar to standard actuarial functions in Markov multiple state models. Furthermore, Billard and Dayananda constructed a multistage compartmental model for HIV-infected individuals based on waiting time in each compartment and studied its actuarial applications in Billard and Dayananda (Reference Billard and Dayananda2014a) and (Reference Billard and Dayananda2014b). Nevertheless, Billard and Dayananda model the waiting times directly, whereas our approach involves setting the transition intensities to match the deterministic compartmental model.

It has come to our attention that our model is a special case of the model used in Hillairet and Lopez (Reference Hillairet and Lopez2021) and Hillairet et al. (Reference Hillairet2022) by Hillairet and Lopez et al. in their study of cyber insurance management. Note that in Hillairet et al. (Reference Hillairet2022), the authors not only study the SIR model but also extend their analysis to multigroup SIR model as well. Nevertheless, our objective and motivation as well as methodology are significantly different from those of Hillairet and Lopez et al. We also acknowledge that there is a parallel paper (Francis and Steffensen, Reference Francis and Steffensen2023) by Francis and Steffensen which has the same fundamental ideas as those in our paper. Francis and Steffensen not only treat the SIR model but also generalise it to several related models as well. The results obtained in Francis and Steffensen (Reference Francis and Steffensen2023) are very similar to ours. Although our paper focuses exclusively on the SIR model, we also discuss several topics not included in Francis and Steffensen (Reference Francis and Steffensen2023) such as the distributions of the waiting times, the duration and the final susceptible size of the epidemic. Thus, despite the overlap, we believe our paper makes a genuine contribution to the literature.

This paper is organised as follows. In Section 2, we recall the deterministic SIR model and its basic properties. In Section 3, we construct the Markov multiple state model based on the deterministic SIR model. In Section 4, we discuss the waiting times for an individual to remain susceptible or infected. In Section 5, we analyse the spread of the epidemic in a population. More precisely, we give the distributions of the final susceptible size and the duration of the epidemic. In Sections 6 and 7, we give formulas for the actuarial present values of the traditional insurance benefits and discuss premium and reserve calculations. In Section 8, we provide numerical illustrations for our method using the Eyam data set from Raggett (Reference Raggett1982).

2. The deterministic SIR model

Let us briefly recall the deterministic SIR model in epidemiology. The readers should consult (Brauer et al., Reference Brauer, Castillo-Chavez and Feng2019) for further description of this model. The population is divided into three compartments Susceptible (S), Infected (I) and Removed (R) where possible transitions are indicated in Figure 1. Let S(t), I(t) and R(t) be the number of susceptible, infected and removed individuals at time t, respectively. Assume that at time 0, the population has N individuals and $R(0)=0$ . Let us define s(t), i(t) and r(t) as the corresponding proportions of susceptible, infected and removed individuals in the population, respectively. The SIR model states that the dynamics of this epidemic is governed by the following system of differential equations

(2.1)

\begin{equation}\left\{ \begin{array}{ll}s'(t) &= -\beta s(t)i(t) \\i'(t) &= \beta s(t)i(t) - \alpha i(t) \\r'(t) &= \alpha i(t)\end{array}\right.\end{equation}

where $\alpha$ and $\beta$ are positive constants. This is a first-order system of differential equations. Although explicit solutions in terms of elementary functions are not available, we can still numerically solve for (s(t), i(t), r(t)) for any $t>0$ . The following proposition gives the phase-plane equation for (s(t), i(t)) (see Brauer et al., Reference Brauer, Castillo-Chavez and Feng2019, p. 37).

Figure 1. The deterministic SIR model.

Proposition 2.1. The functions s(t) and i(t) are related via the equation

(2.2)

\begin{equation}s(t) + i(t) = 1 + \frac{\alpha}{\beta}\log\left(\frac{s(t)}{s(0)}\right).\end{equation}

Equivalently,

(2.3)

\begin{equation}s(t) = s(0)e^{-\frac{\beta}{\alpha}r(t)}.\end{equation}

Proposition 2.2. Let $s(\infty), i(\infty)$ and $r(\infty)$ be the limits of s(t), i(t) and r(t), respectively, as t approaches $\infty$ . Then $i(\infty)=0$ , $r(\infty)=1-s(\infty)$ where $s(\infty)$ is the unique solution between 0 and $\alpha/\beta$ of the equation

(2.4)

\begin{equation}F(z) = z - \frac{\alpha}{\beta} \log\left(\frac{z}{s(0)} \right) - 1 = 0.\end{equation}

In particular, both $s(\infty)$ and $r(\infty)$ are strictly between 0 and 1 and

(2.5)

\begin{align}s(\infty) = 1 + \frac{\alpha}{\beta} \log\left(\frac{s(\infty)}{s(0)} \right).\end{align}

Proof. See Brauer et al. (Reference Brauer, Castillo-Chavez and Feng2019, p. 37).

The following formula for the inverse of s(t) will be used in Subsection 8.6.

Proposition 2.3. The inverse function of s(t) is given by: for $s(\infty)<u\leq s(0)$ ,

(2.6)

\begin{equation}s^{-1}(u) = \int_u^{s(0)} \frac{dz}{\beta z\left[1+ \frac{\alpha}{\beta} \log\left(\frac{z}{s(0)} \right)-z \right]}.\end{equation}

Proof. From (2.2), we deduce

\[ i\left(s^{-1}(u) \right) = 1 + \frac{\alpha}{\beta} \log\left(\frac{u}{s(0)} \right)-u.\]

Furthermore,

\begin{align*}( s^{-1})^{\prime}(u) = \frac{1}{s'\left(s^{-1}(u) \right)}= \frac{-1}{\beta u i\left(s^{-1}(u) \right)}= \frac{-1}{\beta u\left[1+ \frac{\alpha}{\beta} \log\left(\frac{u}{s(0)} \right)-u \right]}.\end{align*}

Integrating both sides and noting that $s^{-1}(s(0))=0$ yields the required formula.

The proof of the following proposition can be found in Brauer et al. (Reference Brauer, Castillo-Chavez and Feng2019, p. 40).

Proposition 2.4. If $s(0)<\alpha/\beta$ , the epidemic fails to take off. Suppose $s(0)>\alpha/\beta$ . The infection proportion i(t) peaks at time $t^{*} = s^{-1}(\alpha/\beta)$ with maximum value

(2.7)

\begin{equation}i(t^{*}) = 1 + \frac{\alpha}{\beta} \left(\log\left(\frac{\alpha}{\beta s(0)}\right)-1\right).\end{equation}

For the rest of this paper, we assume that $s(0)>\alpha/\beta$ .

Proposition 2.5. If $z>t^{*}=s^{-1}(\alpha/\beta)$ and i(z) is sufficiently small, then for $t\geq z$ , we have the following approximations:

(2.8)

\begin{align}s(t) &= s(z) - \frac{\beta s(\infty)i(z) }{\alpha-\beta s(\infty)}\left(1 - e^{-(\alpha-\beta s(\infty))(t-z)}\right) + o(i(z)) \end{align}

(2.9)

\begin{align}i(t) &= i(z)e^{-(\alpha-\beta s(\infty))(t-z)} + o\left(i(z)e^{-(\alpha-\beta s(\infty))(t-z)/2} \right)\end{align}

(2.10)

\begin{align}r(t) &= r(z) + \frac{\alpha i(z)}{\alpha-\beta s(\infty)}\left(1- e^{-(\alpha-\beta s(\infty))(t-z)}\right)+ o(i(z)). \end{align}

Proof. From Proposition 2.4, $s(t)/s(\infty)$ is in the interval $\left( 1,\alpha/(\beta s(\infty))\right)$ for $t>t^{*}$ . Let G(x) be the function

(2.11)

\begin{equation}G(x) = x - \frac{\alpha}{\beta s(\infty)} \log(x)-1.\end{equation}

Clearly, $G(1)=0$ and that

\[ G'(x) = 1 - \frac{\alpha}{\beta s(\infty)x}.\]

Thus, G(x) is decreasing on the interval $\left( 1,\alpha/(\beta s(\infty))\right)$ and $G^{-1}$ exists and is differentiable. From (2.2) and (2.5), $s(t)/s(\infty)$ is the unique solution of the equation $G(x) = -i(t)/s(\infty)$ . Let $\epsilon = (\alpha-\beta s(\infty))/(3\beta)$ . There exists $\delta>0$ such that if $i(t)<\delta$ then $s(t)-s(\infty)<\epsilon$ . Let $t_0>t^{*}$ such that $i(t)<\delta$ for $t>t_0$ . For $z>t_0$ and $w>0$ , integrating the following equation

\begin{align*}\frac{i'(t)}{i(t)} = - (\alpha-\beta s(\infty)) + \beta (s(t)-s(\infty))\end{align*}

between z and $z+w$ and taking exponentiation yields

\begin{align*}i(z+w) = i(z) e^{-(\alpha-\beta s(\infty))w} \exp\left(\beta \int_z^{z+w} (s(t)-s(\infty))dt \right).\end{align*}

Hence, for $z>z_0$ and $w>0$ , we have

(2.12)

\begin{equation}1\leq \frac{i(z+w)}{i(z) e^{-(\alpha-\beta s(\infty))w}} \leq e^{\beta w \epsilon}.\end{equation}

As a result,

\begin{align*}0 \leq i(z+w) - i(z) e^{-(\alpha-\beta s(\infty))w} & \leq i(z) e^{-(\alpha-\beta s(\infty))w}\left(e^{\beta w \epsilon}-1 \right) \\&\leq i(z)e^{-2(\alpha-\beta s(\infty))w/3} = o\left( i(z)e^{-(\alpha-\beta s(\infty))w/2}\right) .\end{align*}

Thus, (2.9) follows. To prove $(2.10)$ , we integrate equation $r'(t) = \alpha i(t)$ from z to $z+w$

\begin{align*}r(z+w) &= r(z) + \alpha \int_z^{z+w} i(t)dt \\&= r(z) + \alpha \int_z^{z+w} i(z)e^{-(\alpha-\beta s(\infty))(t-z)}dt +o\left( \int_z^{z+w} i(z)e^{-(\alpha-\beta s(\infty))(t-z)/2}dt \right) \\&= r(z) + \frac{\alpha i(z)}{\alpha-\beta s(\infty)} \left(1-e^{-(\alpha-\beta s(\infty))w} \right)+ o(i(z)).\end{align*}

Corollary 2.6. For $z>t^{*}=s^{-1}(\alpha/\beta)$ and $i(z)$ sufficiently small,

\begin{align*}s(z) &= s(\infty)\left(1 + \frac{\beta i(z) }{\alpha-\beta s(\infty)} \right) + o(i(z)) \\r(z) &= r(\infty) - \frac{\alpha i(z) }{\alpha-\beta s(\infty)} + o(i(z)).\end{align*}

In particular, for $t\to\infty$ ,

(2.13)

\begin{equation}s(t)-s(\infty) \sim e^{-(\alpha-\beta s(\infty))t}, \quad i(t) \sim e^{-(\alpha-\beta s(\infty))t},\quad r(t) - r(\infty) \sim e^{-(\alpha-\beta s(\infty))t}.\end{equation}

3. The SIR Markov multiple state model

In this section, we will construct a Markov multiple state model that can capture the dynamics of the deterministic SIR model. Consider a random individual in this population. At any given time, this individual can be in either one of the three states 0,1 and 2 which correspond to the three compartments (S), (I) and (R). For $t\geq 0$ , let $X_t$ be the state that this individual is in at time t. We assume that $(X_t)_{t\geq 0}$ is a continuous Markov process. Let P be the transition matrix for $X_t$ . That is for $0\leq z \leq t$ , P(z, t) is the matrix whose entries are given by

(3.1)

\begin{equation}P^{ij}(z,t) = \mathbb{P}(X_{t}=j|X_{z}=i)\end{equation}

for all $i,j \in \{0,1,2\}$ . To connect the deterministic SIR model with this Markov multiple state model, we will match the observed number of individuals in each compartment in the former model with the corresponding expected numbers of individuals in each compartment in the latter model. This leads us to the following proposition that allows us to construct the transition matrix from the values of S(t), I(t) and R(t).

Proposition 3.1. For the observed number of individuals in each compartment of the deterministic SIR model to match the expected number of individuals in the SIR Markov multiple state model, the transition matrix $P(0,t)$ needs to satisfy the following equations:

(3.2)

\begin{align}s(t) &= s(0)P^{00}(0,t)\nonumber \\i(t) &= s(0)P^{01}(0,t) + i(0)P^{11}(0,t) \\r(t) &= s(0)P^{02}(0,t) + i(0)P^{12}(0,t).\nonumber\end{align}

Equations (3.2) can be written in matrix form as follows

(3.3)

\begin{equation}(s(t),i(t),r(t)) = (s(0),i(0),r(0))P(0,t).\end{equation}

Proof. At time 0, there are S(0) susceptible individuals who can get infected at time t with probability $P^{01}(0,t)$ , and there are I(0) infected individuals who remain infected at time t with probability $P^{11}(0,t)$ . Therefore, the expected number of infected individuals at time t is

\[ S(0)P^{01}(0,t) + I(0)P^{11}(0,t).\]

which is equal to I(t) from our assumption. Dividing by N yields the required equation for i(t). The other equations can also be obtained by a similar reasoning.

Proposition 3.2. For $0\leq z\leq t$ ,

(3.4)

\begin{align}s(t) &= s(z)P^{00}(z,t)\nonumber \\i(t) &= s(z)P^{01}(z,t) + i(z)P^{11}(z,t) \\r(t) &= s(z)P^{02}(z,t) + i(z)P^{12}(z,t) + r(z)\nonumber\end{align}

Equivalently,

(3.5)

\begin{equation}(s(t),i(t),r(t))=(s(z),i(z),r(z))P(z,t).\end{equation}

Proof. The result follows directly from Chapman–Kolmogorov equation.

Let us recall the Kolmogorov differential equations for multiple state models (see Haberman and Pitacco, Reference Haberman and Pitacco1998, p. 17).

Theorem 3.3. For a Markov multiple state model with transition matrix P and transition intensity matrix $\mu$ , the Kolmogorov forward differential equation is given by

(3.6)

\begin{equation}P_t(z,t) \;:\!=\; \frac{\partial{}}{\partial{t}} P(z,t) = P(z,t)\mu_t.\end{equation}

The initial condition is $P(z,z)$ equals to the identity matrix.

Proposition 3.4. Suppose for $t\geq 0$

(3.7)

\begin{equation}\mathbb{P}(X_t = 0) = s(t),\quad \mathbb{P}(X_t = 1) = i(t), \quad\mathbb{P}(X_t = 2) = r(t).\end{equation}

Then, the transition intensities are given by

(3.8)

\begin{equation}\begin{array}{ll}\mu_t^{01} = \beta i(t),\qquad\mu_t^{12} = \alpha.\end{array}\end{equation}

All the other $\mu^{ij}_t$ for $i\neq j$ are identically zero.

Proof. Differentiating (3.3) with respect to t yields.

\begin{align*} \frac{d}{dt}(s(t),i(t),r(t)) &= (s(0),i(0),r(0)) \frac{\partial{}}{\partial{t}} P(0,t)\\ &= (s(0),i(0),r(0)) P(0,t)\mu_t = (s(t),i(t),r(t))\mu_t. \end{align*}

Note that $\mu^{00}_t=-\mu^{01}_t$ and $\mu^{11}_t=-\mu^{12}_t$ are the only nonzero entries of $\mu_t$ . Thus, combining the above equation with (2.1) yields

(3.9)

\begin{equation} ({-}\beta s(t)i(t),\beta s(t)i(t)-\alpha i(t),\alpha i(t)) = ({-}s(t)\mu^{01}_t,s(t)\mu^{01}_t-i(t)\mu^{12}_t, i(t)\mu^{12}_t). \end{equation}

We deduce that $\mu^{01}_t = \beta i(t)$ and $\mu^{12}_t = \alpha$ .

Definition 3.5. By the SIR multiple state model, we mean the model given by the continuous Markov process $(X_t)_{t\geq0}$ taking values in $\{0,1,2\}$ with transition intensity matrix given by (3.8) and initial distribution $(s(0), i(0), r(0))$ .

Corollary 3.6. For the SIR multiple state model, the forward Kolmogorov equations are

(3.10)

\begin{align}P^{00}_t(z,t) &= -\beta i(t) P^{00}(z,t), \\P^{01}_t(z,t) &= \beta i(t)P^{00}(z,t)- \alpha P^{01}(z,t), \nonumber \\P^{02}_t(z,t) &= \alpha P^{01}(z,t). \nonumber\end{align}

In addition, for $\Delta t>0$ , we have

(3.11)

\begin{align}P^{00}(t,t+\Delta t) &= 1-\beta i(t)\Delta t + o(\Delta t) \\P^{01}(t,t+\Delta t) &= \beta i(t)\Delta t + o(\Delta t) \nonumber \\P^{02}(t,t+\Delta t) &= o(\Delta t). \nonumber\end{align}

Theorem 3.7. For the SIR multiple state model, the transition probabilities are given by

(3.12)

\begin{equation}P^{11}(z,t) = e^{-\alpha(t-z)},\quad P^{12}(z,t) = 1 - e^{-\alpha(t-z)}.\end{equation}

In addition,

(3.13)

\begin{align}P^{00}(z,t) &= \frac{s(t)}{s(z)}, \quad \& \quad P^{01}(z,t) = \frac{1}{s(z)}\left( i(t)-i(z)e^{-\alpha(t-z)}\right), \nonumber \\P^{02}(z,t) &= \frac{1}{s(z)}\left( r(t)-r(z)-i(z)(1-e^{-\alpha(t-z)})\right).\end{align}

Proof. The formulas for $P^{11}(z,t)$ and $P^{12}(z,t)$ follow immediately from $\mu^{12}_w \equiv \alpha$ . We have

\begin{align*}P^{00}(z,t) &= \exp\left({-}\int_z^t \mu^{01}_w dw \right)= \exp\left({-}\int_z^t \beta i(w)dw \right) \\& = \exp\left(\int_z^t \frac{s'(w)}{s(w)}dw \right)= \frac{s(t)}{s(z)}.\end{align*}

Furthermore,

\begin{align*}P^{01}(z,t)&= \int_z^t P^{00}(z,w)\mu^{01}_w P^{\overline{11}}(w,t)dw = - \frac{e^{-\alpha t}}{s(z)} \int_z^t e^{\alpha w}s'(w)dw\\&= -\frac{e^{-\alpha t}}{ s(z)} \left(e^{\alpha t}s(t)-e^{\alpha z}s(z)-\alpha\int_z^t e^{\alpha w}s(w)dw\right).\end{align*}

We can directly verify that

\[\alpha\int e^{\alpha w}s(w)dw = e^{\alpha w}(s(w)+i(w)) + c,\]

where c is any constant. Thus, the formula for $P^{01}(z,t)$ follows.

Combining (3.13) and Proposition 2.2 yield the following Corollary.

Corollary 3.8. Let $P^{ij}(z,\infty) = \lim_{t\to\infty} P^{ij}(z,t)$ . Then

\begin{align*}P^{00}(z,\infty) = \frac{s(\infty)}{s(z)},\quad P^{01}(z,\infty) = 0,\quad P^{02}(z,\infty) = 1-\frac{s(\infty)}{s(z)}.\end{align*}

For $z>t^{*}=s^{-1}(\alpha/\beta)$ and $i(z)$ sufficiently small, for $t>z$ we have

\begin{align*}P^{00}(z,t) &= P^{00}(z,\infty)\left(1+ \frac{\beta i(z)}{\alpha-\beta s(\infty)}e^{-(\alpha-\beta s(\infty))(t-z)} \right) + o(i(z)) \\P^{01}(z,t) &= \frac{i(z)}{s(z)}\left[e^{-(\alpha-\beta s(\infty))(t-z)}-e^{-\alpha(t-z)}\right] + o\left(i(z)e^{-(\alpha-\beta s(\infty))(t-z)/2} \right), \\P^{02}(z,t) &= P^{02}(z,\infty) - \frac{1}{s(z)}\left( \frac{\alpha i(t)}{\alpha-\beta s(\infty)}-i(z) e^{-\alpha(t-z)}\right) + o(i(z)).\end{align*}

Remark 3.9. The SIR multiple state model defined by Definition 3.5 is a special case of the model constructed by Hillairet and Lopez in (2021).

We have showed that it is the only Markov multiple state model that satisfies (3.7). That is, the probability a randomly chosen individual being in each compartment matches with the proportion of the population in the corresponding compartment which is given by the deterministic SIR model.
We have also determined the transition probabilities of the SIR multiple state model and show that they can be given by simple expressions of $(s(t),i(t),r(t))$ (see (3.13)). Note that we could also obtain (3.13) by solving (3.4).

Remark 3.10. The technique that used to construct the SIR multiple state model can be extended to other deterministic compartment models as well. Suppose we are given a deterministic compartment model with compartments $0,1,\ldots,M$ . Let $\textbf{s}(t) = (s_i(t))_{i=0}^M$ be the row vector whose components are proportions of the population in each compartment. Assume that the dynamics of the population are characterised by the system of differential equations

(3.14)

\begin{equation}\textbf{s}'(t) = F(t,\textbf{s}(t))\end{equation}

where F is a function of t and $\textbf{s}(t)$ . Consider a random individual in this population. Let $X_t$ be the compartment that this individual belongs to at time t. We want $(X_t)_{t\geq0}$ to follow a Markov multiple state model that matches the behaviour of the model (3.14) in aggregate. That is for $t\geq 0$

\[ \textbf{s}(t) = (s_i(t))_{i=0}^{M} = (P(X_t=i))_{i=0}^{M}. \]

Let $\mu_t$ be the transition intensity matrix. By the Chapman–Kolmogorov equation,

(3.15)

\begin{equation}\textbf{s}(t) = (P(X_t=i))_{i=0}^{M} = (P(X_0=i))_i^M P(0,t) = \textbf{s}(0)P(0,t).\end{equation}

Differentiating (3.15) and using (3.6) yields

\begin{align*} \textbf{s}'(t) = \textbf{s}(0)\frac{\partial{}}{\partial{t}} P(0,t) = \textbf{s}(0) P(0,t)\mu_t = \textbf{s}(t) \mu_t.\end{align*}

As a result, $\mu_t$ satisfies the following equation

(3.16)

\begin{equation}\textbf{s}(t)\mu_t = F(t,\textbf{s}(t)).\end{equation}

For the deterministic SIR model, (3.16) reduces to (3.9) which in turn yields (3.8).

Remark 3.11. The fact that many of the constructions in Feng and Garrido (Reference Feng and Garrido2011) are remarkably similar to the actuarial functions in Markov multiple state models, for example, forces of transition, EPVs of annuity and insurance benefits, etc. suggests that there might be a Markov multiple state model underlying all of these constructions. Moreover, this model should describe the state of a random individual in the population during the epidemic rather than the state of the entire population. The chance that an individual getting infected depends on the current state of the population. Unfortunately, this information is not available if we only consider one individual. Therefore, we use the expected state of the population which is given by the deterministic SIR model to approximate the actual state of the population. We are not claiming that this model actually describes the infection mechanism of the epidemic, but it is a reasonable approximation.

4. The distribution of waiting times

In this section, we will analyse the individual’s waiting times in each state for the SIR multiple state model. The results derived here are useful for simulation (see Subsection 8.6). Consider a random individual in the population. Let $T^{(0)}$ and $T^{(1)}$ be the waiting times in states 0 and 1, respectively. If the individual is in state 1 at time 0, then we set $T^{(0)}$ equal to zero. Furthermore, let $T=T^{(0)}+T^{(1)}$ be the time until removal or time to absorption. The following proposition gives the distributions of $T^{(0)}$ , $T^{(1)}$ and T.

Proposition 4.1. The survival function of $T^{(0)}$ is given by

(4.1)

\begin{align}\mathbb{P}\left(T^{(0)}>t \right) = \frac{s(t)}{s(0)},\quad\mathbb{P}\left(T^{(0)}=\infty \right) = \frac{s(\infty)}{s(0)}. \end{align}

The survival function of $T^{(1)}$ is given by

(4.2)

\begin{align}\mathbb{P}\left(T^{(1)}=0 \right) = \frac{s(\infty)}{s(0)},\quad\mathbb{P}\left(T^{(1)}>t \right) = e^{-\alpha t}\left(1-\frac{s(\infty)}{s(0)}\right). \end{align}

The distribution function of T is given by

(4.3)

\begin{align}\mathbb{P}\left(T < t \right) = \frac{r(t)-i(0)(1-e^{-\alpha t})}{s(0)},\quad\mathbb{P}\left(T=\infty \right) = \frac{s(\infty)}{s(0)}.\end{align}

Proof. We have $\mathbb{P}\left(T^{(0)}>t \right) = P^{00}(0,t) = s(t)/s(0).$ Note that $T^{(1)}=0$ if and only if $T^{(0)}=\infty$ . Because of the Markov property, $ \mathbb{P}(T^{(1)}>t|T^{(0)}<\infty)=e^{-\alpha t}$ . Hence,

\begin{align*}\mathbb{P}(T^{(1)}>t)=\mathbb{P}(T^{(1)}>t|T^{(0)}<\infty)\mathbb{P}(T^{(0)}<\infty)= e^{-\alpha t}\left(1-\frac{s(\infty)}{s(0)}\right).\end{align*}

Note that $T\leq t$ is equivalent to $X_t = 2$ given that $X_0=0$ . Thus, $\mathbb{P}(T\leq t)=P^{02}(0,t)$ and that implies (4.3) in view of (3.13).

Remark 4.2. The random variables $T^{(0)}$ , $T^{(1)}$ and T all have mixed distributions with mass probabilities at $\infty$ , 0 and $\infty$ , respectively, and continuous density elsewhere. The fact that $\mathbb{P}(T^{(0)}=\infty)=s(\infty)/s(0)$ can be interpreted as a susceptible individual in this population has a positive chance of never getting infected. Consequently, neither $T^{(0)}$ nor T has finite moments. However, conditioning on $T^{(0)}<\infty$ , all of their moments are finite as showed by the following proposition.

Proposition 4.3 Let $M_{T^{(0)}|T^{(0)}<\infty}(z)$ and $M_{T|T^{(0)}<\infty}(z)$ be the moment generating functions of $T^{(0)}$ and T conditioning on $T^{(0)}<\infty$ , respectively. Then for $z<\alpha - \beta s(\infty)$ ,

(4.4)

\begin{align}M_{T^{(0)}|T^{(0)}<\infty}(z) &= 1+ \frac{z}{s(0)-s(\infty)}\int_0^{\infty}e^{zt} (s(t)-s(\infty))dt\end{align}

(4.5)

\begin{align}M_{T|T^{(0)}<\infty}(z) &= \frac{\alpha}{s(0)-s(\infty)} \left[ \int_0^{\infty}e^{zt} i(t)dt - \frac{i(0)}{\alpha-z} \right]. \end{align}

Consequently, the moments of $T^{(0)}$ and T conditioning on $T^{(0)}<\infty$ are given by

(4.6)

\begin{align}\mathbb{E}\left((T^{(0)})^n\left|T^{(0)}<\infty \right. \right) &=\frac{n}{s(0)-s(\infty)} \int_0^{\infty} t^{n-1}(s(t)-s(\infty))dt. \end{align}

(4.7)

\begin{align} \mathbb{E}\left(T^n\left| T^{(0)}<\infty\right. \right)&=\frac{\alpha}{s(0)-s(\infty)} \left(\int_0^{\infty} t^n i(t)dt - \frac{i(0)n!}{\alpha^{n+1}} \right). \end{align}

Proof. From (2.13), $M_{T^{(0)}|T^{(0)}<\infty}(z)$ is well defined for $z<\alpha - \beta s(\infty)$ and is given by

\begin{align*}M_{T^{(0)}|T^{(0)}<\infty}(z) &= -\int_0^{\infty} \frac{e^{zt}s'(t)}{s(0)-s(\infty)}dt \\&= \frac{1}{s(0)-s(\infty)}\left\{s(0)-s(\infty)+ z\int_0^{\infty}e^{zt} (s(t)-s(\infty))dt \right\}.\end{align*}

From (3.10), the conditional probability density functions of T is

\begin{align*}f_{T|T^{(0)}<\infty}(t) &= \frac{d}{dt} \left(\frac{P^{02}(0,t)}{1-\frac{s(\infty)}{s(0)}}\right)= \frac{s(0) \alpha P^{01}(0,t)}{s(0)-s(\infty)} \\&= \frac{\alpha}{s(0)-s(\infty)} \left( i(t)-i(0)(1-e^{-\alpha t})\right).\end{align*}

From this, (4.5) can be derived straightforwardly. Finally, (4.6) and (4.7) can be obtained by examining the corresponding Taylor’s expansions of (4.4) and (4.5).

5. The duration and final susceptible size of the epidemic

In this section, we will investigate the spread of the epidemic in the whole population. Suppose our population consists of $S_0$ susceptible and $I_0$ infected individuals initially where $S_0,I_0>0$ . Let $N=S_0+I_0$ . We assume that everyone’s state is independent from one another and follows the Markov process described in Definition 3.5 with $s(0)=S_0/N$ and $i(0)=I_0/N$ . Let $\tilde{S}(t)$ and $\tilde{I}(t)$ be the number of susceptible and infected individuals at time t. The spread of the epidemic in the population is given by the process $(\tilde{S}(t),\tilde{I}(t))_{t\geq 0}$ with $(\tilde{S}(0),\tilde{I}(0))=(S_0,I_0)$ . We observe that $(\tilde{S}(t),\tilde{I}(t))_{t\geq 0}$ is a continuous Markov process on the state space

\[ \left\{ (m,n)\in [0,1,\ldots,S_0]\times[0,1,\ldots,N]:\; m+n\leq N\right\}.\]

Proposition 5.1. $\tilde{S}(t)$ follows the binomial distribution $B(S_0,P^{00}(0,t))$ and $\tilde{I}(t)$ is the sum of two independent binomial random variables $B(S_0,P^{01}(0,t))$ and $B(I_0,P^{11}(0,t))$ . In particular,

\begin{align*}\mathbb{E}(\tilde{S}(t)) = Ns(t),\quad \mathbb{E}(\tilde{I}(t)) = Ni(t), \quad\mathrm{Cov}(\tilde{S}(t),\tilde{I}(t))=-S_0P^{00}(0,t)P^{01}(0,t).\end{align*}

Let $\Phi$ be the distribution function of the standard normal distribution. For N large and $0<\alpha<1$ , the $(1-\alpha)$ -confidence intervals of $s(t)$ and $i(t)$ are contained in the intervals

(5.1)

\begin{equation}\frac{\tilde{S}(t)}{N} \pm \frac{\Phi^{-1}(1-\alpha/2)}{2\sqrt{N}},\quad \&\quad\frac{\tilde{I}(t)}{N} \pm \frac{\Phi^{-1}(1-\alpha/2)}{2\sqrt{N}}\end{equation}

respectively. In particular, as $N\to \infty$ , $\tilde{S}(t)/N$ and $\tilde{I}(t)/N$ converge to $s(t)$ and $i(t)$ in probability.

Proof. For $i=0,1,2$ , let $\tilde{S}^{0i}(t)$ be the number of individuals in state i at time t having been susceptible at time 0. Let $I^{11}(t)$ be the number of infected individuals at time t who are infected at time 0. We have

\[ \tilde{S}(t) = S^{00}(t),\quad \tilde{I}(t) = S^{01}(t)+I^{11}(t).\]

From the independence assumption between individuals, $I^{11}(t)\sim B(I_0,P^{11}(0,t))$ and

\[ \left(\tilde{S}^{0i}(t)\right)_{i=0}^2 \sim\mathrm{Multinomial}\left(S_0,\left(P^{0i}(0,t)\right)_{i=0}^2\right).\]

In addition, $I^{11}(t)$ is independent of the $\tilde{S}^{0i}(t)$ for $i=0,1,2$ . The formulas for the expected values and covariance of $\tilde{S}(t)$ and $\tilde{I}(t)$ then follow. The formulas for the confidence intervals can be derived using normal approximations to the binomial distributions.

Remark 5.2. The final susceptible size of the epidemic is the total number of susceptible individuals remained at the end of the epidemic which is given by $\tilde{S}(\infty)$ . It follows the binomial distribution $B(S_0,s(\infty)/s(0))$ . The final size of the epidemic is given by $S_0-\tilde{S}(\infty)$ .

Proposition 5.3. For $0\leq z\leq t$ , the transition probability

\[\mathbb{P}\left( (\tilde{S}(t),\tilde{I}(t))=(S_t,I_t)\left| (\tilde{S}(z),\tilde{I}(z))=(S_z,I_z) \right. \right)\]

is given by

\begin{multline*}\binom{S_z}{S_t} P^{00}(z,t)^{S_t} P^{02}(z,t)^{S_z-S_t}P^{11}(z,t)^{I_t} P^{12}(z,t)^{I_z-I_t} \\ \sum_{k=\max(0,I_t-I_z)}^{\min(S_z-S_t,I_t)} \binom{S_z-S_t}{k}\binom{I_z}{I_t-k} \left(\frac{P^{01}(z,t)P^{12}(z,t)}{P^{02}(z,t)P^{11}(z,t)} \right)^k\end{multline*}

for $0\leq S_t\leq S_z$ and $0\leq I_t\leq S_z+I_z$ . It is zero otherwise.

Proof. Let k be the number of individuals who are in state 0 at time z but are in state 1 at time t. Therefore,

there are $S_t$ , k and $S_z-S_t-k$ individuals who are in state 0 at time z and are in states 0,1 and 2, respectively, at time t.
there are $I_t-k$ and $I_z-(I_t-k)$ individuals who are in state 1 at time z and are in states 1 and 2, respectively, at time t.

Note that k needs to be between $ \max\left(0,I_t-I_z\right)$ and $ \min\left(S_z-S_t,I_t\right)$ . The probability that $ \tilde{S}(t)=S_t, \tilde{I}(t)=I_t$ and k individuals moving from state 0 to 1 is

\begin{multline*}\binom{S_z}{S_t \quad k \quad S_z-S_t-k} P^{00}(z,t)^{S_t} P^{01}(z,t)^{k}P^{02}(z,t)^{S_z-S_t-k}\binom{I_z}{I_t-k} P^{11}(z,t)^{I_t-k}P^{12}(z,t)^{I_z-I_t+k}.\end{multline*}

Summing over all possible values of k and simplifying yields the proposition.

The following proposition gives the transition intensities for the process $(\tilde{S}(t),\tilde{I}(t))_{t\geq 0}$ .

Proposition 5.4. For $\Delta t>0$ ,

(5.2)

(5.3)

\begin{align} \mathbb{P}\left((\tilde{S}(t+\Delta t),\tilde{I}(t+\Delta t))=(S_t,I_t-1)\left| (\tilde{S}(t),\tilde{I}(t))=(S_t,I_t)\right. \right) &= \alpha I_t \Delta t + o(\Delta t).\end{align}

Proof. From Proposition 5.3, the first probability is given by

\begin{align*}S_t P^{00}(t,t+\Delta t)^{S_t-1} P^{01}(t,t+\Delta t) P^{11}(t,t+\Delta t)^{I_t},\end{align*}

whereas the second probability is given by

\begin{align*}I_t P^{00}(t,t+\Delta t)^{S_t} P^{11}(t,t+\Delta t)^{I_t-1}P^{12}(t,t+\Delta t).\end{align*}

Therefore, both formulas follow from (3.11).

Remark 5.5. Using our notations, the general stochastic epidemic model proposed by Bartlett in Bartlett (Reference Bartlett1949) is characterised by the following relations (see Andersson and Britton, Reference Andersson and Britton2000, p. 14 or Britton et al., Reference Britton2019, p. 26)

(5.4)

\begin{align}\mathbb{P}\left((\tilde{S}(t+\Delta t),\tilde{I}(t+\Delta t))=(S_t-1,I_t+1)\left| (\tilde{S}(t),\tilde{I}(t))=(S_t,I_t)\right. \right) &= \beta S_t I_t/N\Delta t + o(\Delta t), \\\mathbb{P}\left((\tilde{S}(t+\Delta t),\tilde{I}(t+\Delta t))=(S_t,I_t-1)\left| (\tilde{S}(t),\tilde{I}(t))=(S_t,I_t)\right. \right) &= \alpha I_t \Delta t + o(\Delta t).\nonumber\end{align}

Compared with Proposition 5.4, the key difference is in the infection intensity where $\beta S_tI_t/N$ is replaced by $\beta S_t i(t)$ . In our model, the infection intensity is given exogenously from the SIR deterministic model, whereas in the general stochastic epidemic model (5.4), the infection intensity is determined endogenously by the actual number of susceptible and infected individuals in the population. Nevertheless, for large N, $\tilde{I}(t)/N \approx i(t)$ in probability. Hence, our model can be considered an approximation of the general stochastic epidemic model. It would be interesting to work out a comprehensive comparison between the two models, and we would like to revisit this question at a future date.

Remark 5.6. The independence assumption among members of the population allows for mathematical tractability. We offer the following heuristic argument to partially justify it. For another point of view, the readers are referred to Hillairet and Lopez (Reference Hillairet and Lopez2021) and Hillairet et al. (Reference Hillairet2022). It is plausible to assume independence between individuals because the dependence nature of the epidemic has already been encoded in the individual’s infection intensity. Let us elaborate by considering the following instances.

Suppose there are many infected individuals in the population at time t, that is, $\tilde{I}(t)$ is large. Then (5.1) implies that i(t) is also likely to be large. As the infection intensity for an individual is $\beta i(t)$ , a randomly chosen individual is more likely to get infected even though all individuals are independent.
Suppose we observe no infected individuals at time t. According to our model, further infection is possible which is counter intuitive. However, by (5.1), if $\tilde{I}(t)=0$ then it is likely that i(t) is small. The infection intensity is also small and even though further infection is possible, it is also unlikely.

Hence, although the independence assumption is unrealistic, this model can still capture the dynamics of the epidemic to a certain extent because the aggregate level of infection in the population indirectly influences every individual’s chance of getting infected.

We now turn to the question of the duration of the epidemic.

Definition 5.7. The duration of the epidemic, D, is defined as the first instance that there are no infected individuals and there are no further infections thereafter. In other words,

Equivalently, the duration can be defined as

\[ D = \max_{1\leq i \leq N}\{ T_i\; :\; T_i <\infty \}\]

where $T_i$ is the time to absorption of the $i{\rm th}$ individual as defined in Section 4.

Proposition 5.8. The distribution function of D is given by

(5.5)

\begin{equation}\mathbb{P}(D\leq t) = \left(P^{00}(0,\infty)+P^{02}(0,t)\right)^{S_0} \left(1-e^{-\alpha t} \right)^{I_0}.\end{equation}

Proof. The epidemic is over by time t if and only if at time t

all $I_0$ infected individuals have been removed;
and all $S_0$ susceptible individuals have either been removed or will never be infected.

Therefore, (5.5) follows from the independence assumption.

Let us give the asymptotic distribution of D as $N\to \infty$ such that s(0) and i(0) are constants. Let $D_0$ be the maximum value of the time to absorption for the $S_0$ individuals who are susceptible at time 0 and are removed eventually. Let $D_1$ be the maximum value of the time to absorption for the $I_0$ individuals who are infected at time 0. Thus, $D_0$ is the maximum value of $S_0$ independently identically distributed (i.i.d.) random variables $(Y_i)_{i=1}^{S_0}$ whose common distribution, F(t), is given by

(5.6)

\begin{equation} F(t) = P^{00}(0,\infty) + P^{02}(0,t).\end{equation}

On the other hand, $D_1$ is the maximum value of $I_0$ i.i.d. exponential random variables with rate $\alpha$ . It is clear that D is the maximum value of $D_0$ and $D_1$ . The following proposition gives the limiting distributions of $D_0$ and $D_1$ .

Proposition 5.9. There exists a constant b such that both $(\alpha-\beta s(\infty))D_0-\log(S_0)-b$ and $\alpha D_1-\log(I_0)$ converge in distribution to the standard Gumbel distribution as $N\to\infty$ . Furthermore, the variable $D/\log(N)$ is bounded in probability. More precisely,

(5.7)

\begin{equation}\lim_{N\to\infty} \mathbb{P}\left(D\leq \frac{2\log(N)}{\alpha-\beta s(\infty)} \right) = 1.\end{equation}

Proof. The fact that the normalised maximum of i.i.d. exponential random variables converges in distribution to the standard Gumbel distribution as sample size increases to infinity is well-known (see Embrechts et al., Reference Embrechts, Klüppelberg and Mikosch1997, p. 125). From Corollary 3.8, $1-F(t) \sim e^{-(\alpha-\beta s(\infty))t}$ . Thus, the random variable $Y_i$ is tail equivalent to the exponential random variable with rate $\alpha-\beta s(\infty)$ . From Embrechts et al. (Reference Embrechts, Klüppelberg and Mikosch1997, Proposition 3.3.28), the norming constants of the exponential distribution can be used for those of F(t) with the limiting distribution shifted by a constant. As $D_0$ and $D_1$ are independent, $P(D\leq t) = P(D_0\leq t)P(D_1\leq t)$ . By direct calculation,

(5.8)

\begin{equation}\lim_{N\to\infty} \mathbb{P}\left(D \leq \frac{2\log(N)}{\alpha-\beta s(\infty)}\right)= \lim_{N\to\infty}\exp\left[-\left(\frac{s(0)e^{-b}}{N}+ \frac{i(0)}{N^{\frac{\alpha+\beta s(\infty)}{\alpha-\beta s(\infty)}}} \right) \right] = 1.\end{equation}

Remark 5.10. It is important to emphasise that the duration of the epidemic is not given by

\[ \inf\{t\geq 0\;:\; \tilde{I}(t)=0\}.\]

The reason is that even if $\tilde{I}(t)=0$ , further infections are still possible after time t. If that is the case, we say the epidemic restarts after time t. Since we cannot rule out the possibility that the epidemic may restart as long as there remain susceptible individuals, the duration is not an observable quantity.

For the rest of this section, we will estimate the duration of the epidemic under different sets of available information. Let us start with the following scenario. Suppose z is the last instance we observe infected individuals and t is the first instance we observe no infected individuals. That is, for $z<t$

(5.9)

\begin{equation}(\tilde{S}(z), \tilde{I}(z))=(S_{z},I_{z}),\quad (\tilde{S}(t),\tilde{I}(t))=(S_{t},0)\end{equation}

where $S_{z},S_{t}$ and $I_{z} >0$ . Under these circumstances, the epidemic is not over at time z, but it is not clear whether it is over at time t. Proposition 5.11 below describes the conditional distribution of the duration given (5.9). Thus, we can estimate how long it takes until the epidemic is finally over having observed no infections for the first time.

Proposition 5.11. For $w\geq 0$ , the conditional distribution of D

\[ \mathbb{P}\left( D\leq z+w\left| (\tilde{S}(z),\tilde{I}(z))=(S_z,I_z), (\tilde{S}(t),\tilde{I}(t))=(S_t,0) \right.\right)\]

is given by

(5.10)

\begin{align} \left\{ \begin{array}{ll} P^{00}(t,\infty)^{S_{t}} \left( \frac{P^{02}(z,z+w)}{P^{02}(z,t)}\right)^{S_{z}-S_{t}} \left(\frac{P^{12}(z,z+w)}{P^{12}(z,t)}\right)^{I_{z}} & \mbox{if $w\leq t-z$,}\\ \left(P^{00}(t,\infty)+P^{02}(t,z+w)\right)^{S_t} & \mbox{if $w> t-z$.} \end{array} \right.\end{align}

Note that the distribution above is continuous but not differentiable at $w=t-z$ .

Remark 5.12. Let us give a more intuitive explanation of (5.10).

For $w>t-z$ , (5.10) gives the probability that the duration is over at time $z+w>t$ based on the state of the population at time t. Then the formula is just a variation of (5.5).
For $w<t-z$ , (5.10) gives the probability that the duration is over at time $z+w<t$ given that there remain $S_t$ susceptible individuals at the end of the epidemic. Then the required probability is the product of the probability that all $S_t$ susceptible individuals at time t remain susceptible forever with the probability that all $S_z-S_t$ susceptible individuals at time z being removed by time $z+w$ given they are removed by time t with the probability that all $I_z$ infected individuals at time z being removed by time $z+w$ given they are being removed by time t.

Corollary 5.13. Suppose that in addition to the information (5.9), we also know that there are no further infections after time t. For example, all the remaining susceptible individuals are removed for some other reasons. Then the distribution of the duration under the combined information is

\[ \mathbb{P}\left( D\leq z+w\left| D\leq t,(\tilde{S}(z),\tilde{I}(z))=(S_z,I_z), (\tilde{S}(t),\tilde{I}(t))=(S_t,0) \right.\right)\]

which, for $0\leq w \leq t-z$ , is given by

\[ \left( \frac{P^{02}(z,z+w)}{P^{02}(z,t)}\right)^{S_{z}-S_{t}} \left(\frac{P^{12}(z,z+w)}{P^{12}(z,t)}\right)^{I_{z}}.\]

Proposition 5.14. Suppose $z>t^{*}$ and i(z) is sufficiently small as in Proposition 2.5. Let $D_z$ be the duration of the epidemic beyond time z. That is $D_z = D-z$ if $D> z$ and 0 otherwise. Then the expected value of $D_z$ given the state of the population at time z can be approximated by

(5.11)

\begin{equation}\mathbb{E}\left(D_z \left|(\tilde{S}(z),\tilde{I}(z))=(S_z,I_z) \right.\right) =\frac{S_z \beta i(z)}{\alpha-\beta s(\infty)}\left( \frac{1}{\alpha-\beta s(\infty)}+ \frac{1}{\alpha}\right) + \frac{I_z}{\alpha} + o(i(z)).\end{equation}

Remark 5.15. Proposition 5.14 whose proof is omitted is applicable in the following situation. Suppose we observe that the infection number is zero at some time z after the peak of the epidemic. How long do we expect to wait until the epidemic is finally over ? From Proposition 5.1, $\tilde{I}(z)=0$ implies that $i(z)<2/\sqrt{N}$ with at least 99% probability. Thus, if the population is large, we can be fairly confident that the condition of the Proposition 5.14 is applied. Then (5.11) gives an approximation to the expected duration of the epidemic from time z. Equation (5.11) is quite intuitive. The factor $S_z \beta i(z)/(\alpha-\beta s(\infty))$ approximates the expected number of infections after time z and the factor $1/(\alpha-\beta s(\infty))+1/\alpha$ approximates the expected time to absorption for individuals who get infected after time z. Hence, even though the epidemic can continue after zero infection level is observed, we do not expect it to last very long or be very severe.

Remark 5.16. Let us make a brief comparison between the model constructed in this section and the general stochastic epidemic model. According to Andersson and Britton (Reference Andersson and Britton2000) Theorem 2.2, for the general stochastic epidemic model the distribution of the final size is typically bimodal. That is either a few individuals are infected or a fairly large number are infected eventually, whereas for our model the distribution for the final size is binomial. In addition, from Andersson and Britton (Reference Andersson and Britton2000, p. 34), the asymptotic distribution of the duration is either short or grows as $\log(N),$ whereas for our model the duration grows as $\log(N)$ .

6. Valuations of insurance benefits

In this section, we derive formulas for the expected present values (EPV) of insurance and annuity benefits in the SIR multiple state model. The results of this section will be used to analyse premiums and reserves on insurance products in the next section.

6.1. Review of general results

For the readers convenience, we collect some useful results for the EPV calculations in a general Markov multiple state model in this subsection. For comprehensive treatments, the readers are referred to Dickson et al. (Reference Dickson, Hardy and Waters2020) and Haberman and Pitacco (Reference Haberman and Pitacco1998).

Definition 6.1. Consider a Markov multiple state model with $M+1$ states $0,1,\ldots,M$ . The transition matrix is P and the force of transition matrix is $\mu_t$ . Let $\delta$ be the constant force of interest and $v = e^{-\delta}$ . Recall the following standard actuarial functions.

For any i,j, let $\bar{a}^{ij}(z,n)$ be the EPV of an insurance benefit that pays a continuous annuity at the rate of $1 per annum between time z and n to an individual, who is at state i at time 0, as long as he is in state j.
\[ \bar{a}^{ij}(z,n) = \int_z^n v^t P^{ij}(z,t)dt.\]
For $i\neq j$ , let $\bar{A}^{ij}(z,n)$ be the EPV of an insurance benefit that pays $1 at time t for $z\leq t\leq n$ if the policyholder who is at state i at time 0 moves to state j at time t.
\[ \bar{A}^{ij}(z,n) = \sum_{k:\; k\neq j}^M \int_z^n v^t P^{ik}(z,t) \mu^{kj}_t dt.\]
Let $\bar{B}^{ij}(z,n)$ be the EPV of an insurance benefit that pays $1 at time t for $z\leq t\leq n$ if the policyholder who was at state i at time 0 transfer out of state t at time t.
\[ \bar{B}^{ij}(z,n) = \sum_{k:\; k\neq j}^M \int_z^n v^t P^{ij}(z,t) \mu^{jk}_t dt.\]

Lemma 6.2. We have

(6.1)

\begin{equation}\sum_{j=0}^M \bar{a}^{ij}(z,n) = \frac{v^z-v^n}{\delta}.\end{equation}

Lemma 6.3. For $i\neq j$ ,

\[ \bar{A}^{ij}(z,n) = v^nP^{ij}(z,n) + \delta \bar{a}^{ij}(z,n) + \bar{B}^{ij}(z,n).\]

Proof. Combining (3.6) and integration by parts yields

\begin{align*}\bar{A}^{ij}(z,n) &= \int_z^n v^t \left\{ P^{ij}_t(z,t)+\sum_{k:k\neq j}^M P^{ij}(z,t)\mu^{jk}_t\right\} dt \\& = \left[v^t P^{ij}(z,t)\right]_z^n - \int_0^n v^t \log(v) P^{ij}(z,t) dt +\sum_{k:k\neq j}^M \int_z^n v^tP^{ij}(z,t)\mu^{jk}_t dt\\&= v^nP^{ij}(z,n) + \delta {}[ij](z,n) +\bar{B}^{ij}(z,n).\end{align*}

Remark 6.4 Consider an insurance policy in force between time z and n for an individual who is currently in state i. He will deposit $1 to an insurance fund the moment he enters state j. In return, he will receive a continuous annuity of rate $\delta$ per annum as long as he is still in state j. His deposit will also be returned either on the moment he leaves state j or on the expiration of the policy. Then Lemma 6.3 states that for such a policy the EPV of the cost to the insurer is equal to the EPV of the fund paid to the insurer.

Corollary 6.5. If j is an absorbing state, then

\[ \bar{A}^{ij}(z,n) = v^nP^{ij}(z,n) + \delta \bar{a}^{ij}(z,n).\]

6.2. Applications to the SIR multiple state model

Let us apply the results of the previous subsection to derive formulas for the standard actuarial functions in the context of the SIR multiple state model.

Proposition 6.6. For the SIR multiple state model, we have

(6.2)

\begin{align}\bar{A}^{01}(z,n) &= v^nP^{01}(z,n) + (\alpha+\delta) \bar{a}^{01}(z,n),\end{align}

(6.3)

\begin{align}\bar{A}^{02}(z,n) &= \alpha \bar{a}^{01}(z,n),\end{align}

(6.4)

\begin{align}\bar{A}^{12}(z,n) &= \frac{\alpha v^z}{\alpha+\delta} \left(1 - e^{-(\alpha+\delta)(n-z)} \right).\end{align}

In addition,

(6.5)

\begin{align}\bar{a}^{11}(z,n) &= \frac{v^z}{\alpha+\delta}\left(1-e^{-(\alpha+\delta)(n-z)} \right),\\\bar{a}^{12}(z,n) &= \frac{v^z-v^n}{\delta} -\frac{v^z}{\alpha+\delta}\left(1-e^{-(\alpha+\delta)(n-z)} \right). \nonumber\end{align}

Furthermore, the following equations hold

(6.6)

\begin{equation}\bar{a}^{00}(z,n) + \left(1+\frac{\alpha}{\delta} \right) \bar{a}^{01}(z,n)=\frac{1}{\delta} \left( v^z - v^n(1-P^{02}(z,n))\right),\end{equation}

and

(6.7)

\begin{equation}\bar{a}^{02}(z,n) = \frac{1}{\delta} \left(\alpha \bar{a}^{01}(z,n)-v^nP^{02}(z,n)\right).\end{equation}

and

(6.8)

\begin{equation} \bar{A}^{01}(z,n)+\delta \bar{a}^{00}(z,n) = v^z-v^nP^{00}(z,n).\end{equation}

Proof. The formulas for $\bar{A}^{12}(z,n)$ , $\bar{A}^{02}(z,n)$ , $\bar{a}^{11}(z,n)$ and $\bar{a}^{12}(z,n)$ can be derived directly from the fact that $\mu^{12}_t = \alpha$ and $P^{11}(z,t) = e^{-\alpha (t-z)}$ . From Proposition 6.3,

\begin{align*}\bar{A}^{01}(z,n) &= v^nP^{01}(z,n) + \delta \bar{a}^{01}(z,n) + \int_z^n v^t P^{01}(z,t)\mu^{12}_t dt\\&= v^nP^{01}(z,n) + (\alpha+\delta) \bar{a}^{01}(z,n).\end{align*}

By Corollary 6.5,

\[ \bar{A}^{02}(z,n) = v^n P^{02}(z,n) + \delta \bar{a}^{02}(z,n).\]

On the other hand, we have just showed that $\bar{A}^{02}(z,n) = \alpha \bar{a}^{01}(z,n)$ . Thus, we deduce (6.7). Equation (6.6) follows from (6.1) and (6.7). Finally, combining (6.2) with (6.6) yields (6.8).

Definition 6.7. Following Feng and Garrido (Reference Feng and Garrido2011), we define the following actuarial functions:

\begin{align*}\bar{a}^{s}(z,n) \;:\!=\; \int_z^n v^{t}s(t)dt, \quad\bar{a}^{i}(z,n) \;:\!=\; \int_z^n v^{t}i(t)dt, \quad\bar{a}^{r}(z,n) \;:\!=\; \int_z^n v^{t}r(t)dt.\end{align*}

In addition,

\begin{align*}\bar{A}^{i}(z,n) &\;:\!=\; \int_z^n v^{t}s(t)\mu^{01}_t dt = \beta \int_z^n v^ts(t)i(t)dt, \\\bar{A}^{r}(z,n) &\;:\!=\; \int_z^n v^{t}i(t)\mu^{12}_t dt = \alpha \int_z^n v^t i(t)dt.\end{align*}

Let us refer to these actuarial functions as the aggregate actuarial functions. These functions are useful to calculate reserves and premiums for an insurance scheme consisting of all the population including those who are already infected or removed. For example, this insurance scheme might be part of a compulsory national insurance plan in which the entire population is required to participate. In this context, the aggregate actuarial functions have the following economic interpretations. If each susceptible (or infected or removed) individual is entitled to a continuous annuity at the rate of $1 per annum between time z and n, then the EPV of the cost to the insurer per individual is given by $\bar{a}^{s}(z,n)$ (or $\bar{a}^{i}(z,n)$ or $\bar{a}^{r}(z,n)$ ), respectively. Similarly, if each individual is compensated $1 immediately upon getting infected (or removed) between time z and n, then $\bar{A}^{i}(z,n)$ (or $\bar{A}^{r}(z,n)$ ) represents the EPV of the cost to the insurer per individual.

Lemma 6.8. From (3.4), the aggregate actuarial functions can be expressed in terms of the standard actuarial functions as follows.

\begin{align*}\bar{a}^{s}(z,n) &= s(z)\bar{a}^{00}(z,n), \\\bar{a}^{i}(z,n) &= s(z)\bar{a}^{01}(z,n) + i(z)\bar{a}^{11}(z,n),\\\bar{a}^{r}(z,n) &= s(z)\bar{a}^{02}(z,n) + i(z)\bar{a}^{12}(z,n)+ r(z) \left(\frac{v^z-v^n}{\delta}\right),\\\bar{A}^{i}(z,n) &= s(z)\bar{A}^{01}(z,n), \\\bar{A}^{r}(z,n) &= s(z)\bar{A}^{02}(z,n) + i(z)\bar{A}^{12}(z,n).\end{align*}

Lemma 6.9.

(6.9)

\begin{equation} \delta \bar{a}^{r}(z,n) = \alpha \bar{a}^{i}(z,n) + {v^zr(z)-v^nr(n)}.\end{equation}

Proof. Equation (6.9) can be derived from integration by parts. It has the following economic interpretation. Consider an insurance policy between time z and n where each individual deposits $1 to an insurance fund either at time z if he is already removed or the moment he is removed. In return, he is entitled to a continuous annuity of rate $\delta$ per annum until time n when the $1 deposited earlier will be returned. The EPV of the cost to the insurer per individual is given by

\[ v^n r(n) + \delta \bar{a}^{r}(z,n).\]

The EPV of the fund paid to the insurer per individual is the sum of the payments of those who are already removed at time z and the payments by those who are not yet removed at time z. Thus, it is equal to

\begin{align*}v^zr(z) + \int_z^n v^tr'(t)dt = v^z r(z) + \alpha \bar{a}^{i}(z,n).\end{align*}

As a result, (6.9) simply says that the EPV of the cost to the insurer is the same as the EPV of the fund paid to the insurer.

Proposition 6.10. For the SIR multiple state model, the following identities hold

(6.10)

\begin{align}& \delta\bar{a}^{s}(z,n) + (\alpha + \delta) \bar{a}^{i}(z,n)= v^z(1-r(z))-v^n(1-r(n)), \end{align}

(6.11)

\begin{align}& \bar{A}^{i}(z,n) + \delta\bar{a}^{s}(z,n) =v^zs(z)-v^ns(n),\end{align}

(6.12)

\begin{align}& \bar{A}^{i}(z,n) + v^zi(z)-v^ni(n) = (\alpha + \delta) \bar{a}^{i}(z,n). \end{align}

Proof. Different versions of these identities are proved in Feng and Garrido (Reference Feng and Garrido2011). We note here that they can be derived from Proposition 6.6 and Lemma 6.8.

Remark 6.11. The economic interpretations of Proposition 6.10 can be obtained by equating the EPV of the cost to the insurer and the EPV of the fund paid to the insurer for suitably chosen insurance policies. As an example, let us interpret (6.10). Consider the insurance policy between time z and n in which each susceptible or infected individual contributes $1 at time z. In return, he will receive a continuous payment of rate $\delta$ per annum until he is either removed or until the expiration of the policy when he is also returned his deposit. The other identities can be interpreted similarly (see Feng and Garrido, Reference Feng and Garrido2011 for more information).

7. Calculation of premiums and reserves

7.1. Premiums

Consider the following insurance policy which is in force for n years:

If the policyholder is susceptible, then he pays a continuous premium at the rate of P per annum;
If the policyholder is infected, then he receives immediately a payment of $S^1$ and a continuous hospitalisation benefit at the rate of H per annum;
If the policyholder is removed, then he receives immediately a payment of $S^2$ .

Note that n should not exceed 1 otherwise constant removal intensity might not be appropriate. We have the options of either work at the individual or aggregate level. That means we can either determine the premium by considering a single policy consisting of one susceptible policyholder or by considering a portfolio consisting of $S_0$ susceptible and $I_0$ infected policyholders. The latter is the approach taken in Feng and Garrido (Reference Feng and Garrido2011). Nevertheless, being able to work at the individual level allows more flexibility, for example, if only a subset of the population is insured.

Proposition 7.1. Under the equivalence principle, the premium rate P at the individual level is

(7.1)

\begin{equation}P = \frac{S^1 \bar{A}^{01}(0,n) + H \bar{a}^{01}(0,n) + S^2 \bar{A}^{02}(0,n)}{\bar{a}^{00}(0,n)}.\end{equation}

The premium rate per individual $\pi$ at the aggregate level is given by

(7.2)

\begin{equation}\pi = \frac{S^1 \bar{A}^{i}(0,n) + H \bar{a}^{i}(0,n) + S^2 \bar{A}^{r}(0,n)}{\bar{a}^{s}(0,n)}.\end{equation}

Proof. Let us prove (7.2) as (7.1) is clear. Let $B^{(0)}$ and $B^{(1)}$ be the EPVs of future benefits for an individual who is susceptible and infected at time 0, respectively. Then,

\begin{align*} B^{(0)} &= S^1 \bar{A}^{01}(0,n) + H \bar{a}^{01}(0,n) + S^2 \bar{A}^{02}(0,n),\\ B^{(1)} &= H\bar{a}^{11}(0,n) + S^2\bar{A}^{12}(0,n). \end{align*}

Therefore, $\pi = (s(0) B^{(0)} + i(0)B^{(1)})/(s(0)\bar{a}^{00}(0,n))$ which can be rearranged to give (7.2).

7.2. Reserves

For reserves we differentiate 4 different types

prospective reserves at the individual or aggregate level,
retrospective reserves at the individual or aggregate level,

where prospective means we are taking the EPV of the difference between future benefits and future premium payments, whereas retrospective means we are taking the accumulating value of the premium payments already received less that of the benefits already paid.

Proposition 7.2. Consider the insurance policy described at the beginning of this section.

The prospective reserves at the individual level at time t for a susceptible policyholder and an infected policyholder are, respectively,

(7.3)

\begin{align}{}_{t}V^{(0)} &= v^{-t}\left(S^1\bar{A}^{01}(t,n) + H\bar{a}^{01}(t,n)+S^2\bar{A}^{02}(t,n) - P\bar{a}^{00}(t,n)\right),\end{align}

(7.4)

\begin{align}{}_{t}V^{(1)} &= v^{-t}\left(H\bar{a}^{11}(t,n)+S^2\bar{A}^{12}(t,n)\right).\end{align}

The retrospective reserves at the individual level at time t for a susceptible policyholder and an infected policyholder are, respectively,

(7.5)

\begin{align}{}_{t}V^{(0)}_{R} &= v^{-t}\left(P\bar{a}^{00}(0,t) - S^1\bar{A}^{01}(0,t) - H\bar{a}^{01}(0,t) - S^2\bar{A}^{02}(0,t)\right), \end{align}

(7.6)

\begin{align}{}_{t}V^{(1)}_{R} &= -v^{-t}\left(H\bar{a}^{11}(0,t)+S^2\bar{A}^{12}(0,t)\right).\end{align}

Furthermore, $({}_{t}V^{(0)},{}_{t}V^{(1)})$ satisfy the differential equation

(7.7)

\begin{equation}\frac{d}{dt} \left( \begin{array}{c}{}_{t}V^{(0)} \\ {}_{t}V^{(1)} \end{array} \right)= \left[ \begin{array}{cc}\delta + \beta i(t) & -\beta i(t) \\ 0 & \delta + \alpha \end{array} \right] \left( \begin{array}{c}{}_{t}V^{(0)} \\ {}_{t}V^{(1)} \end{array} \right) + \left( \begin{array}{c}P \\ -H \end{array} \right) -\left( \begin{array}{c}S^1 \beta i(t) \\ \alpha S^2 \end{array} \right).\end{equation}

whereas $({}_{t}V^{(0)}_{R},{}_{t}V^{(1)}_{R})$ satisfy the differential equation

(7.8)

\begin{equation}\frac{d}{dt} \left( \begin{array}{c}{}_{t}V^{(0)}_{R} \\ {}_{t}V^{(1)}_{R} \end{array} \right)= \left[ \begin{array}{cc}\delta & 0 \\ 0 & \delta \end{array} \right] \left( \begin{array}{c}{}_{t}V^{(0)}_{R} \\ {}_{t}V^{(1)}_{R} \end{array} \right) + \frac{1}{s(0)}\left( \begin{array}{c}Ps(t)-S^1 \beta s(t)i(t) \\ 0 \end{array} \right) - \frac{(H+\alpha S^2)}{s(0)}\left( \begin{array}{c}i(t)-i(0)e^{-\alpha t} \\ s(0)e^{-\alpha t} \end{array} \right).\end{equation}

Proof. The formulas for the prospective and retrospective reserves are clear. We note that (7.7) is a special case of Thiele’s differential equation for prospective reserves (see Dickson et al., Reference Dickson, Hardy and Waters2020, p. 325) and (7.8) can be obtained by direct differentiation.

Proposition 7.3. Let $W^R(t)$ and $W^P(t)$ be the retrospective and prospective reserves per individual at the aggregate level. Then

\begin{align*}W^R(t) &= v^{-t} \left(\pi \bar{a}^{s}(0,t) - S^1 \bar{A}^{i}(0,t) - H \bar{a}^{i}(0,t) - S^2 \bar{A}^{r}(0,t) \right) \\ W^P(t) &= v^{-t}\left( S^1 \bar{A}^{i}(t,n) + H \bar{a}^{i}(t,n) + S^2 \bar{A}^{r}(t,n) - \pi \bar{a}^{s}(t,n)\right).\end{align*}

The prospective and retrospective reserves $W^P(t)$ and $W^R(t)$ are related via

\[ W^R(t) = W^P(t) - e^{\delta t} W^{P}(0).\]

In addition, both $W^R(t)$ and $W^P(t)$ satisfy the differential equation

(7.9)

\begin{align}\frac{dX}{dt} = \delta X + \pi s(t) - \beta S^1 s(t) i(t) - (H+ \alpha S^2) i(t).\end{align}

Proof. The formulas for the prospective and retrospective reserves are clear due to the interpretation for the aggregate actuarial functions. The differential Equation (7.9) follows by direct differentiation.

Remark 7.4. Proposition 7.3 is proved in Feng and Garrido (Reference Feng and Garrido2011) for retrospective reserves. Suppose that P is set equal to $\pi$ . Then we have the following decompositions of aggregate reserves.

(7.10)

\begin{equation}W^P(t) = s(t) {}_{t}V^{(0)} + i(t) {}_{t}V^{(1)}, \quad W^R(t) = s(0) {}_{t}V^{(0)}_{R} + i(0) {}_{t}V^{(1)}_{R}.\end{equation}

Thus, we can also obtain the same results by combining (7.10) and Proposition 7.2.

8. Numerical illustration

8.1. Data

For numerical illustration, let us consider the Eyam data set from Raggett (Reference Raggett1982). This data set describes the evolution of the bubonic plague in the village of Eyam in 1665–1666. From a population of 261 in June 1666, there were only 83 survivors in October 1666 when the plague ended. Note that we have rounded non-integral values to the nearest integers.

8.2. Parameter estimation

Given the values $(S_{t_i},I_{t_i})$ of the number of susceptible and infected individuals $(\tilde{S}(t_i),\tilde{I}(t_i))$ at time $t_i$ for $i=1,\ldots,M$ where $t_1=0$ , we can form the conditional log-likelihood function as follows.

(8.1)

\begin{equation}l(\alpha,\beta) = \sum_{i=1}^{M-1} \log\left(\mathbb{P}\left( (\tilde{S}(t_{i+1}),\tilde{I}(t_{i+1}))=(S_{t_{i+1}},I_{t_{i+1}}) \left|(\tilde{S}(t_{i}),\tilde{I}(t_{i}))=(S_{t_{i}},I_{t_{i}}) \right. \right)\right)\end{equation}

Given $\alpha,\beta$ , numerical methods can be employed to solve the differential Equations (2.1) with initial values

(8.2)

\begin{equation}s(0) = \frac{254}{261},\quad i(0) = \frac{7}{261}\end{equation}

for $(s(t_i),i(t_i),r(t_i))$ for $i=1,\ldots,M$ . The transition probabilities can be obtained from (3.13). Then (8.1) can be calculated via Proposition 5.3. The graph of the function $l(\alpha,\beta)$ is given in Figure 2. The function $l(\alpha,\beta)$ can be maximised using numerical methods. Initial estimates for $(\alpha,\beta)$ can be obtained by minimizing the sum of squares

(8.3)

\begin{equation}\sum_{k=1}^{M} \left(\hat{s}(t_k)-s(t_k) \right)^2 + \sum_{k=1}^{M}\left(\hat{i}(t_k)-i(t_k) \right)^2\end{equation}

where $s(t_k),i(t_k)$ are the values obtained from solving (2.1) and $\hat{s}(t_k),\hat{i}(t_k)$ are the values calculated from the data. The initial estimates obtained from minimizing (8.3) are $\hat{\alpha}_0 = 34.739$ and $\hat{\beta}_0 = 56.441$ . The maximum likelihood estimates of $\alpha$ and $\beta$ are given by

(8.4)

\begin{equation}\hat{\alpha} = 34.150,\quad \hat{\beta} = 55.437.\end{equation}

Figure 2. Log-likelihood function.

These estimates are slightly different from those obtained by Raggett (Reference Raggett1982) who used a different method of estimation. Note that (8.1) does not take into account the fact that the epidemic is already over on October 20. To take that into account, the term $S_{t_M} \log(s(\infty)/s(t_M))$ needs to be added to (8.1). The modified log-likelihood function becomes

(8.5)

\begin{equation}\tilde{l}(\alpha,\beta) = l(\alpha,\beta) + S_{t_M} \log(s(\infty)/s(t_M)).\end{equation}

Note that $s(\infty)$ can be calculated from Proposition 2.2. The estimates obtained from maximizing (8.5) are $\tilde{\alpha}_0 = 35.090$ and $\tilde{\beta}_0 = 56.804$ . However, we do not use these estimates as we would like to treat the data as part of an ongoing epidemic. Figure 3 shows the graphs of the estimated s(t) and i(t) and the observed data.

Figure 3. Fitting s(t), i(t) to the data.

8.3. Transition probabilities and transition intensities

Substituting the values of $\hat{\alpha}$ and $\hat{\beta}$ from (8.4) into (2.1), we can solve for s(t), i(t), r(t) and the transition probabilities $P^{ij}(0,t)$ . From Figure 4, we observe that $P^{00}(0,t)$ , $P^{01}(0,t)$ and $P^{02}(0,t)$ have remarkably similar shapes to s(t), i(t) and r(t), which are consistent with (3.13). To find the value of $s(\infty)$ and $r(\infty)$ , we solve (2.5) which yields

\[ s(\infty) = 0.3257,\quad r(\infty)= 0.6743.\]

As a result, we can deduce

\[ P^{00}(0,\infty)= 0.3346,\quad P^{01}(0,\infty)=0,\quad P^{02}(0,\infty)=0.6654.\]

Hence, at the start of the epidemic a susceptible individual has a 33.46% chance of never getting infected and 66.54% chance of getting removed eventually. From (2.7), the infection intensity peaks at $t^{*}=0.12$ (years) when approximately 10% of the population is infected. Using (3.13), we can calculate the transition probabilities $P(z,z+t)$ for any z, t. The graphs of $P^{0i}(z,z+t)$ for $i=0,1,2$ are given in Figure 5.

Figure 4. Transition probabilities $P^{0i}(0,t)$ for $i=0,1,2$ and s(t), i(t), r(t).

Figure 5. Transition probabilities $P^{0i}(z,z+t)$ for $i=0,1,2$ .

8.4. Duration estimation

Note that the last time that we could observe infected individuals is when $z=0.2521$ and the first time we could observe no infected individuals is when $t=0.3370$ . Based on this information alone, the distribution of the duration D of the epidemic is given by Proposition 5.11. On the other hand, if we could be sure that there are no further infections after time $t=0.3370$ , for example, all susceptible individuals are to die from some other causes, then the distribution of D is given by Corollary 5.13. The graphs of the distributions of $D-z$ in both cases are given in Figure 6. The mean and standard deviation of the duration in the first case are 0.4653 and 0.0528, respectively. On the other hand, the mean and standard deviation of the duration in the second case are 0.3317 and 0.0048, respectively.

Figure 6. Distribution functions of the duration.

8.5. Premium and reserve calculation

Consider an insurance policy that pays a continuous annual hospitalisation benefit of $1000 per annum for a maximum of one year. We assume the force of interest is constant at 5% per annum. Then from (7.1), the annual individual premium rate is given by

\[ P = \frac{1000\bar{a}^{01}(0,1)}{\bar{a}^{00}(0,1)} = 47.5408\]

where

\[ \bar{a}^{01}(0,1)= 0.01934,\quad \bar{a}^{00}(0,1) =0.4068.\]

From (7.2), the annual aggregate premium rate is $\pi = 49.5219$ . As pointed out in Feng and Garrido (Reference Feng and Garrido2011), the aggregate retrospective reserves are usually negative since the infection rate decreases towards the end of the epidemic. Hence, to not have policyholders surrender policies once the peak of the epidemic has passed and running at a deficit, insurance providers should charge premiums at the higher rate than that determined by the equivalence principle. Adopting the algorithm in Feng and Garrido (Reference Feng and Garrido2011), we obtained a premium rate of $113.90. From Figure 7, we observe that with the higher premium rate, the aggregate reserve is always non-negative. Thus, the insurer is protected from loss due to early surrender. However, at the end of the epidemic, there is a surplus of $26.79 which should be returned to all remaining susceptible individuals.

Figure 7. Aggregate retrospective reserves for premium rates $49.52 and $113.90.

8.6. Simulations

In this subsection, we describe the simulation procedure for the process $(X_t)_{t\geq0}$ described in Section 3. Suppose $X_0=0$ . From Proposition 4.1, $T^{(0)}$ and $T^{(1)}$ can be simulated by the below algorithm. Note that the inverse of s(t) is given by (2.6).

Simulate $u\sim \mbox{Uniform}(0,1)$ and $w\sim \mathrm{Exponential}(\alpha)$ .
Then $T^{(0)}$ and $T^{(1)}$ are given by
\[ T^{(0)} = \left\{ \begin{array}{ll}s^{-1}(s(0)u) & \mbox{if $u>s(\infty)/s(0)$,}\\\infty & \mbox{if $u\leq s(\infty)/s(0)$.}\end{array}\right.\quad \& \quad T^{(1)} = \left\{ \begin{array}{ll}0 & \mbox{if $T^{(0)}=\infty$},\\w & \mbox{if $T^{(0)}<\infty$}.\end{array}\right.\]

Finally, we study the duration and final susceptible size of the epidemic. Assume that the population size is 261 with 254 susceptible and 7 infected individuals as in the Eyam data set. We simulate 20,000 such populations. For each population, we simulate 254 paths starting from state 0 and 7 paths starting from state 1 independently. The duration for each population is determined as the instance the last infected individual is removed. The average duration of the epidemic from the simulations is 0.4749 years compared with the expected value of 0.4751 years and standard deviation 0.0798. The histograms of the duration and the final susceptible size of the epidemic are included in Figure 8. They are fitted with the probability density functions given by Remark 5.2 and Proposition 5.8.

Figure 8. Histograms of the duration and the final susceptible size.

Alternatively, we can simulate the dynamics of the above population as follows. Let $t_n=n\Delta T$ for $n=0,1,2,\ldots$ where $\Delta T$ is a small time interval. Let $\hat{S}(0)$ be the number of individuals who are initially susceptible but eventually removed. Thus, $\hat{S}(0)=S_0-\tilde{S}(\infty)$ . For $i=0,1,2$ , let $\hat{S}^{0i}(t_n)$ be the number of individuals who are susceptible at time $t_n$ but are susceptible, infected, and removed at time $t_{n+1}$ , respectively. Then,

\[ (\hat{S}^{0i}(t_{n}))_{i=0}^{2} \sim \mbox{Multinomial}\left({\hat{S}(t_n)},(\hat{P}^{0i}(t_n,t_{n+1}))_{i=0}^{2}\right)\]

where

\begin{align*}\hat{P}^{00}(z,t) = \frac{P^{00}(z,t)-P^{00}(z,\infty)}{P^{02}(z,\infty)},\quad\hat{P}^{01}(z,t) = \frac{P^{01}(z,t)}{P^{02}(z,\infty)},\quad\hat{P}^{02}(z,t) = \frac{P^{02}(z,t)}{P^{02}(z,\infty)}.\end{align*}

Note that

\[ \tilde{S}(t_n)=\tilde{S}(\infty)+\hat{S}(t_n).\]

The simulation algorithm is as follows.

Simulate $\tilde{S}(\infty) \sim B(S_0,P^{00}(0,\infty))$ and set $\hat{S}(0)=N-\tilde{S}(\infty)$ .
For $n\geq 0$ , simulate
\[ (\hat{S}^{0i}(t_{n}))_{i=0}^{2} \sim \mbox{Multinomial}\left({\hat{S}(t_n)},(\hat{P}^{0i}(t_n,t_{n+1}))_{i=0}^{2}\right),\]
and
\[ \tilde{I}^{11}(t_n) \sim B({\tilde{I}(t_n)},P^{11}(t_n,t_{n+1})). \]
Update
\[ \hat{S}(t_{n+1}) = \hat{S}^{00}(t_{n}),\quad \tilde{I}(t_{n+1}) = \hat{S}^{01}(t_{n})+\tilde{I}^{11}(t_{n}). \]
Repeat until $\hat{S}(t_{n})=\tilde{I}(t_{n})=0$ .

Figure 9 shows one possible realisation of the dynamics of the population. As expected, the observed number of susceptible and infected individuals fluctuated around the corresponding expected numbers. In addition, the first time zero infection was observed is around 0.36 years. The epidemic then restarted around 0.41 years and eventually ended around 0.45 years. We note that even though the epidemic continued for almost 0.1 years after zero infection was observed, the number of further infections is only 1. Thus, the restart of the epidemic in this case does not significantly alter the dynamics of the epidemic. This fact is consistent with Proposition 5.14 and Remark 5.15.

Figure 9. One possible realisation of the population dynamics.

9. Conclusions

In this paper, we propose a method of studying the financial repercussions of epidemics by combining deterministic compartment models in epidemiology and Markov multiple state models in life insurance. For illustrative purposes, we focus on the classical SIR model. The model that we analyse, which is a special case of Hillairet & Lopez’s model in Hillairet and Lopez (Reference Hillairet and Lopez2021) and Hillairet et al. (Reference Hillairet2022) and which is also studied and generalised by Francis and Steffensen in Francis and Steffensen (Reference Francis and Steffensen2023), has clarified and extended some of the constructions and results of Feng and Garrido in (2011). In particular, our approach is more flexible in dealing with small portfolios and more complex insurance products. We also study the spread of an epidemic in a population and provide concrete descriptions of the distributions of the final susceptible size and the duration of the epidemic albeit under a very strong independence assumption. Clearly, further work is required to relax that assumption and to arrive at a more realistic model. Nevertheless, we believe our model having some desirable features can serve as a starting point for further research and the techniques used in this paper are applicable in other aspect of epidemic risk management for, for example, resource planning and allocation as in Chen (2021).

Acknowledgements

This research is funded by University of Economics Ho Chi Minh City, Vietnam (UEH). We would like to thank Professor Xueyuan Wu of the University of Melbourne for advice and encouragement. We are also grateful to the editor and the anonymous referees for their careful reading and very helpful comments that improve the paper significantly.

Competing interests

The author declares none.

References

Andersson, H. and Britton, T. (2000) Stochastic Epidemie Models and Their Statistical Analysis, Lecture Notes in Statistics, Vol. 151. Springer.Google Scholar

Bartlett, M. (1949) Some evolutionary stochastic processes. Journal of the Royal Statistical Society B, 11, 211–229.10.1111/j.2517-6161.1949.tb00031.xCrossRef Google Scholar

Billard, L. and Dayananda, P. (2014a) A multi-stage compartmental model for HIV-infected individuals: I Waiting time approach. Mathematical Biosciences, 249, 92–101.10.1016/j.mbs.2013.08.011CrossRef Google Scholar PubMed

Billard, L. and Dayananda, P. (2014b) A multi-stage compartmental model for HIV-infected individuals: II Application to insurance functions and health-care costs. Mathematical Biosciences, 249, 102–109.10.1016/j.mbs.2014.01.009CrossRef Google Scholar PubMed

Brauer, F., Castillo-Chavez, C. and Feng, Z. (2019) Mathematical Models in Epidemiology, 3rd edition, Texts in Applied Mathematics, Vol. 69. Springer.10.1007/978-1-4939-9828-9CrossRef Google Scholar

Britton, T., et al. (2019) Stochastic Epidemic Models with Inference, Lecture Notes in Mathematics, Vol. 2255. Springer.Google Scholar

Chen, X., et al. (2021) Pandemic risk management: Resources contingency planning and allocation, Insurance: Mathematics and Economics, 101, 359–383.Google Scholar PubMed

Dickson, D., Hardy, M. and Waters, H. (2020) Actuarial Mathematics for Life Contingent Risks, 3rd edition, International Series in Actuarial Science. Cambridge University Press.10.1017/9781108784184CrossRef Google Scholar

Embrechts, P., Klüppelberg, C. and Mikosch, T. (1997) Modelling Extremal Events for Insurance and Finance, Stochastic Modelling and Applied Probability. Springer.10.1007/978-3-642-33483-2CrossRef Google Scholar

Feng, R. and Garrido, J. (2011) Actuarial applications of epidemiological models. North American Actuarial Journal, 15, 112–136.10.1080/10920277.2011.10597612CrossRef Google Scholar

Feng, R., et al. (2022) Epidemic compartmental models and their insurance applications. In Pandemics: Insurance and Social Protection.10.1007/978-3-030-78334-1_2CrossRef Google Scholar

Feng, R., Jin, L. and Loke, S.-H. (2020) Interplay between epidemiology and actuarial modeling, Submitted to the Casualty Actuarial Society E-Forum.Google Scholar

Francis, L. and Steffensen, M. (2023) Individual life insurance during epidemics. Annals of Actuarial Science, 2023, 1–24.Google Scholar

Haberman, S. and Pitacco, E. (1998) Actuarial Models for Disability Insurance. Boca Raton: Chapman and Hall.Google Scholar

Hainaut, D. (2021) An actuarial approach for modeling pandemic risk. Risks, 9, 3.10.3390/risks9010003CrossRef Google Scholar

Hillairet, C. and Lopez, O. (2021) Propagation of cyber incidents in an insurance portfolio: Counting processes combined with compartmental epidemiological models. Scandinavian Actuarial Journal, 2021(8), 671–694.10.1080/03461238.2021.1872694CrossRef Google Scholar

Hillairet, C., et al. (2022) Cyber-contagion model with network structure applied to insurance. Insurance: Mathematics and Economics, 107, 88–101.Google Scholar

Hua, C. and Cox, S. (2009) An option-based operational risk management model for pandemics. North American Actuarial Journal, 13(1), 54–76.Google Scholar

Kermack, W. and McKendrick, A. (1927) A contribution to the mathematical theory of epidemics. Proceedings of the Royal Society of London. Series A, Containing Papers of a Mathematical and Physical Character, 115(772), 700–721.Google Scholar

Kermack, W. and McKendrick, A. (1932) Contributions to the mathematical theory of epidemics. II. The problem of endemicity. Proceedings of the Royal Society of London. Series A, Containing Papers of a Mathematical and Physical Character, 138(834), 55–83.Google Scholar

Lefevre, C., Picard, P. and Simon, M. (2017) Epidemic risk and insurance coverage. Journal of Applied Probability, 54(1), 286–303.10.1017/jpr.2016.100CrossRef Google Scholar

Lefevre, C. and Picard, P. (2018a) Final outcomes and disease insurance for a controlled epidemic model. Applied Stochastic Models in Business and Industry, 34(6), 803–815.10.1002/asmb.2341CrossRef Google Scholar

Lefevre, C. and Picard, P. (2018b) A general approach to the integral functionals of epidemic processes. Journal of Applied Probability, 55(2), 593–609.10.1017/jpr.2018.37CrossRef Google Scholar

Lefevre, C. and Simon, M. (2020) SIR-type epidemic models as block structured markov processes. Methodology and Computing in Applied Probability, 22, 433–453 10.1007/s11009-019-09710-yCrossRef Google Scholar

Lefevre, C. and Simon, M. (2021) Ruin problems for epidemic insurance. Advances in Applied Probability, 53, 484–509 10.1017/apr.2020.66CrossRef Google Scholar