1. Introduction
In property and casualty (P&C) insurance, accurate loss reserving is vital for financial stability, regulatory compliance, and effective risk management (Bjarnason and Sjögren, Reference Bjarnason and Sjögren2014). Reserves typically comprise two parts: Reported but not settled (RBNS) and incurred but not reported (IBNR) claims. This paper focuses on the latter, where the number of unreported claims at the valuation date is unknown. Estimating IBNR requires advanced modeling to predict both the frequency and severity of claims, making it a complex yet crucial aspect of the reserving process.
Most of the literature on modeling insurance claims and reserving is based on aggregated claims data, such as the chain ladder method (Mack, Reference Mack1993, Reference Mack1999) and the Bornhuetter-Ferguson method (Bornhuetter and Ferguson, Reference Bornhuetter and Ferguson1972). These “macro-level” models summarize claim development by aggregating payments across occurrence and development years. Subsequently, stochastic models were introduced to account for the variability in reserves derived from these models; see Taylor (Reference Taylor2012), England and Verrall (Reference England and Verrall2002), and Wüthrich and Merz (Reference Wüthrich and Merz2008) for a comprehensive overview of stochastic reserving. Despite their simplicity, these models come with several disadvantages, including the loss of valuable information on the characteristics of individual claims, the high uncertainty of parameter estimates, and potential bias under certain conditions (Charpentier and Pigeon, Reference Charpentier and Pigeon2016). To resolve these shortcomings, “micro-level” models have been proposed with the aim to use individual-level data to depict the development of individual claims.
Arjas (Reference Arjas1989) and Norberg (Reference Norberg1993), Norberg (Reference Norberg1999) are the first to propose a stochastic loss reserving framework at the individual claim level. They proposed a marked non-homogeneous Poisson process (NHPP) model and a general mathematical framework for the development of individual claims. Haastrup and Arjas (Reference Haastrup and Arjas1996) then implemented the NHPP model using non-parametric Bayesian statistics. The subject of micro-level reserving remained unopened before emerging back in recent years. Some early contributions include Larsen (Reference Larsen2007), who revisited the framework of Arjas (Reference Arjas1989) and Norberg (Reference Norberg1993), Norberg (Reference Norberg1999), and Rosenlund (Reference Rosenlund2012), who proposed a deterministic reserving approach that allows conditioning on individual claim characteristics. Antonio and Plat (Reference Antonio and Plat2014) presented a case study in which they applied Norberg’s model to an auto-insurance dataset. Data-driven comparisons between macro-level and micro-level reserving have shown the superiority of the micro-level models; see Charpentier and Pigeon (Reference Charpentier and Pigeon2016) and Huang et al. (Reference Huang, Qiu and Wu2015), Huang et al. (Reference Huang, Wu and Zhou2016).
Since then, various research streams have emerged in the micro-level estimation of IBNR reserves. One stream employs machine learning and neural network techniques for IBNR claim estimation. As described in Bücher and Rosenstock (Reference Bücher and Rosenstock2023), this approach utilizes Chain Ladder (CL)-based methods to estimate IBNR reserves over discrete time intervals, as demonstrated by studies such as Wüthrich (Reference Wüthrich2018a), Wüthrich (Reference Wüthrich2018b), Baudry and Robert (Reference Baudry and Robert2019), De Felice and Moriconi (Reference De Felice and Moriconi2019), Delong et al. (Reference Delong2022) and Bücher and Rosenstock (Reference Bücher and Rosenstock2023). Another stream focuses on modeling the claim arrival process, where the NHPP model has traditionally played a prevalent role. However, NHPPs assume temporal independence in claims count and overlook portfolio-wide environmental effects, as noted by Badescu et al. (Reference Badescu, Lin and Tang2016) and Grandell (Reference Grandell2012), p. 54. This limits their ability to capture dependencies and explain the calendar year effect observed in run-off triangles (Holmberg, Reference Holmberg1994, Shi et al., Reference Shi, Basu and Meyers2012).
Addressing these issues involves incorporating a temporal dependence structure into the claim arrival process model, which can be achieved by transitioning to a Cox process. Notable research in this direction includes Avanzi et al. (Reference Avanzi, Wong and Yang2016), who utilized a shot noise Cox process, and Badescu et al. (Reference Badescu, Lin and Tang2016), who employed a hidden Markov model (HMM) with an Erlang state-dependent distribution to model the claim arrival process. It is crucial to note that while both studies incorporated the Cox model, they focused on modeling the claim arrival process for the entire portfolio, overlooking policyholder information that is likely to enhance the accuracy of the different components within the model.
To this end, we propose a novel micro-level Cox model that flexibly accommodates various levels of granularity, including the policyholder level, lines of business, or the entire portfolio. Building on Badescu et al. (Reference Badescu, Lin and Tang2016), our model incorporates a shared HMM that governs claim arrivals across granular units. At the policyholder level, the HMM captures environmental dynamics that influence all policyholders simultaneously. Conditional on the HMM state, each policyholder’s claim arrival intensity is constant and depends on their risk attributes through a regression function. Thus, the intensity reflects both shared external influences and individual risk profiles. This framework captures temporal dependence and environmental variation often overlooked in traditional reserving models. We show that, within our framework, the discretely observed claim arrival, reported claim, and IBNR claim processes for each policyholder belong to the family of Poisson-HMMs, with state-dependent distributions shaped by the policyholder’s risk attributes.
We initially propose our model in continuous time. In general, fitting continuous-time marked count process models to fully granular reported claims data – that is, data including exact accident times and reporting delays – presents significant challenges. One of the main challenges is maximizing the likelihood of observed data. Maximizing the likelihood of the observed data entails addressing two distinct components: the likelihood for the truncated reporting delay and the likelihood for the claim frequency model of reported claims, which in turn depends on the reporting delay. Thus, maximizing the likelihood poses a formidable task due to the interdependence of the two components. Traditionally, researchers have adopted a two-step approach to maximize the likelihood of the observed data: first fitting the right-truncated reporting delay distribution, and then estimating the frequency model by treating the estimated delay distribution as fixed. However, this method overlooks the inherent interaction between the frequency and reporting delay components; an increase in reported claims could result from a higher claim frequency, faster reporting, or a combination of both.
We address such a challenge within our marked Cox model framework by integrating reporting delay and claim frequency modeling cohesively, transitioning from a continuous-time framework to a discrete-time framework. Specifically, we aggregate continuous-time events to a coarser time scale, which is more common in practice. This coarser time scale (or period) can be days, weeks, months, etc. We assume a hierarchical structure: if
$N_{i,t}$
denotes the number of claims that occurred in period t for the policyholder i, and
$Z_{i,t,d}$
denotes the number of these claims reported d periods later, then we assume that
$Z_{i,t,d} \mid N_{i,t}$
follows a multinomial distribution. We simultaneously model both the delay and the occurrence components, capturing the intricate interplay between claim occurrence and reporting behavior. This type of discrete-time hierarchical structure has been explored in the literature. For instance, Verbelen et al. (Reference Verbelen, Antonio, Claeskens and Crevecoeur2022) considered a model in which the underlying claim counts follow a non-homogeneous Poisson process, and employed the EM algorithm to maximize the joint likelihood of the claim occurrence and reporting delay components. Similarly, Avanzi et al. (Reference Avanzi, Wong and Yang2016) modeled claim arrivals via a shot noise Cox process, using reversible jump MCMC for joint inference within a discrete-time framework. Notably, both of these approaches do not incorporate policyholder-specific covariates. Crevecoeur et al. (Reference Crevecoeur, Antonio, Desmedt and Masquelein2023) extended the structure used in Verbelen et al. (Reference Verbelen, Antonio, Claeskens and Crevecoeur2022) by incorporating covariates into the model, though assuming a homogeneous Poisson process for the underlying claim counts and applying a similar EM-based estimation procedure.
Nevertheless, the existing literature often assumes static or time-dependent covariates to model reporting delay variations. However, practical scenarios frequently involve unexplainable changes in delay structures that would not be captured through conventional covariates. As previously demonstrated by Swamy (Reference Swamy1970) and Sentürk and Müller (Reference Sentürk and Müller2005), accounting for such unexplainable changes requires us to assume that the delay probabilities are random. To address this, we extend the discrete-time model to account for the likely occurrence of unobserved confounding covariates by assuming that the probabilities of the multinomial distribution follow a Dirichlet distribution.
We outline the EM algorithm needed to obtain the maximum likelihood estimates for our proposed models. In particular, for the discrete-time model with the Dirichlet assumption, we employ the Monte Carlo EM (MCEM) algorithm, which uses Monte Carlo simulations to approximate the E-step and handles the complexities introduced by the Dirichlet distribution. We also fit our models to an auto-insurance dataset from a major European insurance company. We show that despite losing information by switching to the discrete-time framework, the discrete-time models perform better than the continuous-time model because they account for the joint modeling of occurrence and reporting. Moreover, we demonstrate that adding the Dirichlet assumption provides more realistic interval estimates for the IBNR claim count.
This work contributes to the literature by:
-
• Extending the HMM-based Cox model of Badescu et al. (Reference Badescu, Lin and Tang2016); Badescu et al. (Reference Badescu, Chen, Lin and Tang2019) to various levels of granularity (e.g., policyholder or business line), which is essential when portfolio composition shifts and claim behaviors vary across units.
-
• Extending the EM-based estimation framework of Verbelen et al. (Reference Verbelen, Antonio, Claeskens and Crevecoeur2022) by modeling the underlying occurrence process as a Poisson-HMM rather than a Poisson process and conducting the analysis at the micro (policyholder-level) scale.
-
• Unsystematic variation in reporting delays by modeling the delay probabilities via a Dirichlet distribution – introducing randomness that was not accounted for in previous approaches such as Verbelen et al. (Reference Verbelen, Antonio, Claeskens and Crevecoeur2022) and Avanzi et al. (Reference Avanzi, Wong and Yang2016) – and outlining the corresponding MCEM algorithm for model fitting.
The paper is structured as follows. In Section 2, we outline our modeling framework and discuss its interpretability. The properties of the model and the associated individual-level reported claim process and IBNR process are presented in Section 3. In Section 4, we outline the discrete-time version of the model and show the likelihood for the three different models under consideration. In Section 5, we provide the EM algorithm needed to estimate the parameters of the three different models. Section 6 details the application of the model in predicting the IBNR claims count and reserve, while we apply the models to real-life data in Section 7 to assess their performance in estimating the IBNR claim count. We conclude in Section 8.
2. Modeling framework
We present our proposed model for modeling the claim arrival process at the policyholder level. In line with the notation established by Badescu et al. (Reference Badescu, Lin and Tang2016), we describe the arrival of the k-th claim for the ith policy
$(i=1,\dots,m)$
through the random variables
$(T_{ik},U_{ik})$
, where
$T_{ik}$
denotes the occurrence time of the claim and
$U_{ik}$
denotes its reporting delay. Chronologically ordered,
$\{(T_{ik},U_{ik}), k=1,2,\dots\}$
constitutes the claim arrival process for policy i, whose risk attributes are denoted by
$\boldsymbol{x}_{i} \in \mathcal{X}$
, where
$\mathcal{X}$
is the covariate space. Additionally, the total claim count process for the ith policy is defined as
$N_i(t)=\sum_{k=1}^{\infty}\unicode{x1D7D9}\{T_{ik} \lt t\}$
, where
$\unicode{x1D7D9}\{.\}$
is the indicator function.
In our modeling approach, we represent
$\{N_i(t), t\ge 0 \}$
as a marked Cox process with two components:
-
1. The marks
$\{U_{i1},U_{i2},\dots\}$ , and
-
2. The stochastic intensity function
$\Lambda_i(t)$ .
The assumption is made that the marks are independent, with a density function given by
$f_{U\mid t,\boldsymbol{x}_i}(u)$
. Note that the distribution of the reporting delay is conditioned on the time of claim occurrence and the risk attributes of the policyholder. Additionally, the stochastic intensity function
$\Lambda_i(t)$
is modeled as a piecewise stochastic process, where
$\Lambda_i(t) = e_{i,l}\Lambda_l(t;\,\boldsymbol{x}_i)$
for
$d_{l-1} \le t \lt d_l$
, with
$l=1,2,\dots$
and
$d_0=0$
. The time points
$d_l$
,
$l=1,2,\dots$
, are predetermined, and
$t=d_0=0$
marks the beginning of the observation window for the entire portfolio. Moreover,
$e_{i,l} \in [0,1]$
denotes the exposure of the ith policy in the interval
$d_{l-1} \le t \lt d_l$
, defined as the proportion of days in that interval for which the contract is in force. Notably,
$e_{i,l}$
can be zero, indicating the case when the ith policy’s contract is not active during
$d_{l-1} \le t \lt d_l$
.
To describe
$\{\Lambda_1(t;\,\boldsymbol{x}_i), \Lambda_2(t;\,\boldsymbol{x}_i),\dots\}$
, we propose a structure involving two components:
-
1. A hidden parameter process
$\{C_1,C_2,\dots\}$ , which constitutes a time-homogeneous Markov chain with a finite state space
$\{1,\dots,g\}$ . Denoting its initial distribution and transition probability matrix as
$\boldsymbol{\pi}_1$ and
$\boldsymbol{\Gamma}=\{\gamma_{jk}\}$ , respectively, where
$\gamma_{jk}=P(C_l=k\mid C_{l-1}=j)$ , we assume the Markov chain is irreducible, aperiodic, and has all states being positive recurrent, leading to a unique limiting distribution.
-
2. A state-dependent process
$\{\Lambda_1(t;\,\boldsymbol{x}_i), \Lambda_2(t;\,\boldsymbol{x}_i),\dots\}$ , which depends on the current state
$C_l$ . Given
$C_l=j$ , we assume that
$\Lambda_l(t;\,\boldsymbol{x}_i)$ is constant such that
\begin{align*}\Lambda_l(t;\,\boldsymbol{x}_i)\mid (C_l=j) = \lambda^{(j)}(\boldsymbol{x}_i),\end{align*}
$\lambda^{(j)}\,:\,\mathcal{X}\to \mathbb{R}^+$ is a regression function.
Specifically, each policyholder’s risk attributes
$\boldsymbol{x}_i$
are incorporated into the model through the regression function
$\lambda^{(j)}(\boldsymbol{x}_i)$
. These risk attributes could include factors such as age, location, coverage type, and any other relevant information about the policyholder that may affect claim occurrence rates. By including these covariates, we tailor the intensity function
$\Lambda_l(t;\,\boldsymbol{x}_i)$
to reflect the unique risk profile of each policyholder.
In summary, the intensity for each policy i remains constant during each period l, contingent on the realization of hidden parameter
$C_l$
, and is determined by the regression function based on policyholder’s risk attributes. The hidden states represent environmental variation (e.g. weather or seasonal effect), affecting all policies in the portfolio, with policies assumed independent given the environmental variation. It is important to note that the state of the lth period is identical for all policies i. Consequently, we have that, for each policy i
$(i=1,\dots,m)$
,

where
$\pi_{lj} = P(C_l=j)$
is the jth element of the row vector
$\boldsymbol{\pi}_{l}=\boldsymbol{\pi}_{1}\Gamma^{l-1}$
.
Remarks
-
• Our Cox model generalizes the framework of Badescu et al. (Reference Badescu, Lin and Tang2016) by modeling claim arrivals at the individual policyholder level, using covariates
$\boldsymbol{x}_i$ to capture heterogeneity across risk profiles. This allows the claim intensity to reflect both individual characteristics and broader environmental variation such as seasonality. Moreover, in contrast to their assumption of an Erlang-distributed intensity over
$[d_l, d_{l+1})$ , we model intensity as a covariate-dependent constant within each period. This choice simplifies estimation and leads to a Poisson mixture distribution, which is appropriate for overdispersed claim count data. While the Erlang assumption yields a Pascal mixture, it adds complexity without a clear improvement in model performance. Moreover, with appropriate segmentation of the timeline, the constant-intensity assumption remains a reasonable approximation.
-
• A special case of our framework arises when modeling claim arrivals across lines of business, with index i denoting each line. If the regression function
$\lambda^{(j)}(\!\cdot\!)$ is constant, the model reduces to the contemporaneously conditionally independent multivariate HMM of MacDonald and Zucchini (Reference MacDonald and Zucchini2016). This HMM structure would capture dependence between lines of business through shared hidden states – an important feature given the emphasis in the literature on modeling such dependencies in reserving. Alternatively, i can represent claim types within a single line of business, allowing the model to account for heterogeneity in mark distributions across types.
-
• While our model assumes predetermined time intervals
$[d_{l-1}, d_l)$ , it would be ideal to select these intervals in a data-driven manner to better capture changes in intensity over time. Nonetheless, using fixed intervals – such as weekly, monthly, or quarterly – offers a practical and sufficiently flexible solution in most applications.
3. Properties of the model
In this section, we outline the immediate properties of our model, focusing on its application from the policy-holder level. At a given reserve valuation date
$\tau$
, the complete claim arrival process is not fully observed; only reported claims are observable. Denoting the reported claim process for the ith policy with respect to
$\tau$
as
$\{N_i^{{\text{r}}}(t);\, 0\le t\le \tau\}$
, it is defined as
$N_i^{{\text{r}}}(t) = \sum_{k=1}^{\infty}\unicode{x1D7D9}\{T_{ik} \lt t;\, T_{ik}+U_{ik}\le \tau\}$
, where
$T_{ik}$
denotes the claim occurrence time and
$U_{ik}$
denotes the reporting delay for the kth claim of the ith policyholder. Claims not reported by
$\tau$
(i.e.,
$T_{ik}+U_{ik}\gt\tau$
) contribute to the IBNR claim process, denoted as
$\{N_i^{\text{ibnr}}(t);\, 0 \le t \le \tau\}$
, and defined as
$N_i^{\text{ibnr}}(t) = \sum_{j=1}^{\infty}\unicode{x1D7D9}\{T_{ik} \lt t;\, T_{ik}+U_{ik} \gt \tau\}$
. Although both processes should be indexed by
$\tau$
, we omit it for simplicity. Note that
$N_i(t)=N_i^\textrm{r}(t)+N_i^\textrm{ibnr}(t)$
.
It’s notable that both the reported claim process and the IBNR claim process remain marked Cox processes, retaining easily convertible stochastic intensity functions and mark densities. We revisit a proposition proven in Theorem 3.1 of Badescu et al. (Reference Badescu, Lin and Tang2016) to demonstrate this.
Proposition 1. Assume that the claim arrival process for the ith policyholder,
$\{N_i(t);\,t\ge0\}$
, is a marked Cox process with stochastic intensity function
$\Lambda_i(t)$
and independent marks
$\{U_{i1},$
$U_{i2},\dots\}$
with common density function
$f_{U\mid t,\boldsymbol{x}_i}(u)$
. Then for a given valuation date
$\tau$
, its associated reported claim process
$\{N_i^{{\text{r}}}(t);\, 0 \le t \le \tau\}$
and IBNR claim process
$\{N_i^{\text{ibnr}}(t);\, 0 \le t \le \tau\}$
are also marked Cox processes. Their adjusted stochastic intensity functions are

respectively, and their independent marks follow adjusted position-dependent mark distributions

respectively, where
$F_{U\mid t,\boldsymbol{x}_i}$
is the distribution function for the reporting delay U given a claim occurrence time t and risk attributes
$\boldsymbol{x}_i$
.
Recall that we model the stochastic intensity function
$\Lambda_i(t)$
of the claim arrival process
$\{N_i(t;\,\boldsymbol{x}_i);\,t\ge \,0\}$
as a piecewise stochastic process, where
$\Lambda_i(t) = e_{i,l}\Lambda_l(t;\,\boldsymbol{x}_i)$
for
$d_{l-1} \le t \lt d_l$
, with
$d_0=0$
, and the time points
$d_l$
$(l=1,2,\dots)$
are predetermined. These predetermined time points constitute the time units which make up the time series on which the HMM is defined. We define
$N_{i,l} \,:\!=\, N_{i}(d_l) - N_i(d_{l-1})$
as the number of claims occurring during
$[d_{l-1},d_l)$
for the ith policyholder (
$i=1,\dots,m$
). Hence,
$\{N_{i,1},N_{i,2},\dots\}$
represents the discrete observations of the claim arrival process at these time points
$d_l$
(
$l=1,2,\dots$
). The corresponding discrete observations of the reported claim process and the IBNR claim process with respect to a valuation date
$\tau$
are denoted by
$\{N_{i,1}^{\text{r}},N_{i,2}^{\text{r}},\dots\}$
and
$\{N_{i,1}^{\text{ibnr}},N_{i,2}^{\text{ibnr}},\dots\}$
, respectively. Notably, only
$\{N_{i,1}^{\text{r}},N_{i,2}^{\text{r}},\dots\}$
is observed at time
$\tau$
. It is straight-forward to see that these three discretely observed processes follow Poisson-HMM.
Proposition 2. For the proposed Cox model for
$N_i(t;\,\boldsymbol{x}_i)$
, the discretely observed claim arrival process,
$\{N_{i,1},N_{i,2},\dots \}$
, the discretely observed reported claim process,
$\{N_{i,1}^{\text{r}},\dots,N_{i,k}^{\text{r}} \}$
, and the discretely observed IBNR claim process,
$\{N_{i,1}^{\text{ibnr}},\dots,N_{i,k}^{\text{ibnr}}\}$
, all fall under the class of Poisson-HMMs, sharing the same hidden parameter process
$\{C_1,C_2,\dots\}$
with
$N_i(t)$
. That is, for each
$n \in \mathbb{N}_0$
,

where
$N_{i,l}^*$
and
$\theta_l^{(j)}$
denote any of the pairs
$\{N_{i,l}, \tilde{\lambda}_l^{(j)}\},\{N_{i,l}^{\mathrm{r}}, \mu_l^{(j)}\},\{N_{i,l}^{\mathrm{ibnr}}, \nu_l^{(j)}\},$
with


Consequently, for each
$l=1,2,\dots$
, the marginals
$N_{i,l}$
,
$N_{i,l}^{\mathrm{r}}$
, and
$N_{i,l}^{\mathrm{ibnr}}$
follow mixed Poisson distributions with probability functions

where
$N_{i,l}^*$
and
$\theta_l^{(j)}$
denote any of the pairs above, and
$p(n;\, \lambda) = \frac{\lambda^n e^{-\lambda}}{n!}.$
The above proposition is straightforward given our assumption of constant intensity functions given a realization of the hidden state. When
$C_l = j$
, the intensity functions for the claim arrival process, reported claim process, and IBNR claim process become constant over the period
$[d_{l-1},d_l)$
, resulting in Poisson processes for the claim arrival process and time-varying Poisson processes for the reported and IBNR claim processes. It is also evident that the three processes share the same hidden parameter process. This observation leads directly to the conclusion that all three observed processes are Poisson-HMMs.
While Poisson-HMMs have traditionally been employed to model claim count processes for entire portfolios, our application extends this methodology to the individual level (or other granular unit). Properties similar to those in Badescu et al. (Reference Badescu, Lin and Tang2016) can be easily derived, but we refrain from showing them in the manuscript, out of considerations related to the manuscript’s length.
4. Likelihood of the observed data
In this section, we propose the discrete-time version of our model, both with and without the Dirichlet assumption for the reporting probability vector. We begin by outlining the likelihood of observed data in the continuous-time framework and briefly discussing the associated complexities. The motivation for switching to a discrete-time framework arises from these complexities, particularly the difficulties in maximizing the likelihood of the observed data. This will serve as a motivation for our discrete-time alternative.
4.1. Continuous-time model
To estimate the IBNR reserve, we seek to maximize the likelihood of the observed data. Let
$n_{i,l}^{\text{r}}$
denotes the number of claims incurred by policyholder (or any granular unit) i during period
$[d_{l-1},d_l]$
and are reported by
$d_T = \tau$
, where T represents the total number of observed periods. These claims, ordered by their arrival time, are described through the pairs
$\{t_{ijl},u_{ijl}\,:\,i=1,\dots,m\,;\,j=1,\dots,n_{i,l}^{\text{r}}\}$
, where
$t_{ijl}$
is the claim arrival time of the j-th ordered reported claim for policyholder i in period
$[d_{l-1},d_l)$
, and
$u_{ijl}$
is its reporting delay. We consider the set of observed data given by

Note that since the Poisson intensity is assumed constant within each interval given the realization of the hidden state, the exact claims occurrence times do not influence the inference for the model parameters and are, therefore, omitted from the analysis. Let
$\boldsymbol{N}_{i}^{(\text{r},s:t)} = (N_{i,s}^{\text{r}},\dots,N_{i,t}^{\text{r}})$
be the discretely observed reported claim process from period s to period t for the policyholder i, and let
$\boldsymbol{n}_{i}^{(\text{r},s:t)}=(n_{i,s}^{\text{r}},\dots,n_{i,t}^{\text{r}})$
be its realization, it is easy to show that the likelihood of the observed data is equal to (see, e.g., Antonio and Plat, Reference Antonio and Plat2014 and Fung et al., Reference Fung, Badescu and Lin2021):

where
$\boldsymbol{\Phi}_1$
represents the vector of the model’s parameters, including: the initial distribution
$\boldsymbol{\pi}_1$
and the transition matrix
$\boldsymbol{\Gamma}$
of the HMM, the regression coefficients
$\boldsymbol{\theta}_j \in \mathbb{R}^p $
for the
$\lambda^{(j)}$
$(j=1,\dots,g)$
, where p is the number of regression parameters in
$ \lambda^{(j)} $
, and the parameters associated with modeling the reporting delay.
The “Claim Frequency” component captures the probability of observing the reported claim counts across all policyholders and time periods. It characterizes the underlying claim frequency patterns in the data. On the other hand, the “Reporting Delay” component accounts for the right-truncated reporting delay for claims that are reported before the valuation date
$\tau$
. It is modeled as a regression that depends on the characteristics of the policy and the time of occurrence.
As discussed in the introduction, maximizing the above likelihood is challenging. Proposition 2 establishes that the distribution of the discretely observed reported claim process depends on the reporting delay distribution, making the maximization of
$\mathcal{L}^{(1)}$
particularly complex. A common approach in the micro-level reserving literature is to perform a two-step maximization: first fitting the “Reporting Delay” component, then using the fitted parameters to estimate the “Claim Frequency” component. This approach has been employed in all studies we are aware of that adopt a similar framework (see, e.g., Antonio and Plat, Reference Antonio and Plat2014; Fung et al., Reference Fung, Badescu and Lin2021, etc.), particularly those following the framework of Norberg (Reference Norberg1993), with the exception of Wahl (Reference Wahl2019), where the likelihood is maximized jointly under the assumption that claim arrivals follow a non-homogeneous Poisson process with piecewise constant intensity and the delay distribution belongs to the exponential family. However, this iterative two-step strategy may lead to suboptimal parameter estimates, as it fails to account for the interdependence between the two components of the model.
4.2. Discrete-time models
4.2.1. The multinomial model
With the aim of joint modeling of frequency and delay, we deviate from the continuous-time framework to a discrete-time framework by aggregating the events towards a coarser time scale. To this end, we assume that claims can be reported with a maximum delay of D periods. This means that if a claim occurs in period t, it will be reported in periods
$t+d$
(
$d=0,1,\dots,D$
). Note that a period t corresponds to the time interval
$[d_{t-1},d_t)$
. We also, for simplicity, assume that the predetermined time points
$d_t$
$(t=0,\dots,T)$
are equidistant, so that
$d_0=0$
,
$d_{t+1}-d_{t}=1$
and T represent the last observed period before the valuation date
$\tau$
. That is, we assume that we consider periods of the same length, where each period represents one unit of time.
We denote the probability that a policyholder’s claim with risk attributes
$\boldsymbol{x}_i$
that occurred in period t is reported in period
$t+d$
by
$p_{t}(d;\,\boldsymbol{x}_i)$
. It is easy to show that the discretely observed claim arrival process,
$\{N_{i,1},N_{i,2},\dots,N_{i,T} \}$
, the discretely observed reported claim process,
$\{N_{i,1}^{\text{r}},\dots,N_{i,T}^{\text{r}} \}$
, and the discretely observed IBNR claim process,
$\{N_{i,1}^{\text{ibnr}},\dots,N_{i,T}^{\text{ibnr}}\}$
, fall under the class of Poisson-HMMs, with the same hidden parameter process
$\{C_1,C_2,\dots\}$
as
$N(t;\,\boldsymbol{x}_i)$
, and with state-dependent Poisson intensities given by

respectively, where
$p_{i,t}^{\text{r}} = \sum_{d=0}^{\min\{D,T-t\}}p_{t}(d;\,\boldsymbol{x}_i)$
is the probability that a claim from policyholder i that occurred in period t is reported before the reserve valuation date.
Let
$Z_{i,t,d}=\sum_{k=1}^\infty 1\left\{T_{i,k} \in [t-1,t), T_{i,k}+U_{i,k} \in [t+d-1, t+d)\right\}$
denote the number of claims from policyholder i that occurred in period t and are reported in period
$t+d$
. Note that
$N_{i,t}^{\text{r}} = \sum_{d=0}^{\min\{D,T-t\}}Z_{i,t,d}$
. Our observed data are now given by

where
$n_{i,t}^{\text{r}}$
and
$z_{i,t,d}$
are the realizations of
$N_{i,t}^{\text{r}}$
and
$Z_{i,t,d}$
, respectively.
Similar to Verbelen et al. (Reference Verbelen, Antonio, Claeskens and Crevecoeur2022), we make the assumption that

Note that for
$t \le T-D$
, all claims occurred in period t is fully observed, that is,
$N_{i,t}^{\text{r}} = N_{i,t}$
, and the counts
$(Z_{i,t,0},\dots,Z_{i,t,D})\mid N_{i,t}^{\text{r}}\sim \text{Multinomial}(N_{i,t}^{\text{r}},\boldsymbol{p}_t(\boldsymbol{x}_i)),$
where
$\boldsymbol{p}_t(\boldsymbol{x}_i)=(p_t(0;\,\boldsymbol{x}_i),\dots,p_t(D;\,\boldsymbol{x}_i))$
.
The likelihood of the observed data is then given by

where
$\boldsymbol{\Phi}_2$
represents the vector of the model’s parameters, including the initial distribution
$\boldsymbol{\pi}_1$
and the transition matrix
$\boldsymbol{\Gamma}$
of the HMM, the regression coefficients
$\boldsymbol{\theta}_j$
for the
$\lambda^{(j)}$
’s, and the parameters related to the modeling of
$\boldsymbol{p}_t(\boldsymbol{x}_i)$
, denoted by
$\boldsymbol{\delta}$
. An equivalent way to write the multinomial component of the likelihood above is to decompose it into a series of conditional binomial likelihoods. That is, we can write
$\mathcal{L}^{(2)}$
as:

where

represents the conditional probability that a claim incurred at period t by policyholder with risk attributes
$\boldsymbol{x}_i$
is reported in period
$t+d$
, given that the claim is reported by period
$t+d$
. Thus, we break down the multinomial likelihood by considering the binomial likelihood of observing each count given the cumulative counts and adjusting the success probabilities accordingly. Note that
$p_t(d;\,\boldsymbol{x}_i)$
can be obtained iteratively from
$q_t(d;\,\boldsymbol{x}_i)$
by:

Remark
Note that t used in a discrete-time setting differs from t in
$ N_i(t) = \sum_{j=1}^\infty \unicode{x1D7D9}\{T_{ij} \lt t\} $
. In the continuous-time setting, t represents calendar time and
$ N_i(t) $
counts the number of claims of the policyholder i that occurred before t. In the discrete-time setting, t represents calendar time discretized into periods (e.g., months), where
$ t = 1 $
corresponds to the first period,
$ t = 2 $
to the second, and so on. For simplicity, we use the same notation t in both settings, but its meaning is clear from the context.
4.2.2. The Dirichlet-multinomial model
The probabilities of reporting delay often exhibit temporal variability, which can be characterized as either explainable or unexplainable. Explainable variations in reporting delay are systematic variations that arise from differences in the period of occurrence of claims and/or the risk attributes associated with policyholders. For example, when working with daily data, claims occurring on weekends may exhibit different reporting patterns compared to those occurring on weekdays. Similarly, the month in which a claim occurs can also influence reporting delay probabilities. In contrast, unexplainable variability refers to unsystematic fluctuations in reporting delay that are not captured by our covariates. In cases where data exhibit high random variability in reporting delay, both continuous-time and multinomial models may produce interval estimates for IBNR claim counts that are significantly off from the actual IBNR claim count, as we will observe in our data analysis. Although an exact cause for the high fluctuations cannot be precisely identified without additional data collection, one possible explanation is the presence of unobserved covariates that have direct and interaction effects on the delay probabilities.
To this end, we add randomness to the reporting delay probabilities
$p_{t}(d;\,\boldsymbol{x}_i)$
’s by assuming that
$\boldsymbol{p}_t(\boldsymbol{x}_i) = (p_{t}(0;\,\boldsymbol{x}_i),\dots,p_{t}(D;\,\boldsymbol{x}_i))$
follows a Dirichlet distribution with parameters
$\boldsymbol{\eta}_{i,t} = (\eta_{t,0}(\boldsymbol{x}_i),\dots,\eta_{t,D}(\boldsymbol{x}_i))$
. This choice is motivated by the fact that the Dirichlet distribution serves as a conjugate prior for the multinomial distribution, which significantly simplifies the mathematical treatment and estimation of our model.
Parameters
$\boldsymbol{\eta}_{i,t} = (\eta_{t,0}(\boldsymbol{x}_i), \dots, \eta_{t,D}(\boldsymbol{x}_i))$
can be specified in various ways. One approach is to model
$\boldsymbol{\eta}_{i,t}$
using a regression with fixed effects, where the parameters are expressed as functions of observed covariates such as policyholder characteristics or temporal factors. Alternatively, the Dirichlet assumption can be incorporated as a random effect by assuming that, for each period t, the delay probability vector
$\boldsymbol{p}_t$
follows a Dirichlet distribution with global parameters
$\boldsymbol{\eta}_t$
. In this paper, we adopt the fixed-effects regression structure for
$\boldsymbol{\eta}_{i,t}$
.
We make use of the following proposition which can also be found in Lawless (Reference Lawless1994):
Proposition 3. If
$\boldsymbol{p}_t(\boldsymbol{x}_i) = (p_{t}(0;\,\boldsymbol{x}_i),\dots,p_{t}(D;\,\boldsymbol{x}_i))$
follows a Dirichlet distribution with parameters
$\boldsymbol{\eta}_{i,t} = (\eta_{t,0}(\boldsymbol{x}_i),\dots,\eta_{t,D}(\boldsymbol{x}_i))$
, then the conditional probabilities
$q_t(d;\,\boldsymbol{x}_i)= \frac{p_t(d;\,\boldsymbol{x}_i)}{\sum_{j=0}^{d}p_t(j;\,\boldsymbol{x}_i)}$
$(d=1,\dots,D)$
are independently distributed beta random variables with

Thus, the binomial likelihoods components in
$\mathcal{L}^{(2)}$
would be replaced with beta-binomial likelihoods components, and the likelihood for our model with the Dirichlet assumption becomes:

where B is the beta function, and
$\boldsymbol{\Phi}_3$
represents the vector of the model’s parameters, including: the initial distribution
$\boldsymbol{\pi}_1$
and the transition matrix
$\boldsymbol{\Gamma}$
of the HMM, the regression coefficients
$\boldsymbol{\theta}_j$
for the
$\lambda^{(j)}$
’s, and the parameters related to the modeling of the
$\boldsymbol{\eta}$
’s, denoted by
$\boldsymbol{\delta}$
.
Remark
With
$ g=1 $
(single state), we have that
$ N_{i,t}^{\text{r}} $
follows a Poisson distribution, and so, this Dirichlet assumption can be seen as an extension of the model of Verbelen et al. (Reference Verbelen, Antonio, Claeskens and Crevecoeur2022). We provide the EM algorithm for this model in the next section.
5. Parameter estimation
In this section, we present the estimation methodology employed to fit the discrete-time models from the previous section. To achieve this, we utilize the EM algorithm, providing an iterative framework for updating the model parameters to maximize the likelihood of the observed reported claims while considering the unobserved hidden states of the HMM. The EM algorithm for fitting the continuous-time model using the two-step maximization can be found in Appendix A. Additionally, to assess the ability of our algorithm to recover model parameters, we conduct a simulation study based on the proposed models. The simulation details are provided in the Online Supplementary Material, and the corresponding code is available at: https://github.com/HassanAbdelrahman/HMM_Simulations.
The EM algorithm for the multinomial model extends the algorithm outlined in Verbelen et al. (Reference Verbelen, Antonio, Claeskens and Crevecoeur2022), where the reported claims process now follows a Poisson-HMM, rather than a Poisson process. On the other hand, the EM algorithm for the Dirichlet-multinomial model requires special attention. This is because the Dirichlet assumption introduces additional complexity in estimating the probability vector, making the E-step analytically intractable. Therefore, we will need to switch from the EM algorithm to the MCEM algorithm, which uses Monte Carlo simulations to approximate the E-step and handle the complexities introduced by the Dirichlet distribution.
5.1. The multinomial model
Recall that the likelihood of the observed data for the discrete-time multinomial model,
$\mathcal{L}^{(2)}$
, is given by

where
$\boldsymbol{N}_{i}^{(r,1:T)} = (N_{i,1}^r,\dots,N_{i,T}^r)$
denote the discretely observed reported claim process from period 1 to T for policy i, and
$\boldsymbol{n}_{i}^{(r,1:T)}=(n_{i,1}^r,\dots,n_{i,T}^r)$
is its realization. As we established in Proposition 2, our model dictates that the discretely observed reported claims process for policy i,
$\{N_{i,1}^{\text{r}},\dots,N_{i,k}^{\text{r}}\}$
, originates from the class of Poisson-HMM. Specifically, we demonstrated that:

The likelihood for the “Claim Frequency” component is thus given by

where

is a diagonal matrix whose diagonal entries correspond to the probabilities associated with each state, and
$\mathbf{1}$
is a row vector of ones with dimension g, where g is the number of states.
The Forward-Backward Algorithm
Before sketching the EM algorithm, we first define the forward and backward probabilities as follows:

respectively. These probabilities are used to iteratively update the hidden state probabilities during the EM algorithm, and they have the following properties (see MacDonald and Zucchini, Reference MacDonald and Zucchini2016 for details):

where
$\mathbb{P}_k(\boldsymbol{n}_t^r)=\prod_{i=1}^m \boldsymbol{P}(N_{i,t}^r=n_{i,t}^r\mid C_t=k)$
.
They are computed using the following recursive relations:

We now proceed with the details of the EM algorithm.
Complete Data Likelihood
The EM algorithm iteratively estimates the parameters
$\boldsymbol{\Phi}_2$
, where the number of hidden states, g, is treated as preset. The algorithm consists of two main steps: the E-step and the M-step, which will be demonstrated in the following subsections. We first introduce the following notations:
-
•
$\boldsymbol{c}^{(T)}=(c_1,\dots,c_T)$ : the unobserved states of the HMM,
-
•
$u_{tj} \,:\!=\, \unicode{x1D7D9}\{C_t = j\}$ , and
-
•
$v_{tjk} \,:\!=\, \unicode{x1D7D9}\{C_{t-1}=j, C_{t}=k\}$ .
We let our complete data consist of all claims that occurred before our reserve valuation date (both reported and unreported), as well as the realizations of the states of the HMM. Thus, the complete data is given by:

Note that the complete data assumes that we know the total number of claims that occurred in each period for each policyholder, as well as the periods in which these claims are reported. The likelihood of the complete data is then given by

where
$\mathbb{P}_{c_t}(n_{i,t}) = P(N_{i,t}=n_{i,t}\mid C_t = c_t)$
.
It is straightforward to see that we can write the log-likelihood of the complete data as

where
$l_{\text{C}}^{(2)}(\boldsymbol{\Phi}_2\mid \mathcal{C})\,:\!=\,\log(\mathcal{L}_{\text{C}}^{(2)}(\boldsymbol{\Phi}_2\mid \mathcal{C}))$
. We now outline the E-step and M-step of the EM algorithm that are applied to maximize the complete-data log-likelihood.
E-step
In the k-th E-step, we compute the conditional expectation of the complete-data log-likelihood given the observed data and the current estimator
$\boldsymbol{\Phi}_2^{(k-1)}$
of the model parameters
$\boldsymbol{\Phi}_2$
. Note that the unobserved components of our complete data consist of the hidden states,
$c^{(T)}$
, as well as the IBNR claims,
$\{n_{i,t}^{\text{ibnr}}, z_{i,t,d} \mid i=1,\dots,m;\, t=1,\dots,T;\, t+d \gt T\}$
. The expectation of these components given the observed data and
$\boldsymbol{\Phi}_2^{(k-1)}$
should thus be derived. The conditional expectation is given by

where the expressions for
$\hat{u}_{tj}^{(k)}$
and
$\hat{v}_{tjl}^{(k)}$
are given by

and

In these expressions, the forward probabilities, backward probabilities, and observed likelihood are obtained using the model parameters
$\boldsymbol{\Phi}_2 = \boldsymbol{\Phi}_2^{(k-1)}$
from the
$(k-1)$
-th iteration. The terms
$\hat{u}_{tj}^{(k)}$
and
$\hat{v}_{tjl}^{(k)}$
represent the updated probabilities of being in state j at time t and the transition probability from state j to state l at time t, respectively, based on the current model parameter estimates
$\boldsymbol{\Phi}_2^{(k-1)}$
and the observed data.
The expressions for
$\hat{n}_{i,t,j}^{(k)}$
and
$\hat{z}_{i,t,d}^{(k)}$
are given by

where
$p_t^{(k-1)}(d;\, \boldsymbol{x}_i)$
and
$\lambda_j^{(k-1)}(\boldsymbol{x}_i)$
are the estimates of
$p_t(d;\, \boldsymbol{x}_i)$
and
$\lambda^{(j)}(\boldsymbol{x}_i)$
, respectively, computed using the parameter estimates obtained from the M-step in the
$(k-1)$
-th iteration. The term
$\hat{n}_{i,t,j}^{(k)}$
represents the updated expectation of the total number of claims that occurred in period t for policyholder i, given that the state of the HMM is j, while the term
$\hat{z}_{i,t,d}^{(k)}$
represents the updated expectation of the portion of these claims that are reported in period
$t+d$
. Note that all claims that occurred at
$t \le T-D$
are reported and are part of the observed data
$\mathcal{O}_D$
, and thus, the expected values of
$N_{i,t}$
and
$Z_{i,t,d}$
(
$t \le T-D;\, d=0, \dots, D$
) given the observed data are equal to
$n_{i,t}$
and
$z_{i,t,d}$
, respectively.
By maximizing this expected complete-data log-likelihood, we can iteratively update the model parameters in the M-step of the algorithm, leading to improved parameter estimates that better fit the observed data.
M-step
The M-step in the EM algorithm involves maximizing the expected complete-data log-likelihood (Equation (5.2)) with respect to the model parameters
$\boldsymbol{\Phi}_2$
, subject to the constraints

For this step, we can obtain the updated estimates for
$\boldsymbol{\pi}_1$
and
$\boldsymbol{\Gamma}$
separately by maximizing the first two terms of Equation (5.2), respectively. The optimal values for
$\pi_{1j}^{(k)}$
and
$\gamma_{jl}^{(k)}$
at the kth iteration are given by


As for the third term:

it is easy to see that for each j, this is a weighted Poisson log-likelihood, with the exception that
$\hat{n}_{i,t,j}^{(k)}$
can be non-integer. Yet, we can still obtain the updated regression parameters for each
$\lambda^{(j)}$
using quasi-likelihood methods available through standard packages in R. Finally, for the term:

it is easy to see that for each d, this is a binomial log-likelihood, again with the exception that
$\hat{z}_{i,t,d}^{(k)}$
can be non-integer. For each d, we model
$q_t(d;\,\boldsymbol{x}_i)$
as a regression function of the policy-holder risk attributes,
$\boldsymbol{x}_i$
, and time-dependent covariates (e.g. the month of the period t). The binomial log-likelihood can then be optimized using quasi-likelihood methods available through standard packages in R.
The model parameters are then given by
$\boldsymbol{\Phi}_2 = \{\boldsymbol{\pi}_1,\boldsymbol{\Gamma},\boldsymbol{\theta}_1,\dots,\boldsymbol{\theta}_g,\boldsymbol{\delta}_1,\dots,\boldsymbol{\delta}_D\}$
, where
$\boldsymbol{\theta}_j$
is the vector of regression parameters for
$\lambda^{(j)}(\boldsymbol{x})$
, and
$\boldsymbol{\delta}_d$
is the vector of regression parameters for
$q_t(d;\,\boldsymbol{x}).$
Convergence
We continue iterating the E-step and M-step until the relative distance between the estimates from consecutive iterations falls below a predetermined threshold. The relative distance is defined as

The EM algorithm converges when the relative distance is below the threshold.
Initialization of Model Parameters
To effectively apply the EM algorithm, appropriate initialization of model parameters is crucial. Note that we have complete data (all claims are reported) for periods
$t \le T - D$
, so we can fit the model for such data to obtain parameter estimates as initialization
Model Selection
Finally, we need to decide on the number of states g. Typically, a larger g would provide a better fit, but it could also lead to overfitting. We usually select the model with the lowest chosen information criterion (e.g., AIC or BIC). For fast and efficient computation, we can employ a backward fitting strategy similar to Badescu et al. (Reference Badescu, Chen, Lin and Tang2019). We start by fitting the model with a large g, and then iteratively delete a state until the chosen information criterion stops decreasing. In this strategy, we use the parameter estimates obtained from fitting with g states to initialize the algorithm for
$ g-1 $
states, with the initial values for
$ \boldsymbol{\pi}_1 $
and
$ \boldsymbol{\Gamma} $
being normalized estimates excluding the deleted state. This approach should make the estimation process very efficient.
Under-/Overflow Problems
As noted by MacDonald and Zucchini (Reference MacDonald and Zucchini2016), the computation of forward and backward probabilities, along with the subsequent calculations in the E- and M- steps, is susceptible to under- or overflow issues. MacDonald and Zucchini (Reference MacDonald and Zucchini2016) recommend scaling the forward and backward probability computations by taking the logarithms of these quantities and provides code solutions for this purpose. However, for our model, additional considerations are needed.
The computation of forward and backward probabilities involves matrix multiplication of the matrices
$\boldsymbol{P}_t(\boldsymbol{n}_t^r)$
. Unlike normal HMMs fitted to single time series, where these matrices are diagonal matrices containing the probability of the count process obtaining its observed value given the different states, our model’s probabilities are the product of the observed count process for each individual. For our model, this matrix is defined as

It becomes evident that the elements of the diagonal matrix are extremely small, potentially causing underflow problems in computing forward and backward probabilities, as well as subsequent calculations, even after scaling. To address this issue, we implement the LogSumExp algorithm (e.g., see Blanchard et al., Reference Blanchard, Higham and Higham2021), which is effective in mitigating numerical instability arising from the small probabilities.
5.2. The Dirichlet-multinomial model
Complete Data Likelihood
We now aim to find the parameter estimates that maximize
$\mathcal{L}^{(3)}$
using the EM algorithm. As we illustrated, the Dirichlet-multinomial model differs from the multinomial model by the assumption that the probability vector
$\boldsymbol{p}_{t}(\boldsymbol{x}_i)$
is random. To this end, we let our complete data be given by

where
$p_{i,t,d}$
is the realization of
$p_d(t;\, \boldsymbol{x}_i)$
. That is, we now assume we know the actual reporting probability vector
$\boldsymbol{p}_t(\boldsymbol{x}_i)$
. The likelihood of the complete data is then given by:

where
$\mathbb{P}_{c_t}(n_{i,t}) = P(N_{i,t}=n_{i,t}\mid C_t = c_t)$
. Note that the “Multinomial Component” in the complete data likelihood is constant since it does not depend on the model’s parameters. The log-likelihood of the complete data is then given by

where
$l_{\text{C}}^{(3)}(\boldsymbol{\Phi}_2\mid \mathcal{C})\,:\!=\,\log(\mathcal{L}_{\text{C}}^{(3)}(\boldsymbol{\Phi}_2\mid \mathcal{C}))$
.
E-Step
In the k-th iteration of the E-step, we take the expectation of
$l_{\text{C}}^{(3)}(\boldsymbol{\Phi}_3\mid \tilde{\mathcal{C}})$
with respect to
$\boldsymbol{\Phi}_3^{(k-1)}$
and
$\mathcal{O}_D$
. We have:

where
$\hat{u}_{tj}^{(k)}$
and
$\hat{v}_{tjl}^{(k)}$
are given by Equations (5.3) and (5.4), respectively, and

where
$\lambda_j^{(k-1)}(\boldsymbol{x}_i)$
is the estimate of
$\lambda^{(j)}(\boldsymbol{x}_i)$
computed using the parameter estimates obtained from the M-step in the
$(k-1)$
-th iteration. We do not have closed-form analytical expressions for
$\hat{n}_{i,t,j}^{(k)}$
(
$t \gt T-D$
) and
$\widehat{\log p_{i,t,d}}^{(k)}$
, and thus, numerical methods are needed to compute these expectations. We can obtain Monte Carlo estimates of these expectations using samples from the conditional distribution
$\boldsymbol{p}_t(\boldsymbol{x}_i) = (p_t(0;\, \boldsymbol{x}_i), \dots, p_t(D;\, \boldsymbol{x}_i))$
, using the parameters
$\boldsymbol{\Phi}_3$
. This, however, adds an extra layer of challenge because there is no guarantee that the likelihood will increase with each iteration, unlike in the case of the normal EM algorithm, as we are replacing the actual expectation with a numerical approximation.
Remark
The computation of the forward probabilities, backward probabilities, the likelihood, and consequently,
$\hat{u}_{tj}$
and
$\hat{v}_{tjk}$
in the E-step differs slightly for the two models considered. The differences arise when computing the probability
$ P(N_{i,t}^{\text{r}} = n_{i,t}^{\text{r}} \mid \mathcal{O}_D, \boldsymbol{\Phi}^{(k-1)}) $
. As previously discussed, this probability depends on the reporting delay. For the discrete-time Multinomial model, the estimation of the probability vector
$ \boldsymbol{p}_t(\boldsymbol{x}_i) $
changes after each iteration of the algorithm, and so we use
$ \boldsymbol{p}_t(\boldsymbol{x}_i) $
estimated from the
$ k-1 $
-th iteration in our computation in the k -th iteration. As for the Dirichlet-multinomial model,
$ \boldsymbol{p}_t(\boldsymbol{x}_i) $
is a Dirichlet random vector, and thus we compute

Distribution of
$\boldsymbol{p}_t \mid \mathcal{O}_D$
In the case where
$ t \le T - D $
, we have:

since the Dirichlet distribution is a conjugate prior of the multinomial distribution. Challenges arise in the case where
$ t \gt T - D $
, as we do not observe
$ z_{i,t,d} $
for
$ d \gt T - t $
, and thus we lose the nice conjugate property. Therefore, we need to derive the conditional distribution of
$\boldsymbol{p}_t$
for
$ t \gt T - D $
.
The probability of
$\boldsymbol{p}_t(\boldsymbol{x}_i)$
equal
$\boldsymbol{p}=(p_0,\dots,p_D)$
given the observed data is given by

where

Thus,

where
$p^{{\text{r}}}=\sum_{d=0}^{T-t}p_d$
. Note that the conditional distribution in this case is not a standard distribution, and thus, we cannot sample from it directly.
Rejection Sampling
Samples from the conditional distribution of
$\boldsymbol{p}_t(\boldsymbol{x}_i)$
can be selected by multivariate rejection sampling. We can write
$P(\boldsymbol{p}_t(\boldsymbol{x}_i)=p\mid \mathcal{O}_D)=ah(\boldsymbol{p})g(\boldsymbol{p}).$
It is straightforward to see that h is the density function of Dirichlet distribution with parameters
$(z_{i,t,0}+\eta_{t,0}(\boldsymbol{x}_i),\dots,z_{i,t,T-t}+\eta_{t,T-t}(\boldsymbol{x}_i),\eta_{t,T-t+1}(\boldsymbol{x}_i),\dots,\eta_{t,D}(\boldsymbol{x}_i)),$
and thus, we can easily sample from it. We perform the rejection sampling as follows:
-
• Step 1: sample
$\boldsymbol{p}$ from h, and independently, sample u from a Uniform(0,1) distribution.
-
• Step 2: If
$u \le \frac{g(\boldsymbol{p})}{\sup_{\boldsymbol{p}}\{g(\boldsymbol{p})\}},$ then accept
$\boldsymbol{p}$ . If not, go to Step 1.
Note that
$\sup_{\boldsymbol{p}}\{g(\boldsymbol{p})\}\le \sum_{j=1}^g \pi_{tj}(\lambda^{(j)}(\boldsymbol{x}_i))^{n_{i,t}^{\text{r}}}$
. By obtaining a sample for
$\boldsymbol{p}_t(\boldsymbol{x}_i)$
given
$\mathcal{O}_D$
and
$\boldsymbol{\Phi}_3^{(k-1)},$
we can estimate
$\hat{n}_{i,t,j}^{(k)}$
and
$\widehat{\log p_{i,t,d}}^{(k)}$
using Monte Carlo methods.
Remark: The formulas presented above in the derivation of the conditional distribution of
$\boldsymbol{p}_t$
implicitly assume an exposure of
$e_{t,i}=1$
. To account for exposure, simply multiply the Poisson intensity by the exposure factor.
M-Step
We now maximize Equation (5.7) from the E-step after replacing
$\hat{n}_{i,t,j}^{(k)}$
(
$t \gt T-D$
) and
$\widehat{\log p_{i,t,d}}^{(k)}$
with their Monte Carlo estimates. Again, the updated estimates of
$\boldsymbol{\pi}_1$
and
$\boldsymbol{\Gamma}$
are given by Equations (5.5) and (5.6), respectively. Moreover, similar to the Multinomial model case, given each j, it is straightforward to see that the third term is a weighted Poisson log-likelihood, and thus can be fit using standard packages in R. Finally, for the fourth and final term, it is straightforward to see that this is a Dirichlet log-likelihood. As previously illustrated, there are several flexible ways to model
$\boldsymbol{\eta}$
. For our data analysis, we model
$\boldsymbol{\eta}$
as a regression function of
$\boldsymbol{x}_i$
and time-dependent covariates. We obtain the parameter estimates for this Dirichlet regression using the DirichletReg package in R (Maier, Reference Maier2014), which rescales the probability estimates from the E-step to ensure that the constraints are satisfied.
Convergence Criterion, Model Initialization, and Model Selection
We deviate from the standard EM algorithm and use the MCEM algorithm, as we compute the expectations in the E-step using Monte Carlo methods. In the MCEM algorithm, it is not guaranteed that the likelihood will increase after each iteration, unlike the standard EM algorithm (see Booth and Hobert, Reference Booth and Hobert1999). Thus, special care should be taken. Many methods have been proposed for approximating the EM algorithm (see Ruth, Reference Ruth2024 for a review). From a practical point of view, we use the same convergence criterion as we have for the previous model (the relative distance criterion), as this would have a negligible effect on the parameter estimates even though the likelihood can fluctuate.
For model initialization, we fit the model to our complete data up to period
$ T - D $
(as we did in the Multinomial case) and use the parameter estimates as the initialization for the algorithm. Finally, we choose the number of states g using a backward strategy as explained before.
6. IBNR reserve prediction
The estimation of the IBNR claims count varies slightly across the three models. Below, we outline the steps for the continuous-time model:
IBNR Claim Reserve Estimation Steps
-
1. Fit Right-Truncated Reporting Delay Regression: Estimate the parameters of the right-truncated reporting delay regression model to obtain the fitted distribution of the reporting delay
$F_{U\mid t,\boldsymbol{x}}(u)$ . The choice of model depends on the data at hand. Options include flexible mixture-based models, as proposed in Fung et al. (Reference Fung, Badescu and Lin2022) and Bücher and Rosenstock (Reference Bücher and Rosenstock2023). Alternatively, the flexsurv package in R (Jackson, Reference Jackson2016) provides tools for fitting parametric proportional hazard models for data with random right truncation.
-
2. Compute Reporting Delay Probability: Use the fitted reporting delay regression to compute the probability
$\int_{d_{l-1}}^{d_l} F_{U\mid t,\boldsymbol{x}_i}(\tau-t) \, dt$ for each policyholder
$i=1,\dots,m$ and each period
$l=1,\dots,T$ . This probability is used in maximizing the likelihood of the discretely observed reported claims processes and simulating the IBNR claim counts.
-
3. Fit the Discretely Observed Reported Claim Process: Maximize the likelihood of the observed reported claim counts using the EM algorithm (see Appendix A). This involves estimating the initial state probabilities and the transition matrix for the HMM, as well as the claim frequency regression parameters.
-
4. Viterbi’s Global Decoding: Apply Viterbi’s algorithm (see Chapter 5 in MacDonald and Zucchini, Reference MacDonald and Zucchini2016 for details) to find the most likely sequence of hidden states
$\boldsymbol{c}^*$ given the observed reported claim counts
$\boldsymbol{n}^{(r,1:T)}$ and the fitted model parameters from step 3.
-
5. Simulate IBNR Claim Counts: For each policyholder
$i=1,\dots,m$ and each period
$l=1,\dots,T$ where the exposure is positive, simulate the IBNR claim count
$N_{i,l}^{\text{IBNR}}$ using the Poisson distribution with mean
$e_{i,l}\lambda_l^{(c_l^*)}(\boldsymbol{x}_i) \times (1 - \int_{d_{l-1}}^{d_l} F_{U\mid t,\boldsymbol{x}_i}(\tau-t) \, dt)$ for the corresponding hidden state from the previous step. The total IBNR claim count at time
$\tau$ is then given by
$\sum_{i=1}^m \sum_{t=1}^T \hat{N}_{i,l}^{\text{IBNR}}$ , where
$\hat{N}_{i,l}^{\text{IBNR}}$ is the simulated IBNR claim count for policyholder i and period l.
Note that Step 5 is repeated 1000 times to obtain a distribution for the IBNR claim counts. The point estimate for the IBNR claim count is the mean of the 1000 simulated values.
The estimation process for the discrete-time models (Multinomial and Dirichlet-Multinomial) begins at Step 3, where the relevant likelihood is maximized, allowing the delay and frequency components to be simultaneously fitted.
In the multinomial model, maximizing the likelihood
$ \mathcal{L}^{(2)} $
provides direct estimates of the reporting probability vector
$\boldsymbol{p}_t(\boldsymbol{x}_i)$
. In Step 5, these probabilities are used to simulate the IBNR claim counts. Specifically, for each policyholder i and each period t, the IBNR claim count
$N_{i,l}^{\text{IBNR}}$
is simulated by drawing from a Poisson distribution with mean

In contrast, the Dirichlet-Multinomial model involves additional steps to account for the variability in reporting probabilities. Instead of directly estimating
$\boldsymbol{p}_t(\boldsymbol{x}_i)$
, Step 3 yields the Dirichlet parameters
$\boldsymbol{\eta}_{i,t}$
, which characterize the distribution of reporting probabilities. These parameters are used in Step 5 to simulate
$\boldsymbol{p}_t(\boldsymbol{x}_i)$
by drawing from a Dirichlet distribution. The simulated
$\boldsymbol{p}_t(\boldsymbol{x}_i)$
is then used in the same way as in the Multinomial model to estimate the IBNR claim counts. The additional simulation step in the Dirichlet-Multinomial model introduces variability, which is captured by repeating the process multiple times to better account for uncertainty in the reporting probabilities.
7. Case study: real-life data
In this section, we apply the proposed models to estimate the IBNR claim count for a major European auto insurance company. We then compare these estimates with those obtained from the traditional Chain Ladder method. This analysis aims to evaluate the performance of our models in a practical insurance context. Additionally, we investigate how the joint modeling of occurrence and reporting in the discrete-time models performs compared to the continuous-time model, where we maximize the likelihood of the observed data using a two-step maximization process. This is particularly interesting given the loss of information when moving to discrete-time models.
7.1. Data analysis
Data Description
Our dataset contains information on Physical Damage policies issued by the company between January 2009 and December 2017. During these years, a total of 293,709 one-year policies were issued, resulting in 106,187 reported claims. For each policy, the dataset includes the policy start and end dates, as well as various vehicle and policyholder attributes. These attributes include the age of the driver (18–90), the age of the car (0–26), the class of the car (A, B, or C), the fuel type (Gasoline or Diesel), whether the contract is a renewal or issued to a new policyholder, and the region (five regions). For each claim, the dataset records the day on which the claim occurred, the day on which the claim was reported, and the progression of claim payments. In this study, we focus on estimating the IBNR claim count, which requires us to concentrate on the time of occurrence and reporting delay of each claim. Therefore, we will not consider the payments in our analysis.
Reporting Delay
From the 106,197 claims reported between January 2009 and December 2017, only two were reported more than two years after the occurrence date. This suggests that nearly all claims occurring before 2016 have been reported. Table 1 displays the distribution of reporting delays in months for these claims. A delay of zero months indicates a claim was reported in the same month it occurred, while a delay of one month means it was reported the following month, and so on. The majority of claims (81.71%) were reported in the same month they occurred. Over 99% of claims were reported within five months, and only 0.1% were reported after ten months. This indicates that most claims are reported promptly after occurrence, with only a small fraction experiencing delays beyond ten months.
Table 1. Reporting delay distribution for claims that occurred before 2016.

Frequency
Figure 1 illustrates the trends in exposure (left) and claim count per exposure (right) from January 2009 to December 2015. The exposure, which represents the number of policies at risk, exhibited a steady increase from January 2009 until December 2012. Following this period, it decreased consistently until January 2015, after which it began to rise again. In contrast, the claim count per exposure displayed significant fluctuations over the same period. It reached a minimum of 0.023 claims per exposure in December 2014 and peaked at 0.045 claims per exposure in May 2013. This variation in claim frequency highlights the need for a sophisticated model capable of capturing such dynamic changes. The HMM employed in our model is well-suited for this purpose, as it can account for the underlying state changes that drive fluctuations in claim frequency over time.

Figure 1. Monthly exposure (left) and claim count per exposure (right) between January 2009 and December 2015.
7.2. IBNR claim count prediction
In this subsection, we outline the four different models used to estimate the IBNR claim count for our data, along with their specifications. The models considered are the continuous-time model, the multinomial model, the Dirichlet-multinomial model, and the traditional Chain Ladder method.
For the discrete-time models, we set the maximum reporting delay, D, to 9 months. This choice is based on the observation that nearly 99.9% of claims are reported within this period, ensuring that the majority of claims are captured within the model.
Specifications for the Fitted Models
-
• The Continuous-time Model (CM): As previously explained, our approach begins by fitting the distribution for the reporting delay, which is then utilized to fit our model, and consequently estimate the IBNR claim count. A notable challenge in this process is the right truncation of the reporting delay, which complicates the fitting of the distribution. We provide the continuous-time model with an advantage over the discrete-time models as we fit the reporting delay using all claims that occurred before the reserve valuation date
$\tau$ , including those that have not yet been reported. This overcomes the issue of right truncation, allowing for a more accurate fit.
While this method is not feasible in practice – since unreported claims cannot be known beforehand – it serves to highlight the strengths of the discrete-time models, which we will demonstrate in the results section. In a practical scenario, one could use a “back censoring” technique, where a time threshold within the training period is identified. Claims occurring before this threshold would have been reported by the valuation date, allowing the model to be fit without concern for right truncation.
In our analysis, the log-logistic proportional hazard regression performed well in fitting the reporting delay data, as evidenced by the Cox-Snell residuals plot in Figure 2. The covariates used in the regression include the the vehicle and policyholder risk attributes introduced in the data description (Section 7.1), the month in which the claim occurred and the day of occurrence.

Figure 2. Cox-Snell residual plot for assessing the fit of the reporting delay model
-
• The Multinomial Model (MM): In Section 5.1, we outlined the EM algorithm required to fit the multinomial model, where we model the binomial probabilities
$ q_t(d;\,\boldsymbol{x}_i) $ for each
$ d \in \{1,\dots,D\} $ as a regression function of the vehicle and policyholder’s risk attributes
$\boldsymbol{x}_i$ and time-dependent covariates. The vehicle and policyholder’s covariates are the vehicle and policyholder risk attributes introduced in the data description (Section 7.1), while the time-dependent covariates include the month of occurrence and the weekday on which the period ends, both of which were found to be significant in our regression analysis. Given that
$ q_t(d;\,\boldsymbol{x}_i) $ is small for
$ d \gt 1 $ , we use a complementary log-log link function to model it. For
$ d = 1 $ , we use a standard log link function, as is typical for binomial GLMs. We fit the binomial component of the likelihood using the standard glm function in R.
-
• The Dirichlet-Multinomial Model (DM): The fitting process for the Dirichlet-Multinomial model is thoroughly detailed in Section 5.2. We model the Dirichlet prarameters
$\boldsymbol{\eta}$ as a regression function of the same covariates used for
$q_t(d;\,\boldsymbol{x}_i)$ in the Multinomial Model.
-
• The Chain Ladder (CL): We estimate the IBNR claim count using the CL method on data aggregated monthly, serving as the baseline for comparing our models. As with the discrete-time models, we assume that claims are reported no later than 9 months after the month of occurrence.
Remarks
-
• For the continuous-time model, we do not assume a maximum reporting delay, so by following the steps in Section 6, we can estimate the full number of IBNR claims, unlike the discrete-time models. However, for a fair comparison with the discrete-time models, we use the estimated parameters after fitting the continuous-time model to estimate only the IBNR claims with a maximum delay period of 9 months after the month of claim occurrence. This is achieved by adjusting Step 5 so that the probability factor
$\left(1 - \int_{d_{l-1}}^{d_l} F_{U\mid t,\boldsymbol{x}_i}(\tau-t) \, dt\right)$ is replaced with:

-
where t is the day of claim occurrence, U is the reporting delay random variable, and
$t_{\text{max}}$ is the date by which the claim should be reported if we assume a maximum delay of 9 months after the month of claim occurrence. For example, if the claim occurrence date is in January, then
$t_{\text{max}}$ will be the end of October for the same year. This adjustment ensures that only the first 9 months of reporting delay are considered in the estimation, aligning the continuous-time model’s output with the assumptions made in the discrete-time models.
-
• The covariates used for the state-dependent intensities,
$\lambda^{(j)}(\!\cdot\!)$ , are the vehicle and policyholder risk attributes introduced in the data description (Section 7.1). More specifically, these attributes include the age of the driver, the age of the car, the car class, whether the contract is a renewal or issued to a new policyholder, and the policyholder’s region.
7.3. Results
We present the results of estimating the IBNR claim count using the four models specified in the previous section, focusing on claims reported within 9 months after the month of occurrence. Estimations were performed at the end of each month from January 2014 to December 2016, resulting in 36 valuation points. For each valuation date, the training set consists of all claims reported up to that valuation date. The models are fitted using this reported claims data to estimate the parameters, which are then used to predict the IBNR claim count as outlined in Section 6. Since our dataset extends to December 2017, we have the actual IBNR claim counts to compare against our model estimates. This process results in 36 valuation points, providing a robust assessment of the models’ predictive performance.
To evaluate the performance of each model, we calculated the absolute percentage error (APE) of the IBNR claim count point estimates at each of the 36 valuation points. The APE is defined as

The results are summarized in boxplots that display the distribution of APE for each model across the valuation period, as well as in a table that reports the mean and standard deviation of these errors (see Figure 3 and Table 2). We considered models with 2, 3, and 4 states for our HMMs. While more than 4 states could potentially capture additional granularity – especially for later reserve valuation dates with longer time series – such complexity may be excessive for the relatively short monthly time series available at earlier dates. Moreover, fitting models with more than 4 states for early dates often resulted in spurious states. Even for later valuation dates where models with 5 states did not produce spurious states, the Bayesian information criterion (BIC) generally favored the 4-state model, indicating no clear benefit from increased complexity. To ensure comparability, we therefore adopt a unified analysis using up to 4 states across all valuation dates.
Table 2. Mean and standard deviation of APE for IBNR estimates by model and HMM states (g):
$g=1$
corresponds to a Poisson process.


Figure 3. Boxplots of absolute percentage errors for IBNR claim count estimates by model and number of HMM states across 36 valuation dates.
Performance Comparison of the Chain Ladder Method and HMM-Based Models
The classical CL method performs worse than our proposed HMM-based models. The CL method shows a mean and median APE of around 16%, with high variability in the error, as indicated by both the boxplot and the standard deviation values presented in Table 2. This highlights the limitations of the CL method compared to more advanced approaches.
Performance of the Proposed Models
Our proposed models – the continuous-time model (CM), the multinomial model (MM), and the Dirichlet-mMultinomial model (DM) – perform much better than the CL method, demonstrating their effectiveness in estimating IBNR claim counts. The discrete-time models (MM and DM) generally outperform the CM, as evidenced by their lower median and mean APE and lower variability (see Figure 3) across all values of g (2, 3, and 4). The CM model shows a median absolute error close to that of the MM and DM models, but the mean absolute error is significantly larger due to its high variability, as reflected in the boxplot.
In general, the CM model tends to produce good estimates of IBNR claim counts when the integral

representing the probability that a claim occurring within the period
$(d_{l-1}, d_l]$
is reported before the valuation date
$\tau$
, aligns closely with the actual percentage of such claims. However, when this integral deviates significantly from the actual percentage, the CM model’s estimates deteriorate. For example, at the end of January 2016, the average value of the integral

where
$d_T =\tau$
(which represents the probability that a claim that occurred in January 2016 was reported before the end of the month), across policyholders was 0.81095, while the actual percentage was 0.7465753 (i.e., 75% of such claims were reported by the end of the month) – a difference that diverged from the usual pattern. This discrepancy, which was not captured by our model for reporting delay despite using all available data, significantly impacted the IBNR claim count estimate; the underestimated value of

led to an underestimated IBNR claim count.
Effect of Increasing States on Model Performance
As the number of states g increases, the performance of the discrete-time models (MM and DM) improves, suggesting that these models benefit from more states in capturing the underlying dynamics of the data. However, this trend does not hold for the continuous-time model (CM). The CM model’s performance remains inconsistent due to its sensitivity to the fitting of reporting delays and the two-step maximization of the likelihood process, as explained in the previous analysis. Therefore, while adding states improves the accuracy of discrete-time models, it does not necessarily translate into better performance for the continuous-time model.
BIC for the Fitted Models
We report the BIC for the fitted models at the first reserve valuation date. The BIC values for the Continuous-time, Multinomial, and Dirichlet-Multinomial models with
$g = 2, 3, 4$
are shown in Table 3.
Table 3. BIC values for three fitted HMM-based models with
$g=2,3,4$
at the first reserve valuation date.

The results indicate that the BIC is lowest for
$g = 4$
across all three models, suggesting that
$g = 4$
provides the best fit. However, the BIC values for
$g = 3$
and
$g = 4$
are very close, indicating that the improvement in fit from
$g = 3$
to
$g = 4$
is relatively small.
Effect of Incorporating the HMM
Table 2 also shows summary statistics for the APE in the case
$ g = 1 $
, which translates to the Poisson process. It is clear that the performance of both the discrete-time models improves by incorporating the HMM. For both models, the mean APE decreases significantly as the number of states increases. This indicates that incorporating the HMM structure, with its ability to account for different state transitions, provides better accuracy in the IBNR claim count estimates. The improvement is seen not only in the mean APE but also in the reduction of the standard deviation of the APE, reflecting the model’s increased stability and precision with more states.
In contrast, the continuous-time CM model shows minimal change when incorporating the HMM. The APE remains largely consistent, indicating that the continuous-time model is less affected by the number of states. The same reasoning we outlined in the previous subsection, when talking about the effect of increasing states on model performance, applies here.
Effect of Including Covariates
To evaluate the effect of including the vehicle and policyholder’s risk characteristics, we fitted our models without incorporating any covariates in the framework. The results indicate that, across all numbers of hidden states g, the APE ranged between 16 and 17% for all three models (the errors are all very similar so we do not include a table). This is comparable to the results obtained using the CL method. These findings highlight the significant role of covariates in improving model performance, as their inclusion enables the models to account for variations in claim counts arising from individual-level characteristics, which cannot be captured in a purely aggregate framework.
Comparison with the IPW Method (Vanegas et al., Reference Vanegas, Badescu and Lin2024)
To provide additional context for the performance of our proposed models, we compared them to the inverse probability weighting (IPW) method proposed in Vanegas et al. (Reference Vanegas, Badescu and Lin2024). This method was implemented using the same reporting delay regression structure as in our continuous-time model. The mean APE for the IPW method was 15%, only 2% less than the APE obtained by the CL.
It is important to note that our proposed models fundamentally differ from the IPW method in their approach. While the IPW focuses on estimating development factors through reporting delay regressions, our models incorporate the temporal dynamics of the claim arrival process. This temporal component enables our models to capture additional patterns and dependencies in the data, resulting in superior performance relative to the IPW method.
Comparison with Chain Ladder under Yearly Aggregation
Figure 4 reveals a clear seasonal pattern in the IBNR claim counts. Although Tee and Käärik (Reference Tee and Käärik2022) argue that CL estimation tends to be more accurate under finer (e.g., monthly) aggregation, it is important to also consider coarser aggregations when seasonal effects in reporting structure are present. In practice, practitioners may prefer yearly aggregation to better capture full seasonal cycles in reporting structures.
To assess the impact of aggregation choice on model performance, we compare our models against the CL method applied to yearly aggregated data. Since the CL method with yearly aggregation does not allow restriction to claims reported within a maximum delay of 9 months, we estimate the total IBNR claim count for all models to ensure a fair comparison. For our discrete-time models, which assume a maximum delay of
$D = 9$
, we estimate the long-delay component (i.e., for delays
$d \gt 9$
) using the empirical average:

and
$z_{l,d}$
is the number of claims occurring at time l and reported with delay d. This correction is applied to all censored entries with
$d \gt 9$
and
$t + d \gt T$
, up to the maximum observed reporting delay in the data, allowing for a complete IBNR claim count estimate.
We compute IBNR claims counts at 24 monthly valuation dates spanning January 2014 to December 2015. Given that the maximum reporting delay in the dataset is two years and data are available up to December 2017, we are able to compare these estimates with the actual observed IBNR claim counts. The mean APE for IBNR claim counts across the 24 valuation dates - calculated for the CL method with monthly aggregation, the CL method with yearly aggregation, and the three proposed HMM-based models with 4 states - is shown in Table 4.
Table 4. Mean APE for IBNR claim count estimates across models.


Figure 4. Error bar plots showing the 95% confidence intervals for simulated IBNR claim count estimates using the DM and MM models with
$ g = 2, 3, 4 $
. The red points indicate when the actual IBNR claim count falls outside the interval, and the blue points indicate when it falls within the interval
The results confirm that yearly aggregation improves CL performance compared to monthly aggregation, likely due to better alignment with seasonal patterns in reporting delays. Notably, the performance of CL-Yearly is comparable to our continuous-time model. Nevertheless, both of our discrete-time HMM-based models outperform the CL benchmark under both aggregation schemes.
Analysis of Confidence Intervals for DM and MM Models
The error bar plots in Figure 4 illustrate the 95% confidence intervals for the simulated IBNR claim count estimates using the DM and MM models with
$ g = 2, 3, 4 $
, while the points represent the actual IBNR claim counts. The red points indicate instances where the actual IBNR claim count falls outside the interval, and the blue points indicate instances where it falls within the interval. It is evident that the DM model’s confidence intervals are generally wider and more frequently contain the actual IBNR claim counts compared to the MM model. Specifically, for the DM model, the confidence intervals include the actual IBNR claim count in 28, 29, and 28 out of the 36 valuation dates for
$ g = 2, 3, 4 $
, respectively. In contrast, the MM model’s intervals contain the actual IBNR claim counts only 17, 20, and 21 times for
$ g = 2, 3, 4 $
, respectively. The increased width of the confidence intervals in the DM model reflects the introduction of variability in the reporting probability vector through the Dirichlet assumption. This added variability leads to a more appropriate distribution of the IBNR claim counts, capturing the inherent uncertainty and providing a more realistic estimate. This highlights the importance of considering such variability for better risk management, as underestimating or overestimating reserves can have significant financial implications.
8. Conclusion
In this paper, we proposed a novel micro-level Cox model designed to enhance the estimation of IBNR claim counts in P&C insurance. Our model incorporates a HMM to capture the temporal dependencies in the claim arrival process, allowing for the integration of policyholder-level data and environmental factors. This framework supports analysis at varying levels of granularity, providing flexibility for different applications. We initially presented the model in a continuous-time framework, then extended it to a discrete-time framework to enable simultaneous modeling of claim occurrence and reporting delays. Additionally, we introduced a Dirichlet distribution assumption for the reporting delay probabilities, addressing nonsystematic or unexplainable variations in delay structures.
Our empirical analysis demonstrated that the discrete-time models, particularly the one incorporating the Dirichlet assumption, outperformed the continuous-time model, providing more accurate estimates of the IBNR claim count, with both frameworks outperforming the classical CL method. This finding highlights the importance of jointly modeling claim occurrence and reporting behavior, as well as the value of accounting for randomness in reporting delays.
For future research, we plan to conduct a deeper analysis of the applicability of continuous-time models in the literature. Given the superior performance of our discrete-time approach, it is essential to reassess the contexts in which continuous-time models are most appropriate. Additionally, the inclusion of the Dirichlet assumption in our model suggests a promising direction for the application of Bayesian methods in IBNR reserving. By transitioning to a Bayesian framework, we aim to further improve the robustness and flexibility of our reserve estimates.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/asb.2025.10056
Funding statement
The authors acknowledge the financial support provided by the Natural Sciences and Engineering Research Council of Canada (Andrei L. Badescu: Grant No. RGPIN-2017-06684; Radu Craiu: Grant No. RGPIN-2024-04506; X. Sheldon Lin: Grant No. RGPIN-2023-04326).
Competing interests
The authors declare none.
Appendix
A. EM Algorithm for the Continuous-time Model
We maximize the likelihood for the continuous-time model (Equation 4.1) using the two-step procedure commonly adopted in the micro-level reserving literature: first, we fit the right-truncated reporting delay distribution, and then estimate the frequency model while treating the delay distribution as fixed. This appendix focuses on the estimation of the “Claim Frequency” component under this framework.
Recall from Proposition 2 that
$N_{i,l}^{\text{r}}|C_l=j \sim \text{Poisson}( \lambda^{(j)}(\boldsymbol{x}_i)a_{l}(\boldsymbol{x}_i)),$
where
$a_{l}(\boldsymbol{x}_i) = e_{i,l}\left(\int_{d_{l-1}}^{d_l}P_{U|\boldsymbol{x}_i}(\tau-t)\,dt\right)$
. We now describe the EM algorithm needed to fit the “Claim Frequency” component (Equation 5.1), assuming that
$a_{l}(\boldsymbol{x}_i)$
is a known constant for all
$l=1,\dots,T$
and
$i=1,\dots,m$
.
Complete Log-Likelihood
The complete data are given by the observed data (the discretely observed reported claims) and the unobserved data (the states of the HMM). It is straightforward to show that the log-likelihood of the complete data is given by

where
$u_{tj} = \unicode{x1D7D9}\{C_t = j\}$
,
$v_{tjk} = \unicode{x1D7D9}\{C_{t-1}=j, C_{t}=k\}$
, and
$\mathbb{P}_{j}(n_{i,t}^r) = P(N_{i,t}^r=n_{i,t}^r|C_t = j)$
.
E-step
In the E-step, we compute the conditional expectation of the complete log-likelihood given the observed data and the current estimator
$\boldsymbol{\Phi}_1^{(k-1)}$
of the model parameters
$\boldsymbol{\Phi}_1$
. This conditional expectation is given by: A

where
$\hat{u}_{tj}^{(k)}$
and
$\hat{v}_{tjl}^{(k)}$
are given by Equations (5.3) and (5.4), respectively.
M-step
The M-step in the EM algorithm involves maximizing Equation (A1) with respect to the model parameters
$\boldsymbol{\Phi}_1=\{\boldsymbol{\pi}_1,\boldsymbol{\Gamma},\boldsymbol{\theta}_1,\dots,\boldsymbol{\theta}_g\}$
, subject to the necessary constraints. The optimal values for
$\pi_{1j}^{(k)}$
and
$\gamma_{jl}^{(k)}$
at the kth iteration are given by Equations (5.5) and (5.6), respectively. Finally, we aim to obtain the regression parameter
$\boldsymbol{\theta}_j^{(k)}$
that maximizes the third term of Equation (A1), which is the likelihood for a weighted Poisson regression. The maximization of this likelihood is readily available in many software packages in R.
Convergence Criterion, Model Initialization and Model Selection
We use similar procedures as those for the discrete-time models.