1. Introduction
A physics-inspired neural network (PINN) incorporates principles from physics into its learning process, enhancing its efficiency in solving complex scientific problems. Compared to existing approaches, PINNs reduce the reliance on large datasets and provide accurate predictions even with sparse or noisy measurements. Researchers have employed PINNs to address a diverse range of problems, including fluid dynamics, solid mechanics, heat transfer, and quantum mechanics.
In this study, we demonstrate that PINNs can be used for the valuation and design of specific variable annuities (VAs) known as guaranteed minimum accumulation benefits (GMABs). This retirement savings vehicle guarantees the policyholder the higher of the account value or a specified amount upon maturity, assuming survival. The proposed model incorporates the following risk factors: interest rates, equity prices, and mortality rates. We also consider four asset classes: cash, zero-coupon bonds, stocks, and mortality-linked bonds. Our goal is to develop a single neural network capable of pricing GMABs with various specifications, including different investment strategies and maturities.
In summary, a PINN serves as an approximate solution to a nonlinear partial differential equation (PDE) that describes the dynamics of a model. The neural network is trained by minimizing the approximation error at sampled points within the PDE domain and on its boundaries. This method originates from the work of Lee & Kang (Reference Lee and Kang1990) and has since been applied to various nonlinear PDEs. For example, Raissi et al. (Reference Raissi, Perdikaris and Karniadakis2019) used PINNs for solving two main classes of mathematical problems: data-driven solutions and data-driven discovery of PDEs. In the field of physics, Carleo & Troyer (Reference Carleo and Troyer2017) and Cai (Reference Cai2018) employed PINNs to accurately approximate quantum many-body wave functions. For further information and recent developments and applications of PINNs, we recommend referring to Cuomo et al. (Reference Cuomo, Schiano Di Cola, Giampaolo, Rozza, Raissi and Piccialli2022) for a comprehensive literature review.
The literature on the valuation of financial and actuarial liabilities using neural networks is relatively recent. Hejazi & Jackson (Reference Hejazi and Jackson2016) developed a neural network to price and estimate the "Greeks" for a large portfolio of VAs. Doyle & Groendyke (Reference Doyle and Groendyke2019) priced and hedged equity-linked contracts using neural networks. They constructed a dataset of variable annuity prices with various features through Monte Carlo (MC) simulations. The neural network was then estimated by minimizing prediction errors using a scaled conjugate gradient descent. Sirignano & Spiliopoulos (Reference Sirignano and Spiliopoulos2018) introduced a deep Galerkin method (DGM) based on a network architecture inspired by long short-term memory (LSTM) neural cells. Their approach was tested on a class of high-dimensional free boundary PDEs and on high-dimensional Hamilton–Jacobi–Bellman PDEs and Burgers’ equation. Gatta et al. (Reference Gatta, Di Cola, Giampaolo and Piccialli F.Cuomo2023) evaluated a suitable PINN for the pricing of American multiasset options and proposed a novel algorithmic technique for free boundary training. Al-Aradi et al. (Reference Al-Aradi, Correia, Jardim, de Freitas Naiff and Saporito2022) extended the DGM in several directions to solve Fokker–Planck and Hamilton–Jacobi–Bellman equations. More recently, Jiang et al. (Reference Jiang, Sirignano and Cohen2023) demonstrated, under mild assumptions, the convergence of the deep Galerkin and PINNs method for solving PDEs. Glau & Wunderlich (Reference Glau and Wunderlich2022) formalized and analyzed the deep parametric PDE method for solving high-dimensional parametric PDEs.
A close alternative to PINN is developed in Weinan et al. (Reference Weinan, Han and Jentzen2017), where they solve parabolic PDEs in high dimensions using neural networks connected to backward stochastic differential equations (BSDEs). This approach employs two separate networks for the solution and its gradient. They apply their algorithm to price options in a multivariate Black and Scholes model. Beck et al. (Reference Beck, Weinan and Jentzen2019) introduce a similar method for solving high-dimensional PDEs based on a connection between fully nonlinear second-order PDEs and second-order backward stochastic differential equations. They illustrate the accuracy of their method with the Black and Scholes and Hamilton–Jacobi–Bellman equations. Using this principle, Barigou & Delong (Reference Barigou and Delong2022) evaluate equity-linked life insurances with neural networks by solving a BSDE, building upon previous results from Delong et al. (Reference Delong, Dhaene and Barigou2019). Unlike PINNs, BSDE methods rely on two networks instead of one, and any modification of contract specifications requires retraining.
Another valuation approach based on neural networks relies on simulations. Buehler et al. (Reference Buehler, Gonon, Teichmann and Wood2019) present a framework for pricing and hedging derivatives using neural networks that are trained on simulated sample paths of risk factors. In this case, the neural network approximates the optimal investment strategy, and the value is determined by averaging discounted payoffs obtained through simulations. Similarly, Horvath et al. (Reference Horvath, Teichmann and Zuric2021) studied the performance of deep hedging under rough volatility models, while Biagini et al. (Reference Biagini, Gonon and Reitsam2023) developed a neural network approximation using simulations for the superhedging price and replicating strategy. As with BSDE approaches, any modification of contract specifications requires retraining. For a comprehensive review of algorithms based on neural networks for stochastic control and PDEs in finance, we recommend referring to Germain et al. (Reference Germain, Pham and Warin2021).
In the financial and insurance industry, four dominant numerical methods are commonly used for pricing contingent claim contracts: MC simulations, bi- or trinomial trees, PDE solving, or fast Fourier transform (FFT) inversion. For situations involving more than two risk factors, trees, PDE solving, or FFT methods become too complex, making MC simulations the only viable alternative. In contrast, the valuation using a PINN does not require the simulation of sample paths of underlying risk factors. Furthermore, once the PINN is trained, valuation becomes nearly instantaneous. Another noteworthy feature is the ability to parameterize a PINN with contract features such as asset-mix or contract durations. This provides a quick solution for understanding the impact of asset-liability management (ALM) on pricing without the need to retrain the network for each setting. This presents a significant advantage compared to BSDE or deep hedging approaches.
This article makes the following contributions. First, we develop a closed-form expression for the price of a GMAB with participation in a market consisting of five classes of assets, including a mortality bond for hedging exposure to longevity risk. This analytical formula enables us to assess the pricing accuracy of a large set of contracts using PINNs without the need for MC simulations. Next, we introduce a specific type of neural network in which intermediate layers are fed with both the output of the previous layer and the initial input vector. Such a network better captures the nonlinear behavior of GMAB prices compared to classical feed-forward networks. Third, we consider risk factors (interest and mortality rates) driven by mean-reverting processes, whereas the existing literature on PINNs predominantly focuses on high-dimensional geometric processes. Lastly, we train the network for various investment strategies, asset mixes, and contract durations. In this sense, our model is parametric and offers significant flexibility compared with BSDE or deep hedging methods, which require retraining for each specific setting.
The article is structured as follows. The next section introduces the financial and biometric risk factors. We consider a multivariate Brownian setting to maintain the analytical tractability needed for benchmarking the PINN. In Section 3, we develop the closed-form expressions for zero-coupon bonds, survival probabilities, and pure endowments. Section 4 provides detailed specifications of the GMAB, including the assumption that assets are invested in a portfolio of cash, bonds, stocks, and mortality bonds. We offer the analytical formula for pricing the GMAB within this framework. Section 5 presents a scaled version of the Feynman–Kac equation that GMAB prices satisfy. In Section 6, we outline the architecture of the neural network used to solve this PDE and explain the training procedure. The final section offers a comprehensive analysis of pricing errors based on a validation set of 500,000 contracts with various features.
2. Risk factors
Our objective is to demonstrate the effectiveness of PINNs in valuing GMABs and establishing investment strategies for them. To validate our methodology, we compare the exact GMAB prices to the estimates generated by the neural network. To facilitate this comparison, we consider Brownian financial and biometric risk factors and derive a closed-form expression for the GMAB price. An alternative validation approach would involve evaluating GMAB prices through MC simulations, but this method is computationally intensive and less accurate. This section introduces the dynamics of risk factors, while Section 3 provides analytical expressions for bond prices and survival probabilities. Finally, we obtain the analytical value of the GMAB in Section 4.
The GMABs provide policyholders with a promise that the value of their investment, conditionally to their survival, will reach a minimum predetermined amount at expiry, regardless of how the underlying investments perform. The value of GMABs depends both on biometrical and financial risk factors that we detail in this section. We consider an hybrid market in which we invest in cash, stocks, interest rate bonds, and mortality bonds. The stock indice and the interest rate are, respectively, denoted by $\left (S_{t}\right )_{t\geq 0}$ , $\left (r_{t}\right )_{t\geq 0}$ . The insured is $x_{\mu }$ year old, and the reference force of mortality is denoted by $\left (\mu _{x+t}\right )_{t\geq 0}$ . The insurer can also invest in mortality bonds written on a reference population of age $x_{\lambda }$ and mortality rates $\left (\lambda _{x_{\lambda }+t}\right )_{t\geq 0}$ .
The financial and biometric processes are defined on a probability space $\left ({\Omega },\mathcal{F},\mathbb{P}\right )$ associated to four independant Brownian motions under $\mathbb{P}$ , ${\tilde{\boldsymbol{W}}}_{t}=\left (\tilde{W}_{t}^{(1)},\tilde{W}_{t}^{(2)},\tilde{W}_{t}^{(3)},\tilde{W}_{t}^{(4)}\right )_{t\geq 0}^{\top }$ . The state variables $\left (S_{t},r_{t},\mu _{x_{\mu }+t},\lambda _{x_{\lambda }+t}\right )$ are driven by the following SDEs:
where $\kappa _{r}$ , $\kappa _{\mu }$ , $\kappa _{\lambda }$ , $\sigma _{S}$ , and $\sigma _{r}$ belong to $\mathbb{R}^{+}$ , whereas $\gamma _{r}(t)$ , $\gamma _{\mu }(t)$ , $\gamma _{\lambda }(t)$ , $\sigma _{\mu }(t)$ , and $\sigma _{\lambda }(t)$ are positive functions of time. $\gamma _{r}(t)$ $\gamma _{\mu }(t)$ , and $\gamma _{\lambda }(t)$ are fitted to term structures of interest and mortality rates. Details about the estimation procedure are provided in Appendix. Notice that $\gamma _{\mu }(t)$ , $\gamma _{\lambda }(t)$ , $\sigma _{\mu }(t)$ , and $\sigma _{\lambda }(t)$ depend on the initial age of the insured and of the hedging population; $x_{\mu },x_{\lambda }\in [0,x_{max})$ , $x_{max}\gt 0$ .
In our framework, mortality rates have a mean-reverting Gaussian dynamic, which offers a good trade-off between complexity and analytical tractability. Milevsky & Promislow (Reference Milevsky and Promislow2001) were among the firsts to consider mean-reverting stochastic processes for modeling mortality. Luciano & Vigna (Reference Luciano and Vigna2005) developed a jump-diffusion affine model for modeling rates and showed that constant mean reverting process are not adapted for describing the mortality. Luciano & Vigna (Reference Luciano and Vigna2008) calibrated various time homogeneous mean-reverting models to different generations in the UK population and investigate their empirical appropriateness. Jevtic et al. (Reference Jevtic, Luciano and Vigna2013) studied and calibrated a cohort-based model which captures the characteristics of a mortality surface with a continuous-time factor approach. Zeddouk & Devolder (Reference Zeddouk and Devolder2020) modeled the force of mortality with mean reverting affine processes, where the level of reversion is time dependent. Hainaut (Reference Hainaut2023) proposed a mortality model based on a mean reverting random field indexed by time and age.
In Equation (1), the parameters $\nu _{S}$ , $\nu _{r}$ , $\nu _{\mu }$ , and $\nu _{\lambda }$ tune the risk premiums $\nu _{S}\sigma _{S}$ , $-\nu _{r}\sigma _{r}$ , $-\nu _{\mu }\sigma _{\mu }(t)$ , $-\nu _{\lambda }\sigma _{\lambda }(t)$ of processes. Furthermore, we assume that the standard deviation of mortality rates is related to age through the relations $\sigma _{\mu }(t)=\alpha _{\mu }e^{\beta _{\mu }(x_{\mu }+t)}$ and $\sigma _{\lambda }(t)=\alpha _{\lambda }e^{\beta _{\lambda }(x_{\lambda }+t)}$ . $\Sigma _{S}$ , $\Sigma _{r}$ , $\Sigma _{\mu }$ , and $\Sigma _{\lambda }$ are vectors such that $\Sigma =\left (\Sigma _{S}^{\top },\Sigma _{r}^{\top }\Sigma _{\mu }^{\top },\Sigma _{\lambda }^{\top }\right )$ is the (upper) Choleski decomposition of the correlation matrix:
where $\rho _{Sr,}\,\rho _{S\mu },\,\rho _{S\lambda },\rho _{r\mu },\rho _{r\lambda }$ , and $\rho _{\mu \lambda }\in ({-}1,1)$ . This model incorporates the correlation between financial and mortality shocks, which can be relevant in the context of events such as a pandemic like COVID-19. We assume that costs of risk, noted $\boldsymbol{\theta }=\left (\theta _{j}\right )_{j=1,\ldots,4}^{\top }$ , are constant, i.e., the Brownian motions with drift under $\mathbb{P}$ ,
are Brownian motions under $\mathbb{Q}$ . To keep identical dynamics under $\mathbb{P}$ and $\mathbb{Q}$ , the risk premium parameters, stored in a vector $\boldsymbol{\theta }=\left (\theta _{1},\theta _{2},\theta _{3},\theta _{4}\right )^{\top }$ , are such that
Under the risk neutral measure $\mathbb{Q}$ , financial and biometric processes are ruled by the following SDEs:
We will use later these equations to build the valuation equation fulfilled by the GMAB. As mentioned earlier, we have chosen a pure Brownian model specifically for the purpose of benchmarking exact GMAB prices against PINN predictions. The following two sections present intermediate analytical results that are essential for deriving a closed-form expression for the GMAB.
3. Bonds and survival probabilities
We denote by $P(t,T)$ the present value of a zero-coupon bond that pays one monetary unit at expiry date, $T$ . The survival probabilities of an $x_{z}+t$ year old individual up to time $t$ are noted $\,_{T}p_{x_{z}+t}^{z}$ , for $z\in \{\mu,\lambda \}$ . Pure endowments provide a lump sum payment at expiry in case of survival and nothing if the policyholder dies before the term. We denote them by $_{T}E_{t}^{z}$ for $z\in \{\mu,\lambda \}$ . Zero-coupon bonds, survival probabilities, and pure endowents are valued by the following expectations under the risk neutral measure:
The model being affine, we derive the closed-form expressions of these products. In the remainder of this article, we adopt the following notation
where $y\in \mathbb{R}^{+}$ is a positive parameter. Various combined integrals of $B_{y}(t,T)$ , $\sigma _{\mu }(t)$ , and $\sigma _{\lambda }(t)$ are provided in Appendix B and serve to analytically evaluate the GMAB. From Proposition 4 in Appendix A, $\gamma _{r}(T)$ matches the initial yield curve of interest rate if
where $-\partial _{T}\ln P(0,T)$ is the instantaneous forward rate. In a similar manner, for a given initial mortality curve $\,_{T}p_{x_{z}}^{z}$ , the function $\gamma _{z}(u)$ is such that
Bond prices, survival probabilities, and endowments admit closed-form expressions presented in the next proposition.
Proposition 1. The price at time $t\leq T$ of a discount bond of maturity $T$ is linked to the initial interest rate curve at time $t=0$ by the relation
In a similar manner, we can show that, when alive at age $x_{z}+t$ , the survival probability up to time $T$ depends on the initial survival term structure as follows for $z\in \{\mu,\lambda \}$
whereas the pure endowment contracts are valued by
and
The sketch of the proof is provided in Appendix A. Using standard developments, the bond price dynamic is equal to
We assume that policyholder’s savings is invested in cash, stocks, zero-coupon bonds, and in mortality bonds written on a reference population of age $x_{\lambda }$ at time $t=0$ . We denote by $D(t,T)$ , the value of this mortality linked bond expiring at time $T$ . Its return depends on the benchmark mortality rate, $\lambda _{x_{\lambda }+t}$ , and is valued at by the following expectation:
This product is designed in such a way that, in the event of over- or undermortality, the bondholder realizes a capital loss or gain at maturity. The dynamics of this bond are described in the next proposition, which is proven in Appendix A.
Proposition 2. Under the risk neutral measure $\mathbb{Q}$ , the mortality bond earns on average the risk-free rate and its price obeys to the SDE:
Under $\mathbb{P}$ , the risk premium of the mortality bond is proportional to interest and mortality volatilities:
4. Contract and analytical valuation
GMABs are retirement savings products which promises in case of survival, the maximum between investments, and a guaranteed capital. The policyholder savings are invested in cash, stocks, risk-free and mortality bonds of maturity $T$ . We denote by $\left (A_{t}\right )_{t\geq 0}$ the total value of these investments. The percentages of stocks, risk-free and mortality bonds are assumed constant and denoted by $\boldsymbol{\pi }=\left (\pi _{S},\pi _{P},\pi _{D}\right )_{t\geq 0}$ . We assume that the investment policy is self-financed, the dynamic of assets under management is equal to
where $dS_{t}$ , $dP(t,T)$ , and $dD(t,T)$ are, respectively, provided in Equations (1), (7), and (9). We can therefore reformulate the differential of $A_{t}$ under $\mathbb{P}$ as follows
where $\boldsymbol{\nu }_{A}(t)$ is a time-varying vector of risk premiums,
and $\Sigma _{A}(t)$ is a volatility $3\times 4$ -matrix :
Under the risk neutral measure, investments earn on average the risk-free rate and the differential of $A_{t}$ is ruled by
The contract is subscribed by an individual of age $x_{\mu }$ and guarantees at the expiry date, denoted by $T^{*}$ , a payout equal to the maximum between a guaranteed capital $G_{m}$ and the portfolio $A_{T^{*}}$ , in the event of survival. However, the benefit is eventually bounded by $G_{M}$ . In practice, $G_{m}$ is often set to the initial investment, $A_{0}$ , or to a percentage of $A_{0}$ . Whereas, the participation is most of the time unbounded ( $G_{M}=+\infty$ ). We denote by $\tau _{\mu }\in \mathbb{R}^{+}$ , the random time of insured’s death and $N_{t}^{\mu }=\textbf{1}_{\{\tau _{\mu }\geq t\}}$ . The payoff in case of survival is
The fair value of such a policy, denoted by $L_{t}$ , is equal to the expected discounted cash flows under the chosen risk neutral measure, noted $\mathbb{Q}$ . This is a function of all state variables:
In order to obtain a closed form expression of the saving contract (10), we perform a change of measure using as Radon–Nykodym derivative:
From Equations (A6), this change of measure is rewritten as follows:
We recognize a Doleans–Dade exponential and then under the measure $\mathbb{F}.$ We denote by $\boldsymbol{W}_{t}^{\mathbb{F}}$ the vector of $\mathbb{F}$ -Brownian motions $\left (W_{t}^{(1)\mathbb{F}},W_{t}^{(2)\mathbb{F}},W_{t}^{(3)\mathbb{F}},W_{t}^{(4)\mathbb{F}}\right )$ , defined by
The dynamic of the total asset under the $\mathbb{F}$ - measure is then equal to
As $A_{T^{*}}=\frac{A_{T^{*}}}{P(T^{*},T^{*})}$ , we focus on the dynamics of $d\frac{A_{t}}{P(t,T^{*})}$ in order to prove that this ratio is a log-normal process.
Proposition 3. The quantity $\frac{A_{t}}{P(t,T^{*})}$ has a log-normal distribution, $\ln \frac{A_{T^{*}}/P(T^{*},T^{*})}{A_{t}/P(t,T^{*})}$ ∼ $N\left (\mu _{\mathbb{F}}(t),v_{\mathbb{F}}(t)\right )$ , with a time dependent variance equal to
and a time dependent mean given by
The proof is provided in Appendix A, while the full analytical expressions of $v_{\mathbb{F}}(t)$ and $\mu _{\mathbb{F}}(t)$ are detailed in Appendix C. As $\frac{A_{t}}{P(t,T^{*})}$ is log-normal under the forward measure, we can evaluate the call options embedded in the GMAB.
Corollary 1. Let us denote by $\Phi (.)$ is the cumulative distribution function (cdf) of a $N(0,1)$ . If we define
then expected positive differences between $A_{T^{*}}$ , $G_{T^{*}}$ , and $G_{M}$ under the forward measure are given by
This corollary is proven using standard Black & Scholes developments. This last result allows us to infer that the market value of the GMAB (11) is given by
where $\,_{T^{*}}E_{t}^{\mu }$ is given by Equation (5). This closed form is used to measure the accuracy of the PINN for predicting GMAB prices.
5. Scaled Feynman–Kac equation
A PINN integrates principles from physics, including PDEs governing the behavior of state variables. In a financial context, we adopt the Feynman–Kac (FK) equation as our guiding principle. The FK equation is a PDE satisfied by all assets traded in an arbitrage-free market. In this section, we construct this equation for the model under consideration in this article. In the following section, we utilize a neural network to solve it for various investment strategies, contract maturities, and bond maturities. To lighten future developments, we denote the vector of state variables, augmented with the total asset value, by
Under the risk neutral measure $\mathbb{Q}$ , this multivariate process is ruled by the following SDE:
where $\boldsymbol{\mu} _{\boldsymbol{y}}(.)$ is a vector of dimension 5 and $\Sigma _{\boldsymbol{y}}(.)$ is a $5\times 4$ matrix:
The GMAB is a function of time and state variables parameterized by the investment policy $\boldsymbol{\pi }$ , the duration of the contract $T^{*}$ , and the maturity of bonds, $T$ . As we aim to understand the relation between assets and liabilities, we emphasize the dependence to investment parameters by denoting the contract price as follows:
Under the assumption of arbitrage-free market, all traded securities, including the insurance contract, earn on average the risk-free rate, i.e. :
Let us, respectively, denote the gradient and the Hessian of $V(.)$ with respect to $\boldsymbol{y}$ by $\nabla _{\boldsymbol{y}}V$ and $\mathcal{H}_{\boldsymbol{y}}(V)$ :
Applying the Itô’s lemma to $V\left (.\right )$ allows us to rewrite (19) as a partial differential valuation equation:
This last expression is called the “Feynman–Kac (FK) equation”. The boundary constraints on $V(.)$ are
Notice that we have 5 state variables and 4 Brownian motions. The model is, therefore, over-specified, and we can remove one state variable from $\boldsymbol{y}$ , for example, the stock price, which is redundant with respect to the total asset value, $A_{t}$ . Nonetheless, we have retained all state variables in the numerical illustration to observe how the PINN handles this naive over-specification.
Due to the numerous risk factors involved, solving numerically the PDE (20) is numerically challenging. Implementing either an explicit or implicit finite difference method necessitates constructing a 6-dimensional grid in the spacetime of time and state variables, with relatively fine meshes to ensure convergence. In practice, this becomes intractable. As an alternative, we approximate the solution to the FK Equation (20) using a neural network that takes as input the time, state variables, asset mix, and durations of GMAB and bonds. The mathematical definition of a neural network will be revisited in the following section, but it is worth noting that a neural network employs bounded activation functions, such as sigmoid or hyperbolic tangents. To mitigate convergence issues associated with the vanishing gradient problem, we must standardize and scale the network’s inputs. This preprocessing step slightly modifies the FK equation. Let us define the vectors $\boldsymbol{a},\boldsymbol{b}\in \mathbb{R}^{5}$ as follows
$\boldsymbol{a}$ and $\boldsymbol{b}$ are chosen to normalize a random sample of state variables. This point is discussed in the next section. The vector of centered and scaled state variables is denoted by
where $\odot$ is the elementwise product. In a similar manner, we center and scale the time with coefficients $a_{h}$ and $b_{h}$ that will depends on the horizon of valuation. The centered scaled time and durations are defined as follows:
We do not need to scale the vector $\boldsymbol{\pi }\in [0,1]^{3}$ . The normalized state variables are ruled by the SDE:
where $\mu _{\tilde{y}}(.)$ is a $5\times$ vector and $\Sigma _{{\tilde{\boldsymbol{y}}}}(.)$ is a 5 $\times$ 4 matrix:
The value of the contract may be seen as a function of time and rescaled state variables, $L_{t}=V(\tilde{t},\tilde{\boldsymbol{y}}_{t},N_{t}^{\mu },|\,\boldsymbol{\pi },\tilde{T}^{*},\tilde{T})$ . The valuation Equation (20) is then rewritten in rescaled form:
where $\nabla _{{\tilde{\boldsymbol{y}}}}V$ and $\mathcal{H}_{{\tilde{\boldsymbol{y}}}}(V)$ are, respectively, the gradient and the Hessian of $V$ with respect to standardized state variables, ${\tilde{\boldsymbol{y}}}$ . The rescaled version of boundary conditions (21) is
The next section explains how to solve Equation (25) with a neural network. Notice that the stock and total asset values exhibit geometric dynamics. An alternative formulation of the FK equation may be derived by considering log-stock and log-asset prices as state variables. This has the advantage of partially reducing the nonlinearity of the FK PDE. However, in the existing literature, such a transformation is not applied and does not adversely affect the accuracy of PINNs. For instance, Sirignano & Spiliopoulos (Reference Sirignano and Spiliopoulos2018), Glau & Wunderlich (Reference Glau and Wunderlich2022), or Gatta et al. (Reference Gatta, Di Cola, Giampaolo and Piccialli F.Cuomo2023) evaluate European or American basket options by solving the Feynman–Kac equation in a multivariate geometric market without prior log-transform.
In practice, a log-transform is deemed unnecessary because the partial derivatives of the approximated GMAB prices are computed precisely through automatic differentiation, not finite differences. This ensures the robustness of the valuation of the Feynman–Kac equation.
6. Neural networks
We approximate the value function solving Equation (25) using a particular type of neural network. This network takes as input a vector of dimension 11, denoted by ${\boldsymbol{z}}$ , containing the scaled time and state variables, the asset mix vector, and scaled contract and bond maturities:
There is no systematic procedure for determining the optimal structure of a neural network. As proven by Hornik (Reference Hornik1991), we know that single-layer neural networks can approximate regular functions arbitrarily well, but achieving reasonable accuracy may require a high number of neurons. An alternative approach is provided by feed-forward networks, where information flows forward through several neural layers. However, these networks often struggle to replicate non-linear functions with a limited number of layers.
To address this challenge, we adopt a network with an architecture illustrated in Fig. 1 This network is a basic version of recurrent neural networks. During the valuation of the network response, the outputs of neurons in hidden layers, added to the initial data vector, serve as inputs for the immediate next layer. The long short-term memory (LSTM) networks are built on the same principle and are powerful tools for time-series prediction (see, e.g., Denuit et al., Reference Denuit, Hainaut and Trufin2019, Chapter 8). The deep Galerkin method (DGM) network, initially proposed by Sirignano & Spiliopoulos (Reference Sirignano and Spiliopoulos2018) for PINN pricing of basket options, belongs to the family of LSTM models. Recurrent neural networks better replicate non-linear functions than classical feed-forward networks. The same network as the one proposed in this article is successfully applied in Hainaut & Casas (Reference Hainaut and Casas2024) for pricing European options in the Heston model.
The network used in this work presents the same drawbacks as a classical recurrent architecture. A model with too many layers becomes untrainable due to the phenomenon of the “vanishing gradient,” as revealed by Hochreiter et al. (Reference Hochreiter, Bengio, Frasconi and Schmidhuber2001). The lagged responses from hidden neural layers pass through several intermediate activation functions, having an indirect feedback effect on the final response of the network. This effect may become marginal when considering too many intermediate layers, rendering the network untrainable.
Definition Let $l$ , $n_{0}$ , $n_{1}$ , …, $n_{l}\in \mathbb{N}$ be, respectively, the number of layers and neurons in each layer. $n_{0}$ is also the size of input vector. The activation function of layer $k=1,2,\ldots,l$ is noted $\phi _{k}(.)\,{:}\,\mathbb{R}\to \mathbb{R}$ . Let $C_{1}\in \mathbb{R}^{n_{1}}\times \mathbb{R}^{n_{0}}$ , ${\boldsymbol{c}}_{1}\in \mathbb{R}^{n_{1}}$ , $C_{k}\in \mathbb{R}^{n_{k}}\times \mathbb{R}^{n_{0}+n_{k-1}}$ , ${\boldsymbol{c}}_{k}\in \mathbb{R}^{n_{k}}$ for $k=2,\ldots,l-1$ , $C_{l}\in \mathbb{R}^{n_{l}}\times \mathbb{R}^{n_{l-1}}$ , ${\boldsymbol{c}}_{l}\in \mathbb{R}^{n_{l}}$ be neural weights defining the input, intermediate, and output layers. We define the following functions
where activation functions $\phi _{k}(.)$ are applied componentwise. The residual neural network is a function $F\,{:}\,\mathbb{R}^{n_{0}}\to \mathbb{R}^{n_{l}}$ defined by
After having chosen a network architecture, the model is trained by minimizing a loss function that is proportional to errors of approximation. This error is measured by replacing $V(.)$ with $F(.)$ in the scaled FK Equation (25), at random points in the domain.
At time $t\in [0,T^{*}]$ , the domain of the state vector $\boldsymbol{y}_{t}=\left (S_{t},r_{t},\mu _{x_{\mu }+t},\lambda _{x_{\mu }+t},A_{t}\right )^{\top }$ is $\,\mathbb{R}^{+}\times \mathbb{R}\times \mathbb{R}^{+\,3}$ . We approximate this domain by a closed convex subspace:
In order to fit the neural network, we draw a sample of $n_{\mathcal{D}}$ realizations of $\boldsymbol{y}_{t}$ in $\mathcal{D}^{\boldsymbol{y}}$ , at random times. We also sample parameters $\left (\boldsymbol{\pi }_{j},T_{j}^{*},T_{j}\right )_{j=1,\ldots,,n_{\mathcal{D}}}$ under the constraints:
The set of sampled state variables and parameters is noted $\mathcal{S}_{\mathcal{D}}=\left (t_{j},\boldsymbol{y}_{j},\pi _{j},T_{j}^{*},T_{j}\right )_{j=1,\ldots,n_{\mathcal{D}}}$ . We next center and scale state variables and times. The mean and standard deviation of sampled state variables are computed with the standard formulas:
The scaling vectors (22) are defined by $\boldsymbol{a}=-\frac{\bar{\boldsymbol{y}}}{{\boldsymbol{S}}_{\boldsymbol{y}}}$ and $\boldsymbol{b}=\frac{1}{{\boldsymbol{S}}_{\boldsymbol{y}}}$ . For $1,\ldots,n_{\mathcal{D}}$ , we define $\tilde{\boldsymbol{y}_{j}}=\boldsymbol{a}+\boldsymbol{b}\odot \boldsymbol{y}_{j}\,$ . The times, contract, and bond maturities are also rescaled with relations (24) and coefficients
The set of sampled normalized state variables and parameters is denoted by
During the training phase, the error of approximation is measured for all points of this sample sets with the scaled FK Equation (26). The error at expiry is measured with another sample set, denoted by $\mathcal{\tilde{S}}_{T^{*}}$ , of $n_{T^{*}}$ realizations of state variables and parameters:
Let us denote by $\boldsymbol{\Theta },$ the vector containing all neural weights $\left (C_{k},{\boldsymbol{c}}_{k}\right )_{k=1,\ldots,l}$ . At points of $\tilde{\mathcal{S}}_{\mathcal{D}}$ , we define for $j=1,\ldots,n_{\mathcal{D}}$ , the error in Equation (25) when $V$ is replaced by the neural network as follows:
We refer to $e_{j}^{\mathcal{D}}$ as the error on the interior domain since it measures the goodness of fit before expiry. The average quadratic loss on $\tilde{\mathcal{S}}_{\mathcal{D}}$ is the first component of the total loss function used to fit the neural network,
Since the error $e_{j}^{\mathcal{D}}(\boldsymbol{\Theta })$ depends on first- and second-order partial derivatives, special attention must be given to the accuracy of their calculation. Computing these derivatives using a standard finite difference method may introduce numerical instabilities. Therefore, we opt for an alternative approach known as automatic or algorithmic differentiation. Automatic differentiation leverages the fact that every computer calculation executes a sequence of elementary arithmetic operations and elementary functions, allowing us to compute partial derivatives with accuracy up to the working precision. Automatic differentiation is implemented in TensorFlow through functions such as GradientTape(.) and tape.gradient(.). For a more in-depth introduction and perspectives, we refer the reader to van Merriënboer et al. (Reference van Merriënboer, Breuleux and Bergeron2018).
On the boundary sample set $\mathcal{\tilde{S}}_{T^{*}}$ , we define the error $e_{k}^{T^{*}}(\boldsymbol{\Theta })$ for $k=1,\ldots,n_{T^{*}}$ , as the difference between the output of the neural network and the payoff at expiry:
The average quadratic loss at expiry is the second component of the total loss function.
The optimal network weights are found by minimizing the losses in $\tilde{\mathcal{S}}_{\mathcal{D}}$ and $\mathcal{\tilde{S}}_{T^{*}}$ ,
In practice, the minimization is performed with a gradient descent algorithm. For an introduction, we refer, e.g., to Denuit et al. (Reference Denuit, Hainaut and Trufin2019), Section 1.6.
7. Numerical illustration
We fit a Nelson–Siegel model to the Belgian state yield curveFootnote 1 on the 1/9/23. Initial survival probabilities, $\,_{t}p_{x}^{\mu }$ and $\sigma _{\mu }(t)$ , are fitted to male Belgian mortality rates.Footnote 2 Details are provided in Appendix D. Parameters of survival probabilities $\,_{t}p_{x}^{\lambda }$ , of the reference population for the mortality bond, are assumed proportional to these of $\,_{t}p_{x}^{\mu }$ . Reference ages, $x_{\mu }$ and $x_{\lambda }$ , are set to $50$ years. Other market parameters are estimated from daily time series of the Belgian stock index BEL 20 and of the 1 year Belgian state yield from the 1/9/10 to the 1/9/23. The correlations $\rho _{S\,\mu }$ and $\rho _{r\,\mu }$ are set to –5% and 0%. Parameter estimates are reported in Table 1.
We first build a sample set, $\mathcal{S_{D}}$ , of contract features, asset-mixes, and durations. We randomly draw $n_{\mathcal{D}}=100,000$ combinations of risk factors and investment-duration parameters. Most of these quantities are drawn from uniform distributions whose characteristics are reported in Table 2. The contract and bond maturities are drawn from exponential distributions to ensure a smoother allocation of durations than what would be obtained with uniform distributions. As $\pi _{S}$ , $\pi _{P}$ , and $\pi _{D}$ are randomly distributed in $[0,1]$ , we allow for short positions in cash. Using the same distributions as for $\mathcal{S_{D}}$ , we generate a boundary sample set $\mathcal{S}_{T^{*}}$ of size, $n_{T^{*}}=50,000$ . The last step consists in scaling and centering the sample sets, as described in the previous section. Notice that setting $G_{m}=100$ is not constraining. As the payoff is piecewise linear, we can use the rule of thumb to evaluate any contract with another minimum capital. If $V(A_{t},G_{m})$ is the GMAB price at time $t$ , for an asset value $A_{t}$ and a guarantee $G_{m}$ , the value of a contract with a guaranteed capital $G_{m}^{'}\neq G_{m}$ is equal to:
The neural network is implemented in TensorFlow using Python. As mentioned earlier, the necessary partial derivatives for evaluating the inner loss are computed through automatic differentiation. This procedure is implemented in TensorFlow functions like GradientTape(.) and tape.gradient(.), allowing for accurate derivative calculations without relying on finite differences. To train the network, we employ the ADAM optimizer and utilize random batches containing 34,000 and 17,000 samples from $\mathcal{\tilde{S}_{D}}$ and $\mathcal{\tilde{S}}_{T^{*}}$ , respectively. Over the course of one epoch, neural weights are updated three times. We run a total of 26,000 epochs and adjust the ADAM learning rate based on the pattern provided in Table 3, to expedite calibration. The training is conducted on a laptop equipped with an AMD Ryzen 7 5825U processor and 16 Gb of RAM (without a CUDA compatible GPU). The training time depends on the network configuration. As indicated in Table 4, running 200 epochs takes approximately 12–18 minutes for networks with 3 and 5 hidden layers with 32 neurons, respectively. The entire calibration process spans from 26 to 40 hours. Choosing 32 neurons per hidden layer strikes a good balance between accuracy and training time. Tests conducted with 16 neurons over 8000 epochs revealed that this configuration performed less effectively compared to the 32-neuron networks.
Table 4 reports the losses on $\mathcal{\tilde{S}_{D}}$ , $\mathcal{\tilde{S}}_{T^{*}}$ and their total, computed with Equations (29) and (30). We observe a significant reduction of losses between 3 and 4 hidden layers models with 32 neurons. The reduction is relatively smaller when we switch from a 4 to a 5 layers model.
Table 5 presents statistics about relative errorsFootnote 3 between PINN and exact prices obtained with formula (18). These statistics are computed for the 100,000 contracts in the training data set $\tilde{\mathcal{S}}_{\mathcal{D}}$ and for a validation set of same size and built in the same manner. The similarity of figures on both sets does not reveal that we overfit the training dataset. The average relative errors are small and around 0.15% for 4 and 5 hidden layers models. The 95% confidence level is also below 1% for these networks which is quite acceptable.
In the remainder of this section, we focus on the network with 4 hidden layers as it yields the lowest relative errors on training and validation sets. Fig. 2 compares exact and approximated PINN prices for an 8-year GMAB. The upper plots show the values at issuance and at expiry for various values of the total asset. Market conditions are as of September 1, 2023. Across a wide range of asset values, the exact and PINN prices closely align. The largest deviations are observed for the lowest and highest values of $A_{t}$ . The lower plots display the value at issuance of the same contract with $A_{0}\in \{90,100\}$ for $r_{0}$ ranging from 0% and 8%. We observe that the spreads between true and approximated prices depend on both parameters. However, it is noteworthy that the error remains under 1% in most cases.
To understand the circumstances in which relative errors are the highest, we analyze them by subgroups of contracts. In order to have a sufficient number of representatives in each sub-category, we generate a third validation set, this time consisting of 500,000 contracts instead of 100,000. Table 6 presents the averages and standard deviations of relative errors for 20 classes of contracts grouped by asset values. We observe that the highest errors occur with the lowest asset values. The left heatmap in Fig. 3 displays relative errors by categories of asset values and residual contract maturities, $T^{*}-t$ . We draw two conclusions from this graph. First, for $A_{t}\leq 80$ , the error increases with the residual contract duration and may be significant. Second, the errors in the usual pricing range, around the guarantee level $G_{m}=100$ , remain close to 0.01% regardless of the time horizon.
It is worth noting that for low asset values, GMAB is deeply out of the money, and the contract price is very close to that of a pure endowment for which we have a closed-form expression. One way to improve accuracy would be to add a third penalty term to the loss function (31). This term would penalize the quadratic spread between PINN and endowment prices for low asset values. However, we deliberately avoided this solution to observe how the network behaves when provided with minimal information.
Table 7 presents the means and standard deviations of errors for 10 groups of contracts, categorized by ranges of interest rates, $r_{t}$ . Regardless of the category, the average relative error remains close to 0.14%. The right heatmap in Fig. 3 suggests that, on average, errors increase with maturity. However, this observation should be qualified by the noticeable variability of errors between consecutive interest rate classes for the same maturity. This variability indicates that the increase in errors is partly attributed to other factors, such as the low asset values present within each of the interest rate classes.
Tables 8 and 9 provide information about errors for contracts categorized by classes of mortality rates. Similar to interest rates, the average error remains small and close to 0.14% across all class. The heatmaps of Fig. 4 show a slight increase in errors with the contract time to expiry. Once again, we observe a higher variability of errors for longer maturities. As mentioned earlier, this variability signals that the increase in errors is influenced by factors beyond mortality rates.
Tables 10 and 11 provide statistics on relative errors for contracts categorized by fractions of stocks and bonds (including interest rate and mortality bonds). The left plot in Fig. 5 displays a heatmap of errors based on these two parameters. Interestingly, we do not observe any particular trend, and the pricing error remains acceptable regardless of the asset mix. This indicates that the PINN can be used for ALM purposes, particularly in approximating the impact of changes in asset mix on contract fair values.
Nevertheless, the right heatmap of Fig. 5 reveals that the error increases with the time to expiry. To illustrate this, we compare the exact and PINN prices of a 5- and 8-year contract in Fig. 6, where $\pi _{S}$ ranges from 0 to 1 and $\pi _{D}=\pi _{M}=\frac{1-\pi _{S}}{2}$ . This graph shows that the PINN captures the trend of the relationship between prices and the fraction of stocks, but the error may reach 1% for certain combinations of parameters and risk factors. While this error may appear relatively high, it should be considered in the context of existing alternatives. For instance, in risk management, a common approach to valuing a contract at a future date relies on the least aquare Monte-Carlo (LSMC) method. As demonstrated in Hainaut & Akbaraly (Reference Hainaut and Akbaraly2023) for a similar contract, the average valuation error often exceeds 1% and can even surpass 3% in extreme cases. Replacing LSMC with the PINN in such a context would clearly reduce the overall valuation error.
We conclude from the error analysis that the average accuracy of a PINN is sufficient for risk management and ALM purposes. The significant advantage of using this approach is the substantial time savings once the calibration is completed. Computing prices for a sample set of 500,000 contracts, each with varying durations, asset mixes, and market conditions, takes less than 45 seconds. In contrast, evaluating the same portfolio using a standard MC method would require a considerable amount of time because we would need to regenerate a sufficient number of sample paths for each initial combination of risk factors. The PINN model is capable of pricing an impressive range of contracts, and with additional computing power, it is likely possible to further improve both accuracy and the diversity of contracts.
8. Conclusions
In this work, we explore the capability of PINNs to price GMABs. The first part of the article introduces market and biometric risk factors within a Gaussian framework. We also develop an analytical formula for GMAB pricing to serve as a benchmark for the PINN. Next, we propose a scaled version of the Feynman–Kac (FK) equation, which helps mitigate the vanishing gradient phenomenon during training. An approximate solution is constructed using a neural network with skip connections. We conclude with a detailed analysis of pricing errors.
This study highlights several advantages of PINNs. First, they excel in evaluating products exposed to multiple risk factors. Alternative procedures, such as those based on finite differences or FFT, become challenging and time-consuming when dealing with more than two risk factors. Second, PINNs offer parameterization options, allowing flexibility in incorporating contract features such as duration or asset mix. This flexibility sets them apart from other methods that require retraining for each unique combination of contract features. Third, once trained, PINNs provide nearly instantaneous valuation. We can appraise a portfolio of 500,000 GMABs with various market conditions and features in less than a minute. In contrast, performing the same exercise with a MC method would entail simulating sample paths for each initial risk factor combination.
However, PINNs also have some drawbacks. First, there is no systematic procedure to determine the optimal network architecture and training set size. After several attempts, we opted for a network with direct connections between input and hidden layers. Second, the training time can be lengthy, although parallelization on GPUs may help reduce it. On a positive note, updating the network to adapt to new market conditions is expected to be less time-consuming when initialized with neural weights obtained after the initial training.
Numerical results are undeniably encouraging. Within the typical range of risk factors, a PINN with four intermediate layers approximates prices with a relative error smaller than ten basis points. Furthermore, the PINN effectively captures the relationship between GMAB prices, risk factors, and asset mix. Its accuracy is sufficient to conduct sensitivity analyses on contract specifications and test various asset allocation strategies.
However, we do observe non-negligible errors for very low asset values. In these cases, the embedded call option in the GMAB is deeply out of the money, and the contract price closely resembles that of a pure endowment. Additionally, we notice that errors tend to slightly increase with the contract’s maturity. While errors may appear relatively high in certain configurations, it is crucial to consider these results in the context of existing alternatives. For example, in risk management, the common approach for valuing contracts at a future date relies on the LSMC method, which is known to be significantly less accurate (as demonstrated in Hainaut & Akbaraly, Reference Hainaut and Akbaraly2023).
This article lays the groundwork for future experiments. First, errors for deep out-of-the-money contracts can be reduced by introducing a third penalty term into the loss function (31). As mentioned earlier, in such cases, the GMAB value is nearly identical to the value of a pure endowment, which can be calculated analytically. Therefore, we can include a term in the loss function that penalizes the quadratic spread between PINN and endowment prices when the asset value falls below a certain threshold. We deliberately avoided this solution initially to assess how the network performs with minimal information. Second, it would be interesting to expand the range of parameters and contract features to generalize the procedure. Lastly, we have not explored all possible neural architectures, and there may be room for optimization to enhance accuracy and reduce computing time.
Data availability statements
The data and code that support the findings of this study are available from the corresponding author (DH) upon reasonable request.
Financial support
The first author thanks the FNRS (Fonds de la recherche scientifique) for the financial support through the EOS project Asterisk (research grant 40007517).
Competing interests
The author declares none.
Appendix A
The next proposition enable us to infer the expressions of $\gamma _{r}(t)$ and $\gamma _{z\in \{\mu,\lambda \}}(t)$ matching the initial term structures of interest and mortality rates.
Proposition 4. At time $0\leq t\leq T$ , the value of the discount bond of maturity $T$ is equal to
The survival probability up to time $T$ for $z\in \{\mu,\lambda \}$ , is given by
The pure endowments admit the following expressions:
and
Sketch of the proof
We can show by direct differentation that interest and mortality rates are equal to
The integrals of interest and mortality rates are obtained by direct integration
The results follow from the log-normality of $e^{-\int _{t}^{T}z_{s}ds}$ for $z\in \{r,\mu,\lambda \}$ , from $\epsilon _{rr}^{2}+\epsilon _{r\mu }^{2}+\epsilon _{r\lambda }^{2}=1$ and $\epsilon _{\mu \mu }^{2}+\epsilon _{\mu \lambda }^{2}=1$ .
end
From Equation (A1), we deduce that the function $\gamma _{r}(u)$ must satisfy the next relation to match the initial yield curve of zero-coupon bond:
Deriving twice this expression leads to the following useful reformulation of $\gamma _{r}(T)$ :
where $-\partial _{T}\ln P(0,T)$ is the instantaneous forward rate. For a given initial mortality curve $\,_{T}p_{x_{z}}^{z}$ , $z\in \{\mu,\lambda \}$ , we show in a similar manner that the function $\gamma _{z}(u)$ satisfies the relation
Equations (A8) and (A9) allows us to rewrite bond prices, survival probabilities and endowments as function of initial term structures of mortality and interest rates. Analytical expressions are provided in Proposition 1 the proof is summarized below.
Proposition 1: sketch of the proof
By direct integration of Equations (A8) and (A9), we obtain that
and
Combining these expressions with these of Proposition 4.
end
Proposition 2 : sketch of the proof
Let us denote the mortality indexed discount factor by $\,_{T}m_{t}^{\lambda }=\mathbb{E^{Q}}\left (e^{-\int _{t}^{T}\left (r_{s}+\lambda _{x_{\lambda }+s}\right )ds}|\mathcal{F}_{t}\right )$ such that $D(t,T)=e^{-\int _{0}^{t}\lambda _{x_{\lambda }+s}ds}\,_{T}m_{t}^{\lambda }$ . The differential of $D(t,T)$ then equal to
From Proposition 5, we infer that the mortality indexed discount factor is ruled by the SDE:
From Equation (A10), we infer the dynamics under $\mathbb{Q}$ , Equation (8). As $d\boldsymbol{W}_{t}=d\tilde{\boldsymbol{W}}_{t}+\boldsymbol{\theta }dt$ with
we obtain Equation (9).
end
The next result presents the dynamics of the discount bond and endowment under the pricing measure. This is a direct consequence of the Itô’s lemma applied to Proposition 4.
Proposition 5. Under the risk neutral measure $\mathbb{Q}$ , the dynamics of the zero-coupon bond and of the pure endowment at time $t\leq T$ are given by
As $\mathbb{E}^{\mathbb{Q}}\left (d\textbf{1}_{\{\tau _{\mu }\geq t\}}\right )=-\mu _{x+t}dt$ for, we check that the pure endowment has a return equal to the risk-free rate: $\mathbb{E}^{\mathbb{Q}}\left (d\,_{T}E_{t}^{\mu }\right )={}_{T}E_{t}^{\mu }\,r_{t}\,dt$ .
Proposition 3: sketch of the proof
From the Itô’s lemma, the dynamic of $\frac{1}{P(t,T^{*})}$ under the risk neutral measure $\mathbb{Q}$ is equal to
From Equation (13), we infer that under $\mathbb{F}$ , this ratio is ruled by
On the other hand, the differential of $\frac{A_{t}}{P(t,T^{*})}$ is related to $dA_{t}$ and $d\left (1/P(t,T^{*})\right )$ by the relation:
From Equations (A12) and (14), we rewrite this last expression as follows:
Applying the Itô’s lemma to $\ln \left (A_{t}/P(t,T^{*})\right )$ , leads to the following result:
This last equation emphasizes that $\ln \frac{A_{T^{*}}/P(T^{*},T^{*})}{A_{t}/P(t,T^{*})}\sim N\left (\mu _{\mathbb{F}}(t),v_{\mathbb{F}}(t)\right )$ .
end
Appendix B
This appendix provides various integrals used for calculating analytical prices. For $t\leq T^{*}$ , $T^{*}\leq T$ , $T^{*}\leq S$ and $y\in \mathbb{R}^{+}$ , we have the following expressions:
The combined integrals of $B_{y}(t,T)$ , $\sigma _{\mu }(t)$ and $\sigma _{\lambda }(t)$ are:
Appendix C
This appendix, provides the full analytical expressions of the variance and mean of the lognormal random process, $\ln \frac{A_{T^{*}}/P(T^{*},T^{*})}{A_{t}/P(t,T^{*})}$ (see Proposition 3). The variance $v_{\mathbb{F}}(t)$ is developed as the following sum:
The integral in the second term of the right hand side is a $3\times 4$ matrix equal to
The integral in the third term of the right hand side of Equation (3) is a symmetric $3\times 3$ matrix:
with the following entries:
The mean function, $\mu _{\mathbb{F}}(t)$ , is developed as follows
The integrals in the third and fourth terms of Equation (C4) are respectively equal to
and
The integrals in Equations (C2), (C3), (C5) and (C6) are detailed in Appendix B.
Appendix D
In the numerical illustration, we model the initial yield curve with the Nelson-Siegel (NS) model. In this framework, initial instantaneous forward rates are provided by the following function:
Parameters $\{b_{0},b_{10},b_{11},c_{1}\}$ are estimated by minimizing the quadratic spread between market and model zero-coupon yields:
We fit the NS model to the yield curve of Belgian state bonds observed on the first of September 2023 and obtain estimates reported in Table D.1.
The volatility of mortality rates is fitted by least square minimization of spreads between $\sigma _{x}(.)$ and empirical deviations of variations of mortality rates by cohort (ages between 20 and 90 years from 1950 to 2020). If $\mu _{x}^{(y)}$ is the observed mortality rates at age $x$ during the calendar year $y$ , we denote by $\Delta \mu _{x}^{(y)}=\mu _{x}^{(y)}-\mu _{x-1}^{(y-1)}$ and by $S_{x}$ the standard deviation of $\Delta \mu _{x}^{(y)}$ for $y$ =1950 to 2020. The $\alpha _{\mu }$ and $\beta _{\mu }$ are obtained by minimizing the sum
On the other hand, the initial curve of survival probabilities is described by a Makeham’s model, i.e.
where $a^{(\mu )},b^{(\mu )},c^{(\mu )}\in \mathbb{R}^{+}$ . These parameters and the reversion speed $\kappa _{u}$ are obtained by least square minimization of spreads between prospective and model survival probabilities. Prospective survival probabilities are computed with a Lee-Carter model fitted to Belgian mortality rates from 1950 to 2020 for 0 to 105 years, male population. Model $\,_{t}p_{x}$ are computed with Equation (25) for $x=20$ years old. Estimated parameters are provided in Table D.2.