Hostname: page-component-cd9895bd7-hc48f Total loading time: 0 Render date: 2024-12-27T05:44:44.439Z Has data issue: false hasContentIssue false

TREE-BASED MACHINE LEARNING METHODS FOR MODELING AND FORECASTING MORTALITY

Published online by Cambridge University Press:  20 May 2022

Dorethe Skovgaard Bjerre*
Affiliation:
CREATES and Department of Economics and Business Economics Aarhus University Fuglesangs Allé 4 8210 Aarhus V, Denmark E-Mail: dorethebjerre@econ.au.dk

Abstract

Machine learning has recently entered the mortality literature in order to improve the forecasts of stochastic mortality models. This paper proposes to use two pure, tree-based machine learning models: random forests and gradient boosting, based on the differenced log-mortality rates to produce more accurate mortality forecasts. These forecasts are compared with forecasts from traditional, stochastic mortality models and with forecasts from random forests and gradient boosting variants of the stochastic models. The comparisons are based on the Model Confidence Set procedure. The results show that the pure, tree-based models significantly outperform all other models in the majority of cases considered. To address the lack of interpretability issue associated with machine learning models, we demonstrate how to extract information about the relationships uncovered by the tree-based models. For this purpose, we consider variable importance, partial dependence plots, and variable split conditions. Results from the in-sample fit suggest that tree-based models can be very useful tools for detecting patterns within and between variables that are not commonly identifiable with traditional methods.

Type
Research Article
Copyright
© The Author(s), 2022. Published by Cambridge University Press on behalf of The International Actuarial Association

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Aburto, J.M., Wensink, M., van Raalte, A. and Lindahl-Jacobsen, R. (2018) Potential gains in life expectancy by reducing inequality of lifespans in Denmark: An international comparison and cause-of-death analysis. BMC Public Health, 18(1), 831. doi: 10.1186/s12889-018-5730-0.CrossRefGoogle ScholarPubMed
Anderson, B.A. and Silver, B.D. (1989) The changing shape of soviet mortality, 1958-1985: An evaluation of old and new evidence. Population Studies, 43(2), 243265. doi: 10.1080/0032472031000144106.CrossRefGoogle ScholarPubMed
Bernardi, M. and Catania, L. (2018) The model confidence set package for R. International Journal of Computational Economics and Econometrics, 8(2), 144158. doi: 10.1504/IJCEE.2018.091037.CrossRefGoogle Scholar
Blum, A. and Monnier, A. (1989) Recent mortality trends in the U.S.S.R.: New evidence. Population Studies, 43(2), 211241. doi: 10.1080/0032472031000144096.CrossRefGoogle Scholar
Breiman, L. (2001) Random forests. Machine Learning, 45, 532. doi: 10.1023/ A:1010933404324.CrossRefGoogle Scholar
Breiman, L., Friedman, J., Stone, C.J. and Olshen, R.A. (1984) Classification and Regression Trees. Belmont, CA: Wadsworth International Group.Google Scholar
Cairns, A.J.G., Blake, D. and Dowd, K. (2006) A two-factor model for stochastic mortality with parameter uncertainty: Theory and calibration. Journal of Risk & Insurance, 73(4), 687718. doi: 10.1111/j.1539-6975.2006.00195.x.CrossRefGoogle Scholar
Cairns, A.J.G., Blake, D., Dowd, K., Coughlan, G.D., Epstein, D., Ong, A. and Balevich, I. (2009) A quantitative comparison of stochastic mortality models using data from England and Wales and the United States. North American Actuarial Journal, 13(10, 135. doi: 10.1080/10920277.2009.10597538.CrossRefGoogle Scholar
Cairns, A.J.G., Blake, D. and Dowd, K. (2008) Modelling and management of mortality risk: A review. Scandinavian Actuarial Journal, 2–3, 79113. doi: 10.1080/03461230802173608.CrossRefGoogle Scholar
Currie, I.D. (2006) Smoothing and forecasting mortality rates with P-splines. Talk given at the Institute of Actuaries. http:www.ma.hw.ac.uk/~{}iain/research/talks.html (visited on 11/03/2020).Google Scholar
Deng, H. (2019) Interpreting tree ensembles with inTrees. International Journal of Data Science and Analytics, 7, 277287. doi: 10.1007/s41060-018-0144-8.CrossRefGoogle Scholar
Deprez, P., Shevchenko, P.V. and Wüthrich, M.V. (2017) Machine learning techniques for mortality modeling. European Actuarial Journal, 7(2), 337352. doi: 10.1007/s13385-017-0152-4.Google Scholar
Friedman, J.H. (2001) Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5), 11891232. doi: 10.1214/aos/1013203451.CrossRefGoogle Scholar
Friedman, J.H. (2002) Stochastic gradient boosting. Computational Statistics & Data Analysis, 38(4), 367378. doi: 10.1016/S0167-9473(01)00065-2.CrossRefGoogle Scholar
Fung, M., Peters, G. and Shevchenko, P. (2017) A united approach to mortality modelling using state-space framework: Characterisation, identification, estimation and forecasting. Annals of Actuarial Science, 11(2), 343389. doi: 10.1017/S1748499517000069.CrossRefGoogle Scholar
Fung, M., Peters, G. and Shevchenko, P. (2019) Cohort effects in mortality modelling: A Bayesian state-space approach. Annals of Actuarial Science, 13(1), 109144. doi: 10.1017/S1748499518000131.CrossRefGoogle Scholar
Greenwell, B., Boehmke, B., Cunningham, J. and Developers, GBM (2020) Package ‘gbm’ (version 2.1.8). https://cran.r-project.org/web/packages/gbm/gbm.pdf (visited on 10/18/2021).Google Scholar
Grønborg, N.S., Lunde, A., Timmermann, A. and Wermers, R. (2020) Picking funds with confidence. Journal of Financial Economics. doi: 10.1016/j.jfineco.2020.07.003.Google Scholar
Hainaut, D. (2018) A neural-network analyzer for mortality forecast. Astin Bulletin, 48(2), 481508. doi: 10.1017/asb.2017.45.CrossRefGoogle Scholar
Hansen, P.R., Lunde, A. and Nason, J.M. (2011) The model confidence set. Econometrica, 79(2), 453497. doi: 10.3982/ECTA5771.Google Scholar
Hiam, L., Harrison, D., McKee, M. and Dorling, D. (2018) Why is life expectancy in England and Wales ‘stalling’? Journal of Epidemiology & Community Health, 72(5), 404408. doi: 10.1136/jech-2017-210401.CrossRefGoogle Scholar
Ho, J.Y. and Hendi, A.S. (2018) Recent trends in life expectancy across high income countries: Retrospective observational study. BMJ, 362, k2562. doi: 10.1136/bmj.k2562.CrossRefGoogle ScholarPubMed
Human Mortality Database. (2020) University of California, Berkeley (USA), and Max Planck Institute for Demographic Research (Germany). Data downloaded on 29 March 2020. www.mortality.org.Google Scholar
James, G., Witten, D., Hastie, T. and Tibshirani, R. (2013) An Introduction to Statistical Learning: with Applications in R, 1st ed. New York: Springer. doi: 10.1007/978-1-4614-7138-7.CrossRefGoogle Scholar
Laurent, S., Rombouts, J.V.K. and Violante, F. (2011) On the forecasting accuracy of multivariate GARCH models. Journal of Applied Econometrics, 27(6), 934955. doi: 10.1002/jae.1248.CrossRefGoogle Scholar
Lee, R. and Carter, L.R. (1992) Modeling and forecasting of U.S. mortality. Journal of the American Statistical Association, 87(419), 659675. doi: 10.1080/01621459.1992.10475265.Google Scholar
Levantesi, S. and Nigri, A. (2020) A random forest algorithm to improve the Lee-Carter mortality forecasting: Impact on q-forward. Soft Computing, 24, 85538567. doi: 10.1007/s00500-019-04427-z.CrossRefGoogle Scholar
Levantesi, S., Nigri, A. and Piscopo, G. (2020) Longevity risk management through machine learning: State of the art. Insurance Markets and Companies, 11(1), 1120. doi: 10.21511/ins.11(1).2020.02.CrossRefGoogle Scholar
Levantesi, S. and Pizzorusso, V. (2019) Application of machine learning to mortality modeling and forecasting. Risks, 7(1), 26. doi: 10.3390/risks7010026.CrossRefGoogle Scholar
Li, N. and Lee, R. (2005) Coherent mortality forecasts for a group of populations: An extension of the Lee-Carter method. Demography, 42(3), 575594. doi: 10.1353/dem.2005.0021.CrossRefGoogle ScholarPubMed
Li, N., Lee, R. and Gerland, P. (2013) Extending the Lee-Carter method to model the rotation of age patterns of mortality decline for long-term projections. Demography, 50(6), 20372051. doi: 10.1007/s13524-013-0232-2.CrossRefGoogle ScholarPubMed
Liaw, A. (2018) Package ‘randomForest’ (version 4.6-14). https://cran.r-project.org/web/packages/randomForest/randomForest.pdf (visited on 10/18/2021).Google Scholar
Liu, L.Y., Patton, A.J. and Sheppard, K. (2015) Does anything beat 5-minute RV? A comparison of realized measures across multiple asset classes. Journal of Econometrics, 187(1), 293311. doi: 10.1016/j.jeconom.2015.02.008.Google Scholar
Medeiros, M.C., Vasconcelos, G.F.R., Veiga, A. and Zilberman, E. (2019) Forecasting ination in a data-rich environment: The benefits of machine learning methods. Journal of Business & Economic Statistics, 122. doi: 10.1080/07350015.2019.1637745.Google Scholar
Nigri, A., Levantesi, S. and Marino, M. (2021) Life expectancy and lifespan disparity forecasting: A long short-term memory approach. Scandinavian Actuarial Journal, 2021(2), 110133. doi: 10.1080/03461238.2020.1814855.CrossRefGoogle Scholar
Nigri, A., Levantesi, S., Marino, M., Scognamiglio, S. and Perla, F. (2019) A deep learning integrated Lee-Carter model. Risks, 7(1), 33. doi: 10.3390/risks7010033.CrossRefGoogle Scholar
Oeppen, J. (2008) Coherent forecasting of multiple-decrement life tables: A test using Japanese cause of death data. European Population Conference 2008, European Association for Population Studies.Google Scholar
Plat, R. (2009) On stochastic mortality modeling. Insurance Mathematics and Economics, 45(3), 393404. doi: 10.1016/j.insmatheco.2009.08.006.CrossRefGoogle Scholar
Renshaw, A.E. and Haberman, S. (2006) A cohort-based extension to the Lee-Carter model for mortality reduction factors. Insurance: Mathematics and Economics, 38(3), 556570. doi: 10.1016/j.insmatheco.2005.12.001.Google Scholar
Richman, R. and Wüthrich, M.V. (2019) A neural network extension of the LeeCarter model to multiple populations. Annals of Actuarial Science, 121. doi: 10.1017/S1748499519000071.Google Scholar
Schnürch, S. and Korn, R. (2021) Point and interval forecasts of death rates using neural networks. ASTIN Bulletin, 128. doi: 10.1017/asb.2021.34.Google Scholar
Shang, H.L. and Haberman, S. (2020) Retiree mortality forecasting: A partial age-range or a full age-range model? Risks, 8(3), 69. doi: 10.3390/risks8030069.CrossRefGoogle Scholar
Villegas, A.M., Kaishev, V.K. and Millossovich, P. (2018) StMoMo: An R package for stochastic mortality modeling. Journal of Statistical Software, 84(3), 138. doi: 10.18637/jss.v084.i03.CrossRefGoogle Scholar
Zhang, Y. and Haghani, A. (2015) A gradient boosting method to improve travel time prediction. Transportation Research Part C: Emerging Technologies, 58, 308324. doi: 10.1016/j.trc.2015.02.019.CrossRefGoogle Scholar
Supplementary material: File

Bjerre supplementary material

Bjerre supplementary material 1

Download Bjerre supplementary material(File)
File 81 KB
Supplementary material: PDF

Bjerre supplementary material

Bjerre supplementary material 2

Download Bjerre supplementary material(PDF)
PDF 591.6 KB