Hostname: page-component-745bb68f8f-l4dxg Total loading time: 0 Render date: 2025-01-07T17:22:20.677Z Has data issue: false hasContentIssue false

Rationale and Applications of Survival Tree and Survival Ensemble Methods

Published online by Cambridge University Press:  01 January 2025

Yan Zhou*
Affiliation:
University of California, Los Angeles
John J. McArdle
Affiliation:
University of Southern California
*
Correspondance should be sent to Yan Zhou, Mary S. Easton Center for Alzheimer’s Disease Research, Department of Neurology, University of California, Los Angeles, 10911 Weyburn Avenue, Suite 200, Los Angeles, CA 90095, USA. E-mail: YanZhou@mednet.ucla.edu

Abstract

Classification and Regression Trees (CART), and their successors—bagging and random forests, are statistical learning tools that are receiving increasing attention. However, due to characteristics of censored data collection, standard CART algorithms are not immediately transferable to the context of survival analysis. Questions about the occurrence and timing of events arise throughout psychological and behavioral sciences, especially in longitudinal studies. The prediction power and other key features of tree-based methods are promising in studies where an event occurrence is the outcome of interest. This article reviews existing tree algorithms designed specifically for censored responses as well as recently developed survival ensemble methods, and introduces available computer software. Through simulations and a practical example, merits and limitations of these methods are discussed. Suggestions are provided for practical use.

Type
Original Paper
Copyright
Copyright © 2014 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Electronic Supplementary Material The online version of this article (doi:10.1007/s11336-014-9413-1) contains supplementary material, which is available to authorized users.

References

Berk, v, (2008). Statistical learning from a regression perspective. New York, NY: Springer.Google Scholar
Breiman, L. (1996). Bagging predictors. Machine Learning, 24, 123140.CrossRefGoogle Scholar
Breiman, L. (2001). Random forests. Machine Learning, 45, 532.CrossRefGoogle Scholar
Breiman, L. (2002). Software for the masses. Department of Statistics, University of California, Berkeley. Retrieved from http://www.stat.berkeley.edu/~breiman/wald2002-3.pdf. Accessed 1 July 2014.Google Scholar
Breiman, L. (2003a). How to use survival forests. Department of Statistics, University of California, Berkeley. Retrieved from http://www.stat.berkeley.edu/~breiman/SF_Manual.pdf. Accessed 1 July 2014.Google Scholar
Breiman, L., (2003b). Manual—setting up, using and understanding random forests V4.0. Retrieved from http://www.stat.berkeley.edu/~breiman/Using_random_forests_v4.0.pdf. Accessed 1 July 2014.Google Scholar
Breiman, L., Friedman, J. H., Olshen, R., & Stone, C. J., (1984). Classification and regression trees. New York, NY: Chapman & Hall.Google Scholar
Butler, J., Gilpin, E., Gordon, L., & Olshen, R., (1989). Tree-structured survival analysis. II. Technical report, Department of Biostatistics, Stanford University.Google Scholar
Ciampi, A., Thiffault, J., Nakache, J.P., & Asselain, B. (1986). Stratification by stepwise regression, correspondence analysis and recursive partition: A comparison of three methods of analysis for survival data with covariates. Computational Statistics & Data Analysis, 4, 185204.CrossRefGoogle Scholar
Cox, D. R. (1972). Regression models and life tables. Journal of the Royal Statistical Society Series B, 34(2), 187–220.CrossRefGoogle Scholar
Cox, D. R., & Oakes, D., (1984). Analysis of survival data. London: Chapman & Hall.Google Scholar
Davis, R., & Anderson, J. (1989). Exponential survival trees. Statistics in Medicine, 8, 947961.CrossRefGoogle ScholarPubMed
DeWit, D. J., Adlaf, E. M., Offord, D. R., & Ogborne, A. C., (2000). Age at first alcohol use: A risk factor for the development of alcohol disorders. American Journal of Psychiatry, 157(5), 745–750.CrossRefGoogle Scholar
Gordon, L., & Olshen, R.A. (1985). Tree-structured survival analysis. Cancer Treatment Reports, 69, 10651069.Google ScholarPubMed
Graf, E., Schmoor, C., Sauerbrei, W., & Schumacher, M. (1999). Assessment and comparison of prognostic classification schemes for survival data. Statistics in Medicine, 18, 25292545.3.0.CO;2-5>CrossRefGoogle ScholarPubMed
Harrell, F., Califf, R., Pryor, D., Lee, K., & Rosati, R. (1982). Evaluating the yield of medical tests. Journal of the American Medical Association, 247, 25432546.CrossRefGoogle ScholarPubMed
Henning, K. R., & Frueh, B. C., (1996). Cognitive-behavioral treatment of incarcerated offenders: An evaluation of the Vermont Department of Corrections’ cognitive self-change program. Criminal Justice and Behavior, 23, 523–541.CrossRefGoogle Scholar
Hothorn, T., Bühlmann, P., Dudoit, S., Molinaro, A., & van der Laan, M. J., (2006a). Survival ensembles. Biostatistics, 7(3), 355–373.CrossRefGoogle Scholar
Hothorn, T., Hornik, K., Strobl, C., & Zeileis, A. (2010). Package ‘party’: A laboratory for recursive part(y)itioning (R package Version 0.9-9997) [Computer software]. Retrieved from http://cran.r-project.org/web/packages/party/index.html. Accessed 15 Oct 2010.Google Scholar
Hothorn, T., Hornik, K., & Zeileis, A. (2006). Unbiased recursive partitioning: A conditional inference framework. Journal of Computational and Graphical Statistics, 15, 651674.CrossRefGoogle Scholar
Hothorn, T., Lausen, B., Benner, A., & Radespiel-Tröger, M. (2004). Bagging survival trees. Statistics in Medicine, 23, 7791.CrossRefGoogle ScholarPubMed
Hothorn, T., & Zeileis, A., (2012). Package ‘partykit’: A Toolkit for Recursive Partytioning (R package Version 0.1-6) [Computer software]. Retrieved from http://cran.r-project.org/web/packages/partykit/index.html. Accessed 3 Sept 2013.Google Scholar
Intrator, O., & Kooperberg, C., (1995). Trees and splines in survival analysis. Statistical Methods in Medical Research, 4(3), 237–261.CrossRefGoogle Scholar
Ishwaran, H., & Kogalur, U. B. (2010). Package ‘randomSurvivalForest’: Random survival forest. (R package Version 3.6.3) [Computer Software]. Retrieved from http://cran.r-project.org/web/packages/randomSurvivalForest/index.html. Accessed 15 Oct 2010.Google Scholar
Ishwaran, H., Kogalur, U. B., Blackstone, E. H., & Lauer, M. S., (2008). Random survival forests. The Annals of Applied Statistics, 2(3), 841–860.CrossRefGoogle Scholar
Keleş, S., & Segal, M. R., (2002). Residual-based tree structured survival analysis. Statistics in Medicine, 21, 313–326.CrossRefGoogle Scholar
LeBlanc, M., & Crowley, J. (1992). Relative risk trees for censored survival data. Biometrics, 48, 411425.CrossRefGoogle ScholarPubMed
LeBlanc, M., & Crowley, J. (1993). Survival trees by goodness of split. Journal of the American Statistical Association, 88, 457467.CrossRefGoogle Scholar
Mantel, N. (1966). Evaluation of survival data and two new rank order statistics arising in its consideration. Cancer Chemotherapy Reports, 50(3), 163170.Google ScholarPubMed
Mertens, J. R., Kline-Simon, A. H., Delucchi, K. L., Moore, C., & Weisner, C. M., (2012). Ten-year stability of remission in private alcohol and drug outpatient treatment: Non-problem users versus abstainers. Drug and Alcohol Dependence, 125(1), 67–74.CrossRefGoogle Scholar
McArdle, J. J., (2011). Exploratory data mining using CART in the behavioral sciences. In H. Cooper, P. Camic, D. Long, A. T. Panter, D. Rindskopf, & K. Sher (Eds.), APA handbook of research methods in psychology. Washington, DC: The American Psychological Association.Google Scholar
Molinaro, A. M., Dudoit, S., & van der Laan, M. J., (2004). Tree-based multivariate regression and density estimation with right-censored data. Journal of Multivariate Analysis, 90, 154–177.CrossRefGoogle Scholar
Morgan, J. N., & Sonquist, J. A., (1963). Problems in the analysis of survey data, and a proposal. Journal of the American Statistical Association, 58, 415–434.CrossRefGoogle Scholar
Morita, J. G., Lee, T. W., & Mowday, R. T., (1993). The regression-analog to survival analysis: A selected application to turnover research. Academy of Management Journal, 36(6), 1430–1464.CrossRefGoogle Scholar
Peters, A., Hothorn, T., Ripley, B. D., Therneau, T., & Atkinson, B., (2009). Package ‘ipred’: Improved Predictors. (R package Version 0.9-3) [Computer Software]. Retrieved from http://cran.r-project.org/web/packages/ipred/index.html. Accessed 1 July 2014.Google Scholar
Peto, R., & Peto, J. (1972). Asymptotically efficient rank invariant test procedures. Journal of the Royal Statistical Society Series A, 135(2), 185207.CrossRefGoogle Scholar
Schemper, M., & Stare, J. (1996). Explained variation in survival analysis. Statistics in Medicine, 15, 19992012.3.0.CO;2-D>CrossRefGoogle ScholarPubMed
Segal, M. R., (1988). Regression trees for censored data. Biometrics, 44, 35–47.CrossRefGoogle Scholar
Schapire, R. E., (1999). A brief introduction to boosting. In Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence (IJCAI 99) (pp. 1401–1405).Google Scholar
Singer, J. D., & Willett, J. B., (1991). Modeling the days of our lives: Using survival analysis when designing and analyzing longitudinal studies of duration and the timing of events. Psychological Bulletin, 110(2), 268.CrossRefGoogle Scholar
Singer, J. D., & Willett, J. B., (2003). Applied longitudinal data analysis. New York, NY: Oxford.CrossRefGoogle Scholar
Stone, M. (1974). Choice and assessment of statistical predictions. Journal of the Royal Statistical Society Series B, 36, 111133.CrossRefGoogle Scholar
Strobl, C., Malley, J., & Tutz, G. (2009). An introduction to recursive partitioning: Rational, application, and characteristics of classification and regression trees, bagging, and random forests. Psychological Methods, 14(4), 323348.CrossRefGoogle Scholar
Therneau, T. M., & Atkinson, B., (2010). Package ‘rpart’: Recursive partitioning (R package Version 3.1-48) [Computer software]. Retrieved from http://cran.r-project.org/web/packages/rpart/index.html. Accessed 15 Oct 2010.Google Scholar
Therneau, T. M., Grambsch, P. M., & Fleming, T. R., (1990). Martingale-based residuals for survival models. Biometrika, 77(1), 147–160.CrossRefGoogle Scholar
Zhang, H. P., & Singer, B., (1999). Recursive partitioning in the health sciences. New York, NY: Springer.CrossRefGoogle Scholar
Zhou, Y., Kadlec, K. M., & McArdle, J. J., (2014). Predicting mortality from demographics and specific cognitive abilities in the Hawaii Family Study of Cognition. In J. J. McArdle & G. Ritschard (Eds.), Contemporary issues in exploratory data mining (pp. 429–449). New York, NY: Routledge.Google Scholar
Zosuls, K. M., Ruble, D. N., Tamis-LeMonda, C. S., Shrout, P. E., Bornstein, M. H., & Greulich, F. K., (2009). The acquisition of gender labels in infancy: Implications for gender-typed play. Developmental Psychology, 45(3), 688.CrossRefGoogle Scholar
Supplementary material: File

Zhou and McArdle supplementary material

Zhou and McArdle supplementary material
Download Zhou and McArdle supplementary material(File)
File 15.4 KB