Hostname: page-component-745bb68f8f-f46jp Total loading time: 0 Render date: 2025-01-07T17:21:30.413Z Has data issue: false hasContentIssue false

A Nonparametric Test of Missing Completely at Random for Incomplete Multivariate Data

Published online by Cambridge University Press:  01 January 2025

Jun Li*
Affiliation:
University of California, Riverside
Yao Yu
Affiliation:
University of California, Riverside
*
Correspondence should be sent to Jun Li, Department of Statistics, University of California, Riverside, Riverside, CA 92521, USA. E-mail: jun.li@ucr.edu

Abstract

Missing data occur in many real world studies. Knowing the type of missing mechanisms is important for adopting appropriate statistical analysis procedure. Many statistical methods assume missing completely at random (MCAR) due to its simplicity. Therefore, it is necessary to test whether this assumption is satisfied before applying those procedures. In the literature, most of the procedures for testing MCAR were developed under normality assumption which is sometimes difficult to justify in practice. In this paper, we propose a nonparametric test of MCAR for incomplete multivariate data which does not require distributional assumptions. The proposed test is carried out by comparing the distributions of the observed data across different missing-pattern groups. We prove that the proposed test is consistent against any distributional differences in the observed data. Simulation shows that the proposed procedure has the Type I error well controlled at the nominal level for testing MCAR and also has good power against a variety of non-MCAR alternatives.

Type
Original Paper
Copyright
Copyright © 2014 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Chen, H. Y., & Little, R., (1999). A test of missing completely at random from generalised estimating equation with missing data. Biometrika, 86, 113.CrossRefGoogle Scholar
Davison, A. C., & Hinkley, D. V., (1997). Bootstrap methods and their application. Oxford: Cambridge University Press.CrossRefGoogle Scholar
Efron, B., & Tibshirani, R. (1993). An introduction to bootstrap. London: Chapman & Hall.CrossRefGoogle Scholar
Fuchs, C. (1982). Maximum likelihood estimation and model selection in contingency tables with missing data. Journal of the American Statistical Association, 77, 270278.CrossRefGoogle Scholar
Jamshidian, M. & Jalal, S. (2010). Tests of homoscedasticity, normality and missing completely at random for incomplete multivariate data. Psychometrika, 75, 649674.CrossRefGoogle ScholarPubMed
Kim, K. H. & Bentler, P. M., (2002). Tests of homogeneity of means and covariance matrices for multivariate incomplete data. Psychometrika, 67, 609624.CrossRefGoogle Scholar
Little, R.J.A. (1988). A test of missing completely at random for multivariate data with missing values. Journal of the American Statistical Association, 83, 11981202.CrossRefGoogle Scholar
Little, R. J. A., & Rubin, D. B., (2002). Statistical analysis with missing data (2nd ed.). New York: Wiley.CrossRefGoogle Scholar
Qu, A., & Song, P.X.K. (2002). Testing ignorable missingness in estimating equation approaches for longitudinal data. Biometrika, 89, 841850.CrossRefGoogle Scholar
Rizzo, M. L., & Székely, G. J., (2010). DISCO analysis: A nonparametric extension of analysis of variance. The Annals of Applied Statistics, 4, 1034–1055.CrossRefGoogle Scholar
Rubin, D.B. (1976). Inference and missing data. Biometrika, 63, 581592.CrossRefGoogle Scholar
Székely, G. J., & Rizzo, M. L. (2005). A new test for multivariate normality. Journal of Multivariate Analysis, 93, 5880.CrossRefGoogle Scholar