Hostname: page-component-cd9895bd7-7cvxr Total loading time: 0 Render date: 2025-01-05T01:01:55.840Z Has data issue: false hasContentIssue false

A General Approach to Categorical Data Analysis with Missing Data, using Generalized Linear Models with Composite Links

Published online by Cambridge University Press:  01 January 2025

David Rindskopf*
Affiliation:
City University of New York Graduate Center
*
Requests for reprints should be sent to David Rindskopf, Educational Psychology Program, CUNY Graduate Center, 33 West 42nd Street, New York, NY 10036.

Abstract

A general approach for analyzing categorical data when there are missing data is described and illustrated. The method is based on generalized linear models with composite links. The approach can be used (among other applications) to fill in contingency tables with supplementary margins, fit loglinear models when data are missing, fit latent class models (without or with missing data on observed variables), fit models with fused cells (including many models from genetics), and to fill in tables or fit models to data when variables are more finely categorized for some cases than others. Both Newton-like and EM methods are easy to implement for parameter estimation.

Type
Original Paper
Copyright
Copyright © 1992 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

The author thanks the editor, the reviewers, Laurie Hopp Rindskopf, and Clifford Clogg for comments and suggestions that substantially improved the paper.

References

Arminger, G. (1982). Latent class analysis with generalized linear models using composite link functions. Unpublished notes.Google Scholar
Baker, S. G., Laird, N. M. (1988). Regression analysis for categorical variables with outcome subject to nonignorable nonresponse. Journal of the American Statistical Association, 83, 6269.CrossRefGoogle Scholar
Burn, R. (1982). Loglinear models with composite link functions in genetics. In Gilchrist, R. (Eds.), GLIM 82: Proceedings of the international conference on generalised linear models (pp. 144154). New York: Springer-Verlag.CrossRefGoogle Scholar
Chen, T., Fienberg, S. E. (1976). The analysis of contingency tables with incompletely classified data. Biometrics, 32, 133144.CrossRefGoogle Scholar
Ekholm, A., Palmgren, J. (1985). A model for a binary response with misclassifications. In Gilchrest, R. (Eds.), GLIM 82: Proceedings of the international conference on generalised linear models (pp. 128143). New York: Springer-Verlag.Google Scholar
Espeland, M. A., Hui, S. L. (1987). A general approach to analyzing epidemiologic data that contain misclassification errors. Biometrics, 43, 10011012.CrossRefGoogle ScholarPubMed
Espeland, M. A., Odoroff, C. L. (1985). Log-linear models for doubly sampled categorical data fitted by the EM algorithm. Journal of the American Statistical Association, 80, 663670.CrossRefGoogle Scholar
Grizzle, J. E., Starmer, F. C., Koch, G. C. (1969). Analysis of categorical data by linear models. Biometrics, 25, 489504.CrossRefGoogle ScholarPubMed
Haberman, S. J. (1974). Log-linear models for frequency tables derived by indirect observation: Maximum likelihood equations. Annals of statistics, 2, 911924.CrossRefGoogle Scholar
Haberman, S. J. (1977). Product models for frequency tables involving indirect observation. Annals of Statistics, 5, 11241147.CrossRefGoogle Scholar
Haberman, S. J. (1979). Analysis of qualitative data: Volume 2. New developments, New York: Academic Press.Google Scholar
Haberman, S. J. (1988). A stabilized Newton-Raphson algorithm for log-linear models for frequency tables derived by indirect observation. In Clogg, C. C. (Eds.), Sociological methodology 1988 (pp. 193211). Washington, D.C.: American Sociological Association.Google Scholar
Hochberg, Y. (1977). On the use of double sampling schemes in analyzing categorical data with misclassification errors. Journal of the American Statistical Association, 72, 914921.Google Scholar
Hocking, R. R., Oxspring, H. H. (1974). The analysis of partially categorized contingency data. Biometrics, 30, 469483.CrossRefGoogle ScholarPubMed
Kempthorne, O. (1980). The term “design matrix”. American Statistician, 34, 249249.Google Scholar
Little, R. J. A., Rubin, D. B. (1987). Statistical analysis with missing data, New York: Wiley.Google Scholar
McCullagh, P., Nelder, J. A. (1989). Generalized linear models 2nd ed.,, London: Chapman and Hall.CrossRefGoogle Scholar
Rindskopf, D. (1984). Linear equality restrictions in regression and loglinear models. Psychological Bulletin, 96, 597603.CrossRefGoogle Scholar
Rindskopf, D. (1990). Nonstandard loglinear models. Psychological Bulletin, 108, 150162.CrossRefGoogle Scholar
Tenenbein, A. (1970). A double sampling scheme for estimating from binomial data with misclassifications. Journal of the American Statistical Association, 65, 13501361.CrossRefGoogle Scholar
Thompson, R., Baker, R. J. (1981). Composite link functions in generalized linear models. Applied Statistics, 30, 125131.CrossRefGoogle Scholar
Winship, C., Mare, R. D. (1990). Loglinear models with missing data: A latent class approach. Sociological methodology, 20, 331367.Google Scholar