A Markov Chain Monte Carlo Approach to Confirmatory Item Factor Analysis

Michael C. Edwards

doi:10.1007/s11336-010-9161-9

A Markov Chain Monte Carlo Approach to Confirmatory Item Factor Analysis

Published online by Cambridge University Press: 01 January 2025

Michael C. Edwards

Show author details

Michael C. Edwards*: Affiliation:
The Ohio State University
*: Requests for reprints should be sent to Michael C. Edwards, 1827 Neil Avenue, Columbus, OH 43210, USA. E-mail: edwards.134@osu.edu

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

Item factor analysis has a rich tradition in both the structural equation modeling and item response theory frameworks. The goal of this paper is to demonstrate a novel combination of various Markov chain Monte Carlo (MCMC) estimation routines to estimate parameters of a wide variety of confirmatory item factor analysis models. Further, I show that these methods can be implemented in a flexible way which requires minimal technical sophistication on the part of the end user. After providing an overview of item factor analysis and MCMC, results from several examples (simulated and real) will be discussed. The bulk of these examples focus on models that are problematic for current “gold-standard” estimators. The results demonstrate that it is possible to obtain accurate parameter estimates using MCMC in a relatively user-friendly package.

Keywords

item factor analysis multidimensional item response theory Markov chain Monte Carlo

Type: Original Paper
Information: Psychometrika , Volume 75 , Issue 3 , September 2010 , pp. 474 - 497

DOI: https://doi.org/10.1007/s11336-010-9161-9 [Opens in a new window]
Copyright: Copyright © 2010 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

I would like to thank Li Cai, David Thissen, and R.J. Wirth for comments on earlier versions of this draft. I would like to thank Roger Millsap and the reviewers for their guidance on revisions. The resulting paper is better for all of your efforts. Any remaining faults are my own.

References

Adams, R.J., Wilson, M., Wang, W. (1997). The multidimensional random coefficients multinomial logit model. Applied Psychological Measurement, 21, 1–23.CrossRef Google Scholar

Albert, J.H. (1992). Bayesian estimation of normal ogive item response curves using Gibbs sampling. Journal of Educational Statistics, 17, 251–269.CrossRef Google Scholar

Albert, J.H., Chib, S. (1993). Bayesian analysis of binary and polychotomous response data. Journal of the American Statistical Association, 88, 669–679.CrossRef Google Scholar

Béguin, A.A., Glas, C.A.W. (2001). MCMC estimation and some model-fit analysis of multidimensional IRT models. Psychometrika, 66, 541–561.CrossRef Google Scholar

Best, N.G., Cowles, M.K., Vines, S.K. (1997). coda: Convergence diagnosis and output analysis software for Gibbs sampling output, Cambridge: University of Cambridge, Institute of Public Health, Medical Research Council Biostatistics Unit (Version 0.4) [Computer software]Google Scholar

Bock, R.D., Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: An application of the EM algorithm. Psychometrika, 46, 443–459.CrossRef Google Scholar

Bock, R.D., Gibbons, R., Muraki, E. (1988). Full-information item factor analysis. Applied Psychological Measurement, 12, 261–280.CrossRef Google Scholar

Bock, R.D., Gibbons, R., Schilling, S.G., Muraki, E., Wilson, D.T., Wood, R. (2002). TESTFACT 4, Chicago: Scientific Software International, Inc. [Computer software]Google Scholar

Bolt, D.M., Lall, V.F. (2003). Estimation of compensatory and noncompensatory multidimensional IRT models using Markov chain Monte Carlo. Applied Psychological Measurement, 27, 395–414.CrossRef Google Scholar

Bradlow, E.T., Wainer, H., Wang, X. (1999). A Bayesian random effects model for testlets. Psychometrika, 64, 153–168.CrossRef Google Scholar

Cai, L. (In Press-a). High-dimensional exploratory item factor analysis by a Metropolis–Hastings Robbins–Monro algorithm. Psychometrika.Google Scholar

Cai, L. (In Press-b). Metropolis–Hastings Robbins–Monro algorithm for confirmatory item factor analysis. Journal of Educational and Behavioral Statistics.Google Scholar

Cai, L., Maydeu-Olivares, A., Coffman, D.L., Thissen, D. (2006). Limited-information goodness-of-fit testing of item response models for sparse 2^p tables. British Journal of Mathematical and Statistical Psychology, 59, 173–194.CrossRef Google Scholar PubMed

Casella, G., George, E.I. (1992). Explaining the Gibbs sampler. The American Statistician, 46, 167–174.CrossRef Google Scholar

Chen, M.-H., Shao, Q.-M., Ibrahim, J.G. (2000). Monte Carlo methods in Bayesian computation, New York: Springer.CrossRef Google Scholar

Chib, S., Greenberg, E. (1995). Understanding the Metropolis–Hastings algorithm. The American Statistician, 49, 327–335.CrossRef Google Scholar

Cowles, M.K. (1996). Accelerating Monte Carlo Markov chain convergence for cumulative-link generalized linear models. Statistics and Computing, 6, 101–111.CrossRef Google Scholar

Cowles, M.K., Carlin, B. (1996). Markov chain Monte Carlo convergence diagnostics: A comparative review. Journal of the American Statistical Association, 91, 883–904.CrossRef Google Scholar

de la Torre, J., Patz, R.J. (2005). Making the most of what we have: A practical application of multidimensional item response theory in test scoring. Journal of Educational and Behavioral Statistics, 30, 295–311.CrossRef Google Scholar

DeMars, C.E. (2006). Application of the bi-factor multidimensional item response theory model to testlet-based tests. Journal of Educational Measurement, 43, 145–168.CrossRef Google Scholar

DeMars, C.E. (2007). “Guessing” parameter estimates for multidimensional item response theory models. Educational and Psychological Measurement, 67, 433–446.CrossRef Google Scholar

Edwards, M.C. (2005a). A Markov chain Monte Carlo approach to confirmatory item factor analysis. Unpublished doctoral dissertation, University of North Carolina at Chapel Hill.Google Scholar

Edwards, M.C. (2005b). MultiNorm: Multidimensional normal ogive item response theory analysis [Computer software].Google Scholar

Edwards, M.C., Vevea, J.L. (2006). An empirical Bayes approach to subscore augmentation: How much strength can we borrow?. The Journal of Educational and Behavioral Statistics, 31, 241–259.CrossRef Google Scholar

Edwards, M.C., Wirth, R.J. (2009). Measurement and the study of change. Research in Human Development, 6, 74–96.CrossRef Google Scholar

Fox, J.-P., Glas, C.A.W. (2001). Bayesian estimation of a multilevel IRT model using Gibbs sampling. Psychometrika, 66, 269–286.CrossRef Google Scholar

Gamerman, D. (1997). Markov chain Monte Carlo, New York: Chapman and Hall.Google Scholar

Gelman, A. (1996). Inference and monitoring convergence. In Gilks, W.R., Richardson, S., Spiegelhalter, D.J. (Eds.), Markov chain Monte Carlo in practice (pp. 131–143). London: Chapman and Hall.Google Scholar

Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B. (2004). Bayesian data analysis, (2nd ed.). New York: Chapman and Hall.Google Scholar

Gelman, A., Rubin, D.B. (1992). Inference from iterative simulation using multiple sequences (with discussion). Statistical Science, 7, 457–511.CrossRef Google Scholar

Geman, S., Geman, D. (1984). Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721–741.CrossRef Google Scholar PubMed

Geweke, J. (1992). Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments. In Bernardo, J.M., Berger, J., Dawid, A.P., Smith, A.F.M. (Eds.), Bayesian statistics 4 (pp. 169–193). Oxford: Oxford University Press.CrossRef Google Scholar

Gibbons, R.D., Bock, R.D., Hedeker, D., Weiss, D.J., Segawa, E., Bhaumik, D.K.et al. (2007). Full-information item bifactor analysis of graded response data. Applied Psychological Measurement, 31, 4–19.CrossRef Google Scholar

Gibbons, R.D., Hedeker, D.R. (1992). Full-information item bi-factor analysis. Psychometrika, 57, 423–436.CrossRef Google Scholar

Gibbons, R.D., Rush, A.J., Immekus, J.C. (2009). On the psychometric validity of the domains of the pdsq: An illustration of the bi-factor item response theory model. Journal of Psychiatric Research, 43, 401–410.CrossRef Google Scholar PubMed

Gilks, W.R., Richardson, S., Spiegelhalter, D.J. (1996). Introducing Markov chain Monte Carlo. In Gilks, W.R., Richardson, S., Spiegelhalter, D.J. (Eds.), Markov chain Monte Carlo in practice (pp. 1–19). New York: Chapman and Hall.Google Scholar

Gilks, W.R., Richardson, S., Spiegelhalter, D.J. (1996). Markov chain Monte Carlo in practice, New York: Chapman and Hall.Google Scholar

Gill, J. (2008). Bayesian methods: A social and behavioral sciences approach, New York: Chapman and Hall/CRC.Google Scholar

Hastings, W.K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57, 97–109.CrossRef Google Scholar

Heidelberger, P., Welch, P.D. (1983). Simulation run length control in the presence of an initial transient. Operations Research, 31, 1109–1144.CrossRef Google Scholar

Hill, C.D., Edwards, M.C., Thissen, D., Langer, M.M., Wirth, R.J., Burwinkle, T.M.et al. (2007). Practical issues in the application of item response theory: A demonstration using item form the Pediatric Quality of Life Inventory (PedsQL) 4.0 Generic Core Scales. Medical Care, 45, S39–S47.CrossRef Google Scholar

Holzinger, K.J., Swineford, F. (1937). The bi-factor method. Psychometrika, 2, 41–54.CrossRef Google Scholar

Jöreskog, K.G., Sörbom, D. (2001). LISREL user’s guide, Chicago: SSI International.Google Scholar

Jöreskog, K.G., Sörbom, D. (2003). LISREL 8.54, Chicago: Scientific Software International, Inc [Computer software]Google Scholar

Kang, T., Cohen, A.S. (2007). Irt model selection methods for dichotomous items. Applied Psychological Measurement, 31, 331–358.CrossRef Google Scholar

Kass, R.E., Carlin, B.P., Gelman, A., Neal, R.M. (1998). Markov chain Monte Carlo in practice: A roundtable discussion. The American Statistician, 52, 93–100.CrossRef Google Scholar

Lord, F.M., Novick, M.R. (1968). Statistical theories of mental test scores, Reading: Addison-Wesley.Google Scholar

Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, E. (1953). Equation of state calculations by fast computing machines. The Journal of Chemical Physics, 21, 1087–1092.CrossRef Google Scholar

Metropolis, N., Ulam, S. (1949). The Monte Carlo method. Journal of the American Statistical Association, 44, 335–341.CrossRef Google Scholar PubMed

Patz, R.J., Junker, B.W. (1999). A straightforward approach to Markov chain Monte Carlo methods for item response models. Journal of Educational and Behavioral Statistics, 24, 146–178.CrossRef Google Scholar

Patz, R.J., Junker, B.W. (1999). Applications and extensions of MCMC in IRT: Multiple item types, missing data, and rated responses. Journal of Educational and Behavioral Statistics, 24, 342–366.CrossRef Google Scholar

Pearson, K. (1914). The life, letters and labours of Francis Gallon, Cambridge: Cambridge University Press.Google Scholar

R Development Core Team (2005). R: A language and environment for statistical computing [Computer software]. Vienna: R Foundation for Statistical Computing. Retrieved from http://www.R-project.org. Available from http://www.R-project.org.Google Scholar

Raftery, A.E., Lewis, S. (1992). How many iterations in the Gibbs sampler?. In Bernardo, J.M., Berger, J., Dawid, A.P., Smith, A.F.M. (Eds.), Bayesian statistics 4 (pp. 763–773). Oxford: Oxford University Press.CrossRef Google Scholar

Roberts, G.O. (1996). Markov chain concepts related to sampling algorithms. In Gilks, W.R., Richardson, S., Spiegelhalter, D.J. (Eds.), Markov chain Monte Carlo in practice (pp. 45–57). New York: Chapman and Hall.Google Scholar

Samejima, F. (1969). Psychometrika Monograph, No. 17: Estimation of latent ability using a response pattern of graded scores.Google Scholar

Schilling, S., Bock, R.D. (2005). High-dimensional maximum marginal likelihood item factor analysis by adaptive quadrature. Psychometrika, 70, 533–555.CrossRef Google Scholar

Segall, D.O. (2002). Confirmatory item factor analysis using Markov chain Monte Carlo estimation with applications to online calibration in CAT. Paper presented at the annual meeting of the National Council on Measurement in Education, New Orleans, LA.Google Scholar

Shi, J.-Q., Lee, S.-Y. (1998). Bayesian sampling-based approach for factor analysis models with continuous and polytomous data. British Journal of Mathematical and Statistical Psychology, 51, 233–252.CrossRef Google Scholar

Sinharay, S. (2004). Experiences with Markov chain Monte Carlo convergence assessment in two psychometric examples. Journal of Educational and Behavioral Statistics, 29, 461–488.CrossRef Google Scholar

Sinharay, S. (2006). Posterior predictive assessment of item response theory models. Applied Psychological Measurement, 30, 298–321.CrossRef Google Scholar

Takane, Y., de Leeuw, J. (1987). On the relationship between item response theory and factor analysis of discretized variables. Psychometrika, 52, 393–408.CrossRef Google Scholar

Tanner, M.A. (1996). Tools for statistical inference, New York: Springer.CrossRef Google Scholar

Tanner, M.A., Wong, W.H. (1987). The calculation of posterior distributions by data augmentation (with discussion). Journal of the American Statistical Association, 82, 528–550.CrossRef Google Scholar

Thissen, D. (1991). Multilog: Multiple category item analysis and test scoring using item response theory, Chicago: Scientific Software International, Inc [Computer software]Google Scholar

Thurstone, L.L. (1947). Multiple-factor analysis, Chicago: University of Chicago Press.Google Scholar

Wainer, H., Bradlow, E.T., Du, Z. (2000). Testlet response theory: An analog for the 3-PL useful in testlet-based adaptive testing. In van der Linden, W.J., Glas, C.A.W. (Eds.), Computerized adaptive testing: Theory and practice (pp. 245–270). Boston: Kluwer Academic.CrossRef Google Scholar

Wainer, H., Kiely, G. (1987). Item clusters and computerized adaptive testing: A case for testlets. Journal of Educational Measurement, 24, 185–202.CrossRef Google Scholar

Wainer, H., Vevea, J.L., Camacho, F., Reeve, B.B., Rosa, K., Nelson, L.et al. (2001). Augmented scores—“Borrowing strength” to compute scores based on a small number of items. In Thissen, D., Wainer, H.et al. (Eds.), Test scoring (pp. 347–387). Mahwah: Lawrence Erlbaum Associates, Inc.Google Scholar

Wang, X., Bradlow, E.T., Wainer, H. (2002). A general Bayesian model for testlets: Theory and applications. Applied Psychological Measurement, 26, 109–128.CrossRef Google Scholar

Wirth, R.J., Edwards, M.C. (2007). Item factor analysis: Current approaches and future directions. Psychological Methods, 12, 58–79.CrossRef Google Scholar PubMed

Article contents

A Markov Chain Monte Carlo Approach to Confirmatory Item Factor Analysis

Abstract

Keywords

Access options

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests