Bayesian Estimation of a Multilevel IRT Model Using Gibbs Sampling

Jean-Paul Fox; Cees A. W. Glas

doi:10.1007/BF02294839

Bayesian Estimation of a Multilevel IRT Model Using Gibbs Sampling

Published online by Cambridge University Press: 01 January 2025

Jean-Paul Fox and

Cees A. W. Glas

Show author details

Jean-Paul Fox*: Affiliation:
University of Twente
Cees A. W. Glas: Affiliation:
University of Twente
*: Requests for reprints should be sent to Jean-Paul Fox, Department of Educational Measurement and Data Analysis, University of Twente, RO. Box 217, 7500 AE Enschede, THE NETHERLANDS. E-mail: FoxJ@edte.utwente.nl

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

In this article, a two-level regression model is imposed on the ability parameters in an item response theory (IRT) model. The advantage of using latent rather than observed scores as dependent variables of a multilevel model is that it offers the possibility of separating the influence of item difficulty and ability level and modeling response variation and measurement error. Another advantage is that, contrary to observed scores, latent scores are test-independent, which offers the possibility of using results from different tests in one analysis where the parameters of the IRT model and the multilevel model can be concurrently estimated. The two-parameter normal ogive model is used for the IRT measurement model. It will be shown that the parameters of the two-parameter normal ogive model and the multilevel model can be estimated in a Bayesian framework using Gibbs sampling. Examples using simulated and real data are given.

Keywords

Bayes estimates Gibbs sampler item response theory (IRT)Markov chain Monte Carlo multilevel model two-parameter normal ogive model

Information

Type: Articles
Information: Psychometrika , Volume 66 , Issue 2 , June 2001 , pp. 271 - 288

DOI: https://doi.org/10.1007/BF02294839 [Opens in a new window]
Copyright: Copyright © 2001 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Adams, R.J., Wilson, M., Wu, M. (1997). Multilevel item response models: An approach to errors in variable regression. Journal of Educational and Behavioral Statistics, 22, 47–76.CrossRef Google Scholar

Albert, J.H. (1992). Bayesian estimation of normal ogive item response curves using Gibbs sampling. Journal of Educational Statistics, 17, 251–269.CrossRef Google Scholar

Béguin, A.A., Glas, C.A.W. (1998). MCMC estimation of multidimensional IRT models. Twente, The Netherlands: University of Twente, Faculty of Educational Science and Technology.Google Scholar

Bock, R.D., Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46, 443–459.CrossRef Google Scholar

Box, G.E.P., Tiao, G.C. (1973). Bayesian inference in statistical analysis. Reading, MA: Addison-Wesley Publishing.Google Scholar

Bradlow, E.T., Wainer, H., Wang, X. (1999). A Bayesian random effects model for testlets. Psychometrika, 64, 153–168.CrossRef Google Scholar

Bryk, A.S., Raudenbush, S.W. (1992). Hierarchical linear models. Newbury Park, CA: Sage Publications.Google Scholar

Bryk, A.S., Raudenbush, S.W., Congdon, R.T. (1996). Hlm for Windows. Chicago, IL: Scientific Software International.Google Scholar

de Leeuw, J., Kreft, I.G.G. (1986). Random coefficient models for multilevel analysis. Journal of Educational and Behavioral Statistics, 11, 57–86.CrossRef Google Scholar

Doolaard, S. (1999). Schools in change or schools in chains. The Netherlands: University of Twente.Google Scholar

Gelfand, A.E., Hills, S.E., Racine-Poon, A., Smith, A.F.M. (1990). Illustration of Bayesian inference in normal data models using Gibbs sampling. Journal of the American Statistical Association, 85, 972–985.CrossRef Google Scholar

Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B. (1995). Bayesian data analysis. London, UK: Chapman & Hall.CrossRef Google Scholar

Gelman, A., Meng, X-L., Stern, H.S. (1996). Posterior predictive assessment of model fitness via realized discrepancies. Statistica Sinica, 6, 733–807.Google Scholar

Geman, S., Geman, D. (1984). Stochastic relaxation, Gibbs distribution, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721–741.CrossRef Google Scholar PubMed

Gibbons, R.D., Hedeker, D.R. (1992). Full-information bi-factor analysis. Psychometrika, 57, 423–463.CrossRef Google Scholar

Glas, C.A.W., Wainer, H., Bradlow, E.T. (2000). MML and EAP estimation in testlet-based adaptive testing. In van der Linden, W.J., Glas, C.A.W. (Eds.), Computerized adaptive testing: Theory and practice (pp. 271–287). Boston, MA: Kluwer Academic Publishers.CrossRef Google Scholar

Goldstein, H. (1995). Multilevel statistical models 2nd ed., London: Edward Arnold.Google Scholar

Hoijtink, H., Boomsma, A. (1995). On person parameter estimation in the dichotomous Rasch model. In Fischer, G.H., Molenaar, I.W. (Eds.), Rasch models: Foundations, recent developments and applications (pp. 53–68). New York, NY: Springer.CrossRef Google Scholar

Hoijtink, H., Molenaar, I.W. (1997). A multidimensional item response model: Constrained latent class analysis using the Gibbs sampler and posterior predictive checks. Psychometrika, 62, 171–189.CrossRef Google Scholar

Lindley, D.V., Smith, A.F.M. (1972). Bayes estimates for the linear model. Journal of the Royal Statistical Society, Series B, 34, 1–41.CrossRef Google Scholar

Longford, N.T. (1993). Random coefficient models. New York, NY: Oxford University Press.Google Scholar

Mathsoft, Data Analysis Products Division (1999). S-Plus 2000 programmer's guide [computer program and software manual], Seattle, WA: Author.Google Scholar

Mislevy, R.J. (1986). Bayes model estimation in item response models. Psychometrika, 51, 177–195.CrossRef Google Scholar

Mislevy, R.J., Bock, R.D. (1989). A hierarchical item-response model for educational testing. In Bock, R.D. (Eds.), Multilevel analysis of educational data (pp. 57–74). San Diego, CA: Academic Press.Google Scholar

Morris, C.N. (1983). Parameteric empirical Bayes inference: Theory and applications (with discussion). Journal of the American Statistical Association, 78, 47–65.CrossRef Google Scholar

O'Hagan, A. (1995). Fractional Bayes factors for model comparison. Journal of the Royal Statistical Society, Series B, 57, 99–138.CrossRef Google Scholar

Patz, R.J., Junker, B.W. (1999). A straightforward approach to Markov chain Monte Carlo methods for item response models. Journal of Educational and Behavioral Statistics, 24, 146–178.CrossRef Google Scholar

Patz, R.J., Junker, B.W. (1999). Applications and extensions of MCMC in IRT: Multiple item types, missing data, and rated responses. Journal of Educational and Behavioral Statistics, 24, 342–366.CrossRef Google Scholar

Raudenbush, S.W. (1988). Educational applications of hierarchical linear models: A review. Journal of Educational Statistics, 13, 85–116.CrossRef Google Scholar

Roberts, G.O., Sahu, S.K. (1997). Updating schemes, correlation structure, blocking and parametrization for the Gibbs sampler. Journal of the Royal Statistical Society, Series B, 59, 291–317.CrossRef Google Scholar

Rubin, D.B. (1981). Estimation in parallel randomized experiments. Journal of Educational Statistics, 6, 377–400.CrossRef Google Scholar

Seltzer, M.H. (1993). Sensitivity analysis for fixed effects in the hierarchical model: A Gibbs sampling approach. Journal of Educational Statistics, 18, 207–235.CrossRef Google Scholar

Seltzer, M.H., Wong, W.H., Bryk, A.S. (1996). Bayesian analysis in applications of hierarchical models: Issues and methods. Journal of Educational and Behavioral Statistics, 21, 131–167.CrossRef Google Scholar

Wainer, H., Bradlow, E.T., Du, Z. (2000). Testlet response theory: An analog for the 3pl model useful in testlet-based adaptive testing. In van der Linden, W.J., Glas, C.A.W. (Eds.), Computerized adaptive testing: Theory and practice (pp. 245–269). Boston, MA: Kluwer Academic Publishers.CrossRef Google Scholar

Wei, G.C.G., Tanner, M.A. (1990). A Monte Carlo implementation of the EM algorithm and the poor man's Data Augmentation algorithms. Journal of the American Statistical Association, 85, 699–704.CrossRef Google Scholar

Zimowski, M.F., Muraki, E., Mislevy, R.J., Bock, R.D. (1996). Bilog MG, multiple-group IRT analysis and test maintenance for binary items. Chicago, IL: Scientific Software International.Google Scholar

Article contents

Bayesian Estimation of a Multilevel IRT Model Using Gibbs Sampling

Abstract

Keywords

Information

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests