Bayesian Modeling of Measurement Error in Predictor Variables Using Item Response Theory

Jean-Paul Fox; Cees A. W. Glas

doi:10.1007/BF02294796

Bayesian Modeling of Measurement Error in Predictor Variables Using Item Response Theory

Published online by Cambridge University Press: 01 January 2025

Jean-Paul Fox and

Cees A. W. Glas

Show author details

Jean-Paul Fox*: Affiliation:
University of Twente
Cees A. W. Glas: Affiliation:
University of Twente
*: Requests for reprints should be send to Jean-Paul Fox, Department of Educational Measurement and Data Analysis, University of Twente, P.O. Box 217, 7500 AE Enschede, THE NETHERLANDS. E-Mail: Fox@edte.utwente.nl

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

It is shown that measurement error in predictor variables can be modeled using item response theory (IRT). The predictor variables, that may be defined at any level of an hierarchical regression model, are treated as latent variables. The normal ogive model is used to describe the relation between the latent variables and dichotomous observed variables, which may be responses to tests or questionnaires. It will be shown that the multilevel model with measurement error in the observed predictor variables can be estimated in a Bayesian framework using Gibbs sampling. In this article, handling measurement error via the normal ogive model is compared with alternative approaches using the classical true score model. Examples using real data are given.

Keywords

classical test theory Gibbs sampler item response theory hierarchical linear models Markov Chain Monte Carlo measurement error multilevel model multilevel IRT two-parameter normal ogive model

Information

Type: Articles
Information: Psychometrika , Volume 68 , Issue 2 , June 2003 , pp. 169 - 191

DOI: https://doi.org/10.1007/BF02294796 [Opens in a new window]
Copyright: Copyright © 2003 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

This paper is part of the dissertation by Fox (2001) that won the 2002 Psychometric Society Dissertation Award.

References

Albert, J.H. (1992). Bayesian estimation of normal ogive item response curves using Gibbs sampling. Journal of Educational Statistics, 17, 251–269.CrossRef Google Scholar

Bock, R.D., Zimowski, M.F. (1997). Multiple group IRT. In van der Linden, W.J., Hambleton, R.K. (Eds.), Handbook of modern item response theory (pp. 433–448). New York, NY: Springer.CrossRef Google Scholar

Béguin, A.A. (2000). Robustness of equating high-stakes tests. Enschede, Netherlands: Twente University.Google Scholar

Béguin, A.A., Glas, C.A.W. (2001). MCMC estimation of multidimensional IRT models. Psychometrika, 66, 541–562.CrossRef Google Scholar

Bernardo, J.M., Smith, A.F.M. (1994). Bayesian theory. New York, NY: John Wiley & Sons.CrossRef Google Scholar

Best, N.G., Cowles, M.K., Vines, S.K. (1995). CODA Convergence diagnosis and output analysis software for Gibbs sampler output: Version 0.3 [Computer software and manual]. University of Cambridge: MRC Biostatistics Unit.Google Scholar

Bollen, K.A. (1989). Structural equations with latent variables. New York, NY: John Wiley & Sons.CrossRef Google Scholar

Bosker, R.J., Blatchford, P., Meijnen, G.W. (1999). Enhancing educational excellence, equity and efficiency. In Bosker, R.J., Creemers, B.P.M., Stringfield, S. (Eds.), Evidence from evaluations of systems and schools in change (pp. 89–112). Dordrecht/Boston/London: Kluwer Academic Publishers.Google Scholar

Box, G.E.P., Tiao, G.C. (1973). Bayesian inference in statistical analysis. Reading, MA: Addison-Wesley Publishing.Google Scholar

Bryk, A.S., Raudenbush, S.W. (1992). Hierarchical linear models. Newbury Park, CA: Sage Publications.Google Scholar

Carlin, B.P., Louis, T.A. (1996). Bayes and empirical Bayes methods for data analysis. London: Chapman & Hall.Google Scholar

Carroll, R., Ruppert, D., Stefanski, L.A. (1995). Measurement error in nonlinear models. London: Chapman & Hall.CrossRef Google Scholar

Chen, M.-H., Shao, Q.-M. (1999). Monte Carlo estimation of Bayesian credible and HPD intervals. Journal of Computational and Graphical Statistics, 8, 69–92.CrossRef Google Scholar

Chib, S., Greenberg, E. (1995). Understanding the Metropolis-Hastings Algorithm. The American Statistician, 49, 327–335.CrossRef Google Scholar

Cook, T.D., Campbell, D.T. (1979). Quasi-experimentation, design & analysis issues for field settings. Chicago, IL: Rand McNally College Publishing.Google Scholar

de Leeuw, J., Kreft, I.G.G. (1986). Random coefficient models for multilevel analysis. Journal of Educational and Behavioral Statistics, 11, 57–86.CrossRef Google Scholar

Fox, J.-P. (2001). Multilevel IRT: A Bayesian perspective on estimating parameters and testing statistical hypotheses. Enschede, Netherlands: Twente University.Google Scholar

Fox, J.-P., Glas, C.A.W. (2001). Bayesian estimation of a multilevel IRT model using Gibbs sampling. Psychometrika, 66, 269–286.CrossRef Google Scholar

Fuller, W.A. (1987). Measurement error models. New York, NY: John Wiley & Sons.CrossRef Google Scholar

Gelfand, A.E., Smith, A.F.M. (1990). Sampling-based approaches to calculating marginal densities. Journal of the American Statistical Association, 85, 398–409.CrossRef Google Scholar

Gelfand, A.E., Hills, S.E., Racine-Poon, A., Smith, A.F.M. (1990). Illustration of Bayesian inference in normal data models using Gibbs sampling. Journal of the American Statistical Association, 85, 972–985.CrossRef Google Scholar

Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B. (1995). Bayesian data analysis. London: Chapman & Hall.CrossRef Google Scholar

Gelman, A., Meng, X.-L., Stern, H.S. (1996). Posterior predictive assessment of model fitness via realized discrepancies. Statistica Sinica, 6, 733–807.Google Scholar

Geman, S., Geman, D. (1984). Stochastic relaxation, Gibbs distribution, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721–741.CrossRef Google Scholar PubMed

Gilks, W.R., Roberts, G.O. (1996). Strategies for improving MCMC. In Gilks, W.R., Richardson, S., Spiegelhalter, D.J. (Eds.), Markov Chain Monte Carlo in practice (pp. 89–114). London: Chapman & Hall.Google Scholar

Goldstein, H. (1995). Multilevel statistical models 2nd ed., London: Edward Arnold.Google Scholar

Gruber, M.H.J. (1998). Improving efficiency by shrinkage. New York, NY: Marcel Dekker.Google Scholar

Hoijtink, H., Boomsma, A. (1995). On person parameter estimation in the dichotomous Rasch model. In Fischer, G.H., Molenaar, I.W. (Eds.), Rasch models: Foundations, recent developments and applications (pp. 53–68). New York, NY: Springer.CrossRef Google Scholar

Johnson, V.E., Albert, J.H. (1999). Ordinal data modeling. New York, NY: Springer-Verlag.CrossRef Google Scholar

Lindley, D.V., Smith, A.F.M. (1972). Bayes estimates for the linear model. Journal of the Royal Statistical Society, 34, 1–41.CrossRef Google Scholar

Liu, J.S., Wong, H.W., Kong, A. (1994). Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes. Biometrika, 81, 27–40.CrossRef Google Scholar

Lord, F.M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum Associates.Google Scholar

Lord, F.M., Novick, M.R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.Google Scholar

MacEachern, S.N., Berliner, L.M. (1994). Subsampling the Gibbs sampler. The American Statistician, 48, 188–190.CrossRef Google Scholar

McDonald, R.P. (1967). Nonlinear factor analysis. Psychometrika, Monograph Number 15.CrossRef Google Scholar PubMed

McDonald, R.P. (1982). Linear versus nonlinear models in latent trait theory. Applied Psychological Measurement, 6, 379–396.CrossRef Google Scholar

McDonald, R.P. (1997). Normal-ogive multidimensional model. In van der Linden, W.J., Hambleton, R.K. (Eds.), Handbook of modern item response theory (pp. 257–269). New York, NY: Springer.CrossRef Google Scholar

Muthén, B.O. (1989). Latent variable modeling in heterogeneous populations. Psychometrika, 54, 557–585.CrossRef Google Scholar

Patz, J.P., Junker, B.W. (1999). Applications and extensions of MCMC in IRT: Multiple item types, missing data, and rated responses. Journal of Educational and Behavioral Statistics, 24, 342–366.CrossRef Google Scholar

Raudenbush, S.W. (1988). Educational applications of hierarchical linear models: A review. Journal of Educational Statistics, 13, 85–116.CrossRef Google Scholar

Raudenbush, S.W., Bryk, A.S., Cheong, Y.F., Congdon, R.T. Jr. (2000). HLM 5. Hierarchical linear and nonlinear modeling. Lincolnwood, IL: Scientific Software International.Google Scholar

Richardson, S. (1996). Measurement error. In Gilks, W.R., Richardson, S., Spiegelhalter, D.J. (Eds.), Markov Chain Monte Carlo in practice (pp. 401–417). London: Chapman & Hall.Google Scholar

Robert, C.P., Casella, G. (1999). Monte Carlo statistical methods. New York, NY: Springer.CrossRef Google Scholar

Roberts, G.O., Sahu, S.K. (1997). Updating schemes, correlation structure, blocking and parameterization for the Gibbs sampler. Journal of the Royal Statistical Society, 59, 291–317.CrossRef Google Scholar

Seltzer, M.H. (1993). Sensitivity analysis for fixed effects in the hierarchical model: A Gibbs sampling approach. Journal of Educational Statistics, 18, 207–235.CrossRef Google Scholar

Seltzer, M.H., Wong, W.H., Bryk, A.S. (1996). Bayesian analysis in applications of hierarchical models: Issues and methods. Journal of Educational and Behavioral Statistics, 21, 131–167.CrossRef Google Scholar

Snijders, T.A.B., Bosker, R.J. (1999). Multilevel analysis. London: Sage Publications.Google Scholar

Tanner, M.A., Wong, W.H. (1987). The calculation of posterior distributions by data augmentation. Journal of the American Statistical Association, 82, 528–550.CrossRef Google Scholar

Tierney, L. (1994). Markov chains for exploring posterior distributions. The Annals of Statistics, 22, 1701–1762.Google Scholar

van der Linden, W.J. (1998). Optimal assembly of psychological and educational tests. Applied Psychological Measurement, 22, 195–211.CrossRef Google Scholar

Zellner, A. (1971). An introduction to Bayesian inference in econometrics. New York, NY: John Wiley & Sons.Google Scholar

Zimowski, M.F., Muraki, E., Mislevy, R.J., Bock, R.D. (1996). Bilog MG, Multiple-group IRT analysis and test maintenance for binary items. Chicago, IL: Scientific Software International.Google Scholar

Article contents

Bayesian Modeling of Measurement Error in Predictor Variables Using Item Response Theory

Abstract

Keywords

Information

Access options

Article purchase

Temporarily unavailable

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests