Hostname: page-component-cd9895bd7-q99xh Total loading time: 0 Render date: 2025-01-04T03:45:38.208Z Has data issue: false hasContentIssue false

Bayesian Modeling of Measurement Error in Predictor Variables Using Item Response Theory

Published online by Cambridge University Press:  01 January 2025

Jean-Paul Fox*
Affiliation:
University of Twente
Cees A. W. Glas
Affiliation:
University of Twente
*
Requests for reprints should be send to Jean-Paul Fox, Department of Educational Measurement and Data Analysis, University of Twente, P.O. Box 217, 7500 AE Enschede, THE NETHERLANDS. E-Mail: Fox@edte.utwente.nl

Abstract

It is shown that measurement error in predictor variables can be modeled using item response theory (IRT). The predictor variables, that may be defined at any level of an hierarchical regression model, are treated as latent variables. The normal ogive model is used to describe the relation between the latent variables and dichotomous observed variables, which may be responses to tests or questionnaires. It will be shown that the multilevel model with measurement error in the observed predictor variables can be estimated in a Bayesian framework using Gibbs sampling. In this article, handling measurement error via the normal ogive model is compared with alternative approaches using the classical true score model. Examples using real data are given.

Type
Articles
Copyright
Copyright © 2003 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

This paper is part of the dissertation by Fox (2001) that won the 2002 Psychometric Society Dissertation Award.

References

Albert, J.H. (1992). Bayesian estimation of normal ogive item response curves using Gibbs sampling. Journal of Educational Statistics, 17, 251269.CrossRefGoogle Scholar
Bock, R.D., Zimowski, M.F. (1997). Multiple group IRT. In van der Linden, W.J., Hambleton, R.K. (Eds.), Handbook of modern item response theory (pp. 433448). New York, NY: Springer.CrossRefGoogle Scholar
Béguin, A.A. (2000). Robustness of equating high-stakes tests. Enschede, Netherlands: Twente University.Google Scholar
Béguin, A.A., Glas, C.A.W. (2001). MCMC estimation of multidimensional IRT models. Psychometrika, 66, 541562.CrossRefGoogle Scholar
Bernardo, J.M., Smith, A.F.M. (1994). Bayesian theory. New York, NY: John Wiley & Sons.CrossRefGoogle Scholar
Best, N.G., Cowles, M.K., Vines, S.K. (1995). CODA Convergence diagnosis and output analysis software for Gibbs sampler output: Version 0.3 [Computer software and manual]. University of Cambridge: MRC Biostatistics Unit.Google Scholar
Bollen, K.A. (1989). Structural equations with latent variables. New York, NY: John Wiley & Sons.CrossRefGoogle Scholar
Bosker, R.J., Blatchford, P., Meijnen, G.W. (1999). Enhancing educational excellence, equity and efficiency. In Bosker, R.J., Creemers, B.P.M., Stringfield, S. (Eds.), Evidence from evaluations of systems and schools in change (pp. 89112). Dordrecht/Boston/London: Kluwer Academic Publishers.Google Scholar
Box, G.E.P., Tiao, G.C. (1973). Bayesian inference in statistical analysis. Reading, MA: Addison-Wesley Publishing.Google Scholar
Bryk, A.S., Raudenbush, S.W. (1992). Hierarchical linear models. Newbury Park, CA: Sage Publications.Google Scholar
Carlin, B.P., Louis, T.A. (1996). Bayes and empirical Bayes methods for data analysis. London: Chapman & Hall.Google Scholar
Carroll, R., Ruppert, D., Stefanski, L.A. (1995). Measurement error in nonlinear models. London: Chapman & Hall.CrossRefGoogle Scholar
Chen, M.-H., Shao, Q.-M. (1999). Monte Carlo estimation of Bayesian credible and HPD intervals. Journal of Computational and Graphical Statistics, 8, 6992.CrossRefGoogle Scholar
Chib, S., Greenberg, E. (1995). Understanding the Metropolis-Hastings Algorithm. The American Statistician, 49, 327335.CrossRefGoogle Scholar
Cook, T.D., Campbell, D.T. (1979). Quasi-experimentation, design & analysis issues for field settings. Chicago, IL: Rand McNally College Publishing.Google Scholar
de Leeuw, J., Kreft, I.G.G. (1986). Random coefficient models for multilevel analysis. Journal of Educational and Behavioral Statistics, 11, 5786.CrossRefGoogle Scholar
Fox, J.-P. (2001). Multilevel IRT: A Bayesian perspective on estimating parameters and testing statistical hypotheses. Enschede, Netherlands: Twente University.Google Scholar
Fox, J.-P., Glas, C.A.W. (2001). Bayesian estimation of a multilevel IRT model using Gibbs sampling. Psychometrika, 66, 269286.CrossRefGoogle Scholar
Fuller, W.A. (1987). Measurement error models. New York, NY: John Wiley & Sons.CrossRefGoogle Scholar
Gelfand, A.E., Smith, A.F.M. (1990). Sampling-based approaches to calculating marginal densities. Journal of the American Statistical Association, 85, 398409.CrossRefGoogle Scholar
Gelfand, A.E., Hills, S.E., Racine-Poon, A., Smith, A.F.M. (1990). Illustration of Bayesian inference in normal data models using Gibbs sampling. Journal of the American Statistical Association, 85, 972985.CrossRefGoogle Scholar
Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B. (1995). Bayesian data analysis. London: Chapman & Hall.CrossRefGoogle Scholar
Gelman, A., Meng, X.-L., Stern, H.S. (1996). Posterior predictive assessment of model fitness via realized discrepancies. Statistica Sinica, 6, 733807.Google Scholar
Geman, S., Geman, D. (1984). Stochastic relaxation, Gibbs distribution, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721741.CrossRefGoogle ScholarPubMed
Gilks, W.R., Roberts, G.O. (1996). Strategies for improving MCMC. In Gilks, W.R., Richardson, S., Spiegelhalter, D.J. (Eds.), Markov Chain Monte Carlo in practice (pp. 89114). London: Chapman & Hall.Google Scholar
Goldstein, H. (1995). Multilevel statistical models 2nd ed., London: Edward Arnold.Google Scholar
Gruber, M.H.J. (1998). Improving efficiency by shrinkage. New York, NY: Marcel Dekker.Google Scholar
Hoijtink, H., Boomsma, A. (1995). On person parameter estimation in the dichotomous Rasch model. In Fischer, G.H., Molenaar, I.W. (Eds.), Rasch models: Foundations, recent developments and applications (pp. 5368). New York, NY: Springer.CrossRefGoogle Scholar
Johnson, V.E., Albert, J.H. (1999). Ordinal data modeling. New York, NY: Springer-Verlag.CrossRefGoogle Scholar
Lindley, D.V., Smith, A.F.M. (1972). Bayes estimates for the linear model. Journal of the Royal Statistical Society, 34, 141.CrossRefGoogle Scholar
Liu, J.S., Wong, H.W., Kong, A. (1994). Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes. Biometrika, 81, 2740.CrossRefGoogle Scholar
Lord, F.M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum Associates.Google Scholar
Lord, F.M., Novick, M.R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.Google Scholar
MacEachern, S.N., Berliner, L.M. (1994). Subsampling the Gibbs sampler. The American Statistician, 48, 188190.CrossRefGoogle Scholar
McDonald, R.P. (1967). Nonlinear factor analysis. Psychometrika, Monograph Number 15.CrossRefGoogle ScholarPubMed
McDonald, R.P. (1982). Linear versus nonlinear models in latent trait theory. Applied Psychological Measurement, 6, 379396.CrossRefGoogle Scholar
McDonald, R.P. (1997). Normal-ogive multidimensional model. In van der Linden, W.J., Hambleton, R.K. (Eds.), Handbook of modern item response theory (pp. 257269). New York, NY: Springer.CrossRefGoogle Scholar
Muthén, B.O. (1989). Latent variable modeling in heterogeneous populations. Psychometrika, 54, 557585.CrossRefGoogle Scholar
Patz, J.P., Junker, B.W. (1999). Applications and extensions of MCMC in IRT: Multiple item types, missing data, and rated responses. Journal of Educational and Behavioral Statistics, 24, 342366.CrossRefGoogle Scholar
Raudenbush, S.W. (1988). Educational applications of hierarchical linear models: A review. Journal of Educational Statistics, 13, 85116.CrossRefGoogle Scholar
Raudenbush, S.W., Bryk, A.S., Cheong, Y.F., Congdon, R.T. Jr. (2000). HLM 5. Hierarchical linear and nonlinear modeling. Lincolnwood, IL: Scientific Software International.Google Scholar
Richardson, S. (1996). Measurement error. In Gilks, W.R., Richardson, S., Spiegelhalter, D.J. (Eds.), Markov Chain Monte Carlo in practice (pp. 401417). London: Chapman & Hall.Google Scholar
Robert, C.P., Casella, G. (1999). Monte Carlo statistical methods. New York, NY: Springer.CrossRefGoogle Scholar
Roberts, G.O., Sahu, S.K. (1997). Updating schemes, correlation structure, blocking and parameterization for the Gibbs sampler. Journal of the Royal Statistical Society, 59, 291317.CrossRefGoogle Scholar
Seltzer, M.H. (1993). Sensitivity analysis for fixed effects in the hierarchical model: A Gibbs sampling approach. Journal of Educational Statistics, 18, 207235.CrossRefGoogle Scholar
Seltzer, M.H., Wong, W.H., Bryk, A.S. (1996). Bayesian analysis in applications of hierarchical models: Issues and methods. Journal of Educational and Behavioral Statistics, 21, 131167.CrossRefGoogle Scholar
Snijders, T.A.B., Bosker, R.J. (1999). Multilevel analysis. London: Sage Publications.Google Scholar
Tanner, M.A., Wong, W.H. (1987). The calculation of posterior distributions by data augmentation. Journal of the American Statistical Association, 82, 528550.CrossRefGoogle Scholar
Tierney, L. (1994). Markov chains for exploring posterior distributions. The Annals of Statistics, 22, 17011762.Google Scholar
van der Linden, W.J. (1998). Optimal assembly of psychological and educational tests. Applied Psychological Measurement, 22, 195211.CrossRefGoogle Scholar
Zellner, A. (1971). An introduction to Bayesian inference in econometrics. New York, NY: John Wiley & Sons.Google Scholar
Zimowski, M.F., Muraki, E., Mislevy, R.J., Bock, R.D. (1996). Bilog MG, Multiple-group IRT analysis and test maintenance for binary items. Chicago, IL: Scientific Software International.Google Scholar