On Latent Trait Estimation in Multidimensional Compensatory Item Response Models

Chun Wang

doi:10.1007/s11336-013-9399-0

On Latent Trait Estimation in Multidimensional Compensatory Item Response Models

Published online by Cambridge University Press: 01 January 2025

Chun Wang

Show author details

Chun Wang*: Affiliation:
University of Minnesota
*: Requests for reprints should be sent to Chun Wang, University of Minnesota, 75 East River Road, Elliott Hall, N658, Minneapolis, MN, 55455, USA. E-mail: wang4066@umn.edu

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Making inferences from IRT-based test scores requires accurate and reliable methods of person parameter estimation. Given an already calibrated set of item parameters, the latent trait could be estimated either via maximum likelihood estimation (MLE) or using Bayesian methods such as maximum a posteriori (MAP) estimation or expected a posteriori (EAP) estimation. In addition, Warm’s (Psychometrika 54:427–450, 1989) weighted likelihood estimation method was proposed to reduce the bias of the latent trait estimate in unidimensional models. In this paper, we extend the weighted MLE method to multidimensional models. This new method, denoted as multivariate weighted MLE (MWLE), is proposed to reduce the bias of the MLE even for short tests. MWLE is compared to alternative estimators (i.e., MLE, MAP and EAP) and shown, both analytically and through simulations studies, to be more accurate in terms of bias than MLE while maintaining a similar variance. In contrast, Bayesian estimators (i.e., MAP and EAP) result in biased estimates with smaller variability.

Keywords

maximum likelihood estimation (MLE)weighted maximum likelihood estimation (WLE)multivariate weighted maximum likelihood estimation (MWLE)Bayesian estimation

Type: Original Paper
Information: Psychometrika , Volume 80 , Issue 2 , June 2015 , pp. 428 - 449

DOI: https://doi.org/10.1007/s11336-013-9399-0 [Opens in a new window]
Copyright: Copyright © 2014 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Ackerman, T., Gierl, M.J., & Walker, C.M. (2003). Using multidimensional item response theory to evaluate educational psychological tests. Educational Measurement: Issues and Practices, 22, 37–51.CrossRef Google Scholar

Anderson, J.A., & Richardson, S.C. (1979). Logistic discrimination and bias correction in maximum likelihood estimation. Technometrics, 21, 71–78.CrossRef Google Scholar

Bock, R.D., Gibbons, R., Schilling, S.G., Muraki, E., Wilson, D.T., & Wood, R. (2003). TESTFACT 4.0. Lincolnwood: Scientific Software International. [Computer software and manual].Google Scholar

Cai, L. (2008). A Metropolis–Hastings Robbins–Monro algorithm for maximum likelihood nonlinear latent structure analysis with a comprehensive measurement model. Unpublished doctoral dissertation, University of North Carolina, Chapel Hill, NC.Google Scholar

Cai, L. (2010). High-dimensional exploratory item factor analysis by a Metropolis–Hastings Robbins–Monro algorithm. Psychometrika, 75, 33–57.CrossRef Google Scholar

Cai, L. (2010). Metropolis–Hastings Robbins–Monro algorithm for confirmatory item factor analysis. Journal of Educational and Behavioral Statistics, 35, 307–335.CrossRef Google Scholar

Cai, L., Thissen, D., & du Toit, S.H.C. (2011). IRTPRO: flexible, multidimensional, multiple categorical IRT modeling [Computer software]. Lincoln wood: Scientific Software International.Google Scholar

Chalmers, R.P. (2012). MIRT: A multidimensional item response theory package for the R environment. Journal of Statistical Software. www.jstatsoft.org.CrossRef Google Scholar

Eignor, D. R., & Schaeffer, G. A. (1995). Comparability studies for the GRE General CAT and the NCLEX using CAT. Paper presented at the meeting of the National Council on Measurement in Education, San Francisco, April.Google Scholar

Finkelman, M., Nering, M.L., & Roussos, L.A. (2009). A conditional exposure method for multidimensional adaptive testing. Journal of Educational Measurement, 46, 84–103.CrossRef Google Scholar

Firth, D. (1993). Bias reduction of maximum likelihood estimates. Biometrika, 80, 27–38.CrossRef Google Scholar

Fraser, C. (1998). NOHARM: a Fortran program for fitting unidimensional and multidimensional normal ogive models in latent trait theory. The University of New England, Center for Behavioral Studies, Armidale, Australia.Google Scholar

Hattie, J. (1981). Decision criteria for determining unidimensionality. Unpublished doctoral dissertation, University of Toronto, Canada.Google Scholar

Kim, J.K., & Nicewander, W.A. (1993). Ability estimation for conventional tests. Psychometrika, 58, 587–599.CrossRef Google Scholar

Lee, P. (1989). Bayesian statistics: an introduction. London: Edward Arnold.Google Scholar

Lehmann, E.L., & Casella, G. (1998). Theory of point estimation. New York: Springer.Google Scholar

Lord, F.M. (1983). Unbiased estimation of ability parameters, of their variance and of their parallel forms reliability. Psychometrika, 48, 223–245.CrossRef Google Scholar

Lord, F.M. (1986). Maximum likelihood and Bayesian parameter estimation in item response theory. Journal of Educational Measurement, 2, 157–162.CrossRef Google Scholar

Lord, F.M., & Novick, M.R. (1968). Statistical theories of mental test scores. Reading: Addison-Wesley.Google Scholar

Mulder, J., & van der Linden, W.J. (2009). Multidimensional adaptive testing with optimal design criteria for item selection. Psychometrika, 74, 273–296.CrossRef Google Scholar PubMed

Reckase, M.D. (2009). Multidimensional item response theory. New York: Springer.CrossRef Google Scholar

Samejima, F. (1993). An approximation for the bias function of the maximum likelihood estimate of a latent variable for the general case where the item responses are discrete. Psychometrika, 58, 119–138.CrossRef Google Scholar

Schaefer, R.L. (1983). Bias correction in maximum likelihood logistic regression. Statistics in Medicine, 2, 71–78.CrossRef Google Scholar PubMed

Segall, D.O. (1996). Multidimensional adaptive testing. Psychometrika, 61, 331–354.CrossRef Google Scholar

Serfling, R.J. (1980). Approximation theorems of mathematical statistics. New York: Wiley.CrossRef Google Scholar

Stroud, A.H., & Sechrest, D. (1966). Gaussian quadrature formulas. Englewood Cliffs: Prentice-Hall.Google Scholar

Tao, J., Shi, N., & Chang, H. (2012). Item-weighted likelihood method for ability estimation in tests composed of both dichotomous and polytomous items. Journal of Educational and Behavioral Statistics, 37, 298–315.CrossRef Google Scholar

Tseng, F.L., & Hsu, T.C. (2001). Multidimensional adaptive testing using the weighted likelihood estimation: a comparison of estimation methods. Paper presented at the annual meeting of Seattle, WA.Google Scholar

van der Linden, W.J. (1999). Multidimensional adaptive testing with a minimum error-variance criterion. Journal of Educational and Behavioral Statistics, 24, 398–412.CrossRef Google Scholar

van der Linden, W.J. (1999). A procedure for empirical initialization of the trait estimator in adaptive testing. Applied Psychological Measurement, 23, 21–29.CrossRef Google Scholar

van der Linden, W.J. (2007). A hierarchical framework for modeling speed and accuracy on test items. Psychometrika, 72, 287–308.CrossRef Google Scholar

van der Linden, W.J. (2008). Using response times for item selection in adaptive testing. Journal of Educational and Behavioral Statistics, 33, 5–20.CrossRef Google Scholar

Veldkamp, B.P., & van der Linden, W.J. (2002). Multidimensional adaptive testing with constraints on test content. Psychometrika, 67, 575–588.CrossRef Google Scholar

Warm, T.A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54, 427–450.CrossRef Google Scholar

Wang, C., & Chang, H. (2011). Item selection in multidimensional computerized adaptive tests—gaining information from different angles. Psychometrika, 76, 363–384.CrossRef Google Scholar

Wang, S., & Wang, T. (2001). Precision of Warm’s weighted likelihood estimates for a polytomous model in computerized adaptive testing. Applied Psychological Measurement, 25, 317–331.CrossRef Google Scholar

Wang, T., Hanson, B.A., & Lau, C.-M.A. (1999). Reducing bias in CAT trait estimation: a comparison of approaches. Applied Psychological Measurement, 23, 263–278.CrossRef Google Scholar

Wang, W.C., Chen, P.H., & Cheng, Y.Y. (2004). Improving measurement precision of test batteries using multidimensional item response models. Psychological Methods, 9, 116–136.CrossRef Google Scholar PubMed

Wang, C., Chang, H., & Boughton, K. (2011). Kullback–Leibler information and its applications in multi-dimensional adaptive testing. Psychometrika, 76, 13–39.CrossRef Google Scholar

Zhang, J., Xie, M., Song, X., & Lu, T. (2011). Investigating the impact of uncertainty about item parameters on ability estimation. Psychometrika, 76, 97–118.CrossRef Google Scholar

Article contents

On Latent Trait Estimation in Multidimensional Compensatory Item Response Models

Abstract

Keywords

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests