Hostname: page-component-5f745c7db-rgzdr Total loading time: 0 Render date: 2025-01-06T07:29:48.772Z Has data issue: true hasContentIssue false

Efficient Standard Error Formulas of Ability Estimators with Dichotomous Item Response Models

Published online by Cambridge University Press:  01 January 2025

David Magis*
Affiliation:
University of Liège and Ku Leuven
*
Correspondence should be made to David Magis, Department of Education (B32), University of Liège, Boulevard du Rectorat 5, 4000 Liège, Belgium. Email: david.magis@ulg.ac.be

Abstract

This paper focuses on the computation of asymptotic standard errors (ASE) of ability estimators with dichotomous item response models. A general framework is considered, and ability estimators are defined from a very restricted set of assumptions and formulas. This approach encompasses most standard methods such as maximum likelihood, weighted likelihood, maximum a posteriori, and robust estimators. A general formula for the ASE is derived from the theory of M-estimation. Well-known results are found back as particular cases for the maximum and robust estimators, while new ASE proposals for the weighted likelihood and maximum a posteriori estimators are presented. These new formulas are compared to traditional ones by means of a simulation study under Rasch modeling.

Type
Original paper
Copyright
Copyright © 2015 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Baker, F. B., & Kim, S.-H. (2004). Item response theory: Parameter estimation techniques. New York: Marcel Dekker.CrossRefGoogle Scholar
Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In Lord, F.M., & Novick, M.R. (Eds.), Statistical theories of mental test scores (chapters) (pp. 1720). Reading, MA: Addison-Wesley.Google Scholar
Birnbaum, A. (1969). Statistical theory for logistic mental test models with a prior distribution of ability. Journal of Mathematical Psychology, 6, 258276.CrossRefGoogle Scholar
Bock, R. D., & Mislevy, R. J. (1982). Adaptive EAP estimation of ability in a micro computer environment. Applied Psychological Measurement, 6, 431–444.CrossRefGoogle Scholar
Carroll, R. J., & Pederson, S. (1993). On robustness in the logistic regression model. Journal of the Royal Statistical Society: Series B, 55, 693–706.CrossRefGoogle Scholar
Doebler, A. (2012). The problem of bias in person parameter estimation in adaptive testing. Applied Psychological Measurement, 36, 255270. doi:10.1177/0146621612443304.CrossRefGoogle Scholar
Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. New York: Erlbaum.Google Scholar
Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Boston, MA: Kluwer.CrossRefGoogle Scholar
Huber, P.J. (1964). Robust estimation of a location parameter. Annals of Mathematical Statistics, 35, 73101. doi:10.1214/aoms/1177703732.CrossRefGoogle Scholar
Huber, P.J. (1967). The behavior of maximum likelihood estimates under non-standard conditions. In Proceeding of the 5th Berkeley Symposium, (vol. 1, pp. 221–233).Google Scholar
Huber, P.J. (1981). Robust statistics. New York: Wiley.CrossRefGoogle Scholar
Koralov, L., & Sinai, Y. G. (2007). Theory of probability and random processes. New York: Springer.CrossRefGoogle Scholar
Lord, F.M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum Associates.Google Scholar
Lord, F.M. (1983). Unbiased estimators of ability parameters, of their variance, and of their parallel-forms reliability. Psychometrika, 48, 233245. doi:10.1007/BF02294018.CrossRefGoogle Scholar
Lord, F.M. (1986). Maximum likelihood and Bayesian parameter estimation in item response theory. Journal of Educational Measurement, 23, 157162. doi:10.1111/j.1745-3984.1986.tb00241.x.CrossRefGoogle Scholar
Magis, D. (2014). On the asymptotic standard error of a class of robust estimators of ability in dichotomous item response models. British Journal of Mathematical and Statistical Psychology, 67, 430450. doi:10.1111/bmsp.12027.CrossRefGoogle ScholarPubMed
Magis, D. (2014). Accuracy of asymptotic standard errors of the maximum and weighted likelihood estimators of proficiency levels with short tests. Applied Psychological Measurement, 38, 105121. doi:10.1177/0146621613496890.CrossRefGoogle Scholar
Magis, D., & Raîche, G. (2012). Random generation of response patterns under computerized adaptive testing with the R package catR. Journal of Statistical Software, 48, 1–31.CrossRefGoogle Scholar
Mislevy, R.J. (1986). Bayes modal estimation in item response theory. Psychometrika, 51, 177195. doi:10.1007/BF02293979.CrossRefGoogle Scholar
Mislevy, R. J., & Bock, R. D. (1982). Biweight estimates of latent ability. Educational and Psychological Measurement, 42, 725–737. doi:https://doi.org/10.1177/001316448204200302.CrossRefGoogle Scholar
Mosteller, F., & Tukey, J. (1977). Exploratory data analysis and regression. Reading, MA: Addison-Wesley.Google Scholar
Nydick, S.W. (2013). catIrt: An R package for simulating IRT-based computerized adaptive tests. R package version 0.4-1.Google Scholar
Ogasawara, H. (2013). Asymptotic properties of the Bayes and pseudo Bayes estimators of ability in item response theory. Journal of Multivariate Analysis, 114, 359377. doi:10.1016/j.jmva.2012.08.013.CrossRefGoogle Scholar
Ogasawara, H. (2013b). Asymptotic cumulants of the ability estimators using fallible item parameters. Journal of Multivariate Analysis, 119, 144–162. doi:https://doi.org/10.1016/j.jmva.2013.04.008.CrossRefGoogle Scholar
Partchev, I. (2012). irtoys: Simple interface to the estimation and plotting of IRT models. R package version 0.1.6.Google Scholar
Patton, J. M., Cheng, Y., Yuan, K.-H., & Diao, Q. (2013). The influence of item calibration error on variable-length computerized adaptive testing. Applied Psychological Measurement, 37, 24–40. doi:https://doi.org/10.1177/0146621612461727.CrossRefGoogle Scholar
R Core Team (2014). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing.Google Scholar
Rao, M.M. (1984). Probability theory with applications. New York: Academic Press.Google Scholar
Reif, M. (2014). PP: Estimation of person parameters for the 1, 2, 3, 4-PL model and the GPCM. R package version 0.5.3.Google Scholar
Schuster, C., & Yuan, K-H. (2011). Robust estimation of latent ability in item response models. Journal of Educational and Behavioral Statistics, 36, 720735. doi:10.3102/1076998610396890.CrossRefGoogle Scholar
Sijisma, K., & Molenaar, I. W. (2002). Introduction to nonparametric item response theory. Thousand Oaks, CA: Sage.CrossRefGoogle Scholar
Stefanski, L. A., & Boos, D. D. (2002). The calculus of M-estimation. The American Statistician, 56, 29–38. doi:https://doi.org/10.1198/000313002753631330.CrossRefGoogle Scholar
Wainer, H. (2000). Computerized adaptive testing: A primer. (2nd ed.). Mahwah, NJ: Lawrence Erlbaum Associates.CrossRefGoogle Scholar
Wainer, H., & Wright, B. D. (1980). Robust estimation of ability in the Rasch model. Psychometrika, 45, 373–391. doi:https://doi.org/10.1007/BF02293910.CrossRefGoogle Scholar
Warm, T.A. (1989). Weighted likelihood estimation of ability in item response models. Psychometrika, 54, 427450. doi:10.1007/BF02294627.CrossRefGoogle Scholar
Warm, T.A. (2007). Warm (Maximum) likelihood estimates of Rasch measures. Rasch Measurement Transactions, 21, 1094.Google Scholar
Wu, M. L., Adams, R. J., & Wilson, M. R. (1997). ConQuest: Multi-aspect test software [Computer program]. Camberwell, Australia: Australian Council for Educational Research.Google Scholar
Yuan, K.-H., & Jennrich, R. I. (1998). Asymptotics of estimating equations under natural conditions. Journal of Multivariate Analysis, 65, 245–260. doi:https://doi.org/10.1006/jmva.1997.1731.CrossRefGoogle Scholar
Zeileis, A. (2006). Object-oriented computation of sandwich estimators. Journal of Statistical Software, 16(9), 116.CrossRefGoogle Scholar