Hostname: page-component-745bb68f8f-lrblm Total loading time: 0 Render date: 2025-01-07T10:19:30.317Z Has data issue: false hasContentIssue false

Continuous Online Item Calibration: Parameter Recovery and Item Utilization

Published online by Cambridge University Press:  01 January 2025

Hao Ren*
Affiliation:
Pacific Metrics
Wim J. van der Linden
Affiliation:
Pacific Metrics
Qi Diao
Affiliation:
Pacific Metrics
*
Correspondence should be made toHao Ren, Pacific Metrics, 1 LowerRagsdale Dr #150, Monterey, CA93940, USA. Email: hren@pacificmetrics.com

Abstract

Parameter recovery and item utilization were investigated for different designs for online test item calibration. The design was adaptive in a double sense: it assumed both adaptive testing of examinees from an operational pool of previously calibrated items and adaptive assignment of field-test items to the examinees. Four criteria of optimality for the assignment of the field-test items were used, each of them based on the information in the posterior distributions of the examinee’s ability parameter during adaptive testing as well as the sequentially updated posterior distributions of the field-test item parameters. In addition, different stopping rules based on target values for the posterior standard deviations of the field-test parameters and the size of the calibration sample were used. The impact of each of the criteria and stopping rules on the statistical efficiency of the estimates of the field-test parameters and on the time spent by the items in the calibration procedure was investigated. Recommendations as to the practical use of the designs are given.

Type
Original Paper
Copyright
Copyright © 2017 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Abdelbasit, K. M., & Plankett, R. L. (1983). Experimental design for binary data. Journal of the American Statistical Association, 78, 90–98.CrossRefGoogle Scholar
Berger, M. P. F. (1992). Sequential sampling designs for the two-parameter item response theory model. Psychometrika, 57, 521538CrossRefGoogle Scholar
Berger, M. P. F. (1994). D-optimal sequential sampling design for item response theory models. Journal of Educational Statistics, 19, 4356CrossRefGoogle Scholar
Berger, M. P. F., & Wong, W. K. (2009). Introduction to optimal designs for social and biomedical research, Chichester, West Sussex: WileyCrossRefGoogle Scholar
Bock, R. D., & Mislevy, R. J. (1982). Adaptive EAP estimation of ability in microcomputer environment. Applied Psychological Measurement, 6, 431444CrossRefGoogle Scholar
Buyske, S. (1998). Optimal design for item calibration in computerized adaptive testing: The 2PL case. In N. Flournoy, et al. (Ed.), New developments and applications in experimental design. Lecture Notes—Monograph Series, vol. 34. Haywood, CA: Institute of Mathematical Statistics.Google Scholar
Chang, H-H, & Ying, Z. (2009). Nonlinear sequential designs for logistic item response models with applications to computerized adaptive testing. The Annals of Statistics, 37, 14661488CrossRefGoogle Scholar
Chang, Y. C. I. (2013). Sequential estimation in item calibration with a two-stage design. arXiv:1206.4189 [stat.AP].Google Scholar
Jones, D. H., & Jin, Z. (1994). Optimal sequential designs for on-line item estimation. Psychometrika, 59, 5975CrossRefGoogle Scholar
Segall, D. O. (2002). Confirmatory item factor analysis using Markov chain Monte Carlo estimation with applications to online calibration in CAT. Paper presented at the annual meeting of the National Council on Measurement in Education. New Orleans, LA.Google Scholar
Segall, D. O. (2003). Calibrating CAT pools and online pretest items using MCMC methods. Paper presented at the annual meeting of the National Council on Measurement in Education. Chicago, IL.Google Scholar
Segall, D. O., & Moreno, K. E. Drasgow, F., & Olson-Buchanan, J. B. (1999). Development of the computerized adaptive testing version of the Armed Services Vocational Aptitude Battery. Innovations in computerized assessment, Hillsdale, NJ: Lawrence Erlbaum Associates.Google Scholar
Silvey, S. D. (1980). Optimal design, London: Chapman and HallCrossRefGoogle Scholar
Stocking, M. L. (1988). Scale drift in on-line calibration (research report 88-28), Princeton, NJ: Educational Testing Service.CrossRefGoogle Scholar
Stocking, M. L. (1990). Specifying optimum examinees for item parameter estimation in item response theory. Psychometrika, 55, 461475CrossRefGoogle Scholar
van der Linden, W. J., & Ren, H. (2015). Optimal Bayesian adaptive design for test-item calibration. Psychometrika, 80, 263288CrossRefGoogle ScholarPubMed
Wainer, H., & Mislevy, R. J. Wainer, H. (1990). Item response theory, item calibration, and proficiency estimation. Computer adaptive testing: A primer, Hillsdale, NJ: Lawrence Erlbaum 65102.Google Scholar