Hostname: page-component-5f745c7db-96s6r Total loading time: 0 Render date: 2025-01-06T21:14:25.806Z Has data issue: true hasContentIssue false

Classical Test Theory as a First-Order Item Response Theory: Application to True-Score Prediction from a Possibly Nonparallel Test

Published online by Cambridge University Press:  01 January 2025

Paul W. Holland*
Affiliation:
Educational Testing Service
Machteld Hoskens
Affiliation:
CTB-McGraw Hill
*
Requests for reprints should be sent to Paul W. Holland, Educational Testing Service, Rosedale Road 12-T, Princeton NJ 08541. E-Mail: pholland@ets.org

Abstract

We give an account of Classical Test Theory (CTT) in terms of the more fundamental ideas of Item Response Theory (IRT). This approach views classical test theory as a very general version of IRT, and the commonly used IRT models as detailed elaborations of CTT for special purposes. We then use this approach to CTT to derive some general results regarding the prediction of the true-score of a test from an observed score on that test as well from an observed score on a different test. This leads us to a new view of linking tests that were not developed to be linked to each other. In addition we propose true-score prediction analogues of the Dorans and Holland measures of the population sensitivity of test linking functions. We illustrate the accuracy of the first-order theory using simulated data from the Rasch model, and illustrate the effect of population differences using a set of real data.

Type
Articles
Copyright
Copyright © 2003 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

This research is collaborative in every respect and the order of authorship is alphabetical. It was begun when both authors were on the faculty of the Graduate School of Education at the University of California, Berkeley.

We would like to thank both Neil Dorans, Skip Livingston and two anonymous referees for many suggestions that have greatly improved this paper.

References

Bock, R.D., Mislevy, R.J. (1982). Adaptive EAP estimation in a microcomputer environment. Applied Psychological Measurement, 6, 431444CrossRefGoogle Scholar
Dorans, N., Holland, P.W. (2000). Population invariance and the equatability of tests: Basic theory and the linear case. Journal of Educational Measurement, 37, 281306CrossRefGoogle Scholar
Feuer, M.J., Holland, P.W., Green, B.F., Bertenthal, M.W., Hemphill, F.C. (1999). Uncommon measures. Washington, DC: National Academy PressGoogle Scholar
Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B. (1995). Bayesian data analysis. London: Chapman and HallCrossRefGoogle Scholar
Holland, P.W. (1990). On the sampling theory foundations of item response theory models. Psychometrika, 55, 577601CrossRefGoogle Scholar
Kelley, T.L. (1923). Statistical methods. New York, NY: MacmillanGoogle Scholar
Lord, F.M., Novick, M.R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-WesleyGoogle Scholar
Mislevy, R.J., Beaton, A.E., Kaplan, B., Sheehan, K.M. (1992). Estimating population characteristics from sparse matrix samples of item responses. Journal of Educational Measurement, 29, 133161CrossRefGoogle Scholar
Pashley, P.J., Phillips, G.W. (1993). Toward world-class standards: A research study linking national and international assessments. Princeton NJ: Educational Testing ServiceGoogle Scholar
Wainer, H.et al. (2001). Augmented scores—“Borrowing strength” to compute scores based on small numbers of items. In Thissen, D., Wainer, H.et al. (Eds.), Test Scoring (pp. 343387). Mahwah, NJ: EarlbaumGoogle Scholar
Williams, V.et al. (1995). Projecting to the NAEP scale: Results from the North Carolina End-of-Grade testing program. Chapel Hill, NC: National Institute of Statistical Science, University of North Carolina, Chapel HillGoogle Scholar
Wu, M., Adams, R., Wilson, M. (1997). ConQuest [Computer program]. Melbourne, Australia: Australian Council for Educational ResearchGoogle Scholar