Hostname: page-component-5f745c7db-nzk4m Total loading time: 0 Render date: 2025-01-06T07:04:43.492Z Has data issue: true hasContentIssue false

A Priori Reliability of Tests with Cut Score

Published online by Cambridge University Press:  01 January 2025

Guido Magnano*
Affiliation:
University of Turin
Chiara Tannoia
Affiliation:
University of Turin
Chiara Andrà
Affiliation:
University of Turin
*
Requests for reprints should be sent to Guido Magnano, Department of Mathematics, University of Turin, Via Carlo Alberto 10, 10123 Torino, Italy. E-mail: guido.magnano@unito.it

Abstract

The theoretical probability of misclassification in a mastery test is exactly computed using the raw score probability distribution (in the Rasch model) as a function of the examinee’s latent ability. The resulting misclassification probability curve, together with the latent ability distribution in the group of examinees, completely determines the expected rate of classification errors. It is shown that several distinct ability thresholds, playing different roles in connection to classification reliability, can be associated to a test with a single cut score. In particular, it is possible to define (and compute) two relevant ability intervals, which encapsulate the functioning of a mastery test (about and far from the cut score, respectively); the dependence of these intervals on the item difficulty spectrum is investigated. Extension to the 2PL model is also discussed, with emphasis on the effects of weighted scoring.

Type
Original Paper
Copyright
Copyright © 2013 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Baker, F.B. (1992). Item response theory: parameter estimation techniques. Statistics: textbooks and monographs. New York: Marcel Dekker.Google Scholar
Gatti, G.G., & Buckendahl, C.W. (2006). On correctly classifying examinees. Annual meeting of the American educational research association, San Francisco CA, April 2006.Google Scholar
Guo, F. (2006). Expected classification accuracy using latent distributions. GMAC Research Reports.Google Scholar
Huynh, H. (1980). Statistical inference for false positive and false negative error rates in mastery testing. Psychometrika, 45, 107120.CrossRefGoogle Scholar
Huynh, H. (1982). Assessing efficiency of decisions in mastery testing. Journal of Educational Statistics, 7(1), 4763.CrossRefGoogle Scholar
Huynh, H. (1990). Computation and statistical inference for decision consistency index based on the Rasch model. Journal of Educational Statistics, 15(4), 353368.CrossRefGoogle Scholar
Khidr, A.M., & Abdelnasser, M.T. (1982). Decomposing the sum of independent Bernoulli variates to its components. Indian Journal of Pure and Applied Mathematics, 49, 223245.Google Scholar
Livingston, S.A., & Lewis, Ch. (1995). Estimating the consistency and accuracy of classifications based on test scores. Journal of Educational Measurement, 32(2), 179197.CrossRefGoogle Scholar
Lord, F.M., & Novick, M.R. (1968). Statistical theories of mental test scores. New York: Addison-Wesley. (contributed by A. Birnbaum), Chaps. 17 and 19.Google Scholar
Lord, F.M., & Wingersky, M.S. (1984). Comparison of IRT true-score and equipercentile observed-score equatings. Applied Psychological Measurement, 8, 453461.CrossRefGoogle Scholar
Rudner, L.M. (2005). Expected classification accuracy. Practical Assessment, Research & Evaluation, 10(13).Google Scholar
Tannoia, C. (2011). Pass-fail reliability for multiple-choice tests with cut scores. M.Sc. thesis, University of Turin.Google Scholar
van der Linden, W.J. (1998). A decision theory model for course placement. Journal of Educational and Behavioral Statistics, 23(1), 1834.CrossRefGoogle Scholar
Wainer, H., Wang, X.A., & Bradlow, E.T. (2005). A Bayesian method for evaluating passing scores: the PPoP curve. Journal of Educational Measurement, 42(3), 271281.CrossRefGoogle Scholar
Wilcox, R.R. (1977). Estimating the likelihood of false-positive and false-negative decisions in mastery testing: an empirical Bayes approach. Journal of Educational Statistics, 2, 289307.CrossRefGoogle Scholar
Young, M.J., & Yoon, B. (1998). Estimating the consistency and accuracy of Classifications in a standards-referenced assessment. CSE Technical Report.Google Scholar