Hostname: page-component-745bb68f8f-s22k5 Total loading time: 0 Render date: 2025-01-07T17:26:03.375Z Has data issue: false hasContentIssue false

Bayesian Checks on Cheating on Tests

Published online by Cambridge University Press:  01 January 2025

Wim J. van der Linden*
Affiliation:
CTB/McGraw-Hill
Charles Lewis
Affiliation:
Fordham University
*
Correspondence should be sent to Wim J. van der Linden, CTB/McGraw-Hill, 20 Ryan Ranch Road, Monterey, CA 93940. Email: wim_vanderlinden@ctb.com

Abstract

Posterior odds of cheating on achievement tests are presented as an alternative to p\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$p$$\end{document} values reported for statistical hypothesis testing for several of the probabilistic models in the literature on the detection of cheating. It is shown how to calculate their combinatorial expressions with the help of a reformulation of the simple recursive algorithm for the calculation of number-correct score distributions used throughout the testing industry. Using the odds avoids the arbitrary choice between statistical tests of answer copying that do and do not condition on the responses the test taker is suspected to have copied and allows the testing agency to account for existing circumstantial evidence of cheating through the specification of prior odds.

Type
Original Paper
Copyright
Copyright © 2014 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Angoff, W.H., (1974). The development of statistical indices for detecting cheaters. Journal of the American Statistical Association, 69, 44–49.CrossRefGoogle Scholar
Belov, D.I., & Armstrong, R. D., (2010). Automatic detection of answer copying via Kullback–Leibler divergence and K-index. Applied Psychological Measurement, 34, 379–392.CrossRefGoogle Scholar
Bock, R.D. (1972). Estimating item parameters and latent ability when responses are scored in two or more nominal categories. Psychometrika, 46, 443459.CrossRefGoogle Scholar
Bock, R.D., (1997). The nominal categories model. In W. J. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 33–49). New York: Springer.Google Scholar
Fox, J-P, & Meijer, R.R. (2008). Using item response theory to obtain individual information from randomized response data: An application using cheating data. Applied Psychological Measurement, 32, 595610.CrossRefGoogle Scholar
Frary, R.B., Tideman, T. N., & Watts, T. M., (1977). Indices of cheating on multiple-choice tests. Journal of Educational Statistics, 2, 235–256.CrossRefGoogle Scholar
Glas, C.A.W., & Meijer, R.R., (2003). A Bayesian approach to person-fit analysis in item response theory. Applied Psychological Measurement, 27, 217–233.CrossRefGoogle Scholar
Holland, P.W., (1996). Assessing unusual agreement between the incorrect answers of two examinees using the K-index: Statistical theory and empirical support (Research Report RR-96-7). Princeton, NJ: Educational Testing Service.Google Scholar
Jacob, B.A., & Levitt, S. (2003a). Rotten apples: An investigation of the prevalence and predictors of teacher cheating. Quarterly Journal of Economics, 118, 843–877.CrossRefGoogle Scholar
Jacob, B.A., & Levitt, S. (2003b). Catching cheating teachers: The results of an unusual experiment in implementing theory. Brookings-Wharton Papers on Urban Affairs, 185–209.CrossRefGoogle Scholar
Jacob, B.A., & Levitt, S., (2004, Winter). To catch a cheat. Education Next, 68–75.Google Scholar
Lehmann, E.L., & Romano, J.P., (2005). Testing statistical hypotheses (3rd ed.). New York: Springer.Google Scholar
Levitt, S., & Rubner, S. (2005). Freakonomics: A rogue economist explores the hidden side of everything. New York: Harper Collins.Google Scholar
Lewis, C. (2006). A note on conditional and unconditional hypothesis testing: A discussion of an issue raised by van der Linden and Sotaridona. Journal of Educational and Behavioral Statistics, 31, 305309.CrossRefGoogle Scholar
Lewis, C., & Thayer, D.T., (1998). The power of the K-index (or PMIR) to detect copying (Research Report RR-98-49). Princeton, NJ: Educational Testing Service.Google Scholar
Lord, F.M., & Wingersky, M.S., (1984). Comparison of IRT true-score and equipercentile observed-score “equatings”. Applied Psychological Measurement, 8, 452–461.CrossRefGoogle Scholar
McLeod, L.D., Lewis, C., & Thissen, D., (2003). A Bayesian method for the detection of item preknowledge in computerized adaptive testing. Applied Psychological Measurement, 27, 121–137.CrossRefGoogle Scholar
Meijer, R.R., & Sijtsma, K., (2001). Methodology review: Evaluation of person fit. Applied Psychological Measurement, 25, 107–135.CrossRefGoogle Scholar
Qualls, A.L., (2001). Can knowledge of erasure behavior be used as an indicator of possible cheating? Educational Measurement: Issues and Practice, 20(1), 9–16.Google Scholar
Saupe, J.L., (1960). An empirical model for the corroboration of suspected cheating on multiple-choice tests. Educational and Psychological Measurement, 20, 475–489.CrossRefGoogle Scholar
Sotaridona, L.S. & Meijer, R.R., (2002). Statistical properties of the K-index for detecting answer copying. Journal of Educational Measurement, 39, 115–132.CrossRefGoogle Scholar
van der Linden, W.J., & Guo, F., (2008). Bayesian procedures for identifying aberrant response-time patterns in adaptive testing. Psychometrika, 73, 365–384.CrossRefGoogle Scholar
van der Linden, W.J., & Jeon, M., (2012). Modeling answer changes on test items. Journal of Educational and Behavioral Statistics, 37, 180–199.CrossRefGoogle Scholar
van der Linden, W.J., & Sotaridona, L. (2004). A statistical test for detecting answer copying on multiple-choice tests. Journal of Educational Measurement, 41, 361–377.CrossRefGoogle Scholar
van der Linden, W.J., & Sotaridona, L., (2006). Detecting answer copying when the regular response process follows a known response model. Journal of Educational and Behavioral Statistics, 31, 283–304.CrossRefGoogle Scholar
Wesolowsky, G.O. (2000). Detecting excessive similarity in answers on multiple-choice exams. Journal of Applied Statistics, 27, 909921.CrossRefGoogle Scholar
Wollack, J.A., (1997). A nominal response model approach to detect answer copying. Applied Psychological Measurement, 21, 307–320.CrossRefGoogle Scholar