Hostname: page-component-745bb68f8f-f46jp Total loading time: 0 Render date: 2025-01-07T18:12:17.475Z Has data issue: false hasContentIssue false

Using Deterministic, Gated Item Response Theory Model to Detect Test Cheating due to Item Compromise

Published online by Cambridge University Press:  01 January 2025

Zhan Shu*
Affiliation:
The University of North Carolina at Greensboro
Robert Henson
Affiliation:
The University of North Carolina at Greensboro
Richard Luecht
Affiliation:
The University of North Carolina at Greensboro
*
Requests for reprints and correspondence concerning this article should be addressed to Zhan Shu, Educational Testing Service, 660 Rosedale Road, Princeton, NJ 08541, USA. E-mail: zshu@ets.org

Abstract

The Deterministic, Gated Item Response Theory Model (DGM, Shu, Unpublished Dissertation. The University of North Carolina at Greensboro, 2010) is proposed to identify cheaters who obtain significant score gain on tests due to item exposure/compromise by conditioning on the item status (exposed or unexposed items). A “gated” function is introduced to decompose the observed examinees’ performance into two distributions (the true ability distribution determined by examinees’ true ability and the cheating distribution determined by examinees’ cheating ability). Test cheaters who have score gain due to item exposure are identified through the comparison of the two distributions. Hierarchical Markov Chain Monte Carlo is used as the model’s estimation framework. Finally, the model is applied in a real data set to illustrate how the model can be used to identify examinees having pre-knowledge on the exposed items.

Type
Original Paper
Copyright
Copyright © 2012 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Requests for reprints and correspondence concerning this article should be addressed to Zhan Shu, Educational Testing Service, 660 Rosedale Road, Princeton, NJ 08541, USA. E-mail: zshu@ets.org

References

Angoff, W.H. (1974). The development of statistical indices for detecting cheaters. Journal of the American Statistical Association, 69, 4449CrossRefGoogle Scholar
Bellezza, F., Bellezza, S. (1995). Detection of copying on multiple-choice tests: an update. Teaching of Psychology, 22(3), 180182CrossRefGoogle Scholar
Cizek, G.J. (1999). Cheating on tests: how to do it, detect it, and prevent it, Mahway: Lawrence Erlbaum AssociatesCrossRefGoogle Scholar
Drasgow, F., Levine, M.V., Williams, E.A. (1985). Appropriateness measurement with polychotomous item response models and standardized indices. British Journal of Mathematical & Statistical Psychology, 38(1), 6786CrossRefGoogle Scholar
Drasgow, F., Levine, M.V. (1986). Optimal detection of certain forms of inappropriate test scores. Applied Psychological Measurement, 10, 5967CrossRefGoogle Scholar
Drasgow, F., Luecht, R.M., Bennett, R. (2006). Technology and testing. In Brennan, R.L. (Eds.), Educational measurement, (4th ed.). Washington: American Council on Education/Praeger Publishers 471515Google Scholar
Dwyer, D.J., Hecht, J.B. (1996). Using statistics to catch cheaters: methodological and legal issues for students personnel administrators. NASPA Journal, 33(2), 125135CrossRefGoogle Scholar
Frary, R., Tideman, N., Watts, T. (1977). Indices of cheating on multiple choice tests. Journal of Educational Statistics, 2(4), 235256CrossRefGoogle Scholar
Hanson, B., Harris, D., & Brennan, R. (1987). A comparison of several methods for examining allegations of copying (ACT Research Report No. 87-15). Iowa City: American College Testing. Google Scholar
Holand, P. (1996). Assessing unusual agreement between the incorrect answers of two examinees using the K-index: statistical theory and empirical support ETS (Technical Report No. 96-4). Princeton: Educational Testing Service. Google Scholar
Levine, M.V., Rubin, D.B. (1979). Measuring the appropriateness of multiple-choice test scores. Journal of Educational Statistics, 4, 269290CrossRefGoogle Scholar
Lewis, C., & Thayer, D.T. (1998). The power of K-index to detect test copying (Research Report No. 08541). Princeton: Educational Testing Service. Google Scholar
Luecht, R.M. (1998). A framework for exploring and controlling risks associated with test item exposure over time. Paper presented at the national council on measurement in education annual meeting, San Diego. Google Scholar
Luecht, R.M. (2005). Some useful cost-benefit criteria for evaluating computer-based test delivery models and systems. Journal of Applied Testing Technology. http://www.testpublishers.org/assets/documents/Volum%207%20Some%20useful%20cost%20benefit.pdf. Google Scholar
Meijer, R.R. (Ed.) (1996). Person-fit research: theory and applications. Applied Measurement in Education, 9(1), 9–18, [Special issue]. Google Scholar
Mcleod, L., Lewis, C. (1999). Detecting item memorization in CAT environment. Applied Psychological Measurement, 23(2), 147159CrossRefGoogle Scholar
Mcleod, L., Lewis, C., Thissen, D. (2003). A Bayesian method for the detection of item pre-knowledge in computerized adaptive testing. Applied Psychological Measurement, 27(2), 121137CrossRefGoogle Scholar
Nering, M.L. (1996). The effects of person misfit in computerized adaptive testing. Unpublished doctoral dissertation, University of Minnesota, Minneapolis. Google Scholar
Nering, M.L. (1997). The distribution of indexes of person fit within the computerized adaptive testing environment. Applied Psychological Measurement, 21, 115127CrossRefGoogle Scholar
Patz, R.J., Junker, B.W. (1999). A straightforward approach to Markov chain Monte Carlo methods for item response models. Journal of Educational and Behavioral Statistics, 24, 146178CrossRefGoogle Scholar
Patz, R.J., Junker, B.W. (1999). A straightforward approach to Markov Chain Monte Carlo methods for item response models. Journal of Educational and Behavioral Statistics, 24(2), 146178CrossRefGoogle Scholar
Rost, J. (1990). Rasch models in latent classes: an integration of two approaches to item analysis. Applied Psychological Measurement, 14(3), 271282CrossRefGoogle Scholar
Rasch, G. (1960). Probabilistic models of some intelligence and attainment tests, Copenhagen: Danish Institute for Educational ResearchGoogle Scholar
Stocking, M.L., Ward, W.C., Potenza, M.T. (1998). Simulating the use of disclosed items in computerized adaptive testing. Journal of Educational Measurement, 35, 4868CrossRefGoogle Scholar
Segall, D. (2002). An item response model for characterizing test comprise. Journal of Educational and Behavioral Statistics, 27(2), 163179CrossRefGoogle Scholar
Segall, D. (2004). A sharing item response theory model for computerized adaptive testing. Journal of Educational and Behavioral Statistics, 29(4), 439460CrossRefGoogle Scholar
Shu, Z. (2010). Using the deterministic, gated item response model detecting test cheating. Unpublished Dissertation. The University of North Carolina at Greensboro. Google Scholar
Sotaridona, L.S., Meijer, R.R. (2003). Two new statistics to detect answer copying. Journal of Educational Measurement, 40(1), 5369CrossRefGoogle Scholar
Sotaridona, L.S. (2003). Statistical methods for the detection of answer copying on achievement tests, AE Enschede: Twente University PressGoogle Scholar
Tatsuoka, K. (1996). Use of generalized person-fit indexes, zetas for statistical pattern classification. Applied Measurement in Education, 9(1), 6575CrossRefGoogle Scholar
van der Linden, W.J., Sotaridona, L.S. (2006). Detecting answer copying when the regular response process follows a known response model. Journal of Educational and Behavioral Statistics, 31(3), 283304CrossRefGoogle Scholar
van der Linden, W.J., Sotaridona, L.S. (2004). A statistical test for detecting answer copying on multiple-choice tests. Journal of Educational Measurement, 41(4), 361377CrossRefGoogle Scholar
van der Linden, W.J., Jeon, M. (2012). Modeling answer changes on test items. Journal of Educational and Behavioral Statistics, 37(1), 180199CrossRefGoogle Scholar
Watson, S.A., Iwamoto, C.K., Nungester, R.J., & Luecht, R.M. (1998). The use of response similarity statistics to study examinees who benefit from copying. Paper presented at the national council on measurement in education annual meeting, San Diego. Google Scholar
Wollack, J.A. (1997). A nominal response model approach for detecting answer copying. Applied Psychological Measurement, 21(4), 307320CrossRefGoogle Scholar
Wollack, A., Cohen, A. (1998). Detection of answer copying with unknown item and trait parameters. Applied Measurement in Education, 22(2), 144152Google Scholar
Wollack, A., Cohen, A., Serlin, R. (2001). Defining error rate and power for detecting answer copying. Applied Psychological Measurement, 25(4), 385404CrossRefGoogle Scholar
Wollack, J.A. (2006). Simultaneous use of multiple answer copying indexes to improve detection rates. Applied Measurement in Education, 19(4), 265288CrossRefGoogle Scholar