Hostname: page-component-745bb68f8f-grxwn Total loading time: 0 Render date: 2025-01-07T18:32:47.009Z Has data issue: false hasContentIssue false

Detection of Test Speededness Using Change-Point Analysis

Published online by Cambridge University Press:  01 January 2025

Can Shao
Affiliation:
University of Notre Dame
Jun Li*
Affiliation:
University of Notre Dame
Ying Cheng*
Affiliation:
University of Notre Dame
*
Jun Li, Department of Applied and Computational Mathematics and Statistics, University of Notre Dame, 171 Hurley Hall, Notre Dame, IN 46556, USA. Email: jun.li@nd.edu
Correspondence should be made to Ying Cheng, Department of Psychology, University of Notre Dame, 118 Haggar Hall, Notre Dame, IN 46556, USA. Email: ycheng4@nd.edu

Abstract

Change-point analysis (CPA) is a well-established statistical method to detect abrupt changes, if any, in a sequence of data. In this paper, we propose a procedure based on CPA to detect test speededness. This procedure is not only able to classify examinees into speeded and non-speeded groups, but also identify the point at which an examinee starts to speed. Identification of the change point can be very useful. First, it informs decision makers of the appropriate length of a test. Second, by removing the speeded responses, instead of the entire response sequence of an examinee suspected of speededness, ability estimation can be improved. Simulation studies show that this procedure is efficient in detecting both speeded examinees and the speeding point. Ability estimation is dramatically improved by removing speeded responses identified by our procedure. The procedure is then applied to a real dataset for illustration purpose.

Type
Article
Copyright
Copyright © 2015 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Andrews, D. W. (1993). Tests for parameter instability and structural change with unknown change point. Econometrica: Journal of the Econometric Society, 61, 821856.CrossRefGoogle Scholar
Angulo, B., Suarez-Gauthier, A., Lopez-Rios, F., Medina, P. P., Conde, E., Tang, M., Sanchez-Cespedes, M. (2008). Expression signatures in lung cancer reveal a profile for EGFR-mutant tumours and identify selective PIK3CA overexpression by gene amplification. The Journal of pathology, 214 (3), 347356.CrossRefGoogle ScholarPubMed
Barry, D., Hartigan, J. A. (1993). (1985). A Bayesian analysis for change point problems. Journal of the American Statistical Association, 88 (421), 309319.CrossRefGoogle Scholar
Bejar, I. I. Test speededness under number-right scoring: An analysis of the Test of English as a Foreign Language (RR-85-11), Princeton, NJ: Educational Testing Service.Google Scholar
Benjamini, Y., Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B (Methodological), 57, 289300.CrossRefGoogle Scholar
Bolt, D. M., Cohen, A. S., Wollack, J. A. (2002). (2012). Item parameter estimation under conditions of test speededness: Application of a mixture Rasch model with ordinal constraints. Journal of Educational Measurement, 39, 331348.CrossRefGoogle Scholar
Bolt, D. M., Mroch, A. A., & Kim, J. S. (2003). An empirical investigation of the Hybrid IRT model for improving item parameter estimation in speeded tests. In: Presented at the annual meeting of the American Educational Research Association, Chicago.Google Scholar
Chen, J., Gupta, A. K. Parametric statistical change point analysis: With applications to genetics, medicine, and finance, 2New York: Springer.Google Scholar
De Boeck, P., Cho, S. J., Wilson, M. (2011). Explanatory secondary dimension modeling of latent differential item functioning. Applied Psychological Measurement, 35 (8), 583603.CrossRefGoogle Scholar
De La Torre, J., Deng, W. (2008). Improving person-fit assessment by correcting the ability estimate and its reference distribution. Journal of Educational Measurement, 45 (2), 159177.CrossRefGoogle Scholar
Douglas, J., Kim, H. R., Habing, B., Gao, F. (1998). Investigating local dependence with conditional covariance functions. Journal of Educational and Behavioral Statistics, 23, 129151.CrossRefGoogle Scholar
Drasgow, F., Levine, M. V., Williams, E. A. (1985). (1994). Appropriateness measurement with polychotomous item response models and standardized indices. British Journal of Mathematical and Statistical Psychology, 38, 6786.CrossRefGoogle Scholar
Efron, B., Tibshirani, R. J. An introduction to the bootstrap, Boca Raton: CRC Press.Google Scholar
Evans, F. R., & Reilly, R. R. (1972). A study of speededness as a source of test bias. Journal of Educational Measurement, 9, 123131.CrossRefGoogle Scholar
Genovese, C. R., Lazar, N. A., Nichols, T. (2002). (1950). (2000). Thresholding of statistical maps in functional neuroimaging using the false discovery rate. Neuroimage, 15 (4), 870878.CrossRefGoogle ScholarPubMed
Goegebeur, Y., De Boeck, P., & Molenberghs, G. (2010). Person fit for test speededness: Normal curvatures, likelihood 618 ratio tests and empirical Bayes estimates. Methodology: European Journal ofResearch Methods for the Behavioral 619 and Social Sciences, 6(1),3.CrossRefGoogle Scholar
Goegebeur, Y., De Boeck, P., Wollack, J. A., & Cohen, A. S. (2008). A speeded item response model with gradual process 616 change. Psychometrika, 73, 6587.Google Scholar
Gulliksen, H. Theory of mental tests, New York: John Wiley.Google Scholar
Gustafsson, F. Adaptive filtering and change detection, New York: Wiley.CrossRefGoogle Scholar
Li, J., Witten, D. M., Johnstone, I. M., Tibshirani, R. (2012). Normalization, testing, and false discovery rate estimation for RNA-sequencing data. Biostatistics, 13, 523538.CrossRefGoogle ScholarPubMed
Lu, Y., Sireci, S. (2007). Validity issues in test speededness. Educational Measurement: Issues and Practice, 26, 2937.CrossRefGoogle Scholar
McBride, W. J., Kimpel, M. W., McClintick, J. N., Ding, Z. M., Edenberg, H. J., Liang, T., Bell, R. L. (2014). Changes in gene expression within the extended amygdala following binge-like alcohol drinking by adolescent alcohol-preferring (P) rats. Pharmacology Biochemistry and Behavior, 117, 5260.CrossRefGoogle ScholarPubMed
Meijer, R. R., Sijtsma, K. (2001). Methodology review: Evaluating person fit. Applied Psychological Measurement, 25, 107135.CrossRefGoogle Scholar
Molenaar, I. W., Hoijtink, H. (1990). The many null distributions of person fit indices. Psychometrika, 55, 75106.CrossRefGoogle Scholar
Nering, M. L. (1995). The distribution of person fit using true and estimated person parameters. Applied Psychological Measurement, 19, 121129.CrossRefGoogle Scholar
Oshima, T. C. (1994). The effect of speededness on parameter estimation in item response theory. Journal of Educational Measurement, 21, 200219.CrossRefGoogle Scholar
R Development Core Team. (2014). R: A language and environment for statistical computing [Computer software manual], Vienna, Austria. http://www.R-project.org.Google Scholar
Raftery, A. E., Akman, V. E. (1986). Bayesian analysis of a Poisson process with a change-point. Biometrika, 73, 8589.CrossRefGoogle Scholar
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph Supplement, 34, 100114.Google Scholar
Schwartzman, A., Lin, X. (2011). The effect of correlation in false discovery rate estimation. Biometrika, 98, 199214.CrossRefGoogle ScholarPubMed
Shao, C., Kim, D., Cheng, Y., & Luo, X. (2014a). A change point-detection based method for warm-up Effect detection in computerized adaptive testing. In: Paper presented at the meeting of International Association for Computerized Adaptive Testing, Princeton.Google Scholar
Shao, C., Kim, D., Cheng, Y., & Luo, X. (2014b). Model comparison on detection of warm-up effect in computerized adaptive testing. In: Paper presented at the National Council of State Boards of Nursing joint research committee meeting, Chicago.Google Scholar
Shao, C., Li, J., & Cheng, Y. (2014). Test speededness detection based on the detection of change-point. In: Paper presented at the annual meeting of the Psychometric Society, Madison.Google Scholar
Storey, J. D., Taylor, J. E., Siegmund, D. (2004). Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: A unified approach. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 66, 187205.CrossRefGoogle Scholar
Storey, J. D., Tibshirani, R. (2003). Statistical significance for genomewide studies. Proceedings of the National Academy of Sciences, 100, 94409445.CrossRefGoogle ScholarPubMed
Suh, Y., Cho, S. J., Wollack, J. A. (2012). A comparison of item calibration procedures in the presence of test speededness. Journal of Educational Measurement, 49, 285311.CrossRefGoogle Scholar
United States Department of Education. (2013). Testing integrity: Issues and recommendations for best practice. Retrieved July 22, 2015 from http://nces.ed.gov/pubs2013/2013454.pdf.Google Scholar
Wang, C., & Xu, G. (2015). A mixture hierarchical model for response times and response accuracy. British Journal of Mathematical and Statistical Psychology. doi:10.1111/bmsp.12054.CrossRefGoogle Scholar
Wollack, J. A., & Cohen, A. S. (2004, April). A model for simulating speeded test data. In: Paper presented at the annual meeting of the American Educational Research Association, San Diego.Google Scholar
Wollack, J. A., Cohen, A. S., Wells, C. S. (2003). A method for maintaining scale stability in the presence of test speededness. Journal of Educational Measurement, 40 (4), 307330.CrossRefGoogle Scholar
Yamamoto, K., Everson, H., Rost, J., Langeheine, R. (1997). Modeling the effects of test length and test time on parameter estimation using the HYBRID model. Applications of latent trait and latent class models in the social sciences, New York: Waxmann. 8998.Google Scholar
Zhang, J. (2013). A sequential procedure for detecting compromised items in the item pool of a CAT system. Applied Psychological Measurement, 38, 87104.CrossRefGoogle Scholar
Zhang, J., Chang, H. H., Yi, Q. (2012). Comparing single-pool and multiple-pool designs regarding test security in computerized testing. Behavior Research Methods, 44, 742752.CrossRefGoogle ScholarPubMed