Hostname: page-component-cd9895bd7-dzt6s Total loading time: 0 Render date: 2025-01-06T01:25:39.816Z Has data issue: false hasContentIssue false

Analyzing Test-Taking Behavior: Decision Theory Meets Psychometric Theory

Published online by Cambridge University Press:  01 January 2025

David V. Budescu*
Affiliation:
Fordham University
Yuanchao Bo
Affiliation:
Fordham University
*
Correspondence should be made to David V. Budescu, Depertament of Psychology, Fordham University, 441 East Fordham Road, Bronx, NY 100458 USA. Email: budescu@fordham.edu

Abstract

We investigate the implications of penalizing incorrect answers to multiple-choice tests, from the perspective of both test-takers and test-makers. To do so, we use a model that combines a well-known item response theory model with prospect theory (Kahneman and Tversky, Prospect theory: An analysis of decision under risk, Econometrica 47:263–91, 1979). Our results reveal that when test-takers are fully informed of the scoring rule, the use of any penalty has detrimental effects for both test-takers (they are always penalized in excess, particularly those who are risk averse and loss averse) and test-makers (the bias of the estimated scores, as well as the variance and skewness of their distribution, increase as a function of the severity of the penalty).

Type
Original Paper
Copyright
Copyright © 2015 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Bar-Hillel, M., Budescu, D.V., & Attali, Y. (2005). Scoring and keying multiple choice tests: A case study in irrationality. Mind and Society, 4, 212.Google Scholar
Bechger, T.M., Maris, G., & Verstralen, H.H.F.M. (2005). The Nedelsky model for multiple-choice items. In van der Ark, A., Croon, M., & Sijtsma, K. (Eds.), Chapter 10 in New developments in categorical data analysis for the social and behavioral sciences. New York: Lawrence Erlbaum.Google Scholar
Bereby-Meyer, Y., Meyer, J., & Budescu, D.V. (2003). Decision making under internal uncertainty: The case of multiple-choice tests with different scoring rules. Acta Psychologica, 112, 207220.CrossRefGoogle ScholarPubMed
Bereby-Meyer, Y., Meyer, J., & Flascher, O.M. (2002). Prospect theory analysis of guessing in multiple choice tests. Journal of Behavioral Decision Making, 15, 313327.CrossRefGoogle Scholar
Ben-Simon, A., Budescu, D.V., & Nevo, B. (1997). A comparative study of measures of partial knowledge in multiple-choice tests. Applied Psychological Measurement, 21, 6588.CrossRefGoogle Scholar
Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. M. Lord, & M. R. Novick (Eds.), Statistical theories of mental test scores. Reading, MA: Addison-Wesley.Google Scholar
Bock, R. D. (1972). Estimating item parameters and latent ability when responses are scored in two or more nominal categories. Psychometrika, 37, 29–51.CrossRefGoogle Scholar
Booij, A.S., Van Praag, B.M.S., & Van de Kuilen, G. (2010). A parametric analysis of prospect theory’s functionals for the general population. Theory and Decision, 68, 115148.CrossRefGoogle Scholar
Budescu, D.V., & Bar-Hillel, M. (1993). To guess or not to guess: A decision theoretic view of formula scoring. Journal of Educational Measurement, 30, 227291.CrossRefGoogle Scholar
De Finetti, B. (1965). Methods for discriminating levels of partial knowledge concerning a test item. British Journal of Mathematical and Statistical Psychology, 18, 87123.CrossRefGoogle Scholar
Diamond, J., & Evans, W. (1973). The correction for guessing. Journal of Educational Research, 43, 181191.Google Scholar
Delgado, A.R. (2007). Using the Rasch model to quantify the causal effect of test instructions. Behavioral Research Methods, 39, 570573.CrossRefGoogle ScholarPubMed
Embretson, S.E., & Reise, S. (2000). Item response theory for psychologists. Mahwah, NJ: Erlbaum Publishers.Google Scholar
Espinosa, M.P., & Gardeazabal, J. (2010). Optimal correction for guessing in multiple-choice tests. Journal of Mathematical Psychology, 54, 415425.CrossRefGoogle Scholar
Espinosa, M.P., & Gardeazabal, J. (2013). Do students behave rationally in multiple choice tests? Evidence form a field experiment. Journal of Economics and Management, 9, 107135.Google Scholar
Frary, R.B. (1988). Formula scoring of multiple choice tests (Correction for guessing). Educational Measurement: Issues and Practice, 7, 3338.CrossRefGoogle Scholar
Holzinger, K.J. (1924). On scoring multiple response tests. Journal of Educational Psychology, 15, 445447.CrossRefGoogle Scholar
Johnson, T.R., Budescu, D.V., & Wallsten, T.S. (2001). Averaging probability judgments: Monte Carlo analyses of asymptotic diagnostic values. Journal of Behavioral Decision Making, 14, 123140.CrossRefGoogle Scholar
Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk. Econometrica, 47, 263291.CrossRefGoogle Scholar
Kahneman, D., & Tversky, A. (2000). Choices, values and frames. New York: Cambridge University Press.CrossRefGoogle Scholar
Karelitz, T.M., & Budescu, D.V. (2013). The effect of the raters’ marginal distributions on their matched agreement: A rescaling framework for interpreting Kappa. Multivariate Behavioral Research, 48(6), 923952.CrossRefGoogle ScholarPubMed
Karmarkar, U.S. (1978). Subjectively weighted utility: A descriptive extension of the expected utility model. Organizational Behavior and Human Performance, 21, 6172.CrossRefGoogle Scholar
Kruskal, W.H. (1958). Ordinal measures of association. Journal of the American Statistical Association, 53, 814–861.CrossRefGoogle Scholar
Lichtenstein, S., Fischhoff, B., & Phillips, L. (1982). Calibration and probabilities: The state of the art to 1980. In Kahneman, D., Slovic, P., & Tversky, A. (Eds.), Judgment under uncertainty: Heuristics and biases (pp. 306334). Cambridge: Cambridge University Press.CrossRefGoogle Scholar
Lord, F.M. (1975). Formula scoring and number-right scoring. Journal of Educational Measurement, 12, 712.CrossRefGoogle Scholar
Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Welsley.Google Scholar
Markowitz, H.M. (1959). Portfolio selection: Efficient diversification of investments. New York: Wiley.Google Scholar
Merkle, E.C., Smithson, M., & Verkuilen, J. (2011). Hierarchical models of simple mechanisms underlying confidence in decision making. Journal of Mathematical Psychology, 55, 5767.CrossRefGoogle Scholar
Nedelsky, L. (1954). Absolute grading standards for objective tests. Educational and Psychological Measurement, 14, 319.CrossRefGoogle Scholar
Samejima, F. (1970). A new family of models for the multiple-choice item (Office of Naval Research Rep. 79–4, N400014–77-C-0360). Knoxville: University of Tennessee, Department of Psychology.Google Scholar
San Martín, E., del Pino, G., & De Boeck, P. (2006). IRT models for ability-based guessing. Applied Psychological Measurement, 30, 183203.CrossRefGoogle Scholar
Stott, H.P. (2006). Cumulative prospect theory’s functional menagerie. Journal of Risk and Uncertainty, 32, 101130.CrossRefGoogle Scholar
Thissen, D., & Steinberg, L. (1984). A response model for multiple choice items. Psychometrika, 49, 501519.CrossRefGoogle Scholar
Thurstone, L.L. (1919). A method for scoring tests. Psychological Bulletin, 16, 235240.CrossRefGoogle Scholar
Traub, R.E., Hambleton, R.K., & Singh, B. (1969). Effects of promised reward and threatened penalty on performance of a multiple choice vocabulary test. Educational and Psychological Measurement, 29, 847861.CrossRefGoogle Scholar
Tversky, A., & Kahneman, D. (1992). Advances in prospect theory: Cumulative representation of uncertainty. Journal of Risk and Uncertainty, 5, 297323.CrossRefGoogle Scholar
Tversky, A., & Kahneman, D. (1991). Loss aversion in riskless choice: A reference-dependent model. The Quarterly Journal of Economics, 106, 10391061.CrossRefGoogle Scholar
Wallsten, T.S., & Budescu, D.V. (1983). Encoding subjective probabilities: A psychological and psychometric review. Management Science, 29, 151173.CrossRefGoogle Scholar
Williams, C.A. (1966). Attitudes towards speculative risk as an indicator of attitudes towards pure risk. Journal of Risk and Insurance, 33, 577587.CrossRefGoogle Scholar
Wright, G., & Ayton, P. (1994). Subjective probability. Chichester: Wiley.Google Scholar