Hostname: page-component-745bb68f8f-l4dxg Total loading time: 0 Render date: 2025-01-07T18:46:19.676Z Has data issue: false hasContentIssue false

Paradoxical Results and Item Bundles

Published online by Cambridge University Press:  01 January 2025

Giles Hooker*
Affiliation:
Cornell University
Matthew Finkelman
Affiliation:
Tufts School of Dental Medicine
*
Requests for reprints should be sent to Giles Hooker, Cornell University, Ithaca, NY, USA. E-mail: giles.hooker@cornell.edu

Abstract

Hooker, Finkelman, and Schwartzman (Psychometrika, 2009, in press) defined a paradoxical result as the attainment of a higher test score by changing answers from correct to incorrect and demonstrated that such results are unavoidable for maximum likelihood estimates in multidimensional item response theory. The potential for these results to occur leads to the undesirable possibility of a subject’s best answer being detrimental to them. This paper considers the existence of paradoxical results in tests composed of item bundles when compensatory models are used. We demonstrate that paradoxical results can occur when bundle effects are modeled as nuisance parameters for each subject. However, when these nuisance parameters are modeled as random effects, or used in a Bayesian analysis, it is possible to design tests comprised of many short bundles that avoid paradoxical results and we provide an algorithm for doing so. We also examine alternative models for handling dependence between item bundles and show that using fixed dependency effects is always guaranteed to avoid paradoxical results.

Type
Theory and Methods
Copyright
Copyright © 2010 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

The authors would like to thank an anonymous referee of Hooker et al. (2009) for suggesting the problem of item bundles.

References

Ackerman, T. (1996). Graphical representation of multidimensional item response theory analyses. Applied Psychological Measurement, 20(4), 311329.CrossRefGoogle Scholar
Bock, R., Gibbons, R., & Muraki, E. (1988). Full-information item factor analysis. Applied Psychological Measurement, 12, 261280.CrossRefGoogle Scholar
Craven, B.D. (1988). Fractional programming, Berlin: Heldermann.Google Scholar
Douglas, J.A., Roussos, L.A., & Stout, W. (1996). Item-bundle dif hypothesis testing: Identifying suspect bundles and assessing their differential functioning. Journal of Educational Measurement, 33, 465484.CrossRefGoogle Scholar
Finkelman, M., Hooker, G., & Wang, J. (2009). Technical Report BU-1768-M, Department of Biological Statistics and Computational Biology, Cornell University.Google Scholar
Hooker, G., Finkelman, M., & Schwartzman, A. (2009). Paradoxical results in multidimensional item response theory. Psychometrika, 74(3), 419442.CrossRefGoogle Scholar
Hoskens, M., & de Boeck, P. (1997). A parametric model for local dependence among test items. Psychological Methods, 2, 261277.CrossRefGoogle Scholar
Kelderman, H. (1984). Loglinear Rasch model tests. Psychometrika, 49, 223245.CrossRefGoogle Scholar
Li, Y., Bolt, D.M., & Fu, J. (2006). A comparison of alternative models for testlets. Applied Psychological Measurement, 20(1), 321.CrossRefGoogle Scholar
McCullagh, P., & Nelder, J.A. (1989). Generalized linear models, London: Chapman and Hall/CRC.CrossRefGoogle Scholar
Reckase, M. (1985). The difficulty of test items that measure more than one ability. Applied Psychological Measurement, 9, 401412.CrossRefGoogle Scholar
Rijmen, F., Tuerlinckx, F., de Boeck, P., & Kuppens, P. (2003). A nonlinear mixed model framework for item response theory. Psychological Methods, 8(2), 185205.CrossRefGoogle ScholarPubMed
Rosenbaum, P.R. (1988). Item bundles. Psychometrika, 53, 349359.CrossRefGoogle Scholar
Veldkamp, B.P. (2002). Multidimensional constrained test assembly. Applied Psychological Measurement, 26(2), 133146.CrossRefGoogle Scholar
Wang, W., & Wilson, M. (2005). The Rasch testlet model. Applied Psychological Measurement, 29(2), 126149.CrossRefGoogle Scholar
Wang, X., Bradlow, E.T., & Wainer, H. (2002). A general Bayesian model for testlets: Theory and applications. Applied Psychological Measurement, 26(1), 109128.CrossRefGoogle Scholar
Wilson, M., & Adams, R.J. (1995). Rasch models for item bundles. Psychometrika, 60, 181198.CrossRefGoogle Scholar