Hostname: page-component-5f745c7db-6bmsf Total loading time: 0 Render date: 2025-01-06T07:23:24.275Z Has data issue: true hasContentIssue false

Application of a Multidimensional Nested Logit Model to Multiple-Choice Test Items

Published online by Cambridge University Press:  01 January 2025

Daniel M. Bolt*
Affiliation:
University of Wisconsin, Madison
James A. Wollack
Affiliation:
University of Wisconsin, Madison
Youngsuk Suh
Affiliation:
Rutgers University
*
Requests for reprints should be sent to Daniel M. Bolt, University of Wisconsin, Madison, Room 859, Educational Sciences Building, Madison, WI, USA. E-mail: dmbolt@wisc.edu

Abstract

Nested logit models have been presented as an alternative to multinomial logistic models for multiple-choice test items (Suh and Bolt in Psychometrika 75:454–473, 2010) and possess a mathematical structure that naturally lends itself to evaluating the incremental information provided by attending to distractor selection in scoring. One potential concern in attending to distractors is the possibility that distractor selection reflects a different trait/ability than that underlying the correct response. This paper illustrates a multidimensional extension of a nested logit item response model that can be used to evaluate such distinctions and also defines a new framework for incorporating collateral information from distractor selection when differences exist. The approach is demonstrated in application to questions faced by a university testing center over whether to incorporate distractor selection into the scoring of its multiple-choice tests. Several empirical examples are presented.

Type
Original Paper
Copyright
Copyright © 2012 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Baker, F.B., Kim, S.-H. (2004). Item response theory: Parameter estimation techniques, (2rd ed.). New York: DekkerCrossRefGoogle Scholar
Bechger, T.M., Maris, G., Verstralen, H.H.F.M., Verhelst, N.D. (2005). The Nedelsky model for multiple-choice items. In van der Ark, L.A., Croon, M.A., Sijtsma, K. New developments in categorical data analysis for the social and behavioral sciences, Mahwah: Lawrence Erlbaum Associates 187206Google Scholar
Bock, R.D. (1972). Estimating item parameters and latent ability when responses are coded in two or more nominal categories. Psychometrika, 37, 2951CrossRefGoogle Scholar
Bolt, D.M., Johnson, T.R. (2009). Addressing score bias and DIF due to individual differences in response style. Applied Psychological Measurement, 33, 335352CrossRefGoogle Scholar
de la Torre, J. (2009). Improving the quality of ability estimates through multidimensional scoring and incorporation of ancillary variables. Applied Psychological Measurement, 33, 465485CrossRefGoogle Scholar
de la Torre, J., Patz, R.J. (2005). Making the most of what we have: A practical application of multidimensional item response theory in test scoring. Journal of Educational and Behavioral Statistics, 30, 295311CrossRefGoogle Scholar
Geisser, S., Eddy, W. (1979). A predictive approach to model selection. Journal of the American Statistical Association, 74, 153160CrossRefGoogle Scholar
Gelfand, A., Dey, D. (1994). Bayesian model choice: Asymptotic and exact calculations. Journal of the Royal Statistical Society, B, 56, 501514CrossRefGoogle Scholar
Gelman, A., Rubin, D.B. (1992). Inference from iterative simulation using multiple sequences (with discussion). Statistical Science, 7, 457511CrossRefGoogle Scholar
Geweke, J. (1992). Evaluating the accuracy of sampling-based approaches to calculating posterior moments. In Bernardo, J.M., Berger, J.O., Dawid, A.P., Smith, A.F.M. Bayesian statistics, Oxford: Oxford University Press 169193Google Scholar
Hutchinson, T.P. (1991). Ability, partial information, and guessing: Statistical modeling applied to multiple-choice tests, Sydney: Rumsby ScientificGoogle Scholar
Li, Y., Bolt, D.M., Fu, J. (2006). A comparison of alternative models for testlets. Applied Psychological Measurement, 30, 321CrossRefGoogle Scholar
Ntzoufras, I. (2009). Bayesian modeling using WinBUGS, Hoboken: WileyCrossRefGoogle Scholar
Patz, R.J., Junker, B.W. (1999). A straightforward approach to Markov chain Monte Carlo methods for item response models. Journal of Educational and Behavioral Statistics, 24, 146178CrossRefGoogle Scholar
Raftery, A.E., Lewis, S.M. (1992). How many iterations in the Gibbs’ sampler. In Bernardo, J.M., Berger, J.O., Dawid, A.P., Smith, A.F.M. Bayesian statistics, Oxford: Oxford University Press 765776Google Scholar
Rodriguez, M.C. (2003). Construct equivalence of multiple-choice and constructed-response items: A random effects synthesis of correlations. Journal of Educational Measurement, 40, 163184CrossRefGoogle Scholar
Samejima, F. (1972). A general model for free-response data. Psychometrika Monographs, 18. Google Scholar
Spiegelhalter, D.J., Thomas, A., Best, N., Lunn, D. (2003). WINBUGS user’s manual (Version 1.4) [Computer software manual], Cambridge: MRC Biostatistics UnitGoogle Scholar
Suh, Y., Bolt, D.M. (2010). Nested logit models for multiple-choice item response data. Psychometrika, 75, 454473CrossRefGoogle Scholar
Thissen, D., Cai, L., Bock, R.D. (2010). The nominal categories item response model. In Nering, M.L., Ostini, R. Handbook of polytomous item response theory models: Development and applications, Philadelphia: Taylor & Francis 4375Google Scholar
Thissen, D., Steinberg, L. (1984). A response model for multiple-choice items. Psychometrika, 49, 501519CrossRefGoogle Scholar
Thissen, D., Steinberg, L. (1986). A taxonomy of item response models. Psychometrika, 51, 567577CrossRefGoogle Scholar
Tutz, G. (1990). Sequential item response models with an ordered response. British Journal of Mathematical and Statistical Psychology, 43, 3955CrossRefGoogle Scholar
van der Linden, W.J., Klein Entink, R.H., Fox, J.-P. (2010). IRT parameter estimation with response times as collateral information. Applied Psychological Measurement, 34, 327347CrossRefGoogle Scholar
Wainer, H., Vevea, J.L., Camacho, F., Reeve, B.B., Rosa, K., Nelson, L., Swygert, K.A., Thissen, D. (2001). Augmented scores—“Borrowing strength” to compute scores based on small numbers of items. In Thissen, D., Wainer, H. Test scoring, Mahwah: Erlbaum 343388Google Scholar
Wollack, J.A., Bolt, D.M., Cohen, A.S., Lee, Y.-S. (2002). Recovery of item parameters in the nominal response model: A comparison of marginal maximum likelihood estimation and Markov chain Monte Carlo estimation. Applied Psychological Measurement, 26, 337350CrossRefGoogle Scholar
Yen, W.M., Fitzpatrick, A.R. (2006). Item response theory. In Brennan, R.L. Educational measurement, (4rd ed.). Westport: Praeger 111153Google Scholar