Hostname: page-component-745bb68f8f-5r2nc Total loading time: 0 Render date: 2025-01-07T19:13:05.906Z Has data issue: false hasContentIssue false

Item Selection in Multidimensional Computerized Adaptive Testing—Gaining Information from Different Angles

Published online by Cambridge University Press:  01 January 2025

Chun Wang*
Affiliation:
University of Illinois at Urbana-Champaign
Hua-Hua Chang
Affiliation:
University of Illinois at Urbana-Champaign
*
Requests for reprints should be sent to Chun Wang, Department of Psychology University of Illinois at Urbana-Champaign, 603 E. Daniel St., Champaign, IL 61820, USA. E-mail: cwang49@cyrus.psych.uiuc.edu

Abstract

Over the past thirty years, obtaining diagnostic information from examinees’ item responses has become an increasingly important feature of educational and psychological testing. The objective can be achieved by sequentially selecting multidimensional items to fit the class of latent traits being assessed, and therefore Multidimensional Computerized Adaptive Testing (MCAT) is one reasonable approach to such task. This study conducts a rigorous investigation on the relationships among four promising item selection methods: D-optimality, KL information index, continuous entropy, and mutual information. Some theoretical connections among the methods are demonstrated to show how information about the unknown vector θ can be gained from different perspectives. Two simulation studies were carried out to compare the performance of the four methods. The simulation results showed that mutual information not only improved the overall estimation accuracy but also yielded the smallest conditional mean squared error in most region of θ. In the end, the overlap rates were calculated to empirically show the similarity and difference among the four methods.

Type
Original Paper
Copyright
Copyright © 2011 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Anderson, T.W. (1984). An introduction to multivariate statistical analysis, (2nd ed.). New York: Wiley.Google Scholar
Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In Lord, F.M., Novick, M.R. (Eds.), Statistical theories of mental test scores (pp. 379479). Reading: Addison-Wesley.Google Scholar
Bolt, D.M., Lall, V.F. (2003). Estimation of compensatory and noncompensatory multidimensional item response models using Markov chain Monte Carlo. Applied Psychological Measurement, 27(6), 395414.CrossRefGoogle Scholar
Chaloner, K., Verdinelli, I. (1995). Bayesian experimental design: A review. Statistical Science, 10, 237304.CrossRefGoogle Scholar
Chang, H.H., Stout, W. (1993). The asymptotic posterior normality of the latent trait in an IRT Model. Psychometrika, 58(1), 3752.CrossRefGoogle Scholar
Chang, H.H., Ying, Z.L. (1996). A global information approach to computerized adoptive testing. Applied Psychological Measurement, 20(3), 213229.CrossRefGoogle Scholar
Chang, H.H., Ying, Z.L. (1999). a-stratified multistage computerized adaptive testing. Applied Psychological Measurement, 23(3), 211222.CrossRefGoogle Scholar
Chang, H.H., Ying, Z.L. (2008). To weight or not to weight? Balancing influence of initial items in adaptive testing. Psychometrika, 73(3), 441450.CrossRefGoogle Scholar
Chen, S.Y., Ankenmann, R.D., Chang, H.H. (2000). A comparison of item selection rules at the early stages of computerized adaptive testing. Applied Psychological Measurement, 24(3), 241255.CrossRefGoogle Scholar
Cheng, Y., Chang, H.-H. (2009). The maximum priority index method for severely constrained item selection in computerized adaptive testing. British Journal of Mathematical and Statistical Psychology, 62, 369383.CrossRefGoogle ScholarPubMed
Cheng, Y. (2009). When cognitive diagnosis meets computerized adaptive testing: CD-CAT. Psychometrika, 74(4), 619632.CrossRefGoogle Scholar
Cover, T., Thomas, J. (1991). Elements of information theory, New York: Wiley.Google Scholar
Eggen, T. (1999). Item selection in adaptive testing with the sequential probability ratio test. Applied Psychological Measurement, 23(3), 249261.CrossRefGoogle Scholar
Finkelman, M., Nering, M.L., Roussos, L.A. (2009). A conditional exposure control method for multidimensional adaptive testing. Journal of Educational Measurement, 46(1), 84103.CrossRefGoogle Scholar
Hattie, J. (1981). Decision criteria for determining unidimensionality. Unpublished doctoral dissertation, University of Toronto, Canada.Google Scholar
Hooker, G., Finkelman, M., Schwartzman, A. (2009). Paradoxical results in multidimensional item response theory. Psychometrika, 74(3), 419442.CrossRefGoogle Scholar
Lee, Y.H., Ip, E.H., Fuh, C.D. (2008). A strategy for controlling item exposure in multidimensional computerized adaptive testing. Educational and Psychological Measurement, 68(2), 215232.CrossRefGoogle Scholar
Lehmann, E.L., Casella, G. (1998). Theory of point estimation, (2nd ed.). New York: Springer.Google Scholar
Lord, F.M. (1980). Applications of item response theory to practical testing problems, Hillsdale: Erlbaum.Google Scholar
Lord, F.M., Novick, M.R. (1968). Statistical theories of mental test scores, Reading: Addison-Wesley.Google Scholar
Luecht, R.M. (1996). Multidimensional computerized adaptive testing in a certification or licensure context. Applied Psychological Measurement, 20(4), 389404.CrossRefGoogle Scholar
Meyer, M.E., Gokhale, O. (1993). Kullback–Leibler information measure for studying convergence rates of densities and distributions. IEEE Transactions on Information Theory, 39(4), 14011404.CrossRefGoogle Scholar
Mulder, J., van der Linden, W.J. (2009). Multidimensional adaptive testing with optimal design criteria for item selection. Psychometrika, 74(2), 273296.CrossRefGoogle ScholarPubMed
Mulder, J., van der Linden, W.J. (2010). Multidimensional adaptive testing with Kullback–Leibler information item selection. In van der Linden, W.J., Glas, C.A.W. (Eds.), Elements of adaptive testing (pp. 77101). New York: Springer.Google Scholar
Reckase, M.D. (1985). The difficulty of test items that measure more than one ability. Applied Psychological Measurement, 9(4), 401412.CrossRefGoogle Scholar
Reckase, M.D. (1997). The past and future of multidimensional item response theory. Applied Psychological Measurement, 21(1), 2536.CrossRefGoogle Scholar
Reckase, M.D. (2009). Multidimensional item response theory, New York: Springer.CrossRefGoogle Scholar
Reckase, M.D., McKinley, R.L. (1991). The discriminating power of items that measure more than one dimension. Applied Psychological Measurement, 15(4), 361373.CrossRefGoogle Scholar
Revuelta, J., Ponsoda, V. (1998). A comparison of item exposure control methods in computerized adaptive testing. Journal of Educational Measurement, 35(4), 311327.CrossRefGoogle Scholar
Renyi, A. (1961). On measures of entropy and information. Proceedings of the fourth berkeley symposium on mathematical statistics and probability, 547561.Google Scholar
Runder, L.M. (2002). An examination of decision-theory adaptive testing procedures. Paper presented at the annual meeting of American Educational Research Association, New Orleans, LA.Google Scholar
Segall, D.O. (1996). Multidimensional adaptive testing. Psychometrika, 61(2), 331354.CrossRefGoogle Scholar
Segall, D.O. (2001). General ability measurement: An application of multidimensional item response theory. Psychometrika, 66(1), 7997.CrossRefGoogle Scholar
Shannon, C.E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27, 379423.CrossRefGoogle Scholar
Sympson, J.B., Hetter, R.D. (1985). Controlling item-exposure rates in computerized adaptive testing. Proceedings of the 27th annual conference of the military testing association, 973977.Google Scholar
van der Linden, W.J. (1996). Assembling tests for the measurement of multiple traits. Applied Psychological Measurement, 20, 373388.CrossRefGoogle Scholar
van der Linden, W.J. (1998). Bayesian item selection criteria for adaptive testing. Psychometrika, 63, 201216.CrossRefGoogle Scholar
van der Linden, W.J. (1999). Multidimensional adaptive testing with a minimum error-variance criterion. Journal of Educational and Behavioral Statistics, 24(4), 398412.CrossRefGoogle Scholar
Veldkamp, B.P., van der Linden, W.J. (2002). Multidimensional adaptive testing with constraints on test content. Psychometrika, 67(4), 575588.CrossRefGoogle Scholar
Wang, C., Chang, H., Boughton, K.A. (2011). Kullback–Leibler information and its applications in multi-dimensional adaptive testing. Psychometrika, 76, 1339.CrossRefGoogle Scholar
Wang, C., & Chang, H. (2010). Item selection in MCAT—the new application of Kullback–Leibler information. Paper presented at the 2010 international meeting of the psychometric society, Athens, Georgia.Google Scholar
Wang, W.C., Chen, P.H. (2004). Implementation and measurement efficiency of multidimensional computerized adaptive testing. Applied Psychological Measurement, 28(5), 295316.CrossRefGoogle Scholar
Weissman, A. (2007). Mutual information item selection in adaptive classification testing. Educational and Psychological Measurement, 67, 4158.CrossRefGoogle Scholar
Xu, X., Chang, H., & Douglas, J. (2005). Computerized adaptive testing strategies for cognitive diagnosis. Paper presented at the annual meeting of national council on measurement in education, Montreal, Canada.Google Scholar