Hostname: page-component-745bb68f8f-g4j75 Total loading time: 0 Render date: 2025-01-07T17:16:49.208Z Has data issue: false hasContentIssue false

A Hierarchical Multi-Unidimensional IRT Approach for Analyzing Sparse, Multi-Group Data for Integrative Data Analysis

Published online by Cambridge University Press:  01 January 2025

Yan Huo*
Affiliation:
Rutgers, The State University of New Jersey
Jimmy de la Torre
Affiliation:
Rutgers, The State University of New Jersey
Eun-Young Mun
Affiliation:
Rutgers, The State University of New Jersey
Su-Young Kim
Affiliation:
Ewha Womans University
Anne E. Ray
Affiliation:
Rutgers, The State University of New Jersey
Yang Jiao
Affiliation:
Rutgers, The State University of New Jersey
Helene R. White
Affiliation:
Rutgers, The State University of New Jersey
*
Correspondence should be sent to Yan Huo, Graduate School of Education, Rutgers, The State University of New Jersey, New Brunswick, NJ, USA. E-mail: yan.huo@gmail.com

Abstract

The present paper proposes a hierarchical, multi-unidimensional two-parameter logistic item response theory (2PL-MUIRT) model extended for a large number of groups. The proposed model was motivated by a large-scale integrative data analysis (IDA) study which combined data (N = 24,336) from 24 independent alcohol intervention studies. IDA projects face unique challenges that are different from those encountered in individual studies, such as the need to establish a common scoring metric across studies and to handle missingness in the pooled data. To address these challenges, we developed a Markov chain Monte Carlo (MCMC) algorithm for a hierarchical 2PL-MUIRT model for multiple groups in which not only were the item parameters and latent traits estimated, but the means and covariance structures for multiple dimensions were also estimated across different groups. Compared to a few existing MCMC algorithms for multidimensional IRT models that constrain the item parameters to facilitate estimation of the covariance matrix, we adapted an MCMC algorithm so that we could directly estimate the correlation matrix for the anchor group without any constraints on the item parameters. The feasibility of the MCMC algorithm and the validity of the basic calibration procedure were examined using a simulation study. Results showed that model parameters could be adequately recovered, and estimated latent trait scores closely approximated true latent trait scores. The algorithm was then applied to analyze real data (69 items across 20 studies for 22,608 participants). The posterior predictive model check showed that the model fit all items well, and the correlations between the MCMC scores and original scores were overall quite high. An additional simulation study demonstrated robustness of the MCMC procedures in the context of the high proportion of missingness in data. The Bayesian hierarchical IRT model using the MCMC algorithms developed in the current study has the potential to be widely implemented for IDA studies or multi-site studies, and can be further refined to meet more complicated needs in applied research.

Type
Original Paper
Copyright
Copyright © 2014 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Adams, R.J., Wilson, M., & Wang, W-C. (1997). The multidimensional random coefficients multinomial logit model. Applied Psychological Measurement, 21, 123.CrossRefGoogle Scholar
Bauer, D. J., & Hussong, A. M., (2009). Psychometric approaches for developing commensurate measures across independent studies: Traditional and new models. Psychological Methods, 14, 101–125.CrossRefGoogle Scholar
Béguin, A.A., & Glas, C.A.W. (2001). MCMC estimation and some model-fit analysis of multidimensional IRT models. Psychometrika, 4, 541562.CrossRefGoogle Scholar
Bolt, D. M., & Lall, V. F., (2003). Estimation of compensatory and noncompensatory multidimensional item response models using Markovchain Monte Carlo. Applied Psychological Measurement, 27, 395–414.CrossRefGoogle Scholar
Cai, L., Thissen, D., & du Toit, S.H.C. (2011). IRTPRO for Windows [Computer software]. Lincolnwood, IL: Scientific Software International.Google Scholar
Casella, G., & George, E.I. (1992). Explaining the Gibbs sampler. The American Statistician, 46, 167174.CrossRefGoogle Scholar
Chalmers, R. P., (2012). mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1–29. http://www.jstatsoft.org/v48/i06/.CrossRefGoogle Scholar
Chib, S., & Greenberg, E. (1995). Understanding the Metropolis-Hastings algorithm. The American Statistician, 49, 327335.CrossRefGoogle Scholar
Curran, P.J., & Hussong, A.M. (2009). Integrative data analysis: The simultaneous analysis of multiple data sets. Psychological Methods, 14, 81100.CrossRefGoogle ScholarPubMed
Curran, P.J., Hussong, A.M., Cai, L., Huang, W., Chassin, L., Sher, K.J., & Zucker, R.A. (2008). Pooling data from multiple longitudinal studies: The role of item response theory in integrative data analysis. Developmental Psychology, 44, 365380.CrossRefGoogle ScholarPubMed
de la Torre, J., (2009). Improving the quality of ability estimates through multidimensional scoring and incorporation of ancillary variables. Applied Psychological Measurement, 33, 465–485.CrossRefGoogle Scholar
de la Torre, J., & Hong, Y. (2009). Parameter estimation with small sample size: A higher-order IRT model approach. Applied Psychological Measurement, 34, 267285.CrossRefGoogle Scholar
de la Torre, J., & Song, H. (2009). Simultaneous estimation of overall and domain abilities: A higher-order IRT model approach. Applied Psychological Measurement, 33, 620639.CrossRefGoogle Scholar
de la Torre, J., & Patz, R.J. (2005). Making the most of what we have: A practical application of multidimensional item response theory in test scoring. Journal of Educational and Behavioral Statistics, 30, 295311.CrossRefGoogle Scholar
Dempster, A.P., Laird, N.M., & Rubin, D.B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society Series B, 39, 138.CrossRefGoogle Scholar
Dimeff, L.A., Baer, J.S., Kivlahan, D.R., & Marlatt, G.A. (1999). Brief alcohol screening and intervention for college students: A harm reduction approach. New York, NY: Guilford Press.Google Scholar
Doornik, J.A. (2009). Object-oriented matrix programming using Ox (Version 3.1) [Computer software]. London: Timberlake Consultants Press.Google Scholar
Fox, J-P, & Glas, C.A.W. (2001). Bayesian estimation of a multilevel IRT model using Gibbs sampling. Psychometrika, 66, 271288.CrossRefGoogle Scholar
Finch, H. (2008). Estimation of item response theory parameters in the presence of missing data. Journal of Educational Measurement, 45, 225245.CrossRefGoogle Scholar
Gelman, A., & Rubin, D.B. (1992). Inference from iterative simulation using multiple sequences. Statistical Science, 7, 457472.CrossRefGoogle Scholar
Gelman, A., Carlin, J.B., Stern, H.S., & Rubin, D.B. (2004). Bayesian data analysis (2nd ed.). Boca Raton, FL: Chapman & Hall/CRC.Google Scholar
Gill, J. (2002). Bayesian methods: A social and behavioral sciences approach (1st ed.). Boca Raton, FL: Chapman & Hall/CRC.CrossRefGoogle Scholar
Hartig, J., & Höhler, J. (2009). Multidimensional IRT models for the assessment of competencies. Studies in Educational Evaluation, 35, 5763.CrossRefGoogle Scholar
Hurlbut, S.C., & Sher, K.J. (1992). Assessing alcohol problems in college students. Journal of American College Health, 41(2), 4958.CrossRefGoogle ScholarPubMed
Kahler, C.W., Strong, D.R., & Read, J.P. (2005). Toward efficient and comprehensive measurement of the alcohol problems continuum in college students: The Brief Young Adult Alcohol Consequences Questionnaire. Alcoholism: Clinical and Experimental Research, 29(7), 11801189.CrossRefGoogle ScholarPubMed
Lazarsfeld, P.F., & Henry, N.W. (1968). Latent structure analysis. Boston, MA: Houghton Mifflin.Google Scholar
Liu, X. (2008). Parameter expansion for sampling a correlation matrix: An efficient GPX-RPMH algorithm. Journal of Statistical Computation and Simulation, 78, 10651076.CrossRefGoogle Scholar
Liu, X., & Daniels, M.J. (2006). A new efficient algorithm for sampling a correlation matrix based on parameter expansion and re-parameterization. Journal of Computational and Graphical Statistics, 15, 897914.CrossRefGoogle Scholar
Lord, F.M., & Novick, M.R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.Google Scholar
McArdle, J.J., Grimm, K., Hamagami, F., Bowles, R., & Meredith, W. (2009). Modeling life-span growth curves of cognition using longitudinal data with multiple samples and changing scales of measurement. Psychological Methods, 14, 126149.CrossRefGoogle ScholarPubMed
McDonald, R. P., (1997). Normal-ogive multidimensional model. In W. J. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 257–269). New York: Springer.Google Scholar
Meng, X.L. (1994). Posterior predictive p-values. The Annals of Statistics, 22, 11421160.CrossRefGoogle Scholar
Millsap, R., & Maydeu-Olivares, A. (2009). Handbook of quantitative methods in psychology. London, UK: Sage.CrossRefGoogle Scholar
Mislevy, R. (1991). Randomization-based inferences about latent variables from complex samples. Psychometrika, 56, 177196.CrossRefGoogle Scholar
Mun, E.Y., White, H.R., de la Torre, J., Atkins, D.C., Larimer, M., Jiao, Y., Huo, Y., & Garberson, L. (2011). Overview of integrative analysis of brief alcohol interventions for college students. Alcoholism: Clinical and Experimental Research, 35, 147.Google Scholar
Oshima, T.C., Raju, N.S., & Flowers, C.P. (1997). Development and demonstration of multidimensional IRT-based internal measures of differential functioning of items and tests. Journal of Educational Measurement, 34, 253272.CrossRefGoogle Scholar
Reckase, M.D. (1996). A linear logistic multidimensional model. In van der Linder, W.J., & Hambleton, R.K. (Eds.), Handbook of modern item response theory (pp. 271286). New York, NY: Springer.Google Scholar
Reckase, M.D. (2009). Multidimensional item response theory. New York, NY: Springer.CrossRefGoogle Scholar
Rubin, D. (1987). Multiple imputation for nonresponse in surveys. New York, NY: Wiley.CrossRefGoogle Scholar
Saunders, J.B., Aasland, O.G., Babor, T.F., & Grant, M. (1993). Development of the alcohol use disorders identification test (AUDIT): WHO Collaborative Project on early detection of persons with harmful alcohol consumption-II. Addiction, 88(6), 791804.CrossRefGoogle ScholarPubMed
Schafer, J.L. (1997). Analysis of incomplete multivariate data. Boca Raton, FL: Chapman & Hall/CRC.CrossRefGoogle Scholar
Sheng, Y., & Wikle, C.K. (2007). Comparing unidimensional and multi-unidimensional IRT models. Educational and Psychological Measurement, 67, 899919.CrossRefGoogle Scholar
Sheng, Y., & Wikle, C.K. (2008). Bayesian multidimensional IRT models with a hierarchical structure. Educational and Psychological Measurement, 68, 413430.CrossRefGoogle Scholar
Sinharay, S., Johnson, M.S., & Stern, H.S. (2006). Posterior predictive assessment of item response theory models. Applied Psychological Measurement, 30, 298321.CrossRefGoogle Scholar
Skinner, H.A., & Allen, B.A. (1982). Alcohol dependence syndrome: Measurement and validation. Journal of Abnormal Psychology, 91(3), 199209.CrossRefGoogle ScholarPubMed
Skinner, H.A., & Horn, J.L. (1984). Alcohol dependence scale: Users guide. Toronto: Addiction Research Foundation.Google Scholar
Thomas, N. (2002). The role of secondary covariates when estimating latent trait population distributions. Psychometrika, 67, 3348.CrossRefGoogle Scholar
Van der Linden, W.J., & Hambleton, R.K. (1997). Handbook of modern item response theory. New York, NY: Springer.CrossRefGoogle Scholar
Wang, W., Wilson, M., & Adams, R. J.. (1995). Item response modeling for multidimensional between-items and multidimensional within-items. Paper presented at the International Objective Measurement Conference. Berkeley, CA.Google Scholar
White, H.R., & Labouvie, E.W. (1989). Towards the assessment of adolescent problem drinking. Journal of Studies on Alcohol, 50(1), 3037.CrossRefGoogle ScholarPubMed
Zeger, L.M., & Thomas, N. (1997). Efficient matrix sampling instruments for correlated latent traits: Examples from the National Assessment of Education Progress. Journal of the American Statistical Association, 92, 416425.CrossRefGoogle Scholar
Zimowski, M.F., Muraki, E., Mislevy, R.J., & Bock, R.D. (2003). BIOLOG-MG 3 [Computer Software]. Lincolnwood, IL: Scientific Software International Inc.Google Scholar