Hostname: page-component-5f745c7db-96s6r Total loading time: 0 Render date: 2025-01-06T06:43:00.400Z Has data issue: true hasContentIssue false

Robust Measurement via A Fused Latent and Graphical Item Response Theory Model

Published online by Cambridge University Press:  01 January 2025

Yunxiao Chen
Affiliation:
Emory University
Xiaoou Li
Affiliation:
University of Minnesota
Jingchen Liu*
Affiliation:
Columbia University
Zhiliang Ying
Affiliation:
Columbia University
*
Correspondence should be made to Jingchen Liu, Columbia University, New York, USA. Email: jcliu@stat.columbia.edu

Abstract

Item response theory (IRT) plays an important role in psychological and educational measurement. Unlike the classical testing theory, IRT models aggregate the item level information, yielding more accurate measurements. Most IRT models assume local independence, an assumption not likely to be satisfied in practice, especially when the number of items is large. Results in the literature and simulation studies in this paper reveal that misspecifying the local independence assumption may result in inaccurate measurements and differential item functioning. To provide more robust measurements, we propose an integrated approach by adding a graphical component to a multidimensional IRT model that can offset the effect of unknown local dependence. The new model contains a confirmatory latent variable component, which measures the targeted latent traits, and a graphical component, which captures the local dependence. An efficient proximal algorithm is proposed for the parameter estimation and structure learning of the local dependence. This approach can substantially improve the measurement, given no prior information on the local dependence structure. The model can be applied to measure both a unidimensional latent trait and multidimensional latent traits.

Type
Original Paper
Copyright
Copyright © The Psychometric Society 2018

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Electronic supplementary material The online version of this article (https://doi.org/10.1007/s11336-018-9610-4) contains supplementary material, which is available to authorized users.

References

Anderson, C. J., &Vermunt, J. K. (2000). Log-multiplicative association models as latent variable models for nominal and/or ordinal data.Sociological Methodology, 30,81121.CrossRefGoogle Scholar
Anderson, C. J., &Yu, H.-T. (2007). Log-multiplicative association models as item response models.Psychometrika, 72,523.CrossRefGoogle Scholar
Barber, R. F., &Drton, M. (2015). High-dimensional Ising model selection with Bayesian information criteria.Electronic Journal of Statistics, 9,567607.CrossRefGoogle Scholar
Belloni, A., &Chernozhukov, V. (2013). Least squares after model selection in high-dimensional sparse models.Bernoulli, 19,521547.CrossRefGoogle Scholar
Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems.Journal of the Royal Statistical Society Series B (Methodological), 36,192236.CrossRefGoogle Scholar
Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability.Lord, F. M., &Novick, M. R. Statistical theories of mental test scores, 395479.Reading, MA:Addison-Wesley.Google Scholar
Boschloo, L.,van Borkulo, C. D.,Rhemtulla, M.,Keyes, K. M.Borsboom, D., &Schoevers, R. A. (2015). The network structure of symptoms of the diagnostic and statistical manual of mental disorders.PLoS One, 10,e0137621CrossRefGoogle Scholar
Bradlow, E. T.,Wainer, H., &Wang, X. (1999). A Bayesian random effects model for testlets.Psychometrika, 64,153168.CrossRefGoogle Scholar
Braeken, J. (2011). A boundary mixture approach to violations of conditional independence.Psychometrika, 76,5776.CrossRefGoogle Scholar
Braeken, J.,Tuerlinckx, F., &De Boeck, P. (2007). Copula functions for residual dependency.Psychometrika, 72,393411.CrossRefGoogle Scholar
Cai, L.,Yang, J. S., &Hansen, M. (2011). Generalized full-information item bifactor analysis.Psychological Methods, 16,221248.CrossRefGoogle ScholarPubMed
Chen, Y. (2016). Latent variable modeling and statistical learning. Ph.D. thesis, Columbia University. Available at http://academiccommons.columbia.edu/catalog/ac:198122.Google Scholar
Chen, Y., Li, X., Liu, J., & Ying, Z. (2016) A fused latent and graphical model for multivariate binary data. Available at arXiv:1606.08925v1.pdf. ArXiv preprint.Google Scholar
Chen, J., &Chen, Z. (2008). Extended Bayesian information criteria for model selection with large model spaces.Biometrika, 95,759771.CrossRefGoogle Scholar
Chen, Y.,Liu, J.,Xu, G., &Ying, Z. (2015). Statistical analysis of Q-matrix based diagnostic classification models.Journal of the American Statistical Association, 110,850866.CrossRefGoogle Scholar
Chen, Y.,Liu, J., &Ying, Z. (2015). Online item calibration for Q-matrix in CD-CAT.Applied Psychological Measurement, 39,515.CrossRefGoogle ScholarPubMed
Chen, W.-H., &Thissen, D. (1997). Local dependence indexes for item pairs using item response theory.Journal of Educational and Behavioral Statistics, 22,265289.CrossRefGoogle Scholar
Cramer, A. O.,Sluis, S.,Noordhof, A.,Wichers, M.,Geschwind, N.,Aggen, S. H.,et.al Borsboom, D. (2012). Dimensions of normal personality as networks in search of equilibrium: You can’t like parties if you don’t like people.European Journal of Personality, 26,414431.CrossRefGoogle Scholar
Cramer, A. O.,Waldorp, L. J.,van der Maas, H. L., &Borsboom, D. (2010). Complex realities require complex theories: Refining and extending the network approach to mental disorders.Behavioral and Brain Sciences, 33,178193.CrossRefGoogle Scholar
Embretson, S. E., &Reise, S. P. (2000). Item response theory for psychologists.Mahwah, NJ:Lawrence Erlbaum Associates Publishers.Google Scholar
Epskamp, S., Maris, G. K., Waldorp, L. J., & Borsboom, D. (2016). Network psychometrics. arXiv preprint arXiv:1609.02818.Google Scholar
Epskamp, S.,Rhemtulla, M., &Borsboom, D. (2017). Generalized network pschometrics: Combining network and latent variable models.Psychometrika, 82,904927.CrossRefGoogle ScholarPubMed
Eysenck, S., &Barrett, P. (2013). Re-introduction to cross-cultural studies of the EPQ.Personality and Individual Differences, 54,485489.CrossRefGoogle Scholar
Eysenck, S. B., Eysenck, H. J., &Barrett, P. (1985). A revised version of the psychoticism scale.Personality and Individual Differences, 6,2129.CrossRefGoogle Scholar
Ferrara, S.,Huynh, H., &Michaels, H. (1999). Contextual explanations of local dependence in item clusters in a large scale hands-on science performance assessment.Journal of Educational Measurement, 36,119140.CrossRefGoogle Scholar
Foygel, R., & Drton, M. (2010). Extended Bayesian information criteria for Gaussian graphical models. In Advances in Neural Information Processing Systems (pp 604–612).Google Scholar
Fried, E. I., Bockting, C.,Arjadi, R.,Borsboom, D.,Amshoff, M.,Cramer, A. O.et.al (2015). From loss to loneliness: The relationship between bereavement and depressive symptoms.Journal of Abnormal Psychology, 124,256265.CrossRefGoogle ScholarPubMed
Gibbons, R. D.,Bock, R. D.,Hedeker, D.,Weiss, D. J.,Segawa, E.,Bhaumik, D. K.,et.al (2007). Full-information item bifactor analysis of graded response data.Applied Psychological Measurement, 31,419.CrossRefGoogle Scholar
Gibbons, R. D., &Hedeker, D. R. (1992). Full-information item bi-factor analysis.Psychometrika, 57,423436.CrossRefGoogle Scholar
Holland, P. W. (1990). The Dutch identity: A new tool for the study of item response models.Psychometrika, 55,518.CrossRefGoogle Scholar
Holland, P. W., &Wainer, H. (2012). Differential item functioning.New York, NY:Routledge.CrossRefGoogle Scholar
Hoskens, M., &De Boeck, P. (1997). A parametric model for local dependence among test items.Psychological Methods, 2,261277.CrossRefGoogle Scholar
Ip, E. H. (2002). Locally dependent latent trait model and the Dutch identity revisited.Psychometrika, 67,367386.CrossRefGoogle Scholar
Ip, E. H. (2010). Empirically indistinguishable multidimensional IRT and locally dependent unidimensional item response models.British Journal of Mathematical and Statistical Psychology, 63,395416.CrossRefGoogle ScholarPubMed
Ip, E. H.,Wang, Y. J.,De Boeck, P., &Meulders, M. (2004). Locally dependent latent trait model for polytomous responses with application to inventory of hostility.Psychometrika, 69,191216.CrossRefGoogle Scholar
Ising, E. (1925). Beitrag zur theorie des ferromagnetismus.Zeitschrift für Physik A Hadrons and Nuclei, 31,253258.Google Scholar
Knowles, E. S., &Condon, C. A. (2000). Does the rose still smell as sweet? Item variability across test forms and revisions.Psychological Assessment, 12,245252.CrossRefGoogle Scholar
Koller, D., &Friedman, N. (2009). Probabilistic graphical models: Principles and techniques.Cambridge, MA:MIT press,Google Scholar
Kruis, J., &Maris, G. (2016). Three representations of the Ising model.Scientific Reports, 6,(34175)111.CrossRefGoogle ScholarPubMed
Laird, N. M. (1991). Topics in likelihood-based methods for longitudinal data analysis.Statistica Sinica, 1,3350.Google Scholar
Lee, J. D., &Hastie, T. J. (2015). Learning the structure of mixed graphical models.Journal of Computational and Graphical Statistics, 24,230253.CrossRefGoogle ScholarPubMed
Li, Y.,Bolt, D. M., &Fu, J. (2006). A comparison of alternative models for testlets.Applied Psychological Measurement, 30,321.CrossRefGoogle Scholar
Liu, J. (2017). On the consistency of Q-matrix estimation: A commentary.Psychometrika, 82,523527.CrossRefGoogle ScholarPubMed
Liu, J.,Xu, G., &Ying, Z. (2012). Data-driven learning of Q-matrix.Applied Psychological Measurement, 36,548564.CrossRefGoogle ScholarPubMed
Liu, J.,Xu, G., &Ying, Z. (2013). Theory of the self-learning Q-matrix.Bernoulli, 19,17901817.CrossRefGoogle ScholarPubMed
Lord, F. M., &Novick, M. R. (1968). Statistical theories of mental test scores.Reading, MA:Addison-Wesley.Google Scholar
Marsman, M.,Maris, G.,Bechger, T., &Glas, C. (2015). Bayesian inference for low-rank Ising networks.Scientific Reports, 5,(9050)17.CrossRefGoogle ScholarPubMed
McKinley, R. L., &Reckase, M. D. (1982). The use of the general Rasch model with multidimensional item response data.Iowa City, IA:American College Testing.Google Scholar
Pan, J.,Ip, E. H., &Dubé, L. (2017). An alternative to post hoc model modification in confirmatory factor analysis: The bayesian lasso.Psychological Methods, 22,687704.CrossRefGoogle ScholarPubMed
Parikh, N., &Boyd, S. P. (2014). Proximal algorithms.Foundations and Trends in Optimization, 1,127239.CrossRefGoogle Scholar
Rasch, G. (1960). Probabilistic models for some intelligence and achievement tests.Copenhagen:Danish Institute for Educational Research.Google Scholar
Ravikumar, P.,Wainwright, M. J., &Lafferty, J. D. (2010). High-dimensional ising model selection using 1-regularized logistic regression.The Annals of Statistics, 38,12871319.CrossRefGoogle Scholar
Reckase, M. (2009). Multidimensional item response theory.New York, NY:Springer.CrossRefGoogle Scholar
Reise, S. P.,Horan, W. P., &Blanchard, J. J. (2011). The challenges of fitting an item response theory model to the social anhedonia scale.Journal of Personality Assessment, 93,213224.CrossRefGoogle Scholar
Reise, S. P.,Morizot, J., &Hays, R. D. (2007). The role of the bifactor model in resolving dimensionality issues in health outcomes measures.Quality of Life Research, 16,1931.CrossRefGoogle ScholarPubMed
Rhemtulla, M.,Fried, E. I.,Aggen, S. H.,Tuerlinckx, F.,Kendler, K. S., &Borsboom, D. (2016). Network analysis of substance abuse and dependence symptoms.Drug and Alcohol Dependence, 161,230237.CrossRefGoogle ScholarPubMed
Schwarz, G. (1978). Estimating the dimension of a model.Annals of Statistics, 6,461464.CrossRefGoogle Scholar
Schwarz, N. (1999). Self-reports: How the questions shape the answers.American Psychologist, 54,93105.CrossRefGoogle Scholar
Sun, J.,Chen, Y.,Liu, J.,Ying, Z.Xin, T. (2016)Latent variable selection for multidimensional item response theory models via L1\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$L_1$$\end{document} regularization.(2016%). Psychometrika, 81,921939.CrossRefGoogle Scholar
van Borkulo, C. D.,Borsboom, D.,Epskamp, S.,Blanken, T. F.,Boschloo, L.,Schoevers, R. A.et.al (2014). A new method for constructing networks from binary data.Scientific Reports, 4,(5918)110.CrossRefGoogle ScholarPubMed
van der Maas, H. L.,Dolan, C. V.,Grasman, R. P.,Wicherts, J. M.,Huizenga, H. M., &Raijmakers, M. E. (2006). A dynamical model of general intelligence: The positive manifold of intelligence by mutualism.Psychological Review, 113,842861.CrossRefGoogle ScholarPubMed
Wainer, H.,Bradlow, E. T., &Du, Z. (2000). Testlet response theory: An analog for the 3PL model useful in testlet-based adaptive testing.van der Linden, W. J.Glas, G. A. Computerized adaptive testing: Theory and practice, 245269.New York, NY:Springer.CrossRefGoogle Scholar
Wang, W-C., &Wilson, M. (2005). The Rasch testlet model.Applied Psychological Measurement, 9,126149.CrossRefGoogle Scholar
Yao, L., &Schwarz, R. D. (2006). A multidimensional partial credit model with associated item and test statistics: An application to mixed-format tests.Applied Psychological Measurement, 30,469492.CrossRefGoogle Scholar
Yen, W. M. (1984). Effects of local item dependence on the fit and equating performance of the three-parameter logistic model.Applied Psychological Measurement, 8,125145.CrossRefGoogle Scholar
Yen, W. M. (1993). Scaling performance assessments: Strategies for managing local item dependence.Journal of Educational Measurement, 30,187213.CrossRefGoogle Scholar
Supplementary material: File

Chen et al. supplementary material

Chen et al. supplementary material
Download Chen et al. supplementary material(File)
File 187.4 KB