Robust Measurement via A Fused Latent and Graphical Item Response Theory Model

Yunxiao Chen; Xiaoou Li; Jingchen Liu; Zhiliang Ying

doi:10.1007/s11336-018-9610-4

Robust Measurement via A Fused Latent and Graphical Item Response Theory Model

Published online by Cambridge University Press: 01 January 2025

Xiaoou Li ,

and

Yunxiao Chen: Affiliation:
Emory University
Xiaoou Li: Affiliation:
University of Minnesota
Jingchen Liu*: Affiliation:
Columbia University
Zhiliang Ying: Affiliation:
Columbia University
*: Correspondence should be made to Jingchen Liu, Columbia University, New York, USA. Email: jcliu@stat.columbia.edu

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

Item response theory (IRT) plays an important role in psychological and educational measurement. Unlike the classical testing theory, IRT models aggregate the item level information, yielding more accurate measurements. Most IRT models assume local independence, an assumption not likely to be satisfied in practice, especially when the number of items is large. Results in the literature and simulation studies in this paper reveal that misspecifying the local independence assumption may result in inaccurate measurements and differential item functioning. To provide more robust measurements, we propose an integrated approach by adding a graphical component to a multidimensional IRT model that can offset the effect of unknown local dependence. The new model contains a confirmatory latent variable component, which measures the targeted latent traits, and a graphical component, which captures the local dependence. An efficient proximal algorithm is proposed for the parameter estimation and structure learning of the local dependence. This approach can substantially improve the measurement, given no prior information on the local dependence structure. The model can be applied to measure both a unidimensional latent trait and multidimensional latent traits.

Keywords

item response theory local dependence robust measurement differential item functioning graphical model Ising model pseudo-likelihood regularized estimator Eysenck personality questionnaire-revised

Information

Type: Original Paper
Information: Psychometrika , Volume 83 , Issue 3 , September 2018 , pp. 538 - 562

DOI: https://doi.org/10.1007/s11336-018-9610-4 [Opens in a new window]
Copyright: Copyright © The Psychometric Society 2018

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

Electronic supplementary material The online version of this article (https://doi.org/10.1007/s11336-018-9610-4) contains supplementary material, which is available to authorized users.

References

Anderson, C. J., &Vermunt, J. K. (2000). Log-multiplicative association models as latent variable models for nominal and/or ordinal data.Sociological Methodology, 30,81–121.CrossRef Google Scholar

Anderson, C. J., &Yu, H.-T. (2007). Log-multiplicative association models as item response models.Psychometrika, 72,5–23.CrossRef Google Scholar

Barber, R. F., &Drton, M. (2015). High-dimensional Ising model selection with Bayesian information criteria.Electronic Journal of Statistics, 9,567–607.CrossRef Google Scholar

Belloni, A., &Chernozhukov, V. (2013). Least squares after model selection in high-dimensional sparse models.Bernoulli, 19,521–547.CrossRef Google Scholar

Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems.Journal of the Royal Statistical Society Series B (Methodological), 36,192–236.CrossRef Google Scholar

Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability.Lord, F. M., &Novick, M. R. Statistical theories of mental test scores, 395–479.Reading, MA:Addison-Wesley.Google Scholar

Boschloo, L.,van Borkulo, C. D.,Rhemtulla, M.,Keyes, K. M.Borsboom, D., &Schoevers, R. A. (2015). The network structure of symptoms of the diagnostic and statistical manual of mental disorders.PLoS One, 10,e0137621CrossRef Google Scholar

Bradlow, E. T.,Wainer, H., &Wang, X. (1999). A Bayesian random effects model for testlets.Psychometrika, 64,153–168.CrossRef Google Scholar

Braeken, J. (2011). A boundary mixture approach to violations of conditional independence.Psychometrika, 76,57–76.CrossRef Google Scholar

Braeken, J.,Tuerlinckx, F., &De Boeck, P. (2007). Copula functions for residual dependency.Psychometrika, 72,393–411.CrossRef Google Scholar

Cai, L.,Yang, J. S., &Hansen, M. (2011). Generalized full-information item bifactor analysis.Psychological Methods, 16,221–248.CrossRef Google Scholar PubMed

Chen, Y. (2016). Latent variable modeling and statistical learning. Ph.D. thesis, Columbia University. Available at http://academiccommons.columbia.edu/catalog/ac:198122.Google Scholar

Chen, Y., Li, X., Liu, J., & Ying, Z. (2016) A fused latent and graphical model for multivariate binary data. Available at arXiv:1606.08925v1.pdf. ArXiv preprint.Google Scholar

Chen, J., &Chen, Z. (2008). Extended Bayesian information criteria for model selection with large model spaces.Biometrika, 95,759–771.CrossRef Google Scholar

Chen, Y.,Liu, J.,Xu, G., &Ying, Z. (2015). Statistical analysis of Q-matrix based diagnostic classification models.Journal of the American Statistical Association, 110,850–866.CrossRef Google Scholar

Chen, Y.,Liu, J., &Ying, Z. (2015). Online item calibration for Q-matrix in CD-CAT.Applied Psychological Measurement, 39,5–15.CrossRef Google Scholar PubMed

Chen, W.-H., &Thissen, D. (1997). Local dependence indexes for item pairs using item response theory.Journal of Educational and Behavioral Statistics, 22,265–289.CrossRef Google Scholar

Cramer, A. O.,Sluis, S.,Noordhof, A.,Wichers, M.,Geschwind, N.,Aggen, S. H.,et.al Borsboom, D. (2012). Dimensions of normal personality as networks in search of equilibrium: You can’t like parties if you don’t like people.European Journal of Personality, 26,414–431.CrossRef Google Scholar

Cramer, A. O.,Waldorp, L. J.,van der Maas, H. L., &Borsboom, D. (2010). Complex realities require complex theories: Refining and extending the network approach to mental disorders.Behavioral and Brain Sciences, 33,178–193.CrossRef Google Scholar

Embretson, S. E., &Reise, S. P. (2000). Item response theory for psychologists.Mahwah, NJ:Lawrence Erlbaum Associates Publishers.Google Scholar

Epskamp, S., Maris, G. K., Waldorp, L. J., & Borsboom, D. (2016). Network psychometrics. arXiv preprint arXiv:1609.02818.Google Scholar

Epskamp, S.,Rhemtulla, M., &Borsboom, D. (2017). Generalized network pschometrics: Combining network and latent variable models.Psychometrika, 82,904–927.CrossRef Google Scholar PubMed

Eysenck, S., &Barrett, P. (2013). Re-introduction to cross-cultural studies of the EPQ.Personality and Individual Differences, 54,485–489.CrossRef Google Scholar

Eysenck, S. B., Eysenck, H. J., &Barrett, P. (1985). A revised version of the psychoticism scale.Personality and Individual Differences, 6,21–29.CrossRef Google Scholar

Ferrara, S.,Huynh, H., &Michaels, H. (1999). Contextual explanations of local dependence in item clusters in a large scale hands-on science performance assessment.Journal of Educational Measurement, 36,119–140.CrossRef Google Scholar

Foygel, R., & Drton, M. (2010). Extended Bayesian information criteria for Gaussian graphical models. In Advances in Neural Information Processing Systems (pp 604–612).Google Scholar

Fried, E. I., Bockting, C.,Arjadi, R.,Borsboom, D.,Amshoff, M.,Cramer, A. O.et.al (2015). From loss to loneliness: The relationship between bereavement and depressive symptoms.Journal of Abnormal Psychology, 124,256–265.CrossRef Google Scholar PubMed

Gibbons, R. D.,Bock, R. D.,Hedeker, D.,Weiss, D. J.,Segawa, E.,Bhaumik, D. K.,et.al (2007). Full-information item bifactor analysis of graded response data.Applied Psychological Measurement, 31,4–19.CrossRef Google Scholar

Gibbons, R. D., &Hedeker, D. R. (1992). Full-information item bi-factor analysis.Psychometrika, 57,423–436.CrossRef Google Scholar

Holland, P. W. (1990). The Dutch identity: A new tool for the study of item response models.Psychometrika, 55,5–18.CrossRef Google Scholar

Holland, P. W., &Wainer, H. (2012). Differential item functioning.New York, NY:Routledge.CrossRef Google Scholar

Hoskens, M., &De Boeck, P. (1997). A parametric model for local dependence among test items.Psychological Methods, 2,261–277.CrossRef Google Scholar

Ip, E. H. (2002). Locally dependent latent trait model and the Dutch identity revisited.Psychometrika, 67,367–386.CrossRef Google Scholar

Ip, E. H. (2010). Empirically indistinguishable multidimensional IRT and locally dependent unidimensional item response models.British Journal of Mathematical and Statistical Psychology, 63,395–416.CrossRef Google Scholar PubMed

Ip, E. H.,Wang, Y. J.,De Boeck, P., &Meulders, M. (2004). Locally dependent latent trait model for polytomous responses with application to inventory of hostility.Psychometrika, 69,191–216.CrossRef Google Scholar

Ising, E. (1925). Beitrag zur theorie des ferromagnetismus.Zeitschrift für Physik A Hadrons and Nuclei, 31,253–258.Google Scholar

Knowles, E. S., &Condon, C. A. (2000). Does the rose still smell as sweet? Item variability across test forms and revisions.Psychological Assessment, 12,245–252.CrossRef Google Scholar

Koller, D., &Friedman, N. (2009). Probabilistic graphical models: Principles and techniques.Cambridge, MA:MIT press,Google Scholar

Kruis, J., &Maris, G. (2016). Three representations of the Ising model.Scientific Reports, 6,(34175)1–11.CrossRef Google Scholar PubMed

Laird, N. M. (1991). Topics in likelihood-based methods for longitudinal data analysis.Statistica Sinica, 1,33–50.Google Scholar

Lee, J. D., &Hastie, T. J. (2015). Learning the structure of mixed graphical models.Journal of Computational and Graphical Statistics, 24,230–253.CrossRef Google Scholar PubMed

Li, Y.,Bolt, D. M., &Fu, J. (2006). A comparison of alternative models for testlets.Applied Psychological Measurement, 30,3–21.CrossRef Google Scholar

Liu, J. (2017). On the consistency of Q-matrix estimation: A commentary.Psychometrika, 82,523–527.CrossRef Google Scholar PubMed

Liu, J.,Xu, G., &Ying, Z. (2012). Data-driven learning of Q-matrix.Applied Psychological Measurement, 36,548–564.CrossRef Google Scholar PubMed

Liu, J.,Xu, G., &Ying, Z. (2013). Theory of the self-learning Q-matrix.Bernoulli, 19,1790–1817.CrossRef Google Scholar PubMed

Lord, F. M., &Novick, M. R. (1968). Statistical theories of mental test scores.Reading, MA:Addison-Wesley.Google Scholar

Marsman, M.,Maris, G.,Bechger, T., &Glas, C. (2015). Bayesian inference for low-rank Ising networks.Scientific Reports, 5,(9050)1–7.CrossRef Google Scholar PubMed

McKinley, R. L., &Reckase, M. D. (1982). The use of the general Rasch model with multidimensional item response data.Iowa City, IA:American College Testing.Google Scholar

Pan, J.,Ip, E. H., &Dubé, L. (2017). An alternative to post hoc model modification in confirmatory factor analysis: The bayesian lasso.Psychological Methods, 22,687–704.CrossRef Google Scholar PubMed

Parikh, N., &Boyd, S. P. (2014). Proximal algorithms.Foundations and Trends in Optimization, 1,127–239.CrossRef Google Scholar

Rasch, G. (1960). Probabilistic models for some intelligence and achievement tests.Copenhagen:Danish Institute for Educational Research.Google Scholar

Ravikumar, P.,Wainwright, M. J., &Lafferty, J. D. (2010). High-dimensional ising model selection using 1-regularized logistic regression.The Annals of Statistics, 38,1287–1319.CrossRef Google Scholar

Reckase, M. (2009). Multidimensional item response theory.New York, NY:Springer.CrossRef Google Scholar

Reise, S. P.,Horan, W. P., &Blanchard, J. J. (2011). The challenges of fitting an item response theory model to the social anhedonia scale.Journal of Personality Assessment, 93,213–224.CrossRef Google Scholar

Reise, S. P.,Morizot, J., &Hays, R. D. (2007). The role of the bifactor model in resolving dimensionality issues in health outcomes measures.Quality of Life Research, 16,19–31.CrossRef Google Scholar PubMed

Rhemtulla, M.,Fried, E. I.,Aggen, S. H.,Tuerlinckx, F.,Kendler, K. S., &Borsboom, D. (2016). Network analysis of substance abuse and dependence symptoms.Drug and Alcohol Dependence, 161,230–237.CrossRef Google Scholar PubMed

Schwarz, G. (1978). Estimating the dimension of a model.Annals of Statistics, 6,461–464.CrossRef Google Scholar

Schwarz, N. (1999). Self-reports: How the questions shape the answers.American Psychologist, 54,93–105.CrossRef Google Scholar

Sun, J.,Chen, Y.,Liu, J.,Ying, Z.Xin, T. (2016)Latent variable selection for multidimensional item response theory models via

L_{1}

\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$L_1$$\end{document}

regularization.(2016%). Psychometrika, 81,921–939.CrossRef Google Scholar

van Borkulo, C. D.,Borsboom, D.,Epskamp, S.,Blanken, T. F.,Boschloo, L.,Schoevers, R. A.et.al (2014). A new method for constructing networks from binary data.Scientific Reports, 4,(5918)1–10.CrossRef Google Scholar PubMed

van der Maas, H. L.,Dolan, C. V.,Grasman, R. P.,Wicherts, J. M.,Huizenga, H. M., &Raijmakers, M. E. (2006). A dynamical model of general intelligence: The positive manifold of intelligence by mutualism.Psychological Review, 113,842–861.CrossRef Google Scholar PubMed

Wainer, H.,Bradlow, E. T., &Du, Z. (2000). Testlet response theory: An analog for the 3PL model useful in testlet-based adaptive testing.van der Linden, W. J.Glas, G. A. Computerized adaptive testing: Theory and practice, 245–269.New York, NY:Springer.CrossRef Google Scholar

Wang, W-C., &Wilson, M. (2005). The Rasch testlet model.Applied Psychological Measurement, 9,126–149.CrossRef Google Scholar

Yao, L., &Schwarz, R. D. (2006). A multidimensional partial credit model with associated item and test statistics: An application to mixed-format tests.Applied Psychological Measurement, 30,469–492.CrossRef Google Scholar

Yen, W. M. (1984). Effects of local item dependence on the fit and equating performance of the three-parameter logistic model.Applied Psychological Measurement, 8,125–145.CrossRef Google Scholar

Yen, W. M. (1993). Scaling performance assessments: Strategies for managing local item dependence.Journal of Educational Measurement, 30,187–213.CrossRef Google Scholar

Chen et al. supplementary material

File 187.4 KB

Article contents

Robust Measurement via A Fused Latent and Graphical Item Response Theory Model

Abstract

Keywords

Information

Access options

Article purchase

Temporarily unavailable

Footnotes

References

Chen et al. supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests