Learning Large Q-Matrix by Restricted Boltzmann Machines

Chengcheng Li; Chenchen Ma; Gongjun Xu

doi:10.1007/s11336-021-09828-4

Learning Large Q-Matrix by Restricted Boltzmann Machines

Published online by Cambridge University Press: 01 January 2025

Chengcheng Li ,

Chenchen Ma and

Gongjun Xu

Show author details

Chengcheng Li: Affiliation:
University of Michigan
Chenchen Ma: Affiliation:
University of Michigan
Gongjun Xu*: Affiliation:
University of Michigan
*: Correspondence should be made to Gongjun Xu, Department of Statistics, University of Michigan, Ann Arbor, USA. Email: gongjun@umich.edu

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

Estimation of the large Q-matrix in cognitive diagnosis models (CDMs) with many items and latent attributes from observational data has been a huge challenge due to its high computational cost. Borrowing ideas from deep learning literature, we propose to learn the large Q-matrix by restricted Boltzmann machines (RBMs) to overcome the computational difficulties. In this paper, key relationships between RBMs and CDMs are identified. Consistent and robust learning of the Q-matrix in various CDMs is shown to be valid under certain conditions. Our simulation studies under different CDM settings show that RBMs not only outperform the existing methods in terms of learning speed, but also maintain good recovery accuracy of the Q-matrix. In the end, we illustrate the applicability and effectiveness of our method through a TIMSS mathematics data set.

Keywords

Cognitive diagnosis models Q-matrix Restricted Boltzmann machines

Type: Theory & Methods
Information: Psychometrika , Volume 87 , Issue 3 , September 2022 , pp. 1010 - 1041

DOI: https://doi.org/10.1007/s11336-021-09828-4 [Opens in a new window]
Copyright: Copyright © 2022 The Author(s) under exclusive licence to The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Supplementary Information The online version contains supplementary material available at https://doi.org/10.1007/s11336-021-09828-4.

References

Bengio, Y., & Delalleau, O., (2009). Justifying and generalizing contrastive divergence Neural Computation 21(6) 1601–1621CrossRef Google Scholar PubMed

Carreira-Perpinan, M. A., & Hinton, G. E. (2005). On contrastive divergence learning. In: Aistats, vol. 10, pp. 33–40. Citeseer.Google Scholar

Chen, Y.,Culpepper, S. A., Chen, Y., & Douglas, J., (2018). Bayesian estimation of the DINA Q matrix Psychometrika 83(1) 89–108CrossRef Google Scholar PubMed

Chen, Y.,Liu, J., Xu, G., & Ying, Z., (2015). Statistical analysis of Q-matrix based diagnostic classification models Journal of the American Statistical Association 110(510) 850–866CrossRef Google Scholar

Chiu, C. Y., (2013). Statistical refinement of the Q-matrix in cognitive diagnosis Applied Psychological Measurement 37(8) 598–618CrossRef Google Scholar

Choi, K.,Lee, Y. S., Park, Y. S., (2015). What CDM can tell about what students have learned: An analysis of TIMSS eighth grade mathematics Eurasia Journal of Mathematics, Science and Technology Education 11 1563–1577CrossRef Google Scholar

Chung, M., & Johnson, M. S. (2018). An MCMC algorithm for estimating the Q-matrix in a Bayesian framework. arXiv preprint arXiv:1802.02286.Google Scholar

Collins, M.,Globerson, A., Koo, T. K.,Carreras, X., Bartlett, P. L., (2008). Exponentiated gradient algorithms for conditional random fields and max-margin markov networks Journal of Machine Learning Research 9 1775–1822Google Scholar

Culpepper, S., (2019). Estimating the cognitive diagnosis Q matrix with expert knowledge: Application to the fraction-subtraction dataset Psychometrika 84(2) 333–357CrossRef Google Scholar

de la Torre, J., (2011). The generalized DINA model framework Psychometrika 76(2) 179–199CrossRef Google Scholar

de la Torre, J., & Chiu, C. Y., (2016). A general method of empirical Q-matrix validation Psychometrika 81(2) 253–73CrossRef Google Scholar PubMed

de la Torre, J., (2008). An empirically based method of Q-matrix validation for the DINA model: Development and applications Journal of Educational Measurement 45(4) 343–362CrossRef Google Scholar

de la Torre, J.,van der Ark, L. A., Rossi, G., (2018). Analysis of clinical data from a cognitive diagnosis modeling framework Measurement and Evaluation in Counseling and Development 51(4) 281–296CrossRef Google Scholar

DeCarlo, L. T., (2012). Recognizing uncertainty in the Q-matrix via a Bayesian extension of the DINA model Applied Psychological Measurement 36(6) 447–468CrossRef Google Scholar

García, P.,Olea, J., de la Torre, J., (2014). Application of cognitive diagnosis models to competency-based situational judgment tests Psicothema 26(3) 372–7 25069557CrossRef Google Scholar PubMed

González, J., & Wiberg, M., Applying test equating methods New York SpringerCrossRef Google Scholar

Gu, Y., & Xu, G., (2017). The sufficient and necessary condition for the identifiability and estimability of the DINA model Psychometrika (2019). 84(2) 468–483CrossRef Google Scholar

Gu, Y., & Xu, G., (2020). Partial identifiability of restricted latent class models Annals of Statistics 48(4) 2082–2107CrossRef Google Scholar

Gu, Y., & Xu, G. (2020b). Sufficient and necessary conditions for the identifiability of the Q-matrix. Statistica Sinica.CrossRef Google Scholar

Haertel, E. H., (1989). Using restricted latent class models to map the skill structure of achievement items Journal of Educational Measurement 26(4) 301–321CrossRef Google Scholar

Hartz, S. (2002). A Bayesian framework for the unified model for assessing cognitive abilities: Blending theory with practicality. Unpublished doctoral dissertation.Google Scholar

Henson, R. A.,Templin, J. L., Willse, J. T., (2008). Defining a family of cognitive diagnosis models using log-linear models with latent variables Psychometrika 74(2) 191CrossRef Google Scholar

Hinton, G. E., (2002). Training products of experts by minimizing contrastive divergence Neural Computation 14(8) 1771–1800CrossRef Google Scholar PubMed

Hinton, G. E., & Salakhutdinov, R. R., (2006). Reducing the dimensionality of data with neural networks Science 313(5786) 504–507CrossRef Google Scholar PubMed

Jiang, B., & Wu, T-Y Jin, Y., & Wong, W. H., et.al., (2018). Convergence of contrastive divergence algorithm in exponential family The Annals of Statistics 46 6A 3067–3098CrossRef Google Scholar

Junker, B. W., & Sijtsma, K., (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory Applied Psychological Measurement 25(3) 258–272CrossRef Google Scholar

Kuhn, H. W., (1955). The Hungarian method for the assignment problem Naval Research Logistics Quarterly 2 1–2 83–97CrossRef Google Scholar

Larochelle, H., & Bengio, Y. (2008). Classification using discriminative restricted Boltzmann machines. In: Proceedings of the 25th international conference on machine learning, ICML ’08, pages 536–543, New York, NY, USA. ACMCrossRef Google Scholar

Lee, Y. S.,Park, Y. S., Taylan, D., (2011). A cognitive diagnostic modeling of attribute mastery in Massachusetts, Minnesota, and the U.S. national sample using the TIMSS 2007 International Journal of Testing 11 144–177CrossRef Google Scholar

Liu, J.,Xu, G., Ying, Z., (2012). Data-driven learning of Q-matrix Applied Psychological Measurement 36(7) 548–564CrossRef Google Scholar PubMed

Long, P. M., & Servedio, R. A. (2010). Restricted Boltzmann machines are hard to approximately evaluate or simulate. In: Proceedings of the 27th international conference on machine learning, ICML’10, page 703–710, Madison, WI, USA. OmnipressGoogle Scholar

MacKay, D. (2001). Failures of the one-step learning algorithm. In Available electronically at http://www.inference.phy.cam.ac.uk/mackay/abstracts/gbm.html.Google Scholar

Robitzsch, A.,Kiefer, T., George, A. C.,Uenlue, A., Robitzsch, M. A., Handbook of diagnostic classification models New York SpringerGoogle Scholar

Rosasco, L. (2009). Sparsity based regularization. MIT class notes.Google Scholar

Salakhutdinov, R., Mnih, A., & Hinton, G. (2007). Restricted Boltzmann machines for collaborative filtering. In: Proceedings of the 24th international conference on machine learning, ICML ’07, pp. 791–798, New York, NY, USA. ACMCrossRef Google Scholar

Schlueter, J. (2014). Restricted Boltzmann machine derivations. NotesGoogle Scholar

Smolensky, P., Information processing in dynamical systems: Foundations of harmony theory (1986). Colorado Colorado University at Boulder Department of Computer ScienceGoogle Scholar

Su, Y-L Choi, K.,Lee, W., Choi, T., & McAninch, M., (2020). Hierarchical cognitive diagnostic analysis for timss 2003 mathematics Centre for Advanced Studies in Measurement and Assessment (2013). 35 1–71Google Scholar

Sutskever, I., & Tieleman, T. (2010). On the convergence properties of contrastive divergence. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics, pp. 789–795.Google Scholar

Templin, J., & Henson, R., (2006). Measurement of psychological disorders using cognitive diagnosis models Psychological Methods 11 287–305CrossRef Google Scholar PubMed

Tsuruoka, Y., Tsujii, J., & Ananiadou, S. (2009). Stochastic gradient descent training for L1-regularized log-linear models with cumulative penalty. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP: Vol. 1, pp. 477–485. Association for Computational Linguistics.CrossRef Google Scholar

von Davier, M., A general diagnostic model applied to language testing data (ETS research report RR-05-16) Princeton Educational Testing ServiceGoogle Scholar

von Davier, M., (2005). A general diagnostic model applied to language testing data British Journal of Mathematical and Statistical Psychology (2008). 61(2) 287–307CrossRef Google Scholar

Wu, Z.,Deloria-Knoll, M., Zeger, S. L., (2016). Nested partially latent class models for dependent binary data; Estimating disease etiology Biostatistics 18(2) 200–213Google Scholar

Xu, G., (2017). Identifiability of restricted latent class models with binary responses Annals of Statistics 45(2) 675–707CrossRef Google Scholar

Xu, G., & Shang, Z., (2018). Identifying latent structures in restricted latent class models Journal of the American Statistical Association 113(523) 1284–1295CrossRef Google Scholar

Yuille, A. L., (2004). The convergence of contrastive divergences Advances in Neural Information Processing Systems 17 1593–1600Google Scholar

Li et al. supplementary material

Supplementary Information The online version contains supplementary material available at https://doi.org/10.1007/s11336-021-09828-4.

File 330.4 KB

Article contents

Learning Large Q-Matrix by Restricted Boltzmann Machines

Abstract

Keywords

Access options

Footnotes

References

Li et al. supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests