Semiparametric Factor Analysis for Item-Level Response Time Data

Yang Liu; Weimeng Wang

doi:10.1007/s11336-021-09832-8

Semiparametric Factor Analysis for Item-Level Response Time Data

Published online by Cambridge University Press: 01 January 2025

Yang Liu

and

Weimeng Wang

Show author details

Yang Liu*: Affiliation:
University of Maryland
Weimeng Wang: Affiliation:
University of Maryland
*: Correspondence should be made to Yang Liu, Department of Human Development and Quantitative Methodology, University of Maryland, College Park, USA. Email: yliu87@umd.edu

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

Item-level response time (RT) data can be conveniently collected from computer-based test/survey delivery platforms and have been demonstrated to bear a close relation to a miscellany of cognitive processes and test-taking behaviors. Individual differences in general processing speed can be inferred from item-level RT data using factor analysis. Conventional linear normal factor models make strong parametric assumptions, which sacrifices modeling flexibility for interpretability, and thus are not ideal for describing complex associations between observed RT and the latent speed. In this paper, we propose a semiparametric factor model with minimal parametric assumptions. Specifically, we adopt a functional analysis of variance representation for the log conditional densities of the manifest variables, in which the main effect and interaction functions are approximated by cubic splines. Penalized maximum likelihood estimation of the spline coefficients can be performed by an Expectation-Maximization algorithm, and the penalty weight can be empirically determined by cross-validation. In a simulation study, we compare the semiparametric model with incorrectly and correctly specified parametric factor models with regard to the recovery of data generating mechanism. A real data example is also presented to demonstrate the advantages of the proposed method.

Keywords

Factor analysis Conditional density estimation Functional ANOVA Cubic spline Penalized maximum likelihood Expectation–maximization algorithm

Information

Type: Theory and Methods
Information: Psychometrika , Volume 87 , Issue 2: Special Issue on Forecasting with Intensive Longitudinal Data , June 2022 , pp. 666 - 692

DOI: https://doi.org/10.1007/s11336-021-09832-8 [Opens in a new window]
Copyright: copyright © 2021 The Author(s) under exclusive licence to The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

Supplementary Information The online version contains supplementary material available at https://doi.org/10.1007/s11336-021-09832-8.

References

Agresti, A. (2003). Categorical data analysis. Wiley.Google Scholar

Aitkin, M. (1999). A general maximum likelihood analysis of variance components in generalized linear models. Biometrics, 55 (11), 117– 128. CrossRef Google Scholar PubMed

Alexander, P. A. & The Disciplined Reading and Learning Research Laboratory. (2012). Reading into the future: Competence for the 21st century. Educational Psychologist, 47(4), 259–280.CrossRef Google Scholar

Alexander, P. A., Dumas, D., Grossnickle, E. M., List, A., & Firetto, C. M. (2016). Measuring relational reasoning. The Journal of Experimental Education, 84 (1), 119– 151. CrossRef Google Scholar

Bartholomew, D., Knott, M., & Moustaki, I. (2011). Latent variable models and factor analysis: A unified approach. Wiley.CrossRef Google Scholar

Bock, R. D. (1972). Estimating item parameters and latent ability when responses are scored in two or more nominal categories. Psychometrika, 37 (1), 29– 51. CrossRef Google Scholar

Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46 (4), 443– 459. CrossRef Google Scholar

Bollen, K. (1989). Structural equations with latent variables. Wiley.CrossRef Google Scholar

Boyd, S., & Vandenberghe, L. (2004). Convex optimization. Cambridge University Press.CrossRef Google Scholar

Brainard, D. H. (1997). The psychophysics toolbox. Spatial Vision, 10 (4), 433– 436. CrossRef Google Scholar PubMed

Brown, L., Gans, N., Mandelbaum, A., Sakov, A., Shen, H., Zeltyn, S., & Zhao, L. (2005). Statistical analysis of a telephone call center: A queueing-science perspective. Journal of the American Statistical Association, 100 (469), 36– 50. CrossRef Google Scholar

Cai, L. (2010). High-dimensional exploratory item factor analysis by a Metropolis–Hastings Robbins–Monro algorithm. Psychometrika, 75 (13), 3– 57. CrossRef Google Scholar

Chambers, J. M., Cleveland, W. S., Kleiner, B., & Tukey, P. A. (1983). Graphical methods for data analysis. Chapman.Google Scholar

Currie, I. D., Durban, M., & Eilers, P. H. (2006). Generalized linear array models with applications to multidimensional. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 682, 259– 280. CrossRef Google Scholar

Davis, P., & Polonsky, I. (1964). Numerical interpolation, differentiation and integration. In M. Abramowitz & I. A. Stegun (Eds.), Handbook of mathematical functions with formulas, graphs, and mathematical tables. DCNational Bureau of Standards.Google Scholar

De Boeck, P., & Jeon, M. (2019). An overview of models for response times and processes in cognitive. Frontiers in Psychology, 10, 102 CrossRef Google Scholar PubMed

De Boor, C. (1978). A practical guide to splines. Springer.CrossRef Google Scholar

De Boor, C., & Daniel, J. W. (1974). Splines with nonnegative B-spline coefficients. Mathematics of Computation, 28 (126), 565– 568. Google Scholar

Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39 (1), 1– 22. CrossRef Google Scholar

Dierckx, P. (1993). Curve and surface fitting with splines. Clarendon.CrossRef Google Scholar

Douglas, J. (1997). Joint consistency of nonparametric item characteristic curve and ability estimation. Psychometrika, 6 (21), 7– 28. CrossRef Google Scholar

Eilers, P. H., & Marx, B. D. (1996). Flexible smoothing with B-splines and penalties. Statistical Science, 11, 89– 102. CrossRef Google Scholar

Entink, R. K., van der Linden, W., & Fox, J. P. (2009). A Box–Cox normal model for response times. British Journal of Mathematical and Statistical Psychology, 62 (3), 621– 640. CrossRef Google Scholar

Glas, C. A., & van der Linden, W. J. (2010). Marginal likelihood inference for a model for item responses and response times. British Journal of Mathematical and Statistical Psychology, 63 (3), 603– 626. CrossRef Google Scholar

Gu, C. & Qiu, C. (1993). Smoothing spline density estimation: Theory. The Annals of Statistics, 217–234.CrossRef Google Scholar

Gu, C. (1995). Smoothing spline density estimation: Conditional distribution. Statistica Sinica, 709–726.Google Scholar

Gu, C. (1993). Smoothing spline density estimation: A dimensionless automatic algorithm. Journal of the American Statistical Association, 88 (422), 495– 504. CrossRef Google Scholar

Gu, C. (2013). Smoothing spline ANOVA models. Springer.CrossRef Google Scholar

Gu, M., & Kong, F. (1998). A stochastic approximation algorithm with Markov chain Monte-Carlo method for incomplete data estimation problems. Proceedings of the National Academy of Sciences, 95 (13), 7270– 7274. CrossRef Google Scholar PubMed

Gu, C., & Wahba, G. (1993). Smoothing spline ANOVA with component-wise Bayesian ‘confidence interval’. Journal of Computational and Graphical Statistics, 2 (1), 97– 117. Google Scholar

Hastie, T., Tibshirani, R., & Friedman, J. (2013). The elements of statistical learning: Data mining, inference, and prediction. Springer.Google Scholar

Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling: A Multidisciplinary Journal, 6 (1), 1– 55. CrossRef Google Scholar

Kang, H. A. (2017). Penalized partial likelihood inference of proportional hazards latent trait models. British Journal of Mathematical and Statistical Psychology, 70 (2), 187– 208. CrossRef Google Scholar PubMed

Kendall, M. (1955). Rank correlation methods (2nd ed.). Charles Griffin and Co.Google Scholar

Kyllonen, P. C., & Zu, J. (2016). Use of response time for measuring cognitive ability. Journal of Intelligence, 4 (14), 1– 29. CrossRef Google Scholar

Lee, Y. H., & Haberman, S. J. (2016). Investigating test-taking behaviors using timing and process data. International Journal of Testing, 16 (3), 240– 267. CrossRef Google Scholar

Lee, S. Y., Lu, B., & Song, X. Y. (2008). Semiparametric Bayesian analysis of structural equation models with fixed covariates. Statistics in Medicine, 27 (13), 2341– 2360. CrossRef Google Scholar PubMed

Leitenstorfer, F., & Tutz, G. (2007). Generalized monotonic regression based on B-splines with an application to air pollution data. Biostatistics, 8 (3), 654– 673. CrossRef Google Scholar PubMed

Liu, Y., Magnus, B. E., & Thissen, D. (2016). Modeling and testing differential item functioning in unidimensional binary item response models with a single continuous covariate: A functional data analysis approach. Psychometrika, 81 (2), 371– 398. CrossRef Google Scholar PubMed

MacCallum, R. C. (2003). 2001 presidential address: Working with imperfect models. Multivariate Behavioral Research, 38 (1), 113– 139. CrossRef Google Scholar PubMed

MacCallum, R. C., Roznowski, M., & Necowitz, L. B. (1992). Model modifications in covariance structure analysis: The problem of capitalization on chance. Psychological Bulletin, 111 (3), 490– 504. CrossRef Google Scholar PubMed

Maydeu-Olivares, A. (2017). Assessing the size of model misfit in structural equation models. Psychometrika, 82 (3), 533– 558. CrossRef Google Scholar

Molenaar, D., Bolsinova, M., & Vermunt, J. K. (2018). A semi-parametric within-subject mixture approach to the analyses of responses and response times. British Journal of Mathematical and Statistical Psychology, 71 (2), 205– 228. CrossRef Google Scholar

Nocedal, J., & Wright, S. (2006). Numerical optimization. Springer.Google Scholar

OECD. (2017). PISA 2015 assessment and analytical framework. https://doi.org/10.1787/9789264281820-en CrossRef Google Scholar

Pya, N., & Wood, S. N. (2015). Shape constrained additive models. Statistics and Computing, 25 (3), 543– 559. CrossRef Google Scholar

R Core Team. (2020). R: A language and environment for statistical computing [computer oftware manual], Vienna, Austria. https://www.R-project.org/ Google Scholar

Ramsay, J. O. (1991). Kernel smoothing approaches to nonparametric item characteristic curve estimation. Psychometrika, 56 (4), 611– 630. CrossRef Google Scholar

Ramsay, J. O., & Silverman, B. W. (1997). Functional data analysis. Springer.CrossRef Google Scholar

Ramsay, J. O., & Winsberg, S. (1991). Maximum marginal likelihood estimation for semiparametric item analysis. Psychometrika, 56 (3), 365– 379. CrossRef Google Scholar

Ranger, J., Kuhn, J. T., & Ortner, T. M. (2020). Modeling responses and response times in tests with the hierarchical model and the three-parameter lognormal distribution. Educational and Psychological Measurement, 80 (6), 1059– 1089. CrossRef Google Scholar PubMed

Ranger, J., & Ortner, T. (2012). A latent trait model for response times on tests employing the proportional hazards model. British Journal of Mathematical and Statistical Psychology, 65 (2), 334– 349. CrossRef Google Scholar PubMed

Ranger, J., & Ortner, T. M. (2013). Response time modeling based on the proportional hazards model. Multivariate Behavioral Research, 48 (4), 503– 533. CrossRef Google Scholar PubMed

Ranger, J., & Wolgast, A. (2019). Using response times as collateral information about latent traits in psychological tests. Methodology, 15, 185– 196. CrossRef Google Scholar

Rossi, N., Wang, X., & Ramsay, J. O. (2002). Nonparametric item response function estimates with the EM algorithm. Journal of Educational and Behavioral Statistics, 27 (3), 291– 317. CrossRef Google Scholar

Rudin, W. (1964). Principles of mathematical analysis. McGraw-Hill.Google Scholar

Schnipke, D. L., & Scrams, D. J. (2002). Exploring issues of examinee behavior: Insights gained from response-time analyses. In C. N. Mills, M. Potenza, J. J. Fremer, & W. Ward (Eds.), Computer-based testing: Building the foundation for future assessments (pp. 237–266). Lawrence Erlbaum Associates.Google Scholar

Shaked, M., & Shanthikumar, J. (2007). Stochastic orders. Springer.CrossRef Google Scholar

Sinharay, S., & Johnson, M. S. (2019). The use of item scores and response times to detect examinees who may have benefited from item preknowledge. British Journal of Mathematical and Statistical Psychology. Google Scholar PubMed

Sinharay, S., & van Rijn, P. W. (2020). Assessing fit of the lognormal model for response times. Journal of Educational and Behavioral Statistics, 45 (5), 534– 568. CrossRef Google Scholar

Skrondal, A., & Rabe-Hesketh, S. (2004). Generalized latent variable modeling: Multilevel, longitudinal, and structural equation models. CRC Press.CrossRef Google Scholar

Snow, J. (2012). Qualtrics survey software: Handbook for research professionals. Qualtrics Labs Inc.Google Scholar

Song, X. Y., & Lu, Z. H. (2010). Semiparametric latent variable models with Bayesian P-splines. Journal of Computational and Graphical Statistics, 19 (3), 590– 608. CrossRef Google Scholar

Thissen, D., & Steinberg, L. (1986). A taxonomy of item response models. Psychometrika, 51 (4), 567– 577. CrossRef Google Scholar

Thissen, D., & Wainer, H. (2001). Test scoring. Taylor & Francis.CrossRef Google Scholar

van der Linden, W. J. (2006). A lognormal model for response times on test items. Journal of Educational and Behavioral Statistics, 31 (2), 181– 204. CrossRef Google Scholar

van der Linden, W. J., & Guo, F. (2008). Bayesian procedures for identifying aberrant response-time patterns in adaptive testing. Psychometrika, 73 (3), 365– 384. CrossRef Google Scholar

van der Linden, W. J., Klein Entink, R. H., & Fox, J. P. (2010). IRT parameter estimation with response times as collateral information. Applied Psychological Measurement, 34 (5), 327– 347. CrossRef Google Scholar

Wang, C., Fan, Z., Chang, H. H., & Douglas, J. A. (2013). A semiparametric model for jointly analyzing response times and accuracy in computerized testing. Journal of Educational and Behavioral Statistics, 38 (4), 381– 417. CrossRef Google Scholar

Wood, S. N. (2004). Stable and efficient multiple smoothing parameter estimation for generalized additive models. Journal of the American Statistical Association, 99 (467), 673– 686. CrossRef Google Scholar

Wu, C. J. (1983). On the convergence properties of the EM algorithm. The Annals of Statistics, 95–103.CrossRef Google Scholar

Yalcin, I. & Amemiya, Y. (2001). Nonlinear factor analysis as a statistical method. Statistical Science, 275–294.Google Scholar

Zhang, S., Chen, Y., & Liu, Y. (2020). An improved stochastic EM algorithm for large-scale full-information item factor analysis. British Journal of Mathematical and Statistical Psychology, 73 (1), 44– 71. CrossRef Google Scholar PubMed

Zhang, D., & Davidian, M. (2001). Linear mixed models with flexible distributions of random effects for longitudinal data. Biometrics, 57 (3), 795– 802. CrossRef Google Scholar PubMed

Zhao, H., Alexander, P. A., & Sun, Y. (2020). Relational reasoning’s contributions to mathematical thinking and performance in Chinese elementary and middle-school students. Journal of Educational Psychology, Google Scholar

Liu and Wang Supplementary material

File 128.9 KB

Article contents

Semiparametric Factor Analysis for Item-Level Response Time Data

Abstract

Keywords

Information

Access options

Article purchase

Temporarily unavailable

Footnotes

References

Liu and Wang Supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests