Item Response Theory Observed-Score Kernel Equating

Björn Andersson; Marie Wiberg

doi:10.1007/s11336-016-9528-7

Item Response Theory Observed-Score Kernel Equating

Published online by Cambridge University Press: 01 January 2025

Björn Andersson and

Marie Wiberg

Show author details

Björn Andersson*: Affiliation:
Beijing Normal University Uppsala University
Marie Wiberg: Affiliation:
Umeå University
*: Correspondence should be made to Björn Andersson, Collaborative Innovation Center of Assessment toward Basic Education Quality, Beijing Normal University, No. 19 Xinjiekou Wai Street, Haidian District, 100875 Beijing, China. Email: bjoern.andersson@bnu.edu.cn

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

Item response theory (IRT) observed-score kernel equating is introduced for the non-equivalent groups with anchor test equating design using either chain equating or post-stratification equating. The equating function is treated in a multivariate setting and the asymptotic covariance matrices of IRT observed-score kernel equating functions are derived. Equating is conducted using the two-parameter and three-parameter logistic models with simulated data and data from a standardized achievement test. The results show that IRT observed-score kernel equating offers small standard errors and low equating bias under most settings considered.

Keywords

observed-score equating item response theory equipercentile equating standard errors NEAT design

Type: Original Paper
Information: Psychometrika , Volume 82 , Issue 1 , March 2017 , pp. 48 - 66

DOI: https://doi.org/10.1007/s11336-016-9528-7 [Opens in a new window]
Copyright: Copyright © 2016 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Electronic supplementary material The online version of this article (doi:10.1007/s11336-016-9528-7) contains supplementary material, which is available to authorized users.

The first author acknowledges the financial support from the Collaborative Innovation Center of Assessment toward Basic Education Quality at Beijing Normal University. The research in this article by the second author was funded by the Swedish Research Council Grant 2014-578.

References

Andersson, B., Bränberg, K., & Wiberg, M. (2013). Performing the kernel method of test equating with the package kequate. Journal of Statistical Software, 55, (6), 1–25.CrossRef Google Scholar

Battauz, M. (2015). equateIRT: An R package for IRT test equating. Journal of Statistical Software, 68, (7), 1–22.CrossRef Google Scholar

Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46, 443–459.CrossRef Google Scholar

Chalmers, R. P. (2012). mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48, (6), 1–29.CrossRef Google Scholar

Dorans, N., Feigenbaum, M., Lawrence, I., Dorans, N., Feigenbaum, M., Feryok, N., Sehmitt, A., & Wright, N. (1994). Equating issues engendered by changes to the SAT and PSAT/NMSQT. Technical issues related to the introduction of the new SAT and PSAT/NMSQT, Princeton, NJ: Educational Testing Service 91–122.Google Scholar

Ferguson, T. (1996). A course in large sample theory, London: Chapman & Hall.CrossRef Google Scholar

Haebara, T. (1980). Equating logistic ability scales by a weighted least squares method. Japanese Psychological Research, 22, 144–149.CrossRef Google Scholar

Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications, Boston: Kluwer.CrossRef Google Scholar

Holland, P. W., & Thayer, D. T. (1989). The kernel method of equating score distributions (Technical Report No. 89-84). Princeton, NJ: Educational Testing Service.Google Scholar

Kim, S. (2006). A comparative study of IRT fixed parameter calibration methods. Journal of Educational Measurement, 43, 355–381.CrossRef Google Scholar

Kolen, M. J., & Brennan, R. J. (2014). Test equating: Methods and practices, 3New York: Springer.Google Scholar

Lee, Y.-H., & von Davier, A. A., von Davier, A. A. (2011). Equating through alternative kernels. Statistical models for test equating, scaling, and linking, New York: Springer.Google Scholar

Li, Y. H., & Lissitz, R. W. (2004). Applications of the analytically derived asymptotic standard errors of item response theory item parameter estimates. Journal of Educational Measurement, 41, 85–117.CrossRef Google Scholar

Lord, F. M. (1980). Applications of item response theory to practical testing problems, Hillsdale, NJ: Erlbaum.Google Scholar

Lord, F. M., & Wingersky, M. S. (1984). Comparison of IRT true-score and equipercentile observed-score “equatings”. Applied Psychological Measurement, 8, 452–461.CrossRef Google Scholar

Louis, T. A. (1982). Finding the observed information matrix when using the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 44, 226–233.CrossRef Google Scholar

Loyd, B. H., & Hoover, H. (1980). Vertical equating using the Rasch model. Journal of Educational Measurement, 17, 179–193.CrossRef Google Scholar

Marco, G. L. (1977). Item characteristic curve solutions to three intractable testing problems. Journal of Educational Measurement, 14, 139–160.CrossRef Google Scholar

Mislevy, R. J., & Bock, R. D. (1990). BILOG 3: Item analysis and test scoring with binary logistic models, Mooresville, IN: Scientific Software.Google Scholar

Moses, T., & Holland, P. W. (2010). A comparison of statistical selection strategies for univariate and bivariate log-linear models. British Journal of Mathematical and Statistical Psychology, 63, 557–574.CrossRef Google Scholar PubMed

Ogasawara, H. (2000). Asymptotic standard errors of IRT equating coefficients using moments. Economic Review (Otaru University of Commerce), 51, 1–23.Google Scholar

Ogasawara, H. (2001). Standard errors of item response theory equating/linking by response function methods. Applied Psychological Measurement, 25, 53–67.CrossRef Google Scholar

Ogasawara, H. (2003). Asymptotic standard errors of IRT observed-score equating methods. Psychometrika, 68, 193–211.CrossRef Google Scholar

Ogasawara, H. (2009). Asymptotic cumulants of the parameter estimators in item response theory. Computational Statistics, 24, 313–331.CrossRef Google Scholar

R Development Core Team. (2013). R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing.Google Scholar

Rijmen, F., Qu, Y., & von Davier, A. A., von Davier, A. A. (2011). Hypothesis testing of equating differences in the kernel equating framework. Statistical models for test equating, scaling, and linking, New York: Springer 317–326.Google Scholar

Stocking, M. L., & Lord, F. M. (1983). Developing a common metric in item response theory. Applied Psychological Measurement, 7, 201–210.CrossRef Google Scholar

van der Linden, W. J., von Davier, A. A. (2011). Local observed-score equating. Statistical models for test equating, scaling, and linking, New York: Springer 317–326.Google Scholar

von Davier, A. A., Holland, P. W., & Thayer, D. T. (2004). The kernel method of test equating, New York: Springer.CrossRef Google Scholar

von Davier, A. A. (2010, July). Equating observed-scores: The percentile rank, gaussian kernel, and IRT observed-score equating methods. Workshop given at the International Meeting of the Psychometric Society, Athens, GA.Google Scholar

Wiberg, M., van der Linden, W. J., von Davier, A. A. (2014). Local observed-score kernel equating. Journal of Educational Measurement, 51, 57–74.CrossRef Google Scholar

Yuan, K.-H., Cheng, Y., & Patton, J. (2013). Information matrices and standard errors for MLEs of item parameters in IRT. Psychometrika, 79, 232–254.CrossRef Google Scholar PubMed

Andersson and Wiberg supplementary material

File 286.9 KB

Article contents

Item Response Theory Observed-Score Kernel Equating

Abstract

Keywords

Access options

Footnotes

References

Andersson and Wiberg supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests