Rasch Trees: A New Method for Detecting Differential Item Functioning in the Rasch Model

Carolin Strobl; Julia Kopf; Achim Zeileis

doi:10.1007/s11336-013-9388-3

Rasch Trees: A New Method for Detecting Differential Item Functioning in the Rasch Model

Published online by Cambridge University Press: 01 January 2025

Carolin Strobl ,

Julia Kopf and

Achim Zeileis

Show author details

Carolin Strobl*: Affiliation:
Universität Zürich
Julia Kopf: Affiliation:
Ludwig-Maximilians-Universität München
Achim Zeileis: Affiliation:
Universität Innsbruck
*: Requests for reprints should be sent to Carolin Strobl, Department of Psychology, Universität Zürich, Binzmühlestr. 14, 8050 Zürich, Switzerland. E-mail: Carolin.Strobl@psychologie.uzh.ch

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

A variety of statistical methods have been suggested for detecting differential item functioning (DIF) in the Rasch model. Most of these methods are designed for the comparison of pre-specified focal and reference groups, such as males and females. Latent class approaches, on the other hand, allow the detection of previously unknown groups exhibiting DIF. However, this approach provides no straightforward interpretation of the groups with respect to person characteristics. Here, we propose a new method for DIF detection based on model-based recursive partitioning that can be considered as a compromise between those two extremes. With this approach it is possible to detect groups of subjects exhibiting DIF, which are not pre-specified, but result from combinations of observed covariates. These groups are directly interpretable and can thus help generate hypotheses about the psychological sources of DIF. The statistical background and construction of the new method are introduced by means of an instructive example, and extensive simulation studies are presented to support and illustrate the statistical properties of the method, which is then applied to empirical data from a general knowledge quiz. A software implementation of the method is freely available in the R system for statistical computing.

Keywords

item response theory IRT Rasch model differential item functioning DIF measurement invariance structural change model-based recursive partitioning

Type: Original Paper
Information: Psychometrika , Volume 80 , Issue 2 , June 2015 , pp. 289 - 316

DOI: https://doi.org/10.1007/s11336-013-9388-3 [Opens in a new window]
Copyright: Copyright © 2013 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Andersen, E. (1972). A goodness of fit test for the Rasch model. Psychometrika, 38, 123–140.CrossRef Google Scholar

Andrews, D.W.K. (1993). Tests for parameter instability and structural change with unknown change point. Econometrica, 61, 821–856.CrossRef Google Scholar

Ben-Shakhar, G., & Sinai, Y. (1991). Gender differences in multiple-choice tests: the role of differential guessing tendencies. Journal of Educational Measurement, 28(1), 23–35.CrossRef Google Scholar

Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In Lord, F., & Novick, M. (Eds.), Statistical theories of mental test scores. Reading: Addison-Wesley.Google Scholar

Boulesteix, A.L. (2006). Maximally selected chi-square statistics and binary splits of nominal variables. Biometrical Journal, 48(5), 838–848.CrossRef Google Scholar PubMed

Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1984). Classification and regression trees. New York: Chapman and Hall.Google Scholar

Cohen, A., & Bolt, D. (2005). A mixture model analysis of differential item functioning. Journal of Educational Measurement, 42(3), 133–148.CrossRef Google Scholar

Dobra, A., & Gehrke, J. (2001). Bias correction in classification tree construction. In Brodley, C.E., & Danyluk, A.P. (Eds.), Proceedings of the seventeenth international conference on machine learning (ICML 2001) (pp. 90–97). San Mateo: Morgan Kaufmann.Google Scholar

Fischer, G., & Molenaar, I. (Eds.) (1995). Rasch models: foundations, recent developments and applications. New York: Springer.CrossRef Google Scholar

Fraley, C., & Raftery, A. (2002). Model-based clustering, discriminant analysis and density estimation. Journal of the American Statistical Association, 97(458), 611–631.CrossRef Google Scholar

Fraley, C., & Raftery, A. (2012). mclust: Model-based clustering/Normal mixture modeling. R package version 3.4.11. http://CRAN.R-project.org/package=mclust.Google Scholar

Gelin, M., Carleton, B., Smith, M., & Zumbo, B. (2004). The dimensionality and gender differential item functioning of the mini asthma quality of life questionnaire (MiniAQLQ). Social Indicators Research, 68, 91–105.CrossRef Google Scholar

Gustafsson, J. (1980). Testing and obtaining fit of data in the Rasch model. British Journal of Mathematical & Statistical Psychology, 33(2), 205–233.CrossRef Google Scholar

Hancock, G., & Samuelsen, K. (Eds.) (2007). Advances in latent variable mixture models. Charlotte: Information Age.Google Scholar

Hochberg, Y., & Tamhane, A. (Eds.) (1987). Multiple comparison procedures. New York: Wiley.CrossRef Google Scholar

Hothorn, T., Hornik, K., & Zeileis, A. (2006). Unbiased recursive partitioning: a conditional inference framework. Journal of Computational and Graphical Statistics, 15(3), 651–674.CrossRef Google Scholar

Hothorn, T., & Lausen, B. (2003). On the exact distribution of maximally selected rank statistics. Computational Statistics & Data Analysis, 43(2), 121–137.CrossRef Google Scholar

Hothorn, T., & Zeileis, A. (2008). Generalized maximally selected statistics. Biometrics, 64(4), 1263–1269.CrossRef Google Scholar PubMed

Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2(1), 193–218.CrossRef Google Scholar

Kelderman, H., & MacReady, G. (1990). The use of loglinear models for assessing differential item functioning across manifest and latent examinee groups. Journal of Educational Measurement, 27(4), 307–327.CrossRef Google Scholar

Koziol, J. (1991). On maximally selected chi-square statistics. Biometrics, 47(4), 1557–1561.CrossRef Google Scholar

Liou, M. (1994). More on the computation of higher-order derivatives on the elementary symmetric functions in the Rasch model. Applied Psychological Measurement, 18(1), 53–62.CrossRef Google Scholar

Maij-de Meij, A., Kelderman, H., & Van der Flier, H. (2008). Fitting a mixture item response theory model to personality questionnaire data: characterizing latent classes and investigating possibilities for improving prediction. Applied Psychological Measurement, 32(8), 611–631.CrossRef Google Scholar

Mair, P., & Hatzinger, R. (2007). Extended Rasch modeling: the eRm package for the application of IRT models in R. Journal of Statistical Software, 20, 9. http://www.jstatsoft.org/v20/i09/.CrossRef Google Scholar

Mair, P., Hatzinger, R., & Maier, M. (2012). eRm: extended Rasch modeling. R package version 0.15-0. http://CRAN.R-project.org/package=eRm.Google Scholar

Marcus, R., Peritz, E., & Gabriel, K. (1976). Closed testing procedures with special reference to ordered analysis of variance. Biometrika, 63(3), 655–660.CrossRef Google Scholar

Masters, G. (1982). A Rasch model for partial credit scoring. Psychometrika, 47(2), 149–174.CrossRef Google Scholar

Merkle, E.C., Fan, J., & Zeileis, A. (2013). Testing for measurement invariance with respect to an ordinal variable. Psychometrika, forthcoming.Google Scholar

Merkle, E.C., & Zeileis, A. (2013). Tests of measurement invariance without subgroups: a generalization of classical methods. Psychometrika, 78(1), 59–82.CrossRef Google Scholar PubMed

Miller, R., & Siegmund, D. (1982). Maximally selected chi square statistics. Biometrics, 38(4), 1011–1016.CrossRef Google Scholar

Milligan, G., & Cooper, M. (1986). A study of the comparability of external criteria for hierarchical cluster analysis. Multivariate Behavioral Research, 21(4), 441–458.CrossRef Google Scholar PubMed

Mislevy, R., & Verhelst, N. (1990). Modeling item responses when different subjects employ different solution strategies. Psychometrika, 55(2), 195–215.CrossRef Google Scholar

Pedraza, O., Graff-Radford, N., Smith, G., Ivnik, R., Willis, F., Petersen, R., & Lucas, J. (2009). Differential item functioning of the Boston Naming Test in cognitively normal African American and Caucasian older adults. Journal of the International Neuropsychological Society, 15(05), 758–768.CrossRef Google Scholar PubMed

Penfield, D. (2007). Assessing differential step functioning in polytomous items using a common odds ratio estimator. Journal of Educational Measurement, 44(3), 187–210.CrossRef Google Scholar

Penfield, D., Alvarez, K., & Lee, O. (2009). Using a taxonomy of differential step functioning to improve the interpretation of DIF in polytomous items: an illustration. Applied Measurement in Education, 22(1), 61–78.CrossRef Google Scholar

Perkins, A., Stump, T., Monahan, P., & McHorney, C. (2006). Assessment of differential item functioning for demographic comparisons in the MOS SF-36 health survey. Quality of Life Research, 15, 331–348.CrossRef Google Scholar PubMed

R Development Core Team (2012). R: a language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. http://www.R-project.org/.Google Scholar

Rijmen, F., Tuerlinckx, F., De Boeck, P., & Kuppens, P. (2003). A nonlinear mixed model framework for item response theory. Psychological Methods, 8(2), 185–205.CrossRef Google Scholar PubMed

Rizopoulos, D. (2006). ltm: an R package for latent variable modeling and item response analysis. Journal of Statistical Software, 17, 5. http://www.jstatsoft.org/v17/i05/.CrossRef Google Scholar

Rizopoulos, D. (2012). ltm: latent trait models under IRT. R package version 0.9-9. http://CRAN.R-project.org/package=ltm.Google Scholar

Rost, J. (1990). Rasch models in latent classes: an integration of two approaches to item analysis. Applied Psychological Measurement, 14(3), 271–282.CrossRef Google Scholar

Shih, Y.S. (2004). A note on split selection bias in classification trees. Computational Statistics & Data Analysis, 45(3), 457–466.CrossRef Google Scholar

Smit, J., Kelderman, H., & Van der Flier, H. (2000). The mixed Birnbaum model: estimation using collateral information. Methods of Psychological Research Online, 5, 1–13.Google Scholar

Strobl, C., Boulesteix, A.L., & Augustin, T. (2007). Unbiased split selection for classification trees based on the Gini index. Computational Statistics & Data Analysis, 52(1), 483–501.CrossRef Google Scholar

Strobl, C., Malley, J., & Tutz, G. (2009). An introduction to recursive partitioning: rationale, application and characteristics of classification and regression trees, bagging and random forests. Psychological Methods, 14(4), 323–348.CrossRef Google Scholar PubMed

Strobl, C., Wickelmaier, F., & Zeileis, A. (2011). Accounting for individual differences in Bradley–Terry models by means of recursive partitioning. Journal of Educational and Behavioral Statistics, 36(2), 135–153.CrossRef Google Scholar

Trepte, S., & Verbeet, M. (Eds.) (2010). Allgemeinbildung in Deutschland—Erkenntnisse aus dem SPIEGEL Studentenpisa-Test. Wiesbaden: VS Verlag.Google Scholar

Van den Noortgate, W., & De Boeck, P. (2005). Assessing and explaining differential item functioning using logistic mixed models. Journal of Educational and Behavioral Statistics, 30(4), 443–464.CrossRef Google Scholar

Westers, P., & Kelderman, H. (1992). Examining differential item functioning due to item difficulty and alternative attractiveness. Psychometrika, 57(1), 107–118.CrossRef Google Scholar

Woods, C., Oltmanns, T., & Turkheimer, E. (2009). Illustration of MIMIC-model DIF testing with the schedule for nonadaptive and adaptive personality. Journal of Psychopathology and Behavioral Assessment, 31, 320–330.CrossRef Google Scholar PubMed

Zeileis, A., & Hornik, K. (2007). Generalized m-fluctuation tests for parameter instability. Statistica Neerlandica, 61(4), 488–508.CrossRef Google Scholar

Zeileis, A., Hothorn, T., & Hornik, K. (2008). Model-based recursive partitioning. Journal of Computational and Graphical Statistics, 17(2), 492–514.CrossRef Google Scholar

Zeileis, A., Strobl, C., Wickelmaier, F., & Kopf, J. (2012). psychotree: recursive partitioning based on psychometric models. R package version 0.12-2. http://CRAN.R-project.org/package=psychotree.Google Scholar

Article contents

Rasch Trees: A New Method for Detecting Differential Item Functioning in the Rasch Model

Abstract

Keywords

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests