Hostname: page-component-745bb68f8f-b6zl4 Total loading time: 0 Render date: 2025-01-07T18:40:29.737Z Has data issue: false hasContentIssue false

Item Bias Detection using Loglinear IRT

Published online by Cambridge University Press:  01 January 2025

Henk Kelderman*
Affiliation:
University of Twente
*
Requests for reprints should be sent to Henk Kelderman, University of Twente, PO Box 217, 7500 AE Enschede, THE NETHERLANDS.

Abstract

A method is proposed for the detection of item bias with respect to observed or unobserved subgroups. The method uses quasi-loglinear models for the incomplete subgroup × test score × Item 1 × ... × item k contingency table. If subgroup membership is unknown the models are Haberman's incomplete-latent-class models.

The (conditional) Rasch model is formulated as a quasi-loglinear model. The parameters in this loglinear model, that correspond to the main effects of the item responses, are the conditional estimates of the parameters in the Rasch model. Item bias can then be tested by comparing the quasi-loglinear-Rasch model with models that contain parameters for the interaction of item responses and the subgroups.

Type
Original Paper
Copyright
Copyright © 1989 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

The author thanks Wim J. van der Linden and Gideon J. Mellenbergh for comments and suggestions and Frank Kok for empirical data.

References

Baker, R. J., & Nelder, J. A. (1978). The GLIM system: Generalized linear interactive modeling, Oxford: The Numerical Algorithms Group.Google Scholar
Berk, R. A. (1982). Handbook of methods for detecting test bias, Baltimore: The Johns Hopkins University Press.Google Scholar
Binet, A., & Simon, T. (1916). The development of Intelligence in Children, Baltimore: Williams & Wilkins.Google Scholar
Bishop, Y. M. M., Fienberg, S. E., & Holland, P. W. (1975). Discrete multivariate analysis, Cambridge, MA: MIT Press.Google Scholar
Camilli, G. (1979). A critique of the chi-square method for assessing item bias, Boulder: University of Colorado, Laboratory of Educational Research.Google Scholar
Cressie, N., & Holland, P. W. (1983). Characterizing the manifest probabilities of latent trait models. Psychometrika, 48, 129142.CrossRefGoogle Scholar
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM Algorithm. Journal of the Royal Statistical Society, 39, 138.CrossRefGoogle Scholar
Duncan, O. D. (1984). Rasch measurement: Further examples and discussion. In Turner, C. F. & Martin, E. (Eds.), Surveying subjective phenomena (pp. 367403). New York: Russell Sage Foundation.Google Scholar
Durovic, J. (1975). Definitions of test bias: A taxonomy and an illustration of an alternative model. Unpublished doctoral dissertation, State University of New York at Albany.Google Scholar
Fienberg, S. E. (1972). The analysis of incomplete multi-way contingency tables. Biometrics, 28, 177202.CrossRefGoogle Scholar
Fischer, G. H., & Forman, A. F. (1982). Some applications of logistic latent trait models with linear constraints on parameters. Applied Psychological Measurement, 6, 397416.CrossRefGoogle Scholar
Goodman, L. A. (1974). Exploratory latent structure analysis. Biometrika, 61, 215231.CrossRefGoogle Scholar
Goodman, L. A. (1975). A new model for scaling response patterns: An application of the quasi-independence concept. Journal of the American Statistical Association, 70, 755768.CrossRefGoogle Scholar
Goodman, L. A. (1978). Analyzing qualitative/categorical data: Loglinear models and latent structure analysis, London: Addison Wesley.Google Scholar
Goodman, L. A., & Fay, R. (1974). ECTA program, description for users, Chicago: University of Chicago, Department of Statistics.Google Scholar
Haberman, S. J. (1979). Analysis of qualitative data: New developments (Vol. 2), New York: Academic Press.Google Scholar
Holland, P. W. (1985). On the study of differential item performance without IRT. Paper presented at the Annual Meeting of the Military Testing Association, San Diego.Google Scholar
Holland, P. W., & Thayer, D. (1986). Differential item performance and the Mantel-Haenszel statistic. Paper presented at the Annual Meeting of the American Educational Research Association, San Francisco.Google Scholar
Ironson, G. H. (1982). Use of chi-square and latent trait approaches for detecting item bias. In Berk, R. A. (Eds.), Handbook of methods for detecting item bias, Baltimore: The Johns Hopkins University Press.Google Scholar
Jensen, A. R. (1980). Bias in mental testing, London: Methuen.Google Scholar
Kelderman, H. (1984). Loglinear Rasch model tests. Psychometrika, 49, 223245.CrossRefGoogle Scholar
Kelderman, H. (1987). Estimating quasi-loglinear models for a Rasch table if the number of items is large, Enschede: University of Twente, Department of Education.Google Scholar
Kelderman, H., & Steen, R. (1988). LOGIMO: A program for loglinear IRT modeling, Enschede: University of Twente, Department of Education.Google Scholar
Kok, F. G. (1982). Het partijdige item. [The biased item] Psychologisch Laboratorium, University of Amsterdam.Google Scholar
Kok, F. G., & Mellenbergh, G. J. (1985, July). A mathematical model for item bias and a definition of bias effect size. Paper presented at the Fourth Meeting of the Psychometric Society, Cambridge, Great Britain.Google Scholar
Kok, F. G., Mellenbergh, G. J., & van der Flier, H. (1985). An iterative procedure for detecting biased items. Journal of Educational Measurement, 22, 295303.CrossRefGoogle Scholar
Larnz, K. (1978). Small-sample comparisons of exact levels for chi-square statistics. Journal of the American Statistical Association, 73, 412419.Google Scholar
Lazarsfeld, P. F. (1950). The interpretation and computation of some latent structures. In Stouffer, S. A. et al. (Eds.), Measurement and prediction in World War II (pp. 413472). Princeton: Princeton University Press.Google Scholar
Lazarsfeld, P. F., & Henry, N. W. (1968). Latent structure analysis, Boston: Houghton Miffin.Google Scholar
Lord, F. M. (1980). Applications of item response theory to practical testing problems, Hillsdale, New Jersey: Lawrence Erlbaum.Google Scholar
McHugh, R. B. (1956). Efficient estimation and local identifica in latent class analysis. Psychometrika, 21, 331347.CrossRefGoogle Scholar
Mellenbergh, G. J. (1982). Contingency table methods for assessing item bias. Journal of Educational Statistics, 7, 105118.CrossRefGoogle Scholar
Mislevy, R. J. (1981). A general linear model for the analysis of Rasch item threshold estimates. Unpublished doctoral dissertation, University of Chicago.Google Scholar
Muthén, B., & Lehman, J. (1985). Multiple group IRT modeling: Applications to item bias analysis. Journal of Educational Statistics, 10, 133142.CrossRefGoogle Scholar
Nungester, R. J. (1977). An empirical examination of three models of item bias. Dissertation Abstracts International, 38, 2726 A. (University Microfilms No. 77-24, 289, Doctoral dissertation Florida State University, 1977)Google Scholar
Osterlind, S. J. (1983). Test item bias, Beverly Hills: Sage.CrossRefGoogle Scholar
Petersen, N. S., & Novick, M. R. (1976). An evaluation of some models for culture-fair selection. Journal of Educational Measurement, 329.CrossRefGoogle Scholar
Rao, C. R. (1965). Linear statistical inference and its applications, New York: Wiley.Google Scholar
Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests, Copenhagen: Paedagogiske Institut.Google Scholar
Rasch, G. (1966). An item analysis that takes individual differences into account. British Journal of Mathematical and Statistical Psychology, 19, 4957.CrossRefGoogle ScholarPubMed
Scheuneman, J. (1979). A method of assessing bias in test items. Journal of Educational Measurement, 16, 143152.CrossRefGoogle Scholar
Shepard, L. A., Camilli, G., & Averill, M. (1981). Comparison of procedures for detecting test-item bias with both internal and external ability criteria. Journal of Educational Statistics, 6, 317377.CrossRefGoogle Scholar
Tjur, T. (1982). A connection between Rasch's item analysis model and a multiplicative Poisson model. Scandinavian Journal of Statistics, 9, 2330.Google Scholar
Wright, B. D., Mead, R. J., & Draba, R. (1975). Detecting and correcting test item bias with a logistic response model (RM 22), Chicago: University of Chicago, Department of Education, Statistical Laboratory.Google Scholar