Item Screening in Graphical Loglinear Rasch Models

Svend Kreiner; Karl Bang Christensen

doi:10.1007/s11336-011-9203-y

Item Screening in Graphical Loglinear Rasch Models

Published online by Cambridge University Press: 01 January 2025

Svend Kreiner and

Karl Bang Christensen

Show author details

Svend Kreiner*: Affiliation:
University of Copenhagen
Karl Bang Christensen: Affiliation:
University of Copenhagen
*: Requests for reprints should be sent to Svend Kreiner, Department of Biostatistics, University of Copenhagen, Oster Farimagsgade 5, B, POB 2029, 1014 Copenhagen K, Denmark. E-mail: s.kreiner@biostat.ku.dk

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

In behavioural sciences, local dependence and DIF are common, and purification procedures that eliminate items with these weaknesses often result in short scales with poor reliability. Graphical loglinear Rasch models (Kreiner & Christensen, in Statistical Methods for Quality of Life Studies, ed. by M. Mesbah, F.C. Cole & M.T. Lee, Kluwer Academic, pp. 187–203, 2002) where uniform DIF and uniform local dependence are permitted solve this dilemma by modelling the local dependence and DIF. Identifying loglinear Rasch models by a stepwise model search is often very time consuming, since the initial item analysis may disclose a great deal of spurious and misleading evidence of DIF and local dependence that has to disposed of during the modelling procedure.

Like graphical models, graphical loglinear Rasch models possess Markov properties that are useful during the statistical analysis if they are used methodically. This paper describes how. It contains a systematic study of the Markov properties and the way they can be used to distinguish spurious from genuine evidence of DIF and local dependence and proposes a strategy for initial item screening that will reduce the time needed to identify a graphical loglinear Rasch model that fits the item responses. The last part of the paper illustrates the item screening procedure on simulated data and on data on the PF subscale measuring physical functioning in the SF36 Health Survey inventory.

Keywords

chain graph models graphical Rasch models loglinear Rasch models global Markov properties differential item functioning local dependence Mantel–Haenszel analysis partial gamma coefficient

Type: Original Paper
Information: Psychometrika , Volume 76 , Issue 2 , April 2011 , pp. 228 - 256

DOI: https://doi.org/10.1007/s11336-011-9203-y [Opens in a new window]
Copyright: Copyright © 2011 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Ackerman, T.A. (1992). A didactic explanation of item bias, item impact, and item validity from a multidimensional perspective. Journal of Educational Measurement, 29, 67–91.CrossRef Google Scholar

Agresti, A. (1984). Analysis of ordinal categorical data, New York: Wiley.Google Scholar

Andersen, E.B. (1977). Sufficient statistics and latent trait models. Psychometrika, 42, 69–81.CrossRef Google Scholar

Anderson, C.J., Böckenholt, U. (2000). Graphical regression models for polytomous variables. Psychometrika, 65, 497–509.CrossRef Google Scholar

Anderson, C.J., Yu, H.-T. (2007). Log-multiplicative association models as item response models. Psychometrika, 72, 5–23.CrossRef Google Scholar

Bartolucci, F. (2007). A class of multidimensional IRT models for testing unidimensionality and clustering items. Psychometrika, 72, 141–158.CrossRef Google Scholar

Bartolucci, F., Forcina, A. (2005). Likelihood inference on the underlying structure of IRT models. Psychometrika, 70, 31–44.CrossRef Google Scholar

Benjamini–Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B, 57, 289–300.CrossRef Google Scholar

Besag, J., Clifford, P. (1991). Sequential Monte Carlo p-values. Biometrika, 78, 301–304.CrossRef Google Scholar

Bishop, Y.M.M., Fienberg, S.E., Holland, P.W. (1975). Discrete multivariate analysis: theory and practice, Cambridge: MIT Press.Google Scholar

Christensen, K.B., Kreiner, S. (2007). A Monte Carlo approach to unidimensionality testing in polytomous Rasch models. Journal of Applied Psychological Measurement, 31, 20–30.CrossRef Google Scholar

Clauser, B., Mazor, K.M., Hambleton, R.K. (1994). The effect of score group width on the Mantel–Haenszel procedure. Journal of Educational Measurement, 31, 67–78.CrossRef Google Scholar

Davis, J.A. (1967). A partial coefficient for Goodman and Kruskal’s Gamma. Journal of the American Statistical Association, 69, 174–180.Google Scholar

Dawid, A.P. (1979). Conditional independence in statistical theory (with discussion). Journal of the Royal Statistical Society, Series A, 147, 278–292.CrossRef Google Scholar

Fayers, P.M., Machin, D. (2007). Quality of life: the assessment, analysis, and interpretation of patient reported outcomes, (2nd ed.). Chichester: Wiley.CrossRef Google Scholar

Fidalgo, A.M., Mellenbergh, G.J., Muniz, J. (2000). Effects of DIF, test length, and purification type on robustness and power of Mantel–Haenszel procedures. Methods of Psychological Research Online, 5, 43–53.Google Scholar

Fischer, G.H. (1995). The derivation of polytomous Rasch models. In Fischer, G.H., Molenaar, I.W. (Eds.), Rasch models: Foundations, recent developments, and applications (pp. 293–306). New York: Springer.CrossRef Google Scholar

Finch, H. (2005). The MIMIC model as a method for detecting DIF: comparison with Mantel–Haenszel, SIBTEST and the IRT Likelihood Ratio. Applied Psychological Measurement, 29, 278–295.CrossRef Google Scholar

Frank, O., Strauss, D. (1986). Markov graphs. Journal of the American Statistical Association, 81, 832–842.CrossRef Google Scholar

French, B.F., Maller, S.J. (2007). Iterative purification and effect size use with logistic regression for differential item functioning detection. Educational and Psychological Measurement, 67, 373–393.CrossRef Google Scholar

Hagenaars, J.A. (1998). Categorical causal modelling: latent class analysis and directed Log-linear models with latent variables. Sociological Methods and Research, 26, 436–486.CrossRef Google Scholar

Hanson, B.A. (1998). Uniform DIF and DIF defined by differences in item response functions. Journal of Educational and Behavioral Statistics, 23, 244–253.CrossRef Google Scholar

Holland, P.W. (1981). When are item response models consistent with observed data. Psychometrika, 46, 79–92.CrossRef Google Scholar

Holland, P.W., Hoskens, M. (2003). Classical test theory as a first-order item response theory: Application to true-score prediction from a possible nonparallel test. Psychometrika, 68, 123–150.CrossRef Google Scholar

Holland, P.W., Rosenbaum, P.R. (1986). Conditional association and unidimensionality in monotone latent variable models. Annals of Statistics, 14, 1523–1543.CrossRef Google Scholar

Holland, P.W., Thayer, D.T. (1988). Differential item performance and the Mantel–Haenszel procedure. In Wainer, H., Braun, H. (Eds.), Test validity (pp. 129–145). Hillsdale: Lawrence Erlbaum Associates.Google Scholar

Hoskens, M., De Boeck, P. (1997). A parametric model for local dependence among test items. Psychological Methods, 2, 261–277.CrossRef Google Scholar

Humphreys, K., Titterington, D.M. (2003). Variational approximations for categorical causal modelling with latent variables. Psychometrika, 68, 391–412.CrossRef Google Scholar

Ip, E.H. (2001). Testing for local dependence in dichotomous item response models. Psychometrika, 66, 109–132.CrossRef Google Scholar

Ip, E.H. (2002). Locally dependent latent trait model and the Dutch Identity revisited. Psychometrika, 67, 367–386.CrossRef Google Scholar

Junker, B.W. (1993). Conditional association, essential independence and monotone unidimensional item response models. Annals of Statistics, 21, 1359–1378.CrossRef Google Scholar

Junker, B.W., Sijtsma, K. (2000). Latent and manifest monotonicity in item response models. Applied Psychological Measurement, 24, 65–81.CrossRef Google Scholar

Kelderman, H. (1984). Loglinear Rasch model tests. Psychometrika, 49, 223–245.CrossRef Google Scholar

Kelderman, H. (1989). Item bias detection using loglinear IRT. Psychometrika, 54, 681–697.CrossRef Google Scholar

Kelderman, H. (1992). Computing maximum likelihood estimates of loglinear models from marginal sums with special attention to loglinear item response theory. Psychometrika, 57, 437–450.CrossRef Google Scholar

Kelderman, H. (2005). Building IRT models from scratch: Graphical models, exchangeability, marginal freedom, scale type, and latent traits. In van der Ark, A., Croon, M.A., Sijtsma, K. (Eds.), New developments in categorical data analysis for the social and behavioural Sciences (pp. 167–187). Hillsdale: Lawrence Erlbaum.Google Scholar

Kreiner, S. (1986). Computerized exploratory screening of large-dimensional contingency tables. In De Antoni, F., Lauro, N., Rizzi, A. (Eds.), COMPSTAT 1986 (pp. 43–48). Heidelberg: Physica Verlag.Google Scholar

Kreiner, S. (1987). Analysis of multidimensional contingency tables by exact conditional tests: Techniques and strategies. Scandinavian Journal of Statistics, 14, 97–112.Google Scholar

Kreiner, S. (1993/2006). Validation of index scales for analysis of survey data. In Dean, K. (Eds.), Population health research (pp. 116–144). London: Sage Publications. Reprinted in D.J. Bartolomew (Ed.) (2006), Measurement, vol. III (pp. 297–328). London: Sage Publications.Google Scholar

Kreiner, S. (2003). Introduction to DIGRAM (Research report 03/10). Copenhagen: Dept. of Biostatistics, Univ. of Copenhagen.Google Scholar

Kreiner, S. (2007). Validity and objectivity: reflections on the role and nature of Rasch models. Nordic Psychology, 59, 268–298.CrossRef Google Scholar

Kreiner, S., Christensen, K.B. (2002). Graphical Rasch models. In Mesbah, M., Cole, F.C., Lee, M.T. (Eds.), Statistical methods for quality of life studies (pp. 187–203). Dordrecht: Kluwer Academic.CrossRef Google Scholar

Kreiner, S., Christensen, K.B. (2004). Analysis of local dependence and multidimensionality in graphical loglinear Rasch models. Communications in Statistics. Theory and Methods, 33, 1239–1276.CrossRef Google Scholar

Kreiner, S., Christensen, K.B. (2006). Validity and objectivity in health related summated scales: Analysis by graphical loglinear Rasch models. In von Davier, M., Carstensen, C.H. (Eds.), Multivariate and mixture distribution Rasch models—extensions and applications (pp. 329–346). New York: Springer.Google Scholar

Kreiner, S., Pedersen, J.H., & Siersma, V. (2009). Derivation and testing hypotheses in chain graph models (Research report 09/9). Copenhagen: Dept. of Biostatistics, University of Copenhagen. Retrieved from http://biostat.ku.dk/reports/2009/Research_report_09-09.pdf.Google Scholar

Lauritzen, S.L. (1996). Graphical models, Oxford: Clarendon Press.CrossRef Google Scholar

Lord, F.M. (1980). Applications of item response theory to practical testing problems, Hillsdale: Lawrence Erlbaum.Google Scholar

Mazor, K.M., Clauser, B.E., Hambleton, R.K. (1992). The effect of sample size on the functioning of the Mantel–Haenszel statistic. Educational and Psychological Measurement, 52, 443–451.CrossRef Google Scholar

Mellenbergh, G.J. (1982). Contingency table models for assessing item bias. Journal of Educational Statistics, 7, 105–108.CrossRef Google Scholar

Park, D.G., Lautenschlager, G.J. (1990). Improving IRT item bias with iterative linking and ability scale purification. Applied Psychological Measurement, 14, 1163–1173.CrossRef Google Scholar

Penfield, R.D. (2001). Assessing differential item functioning among multiple groups: A comparison of three Mantel–Haenszel procedures. Applied Measurement in Education, 14, 235–259.CrossRef Google Scholar

Penfield, R.D., Camilli, G. (2007). Differential item functioning and item bias. In Rao, C.R., Sinharay, S. (Eds.), Handbook of statistics: psychometrics (pp. 125–168). Amsterdam: Elsevier.Google Scholar

Raju, N.S., Drasgow, F., Slinde, J.A. (1993). An empirical comparison of the area methods, Lord’s chi-square test, and the Mantel–Haenszel technique for assessing differential item functioning. Educational and Psychological Measurement, 53, 301–315.CrossRef Google Scholar

Rasch, G. (1961/2006). On general laws and the meaning of measurement in psychology. In Neyman, J. (Eds.), Proceedings of the 4th Berkley symposium on mathematical statistics and probability (pp. 321–333). Berkeley: University of California Press. Reprinted in D.J. Bartolomew (Ed.). Measurement, vol. I (pp 319–334). London: Sage Publications.Google Scholar

Rijmen, F., Vansteelandt, K., De Boeck, P. (2008). Latent class models for diary method data: Parameter estimation by local computations. Psychometrika, 73, 167–182.CrossRef Google Scholar PubMed

Rosenbaum, P.R. (1984). Testing the conditional independence and monotonicity assumptions of item response theory. Psychometrika, 49, 425–435.CrossRef Google Scholar

Rosenbaum, P.R. (1988). Item Bundles. Psychometrika, 53, 349–359.CrossRef Google Scholar

Rosenbaum, P.R. (1989). Criterion-related construct validity. Psychometrika, 54, 625–633.CrossRef Google Scholar

Sue, Y.-H., Wang, W.-C. (2005). Efficiency of the Mantel, Generalized Mantel–Haenszel, and logistic discriminant function analysis methods in detecting differential item functioning for polytomous items. Applied Measurement in Education, 18, 313–350.CrossRef Google Scholar

Swaminathan, H., Rogers, J.H. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27, 361–370.CrossRef Google Scholar

Tjur, T. (1982). A connection between Rasch’s item analysis model and a multiplicative Poisson model. Scandinavian Journal of Statistics, 9, 23–30.Google Scholar

Van der Ark, L.A., Bergsma, W.P. (2010). A Note on stochastic ordering of the latent trait using the sum of polytomous item scores. Psychometrika, 75, 272–279.CrossRef Google Scholar

Williams, N.J., Beretvas, S.N. (2006). DIF identification using HGLM for polytomous items. Applied Psychological Measurement, 30, 22–42.CrossRef Google Scholar

Zumbo, B.D. (1999). A handbook on the theory and methods of differential item functioning (DIF), Ottawa: Directorate of Human Resources Research and Evaluation, National Defence.Google Scholar

Article contents

Item Screening in Graphical Loglinear Rasch Models

Abstract

Keywords

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests