Skip to main content Accessibility help
×
Hostname: page-component-78c5997874-ndw9j Total loading time: 0 Render date: 2024-11-09T01:38:14.404Z Has data issue: false hasContentIssue false

Identifying and Minimizing Measurement Invariance among Intersectional Groups

The Alignment Method Applied to Multi-category Items

Published online by Cambridge University Press:  23 June 2023

Rachel A. Gordon
Affiliation:
Northern Illinois University
Tianxiu Wang
Affiliation:
University of Pittsburgh
Hai Nguyen
Affiliation:
University of Illinois, Chicago
Ariel M. Aloe
Affiliation:
University of Iowa

Summary

This Element demonstrates how and why the alignment method can advance measurement fairness in developmental science. It explains its application to multi-category items in an accessible way, offering sample code and demonstrating an R package that facilitates interpretation of such items' multiple thresholds. It features the implications for group mean differences when differences in the thresholds between categories are ignored because items are treated as continuous, using an example of intersectional groups defined by assigned sex and race/ethnicity. It demonstrates the interpretation of item-level partial non-invariance results and their implications for group-level differences and encourages substantive theorizing regarding measurement fairness.
Get access
Type
Element
Information
Online ISBN: 9781009357784
Publisher: Cambridge University Press
Print publication: 06 July 2023

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Achenbach System of Empirically Based Assessment (ASEBA). (n.d.). The ASEBA approach. https://aseba.org/.Google Scholar
Aiken, L. S., West, S. G., & Millsap, R. E. (2008). Doctoral training in statistics, measurement, and methodology in psychology: Replication and extension. American Psychologist, 63, 3250. https://doi.org/10.1037/0003-066X.63.1.32.Google Scholar
American Educational Research Association (AERA), American Psychological Association (APA), & National Council on Measurement in Education (NCME). (2014). Standards for educational and psychological testing. American Educational Research Association. www.testingstandards.net/open-access-files.html.Google Scholar
Asparouhov, T., & Muthén, B. (2014). Multiple-group factor analysis alignment. Structural Equation Modeling: A Multidisciplinary Journal, 21(4), 495508. https://doi.org/10.1080/10705511.2014.919210.Google Scholar
Asparouhov, T., & Muthén, B. (2020, December). IRT in Mplus (Version 4). www.statmodel.com/download/MplusIRT.pdf.Google Scholar
Asparouhov, T., & Muthén, B. (2023). Multiple group alignment for exploratory and structural equation models. Structural Equation Modeling., 30(2), 169191 https://doi.org/10.1080/10705511.2022.2127100.CrossRefGoogle Scholar
Babcock, B., & Hodge, K. J. (2020). Rasch versus classical equating in the context of small sample sizes. Educational and Psychological Measurement, 80, 499521. https://doi.org/10.1177/0013164419878483.CrossRefGoogle ScholarPubMed
Bansal, P. S., Babinski, D. E., Waxmonsky, J. G., & Waschbusch, D. A. (2022). Psychometric properties of parent ratings on the Inventory of Callous–Unemotional Traits in a nationally representative sample of 5- to 12-year-olds. Assessment, 29, 242256. https://doi.org/10.1177/1073191120964562.Google Scholar
Bauer, D. J. (2017). A more general model for testing measurement invariance and differential item functioning. Psychological Methods, 22, 507526. https://doi.org/10.1037/met0000077.CrossRefGoogle ScholarPubMed
Belzak, W. C. M., & Bauer, D. J. (2020). Improving the assessment of measurement invariance: Using regularization to select anchor items and identify differential item functioning. Psychological Methods, 25, 673690. https://doi.org/10.1037/met0000253.Google Scholar
Benjamin, L. T. Jr. (2005). A history of clinical psychology as a profession in America (and a glimpse at its future). Annual Review of Clinical Psychology, 1, 130. https://doi.org/10.1146/annurev.clinpsy.1.102803.143758.Google Scholar
Bodenhorn, T., Burns, J. P., & Palmer, M. (2020). Change, contradiction, and the state: Higher education in greater China. The China Quarterly, 244, 903919. https://doi.org/10.1017/S0305741020001228.Google Scholar
Bordovsky, J. T., Krueger, R. F., Argawal, A., & Grucza, R. A. (2019). A decline in propensity toward risk behaviors among U. S. adolescents. Journal of Adolescent Health, 65, 745751. https://doi.org/10.1016/j.jadohealth.2019.07.001.CrossRefGoogle Scholar
Boulkedid, R., Abdoul, H., Loustau, M., Sibony, O., & Alberti, C. (2011). Using and reporting the Delphi method for selecting healthcare quality indicators: A systematic review. PloS One, 6(6), e20476. https://doi.org/10.1371/journal.pone.0020476.CrossRefGoogle ScholarPubMed
Bratt, C., Abrams, D., Swift, H. J., Vauclair, C. M., & Marques, S. (2018). Perceived age discrimination across age in Europe: From an ageing society to a society for all ages. Developmental Psychology, 54, 167180. https://doi.org/10.1037/dev0000398.CrossRefGoogle ScholarPubMed
Burnham, K. P., & Anderson, D. R. (2002). Model selection and multi-model inference. Springer-Verlag.Google Scholar
Buss, A. H., & Perry, M. P. (1992). The aggression questionnaire. Journal of Personality and Social Psychology, 63, 452459. https://doi.org/10.1037/0022-3514.63.3.452.Google Scholar
Buss, A. H., & Warren, W. L. (2000). Aggression questionnaire. WPS. www.wpspublish.com/aq-aggression-questionnaire.Google Scholar
Byrne, B. M., Oakland, T., Leong, F. T. L. et al. (2009). A critical analysis of cross-cultural research and testing practices: Implications for improved education and training in psychology. Training and Education in Professional Psychology, 3, 94105. https://doi.org/10.1037/a0014516.Google Scholar
Byrne, B. M., Shavelson, R. J., & Muthén, B. (1989). Testing for the equivalence of factor covariance and mean structures: The issue of partial measurement invariance. Psychological Bulletin, 105(3), 456466. https://doi.org/10.1037/0033-2909.105.3.456.CrossRefGoogle Scholar
Camilli, G. (2006). Test fairness. In Brennan, R. L. (Ed.), Educational measurement (4th ed., pp. 221256). Praeger.Google Scholar
Cheung, G. W., & Lau, R. S. (2012). A direct comparison approach for testing measurement invariance. Organizational Research Methods, 15, 167198. https://doi.org/10.1177/1094428111421987.Google Scholar
Chilisa, B. (2020). Indigenous research methodologies. Sage.Google Scholar
Cleveland, H. H., Wiebe, R. P., van den Oord, E. J. C. G., & Rowe, D. C. (2000). Behavior problems among children from different family structures: The influence of genetic self-selection. Child Development, 71, 733751. https://doi.org/10.1111/1467-8624.00182.Google Scholar
Covarrubias, A., & Vélez, V. (2013). Critical race quantitative intersectionality: An antiracist research paradigm that refuses to “let the numbers speak for themselves.” In Lynn, M. & Dixson, A. D. (Eds.), Handbook of critical race theory in education (pp. 270285). Routledge.Google Scholar
Crenshaw, K. (1989). Demarginalizing the intersection of race and sex: A black feminist critique of antidiscrimination doctrine, feminist theory and antiracist politics. University of Chicago Legal Forum, 1989 (1), Article 8, 139167.Google Scholar
Crowder, M. K., Gordon, R. A., Brown, R. D., Davidson, L. A., & Domitrovich, C. E. (2019). Linking social and emotional learning standards to the WCSD Social-Emotional Competency Assessment: A Rasch approach. School Psychology Quarterly, 34, 281295. https://doi.org/10.1037/spq0000308.Google Scholar
Davidov, E., Meuleman, B., Cieciuch, J., Schmidt, P., & Billiet, J. (2014). Measurement equivalence in cross-national research. Annual Review of Sociology, 40, 5575. https://doi.org/10.1146/annurev-soc-071913-043137.Google Scholar
De Bondt, N., & Van Petegem, P. (2015). Psychometric evaluation of the overexcitability questionnaire-two applying Bayesian structural equation modeling (BSEM) and multiple-group BSEM-based alignment with approximate measurement invariance. Frontiers in Psychology, 6, 1963. https://doi.org/10.3389/fpsyg.2015.01963.CrossRefGoogle ScholarPubMed
DeMars, C. E. (2020). Alignment as an alternative to anchor purification in DIF analyses. Structural Equation Modeling, 27, 5672. https://doi.org/10.1080/10705511.2019.1617151.Google Scholar
Dorans, N. J., & Cook, L. L. (2016). Fairness in educational assessment and measurement. Routledge. https://doi.org/10.4324/9781315774527.Google Scholar
Duckworth, A. L., & Yeager, D. S. (2015). Measurement matters: Assessing personal qualities other than cognitive ability for educational purposes. Educational Researcher, 44, 237251. https://doi.org/10.3102/0013189X15584327.Google Scholar
Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Lawrence Erlbaum Associates.Google Scholar
Evers, A., Muñiz, J., Hagemeister, C. et al. (2013). Assessing the quality of tests: Revision of the European Federation of Psychologists’ Associations (EFPA) review model. Psichothema, 25, 283291.Google Scholar
Finch, W. H. (2016). Detection of differential item functioning for more than two groups: A Monte Carlo comparison of methods. Applied Measurement in Education, 29, 3045. https://doi.org/10.1080/08957347.2015.1102916.Google Scholar
Flake, J. K., Pek, J., & Hehman, E. (2017). Construct validation in social and personality research: Current practice and recommendations. Social Psychological and Personality Science, 8, 370378. https://doi.org/10.1177/1948550617693063.Google Scholar
Fujimoto, K. A., Gordon, R. A., Peng, F., & Hofer, K. G. (2018). Examining the category functioning of the ECERS-R across eight datasets. AERA Open, 4, 116. https://doi.org/10.1177/2332858418758299.Google Scholar
Garcia, N. M., López, N., & Vélez, V. N. (2018). QuantCrit: Rectifying quantitative methods through critical race theory. Race Ethnicity and Education, 21, 149157. https://doi.org/10.1080/13613324.2017.1377675.CrossRefGoogle Scholar
Golinski, C., & Cribbie, R. A. (2009). The expanding role of quantitative methodologists in advancing psychology. Canadian Psychology, 50, 8390. https://doi.org/10.1037/a0015180.Google Scholar
Gordon, R. A. (2015). Measuring constructs in family science: How can IRT improve precision and validity? Journal of Marriage and Family, 77, 147176. https://doi.org/10.1111/jomf.12157.Google Scholar
Gordon, R. A., Crowder, M. K., Aloe, A. M., Davidson, L. A., & Domitrovich, C. E. (2022). Student self-ratings of social-emotional competencies: Dimensional structure and outcome associations of the WCSD-SECA among Hispanic and non-Hispanic White boys and girls in elementary through high school. Journal of School Psychology, 93, 4162. https://doi.org/10.1016/j.jsp.2022.05.002.Google Scholar
Gordon, R. A., & Davidson, L. A. (2022). Cross-cutting issues for measuring SECs in context: General opportunities and challenges with an illustration of the Washoe County School District Social-Emotional Competency Assessment (WCSD-SECAs). In Jones, S., Lesaux, N., & Barnes, S. (Eds.), Measuring non-cognitive skills in school settings (pp. 225251). Guilford Press.Google Scholar
Guttmannova, K., Szanyi, J. M., & Cali, P. W. (2008). Internalizing and externalizing behavior problem scores: Cross-ethnic and longitudinal measurement invariance of the Behavior Problem Index. Educational and Psychological Measurement, 68, 676694. https://doi.org/10.1177/0013164407310127.Google Scholar
Han, K., Colarelli, S. M., & Weed, N. C. (2019). Methodological and statistical advances in the consideration of cultural diversity in assessment: A critical review of group classification and measurement invariance testing. Psychological Assessment, 31, 14811496. https://doi.org/10.1037/pas0000731.CrossRefGoogle ScholarPubMed
Hauser, R. M., & Goldberger, A. S. (1971). The treatment of unobservable variables in path analysis. Sociological Methodology, 3, 81117. https://doi.org/10.2307/270819.Google Scholar
Hui, C. H., & Triandis, H. C. (1985). Measurement in cross-cultural psychology: A review and comparison of strategies. Journal of Cross-Cultural Psychology, 16, 131152. https://doi.org/10.1177/0022002185016002001.Google Scholar
Hussey, I., & Hughes, S. (2020). Hidden invalidity among 15 commonly used measures in social and personality psychology. Advances in Methods and Practices in Psychological Science, 3, 166184. https://doi.org/10.1177/2515245919882903.CrossRefGoogle Scholar
Jackson, M. I., & Mare, R. D. (2007). Cross-sectional and longitudinal measurements of neighborhood experience and their effects on children. Social Science Research, 36, 590610. https://doi.org/10.1016/j.ssresearch.2007.02.002.Google Scholar
Johnson, J. L., & Geisinger, K. F. (2022). Fairness in educational and psychological testing: Examining theoretical, research, practice, and policy implications of the 2014 standards. American Educational Research Association. https://doi.org/10.3102/9780935302967_1.Google Scholar
Kim, E. S., Cao, C., Wang, Y., & Nguyen, D. T. (2017). Measurement invariance testing with many groups: A comparison of five approaches. Structural Equation Modeling, 24, 524544. https://doi.org/10.1080/10705511.2017.1304822.Google Scholar
King, K. M., Pullman, M. D., Lyon, A. R., Dorsey, S., & Lewis, C. C. (2019). Using implementation science to close the gap between the optimal and typical practice of quantitative methods in clinical science. Journal of Abnormal Psychology, 128, 547562. https://doi.org/10.1037/abn0000417.Google Scholar
Kolen, M. J., & Brennan, R. L. (2014). Test equating, scaling, and linking. Springer. https://doi.org/10.1007/978-1-4939-0317-7.Google Scholar
Lai, M. H. C. (2021). Adjusting for measurement noninvariance with alignment in growth modeling. Multivariate Behavioral Research. https://doi.org/10.1080/00273171.2021.1941730.Google Scholar
Lai, M. H. C., Liu, Y., & Tse, W. W. (2022). Adjusting for partial invariance in latent parameter estimation: Comparing forward specification search and approximate invariance methods. Behavior Research Methods, 54, 414434. https://doi.org/10.3758/s13428-021-01560-2.CrossRefGoogle ScholarPubMed
Lane, S., Raymond, M. R., & Haladyna, T. M. (2016). Handbook of test development. Routledge.Google Scholar
Lansford, J. E., Rothenberg, W. A., Riley, J. et al. (2021). Longitudinal trajectories of four domains of parenting in relation to adolescent age and puberty in nine countries. Child Development, 92, e493e512. https://doi.org/10.1111/cdev.13526.Google Scholar
Lee, J., & Wong, K. K. (2022). Centering whole-child development in global education reform international perspectives on agendas for educational equity and quality. Routledge. https://doi.org/10.4324/9781003202714.Google Scholar
Lemann, N. (2000). The big test: The secret history of American meritocracy. Farrar, Straus, and Giroux.Google Scholar
Likert, R. (1932). A technique for the measurement of attitudes. Archives of Psychology, 22(140), 155.Google Scholar
Liu, Y., Millsap, R. E., West, S. G. et al. (2017). Testing measurement invariance in longitudinal data with ordered-categorical measures. Psychological Methods, 22, 486506. https://doi.org/10.1037/met0000075.Google Scholar
Long, J. S. (1997). Regression models for categorical and limited dependent variables. Sage.Google Scholar
Long, J. S., & Freese, J. (2014). Regression models for categorical dependent variables using Stata (3rd ed.). Stata Press.Google Scholar
Luong, R., & Flake, J. K. (2022). Measurement invariance testing using confirmatory factor analysis and alignment optimization: A tutorial for transparent analysis planning and reporting. Psychological Methods. https://doi.org/10.1037/met0000441.Google Scholar
Marsh, H. W., Guo, J., Parker, P. D. et al. (2018). What to do when scalar invariance fails: The extended alignment method for multi-group factor analysis comparison of latent means across many groups. Psychological Methods, 23, 524545. https://doi.org/10.1037/met0000113.Google Scholar
McLeod, J. D., Kruttschnitt, C., & Dornfeld, M. (1994). Does parenting explain the effects of structural conditions on children’s antisocial behavior? A comparison of Blacks and Whites. Social Forces, 73, 575604. https://doi.org/10.2307/2579822.Google Scholar
McLoyd, V., & Smith, J. (2002). Physical discipline and behavior problems in African American, European American, and Hispanic children: Emotional support as a moderator. Journal of Marriage and Family, 64, 4053. https://doi.org/10.1111/j.1741-3737.2002.00040.x.Google Scholar
Meade, A. W. (2010). A taxonomy of effect size measures for the differential functioning of items and scales. The Journal of Applied Psychology, 95(4), 728743. https://doi.org/10.1037/a0018966.Google Scholar
Meade, A. W., & Bauer, D. J. (2007). Power and precision in confirmatory factor analytic tests of measurement invariance. Structural Equation Modeling, 14, 611635. https://doi.org/10.1080/10705510701575461.Google Scholar
Meitinger, K., Davidov, E., Schmidt, P., & Braun, M. (2020). Measurement invariance: Testing for it and explaining why it is absent. Survey Research Methods, 14, 345349.Google Scholar
Messick, S. (1989). Meaning and values in test validation: The science and ethics of assessment. Educational Researcher, 18, 511. https://doi.org/10.3102/0013189X018002005.Google Scholar
Millsap, R. E. (2011). Statistical approaches to measurement invariance. Routledge. https://doi.org/10.4324/9780203821961.Google Scholar
Morrell, L., Collier, T., Black, P., & Wilson, M. (2017). A construct-modeling approach to develop a learning progression of how students understand the structure of matter. Journal of Research in Science Teaching, 54, 10241048. https://doi.org/10.1002/tea.21397.Google Scholar
Moss, P. A. (2016). Shifting the focus of validity for test use. Assessment in Education, 23, 116. https://doi.org/10.1080/0969594X.2015.1072085.Google Scholar
Moss, P. A., Pullin, D., Gee, J. P., & Haertel, E. H. (2005). The idea of testing: Psychometric and sociocultural perspectives. Measurement, 3, 6383. https://doi.org/10.1207/s15366359mea0302_1.Google Scholar
Muthén, B., & Asparouhov, T. (2012). Bayesian structural equation modeling: A more flexible representation of substantive theory. Psychological Methods, 17, 313335. https://doi.org/10.1037/a0026802.Google Scholar
Muthén, B., & Asparouhov, T. (2013). BSEM measurement invariance analysis. Mplus Web Notes: No. 17. www.statmodel.com.Google Scholar
Muthén, B., & Asparouhov, T. (2014). IRT studies of many groups: The alignment method. Frontiers in Psychology, 5. https://doi.org/10.3389/fpsyg.2014.00978.Google Scholar
Muthén, B., & Asparouhov, T. (2018). Recent methods for the study of measurement invariance with many groups: Alignment and random effects. Sociological Methods & Research, 47(4), 637664. https://doi.org/10.1177/0049124117701488.Google Scholar
Nering, M., & Ostini, R. (Eds.). (2010). Handbook of polytomous item response theory models. Routledge. https://doi.org/10.4324/9780203861264.Google Scholar
Oakland, T., Douglas, S., & Kane, H. (2016). Top ten standardized tests used internationally with children and youth by school psychologists in 64 countries: A 24-year follow-up study. Journal of Psychoeducational Assessment, 34, 166176. https://doi.org/10.1177/0734282915595303.Google Scholar
Osterlind, S. J., & Everson, H. T. (2009). Differential item functioning (2nd ed.). Sage. https://doi.org/10.4135/9781412993913.Google Scholar
Parcel, T. L., & Menaghan, E. G. (1988). Measuring behavioral problems in a large cross-sectional survey: Reliability and validity for children of the NLS youth. Unpublished manuscript. Columbus, OH: Center for Human Resource Research, Ohio State University.Google Scholar
Pokropek, A., & Pokropek, E. (2022). Deep neural networks for detecting statistical model misspecifications: The case of measurement invariance. Structural Equation Modeling, 29, 394411. https://doi.org/10.1080/10705511.2021.2010083.Google Scholar
Pokropek, A., Schmidt, P., & Davidov, E. (2020). Choosing priors in Bayesian measurement invariance modeling: A Monte Carlo simulation study. Structural Equation Modeling, 27, 750764. https://doi.org/10.1080/10705511.2019.1703708.Google Scholar
Raftery, A. E. (1995). Bayesian model selection in social research. Sociological Methodology, 25, 111163. https://doi.org/10.2307/271063.CrossRefGoogle Scholar
Raju, N., Fortmann-Johnson, K. A., Kim, W. et al. (2009). The item parameter replication method for detecting differential item functioning in the polytomous DFIT framework. Applied Psychological Measurement, 33, 133147. https://doi.org/10.1177/0146621608319514.Google Scholar
Raju, N.S. (1988). The area between two item characteristic curves. Psychometrika 53, 495502. https://doi.org/10.1007/BF02294403.Google Scholar
Raju, N. S. (1990). Determining the significance of estimated signed and unsigned areas between two item response functions. Applied Psychological Measurement, 14(2), 197207. https://doi.org/10.1177/014662169001400208.Google Scholar
Rescorla, L. A., Adams, A., Ivanova, M. Y., & International ASEBA Consortium. (2020). The CBCL/1½–5’s DSM‑ASD scale: Confirmatory factor analyses across 24 societies. Journal of Autism and Developmental Disorders, 50, 33263340. https://doi.org/10.1007/s10803-019-04189-5.Google Scholar
Rhemtulla, M., Brosseau-Liard, P. E., & Savalei, V. (2012). When can categorical variables be treated as continuous? A comparison of robust continuous and categorical SEM estimation methods under suboptimal conditions. Psychological Methods, 17, 354373. https://doi.org/10.1037/a0029315.Google Scholar
Rimfeld, K., Malanchini, M., Hannigan, L. J. et al. (2019). Teacher assessments during compulsory education are as reliable, stable and heritable as standardized test scores. Journal of Child Psychology and Psychiatry, 60, 12781288. https://doi.org/10.1111/jcpp.13070.Google Scholar
Rious, J. B., Cunningham, M., & Spencer, M. B. (2019). Rethinking the notion of “hostility” in African American parenting styles. Research in Human Development, 16, 3550. https://doi.org/10.1080/15427609.2018.1541377.Google Scholar
Rotberg, I. C. (Ed.). (2010). Balancing change and tradition in global education reform (2nd ed.). Rowman & Littlefield.Google Scholar
Rothstein, M. G., & Goffin, R. D. (2006). The use of personality measures in personnel selection: What does current research support? Human Resource Management Review, 16, 155180. https://doi.org/10.1016/j.hrmr.2006.03.004.Google Scholar
Royston, P., Altman, D. G., & Sauerbrei, W. (2005). Dichotomizing continuous predictors in multiple regression: A bad idea. Statistics in Medicine, 25, 127141. https://doi.org/10.1002/sim.2331.Google Scholar
Sablan, J. R. (2019). Can you really measure that? Combining critical race theory and quantitative methods. American Educational Research Journal, 56, 178203. https://doi.org/10.3102/0002831218798325.Google Scholar
Samejima, F. (1969). Estimation of ability using a response pattern of graded scores. Psychometrika Monograph: No. 17. https://doi.org/10.1007/BF03372160.Google Scholar
Samejima, F. (1996). The graded response model. In van der Linden, W. J. & Hambleton, R. K. (Eds.), Handbook of modern item response theory (pp. 85100). Springer. https://doi.org/10.1007/978-1-4757-2691-6_5.Google Scholar
Samejima, F. (2010). The general graded response model. In Nering, M. L. & Ostini, R. (Eds.), Handbook of polytomous item response theory models (pp. 77107). Routledge.Google Scholar
Santori, D. (2020). Test-based accountability in England. Oxford Research Encyclopedias. Oxford University Press. https://doi.org/10.1093/acrefore/9780190264093.013.1454.Google Scholar
Seddig, D., & Lomazzi, V. (2019). Using cultural and structural indicators to explain measurement noninvariance in gender role attitudes with multilevel structural equation modeling. Social Science Research, 84, 102328. https://doi.org/10.1016/j.ssresearch.2019.102328.Google Scholar
Sestir, M. A., Kennedy, L. A., Peszka, J. J., & Bartley, J. G. (2021). New statistics, old schools: An overview of current introductory undergraduate and graduate statistics pedagogy practices. Teaching of Psychology. https://doi.org/10.1177/00986283211030616.Google Scholar
Sharpe, D. (2013). Why the resistance to statistical innovations? Bridging the communication gap. Psychological Methods, 18, 572582. https://doi.org/10.1037/a0034177.Google Scholar
Shute, R. H., & Slee, P. T. (2015). Child development theories and critical perspectives. Routledge.Google Scholar
Sirganci, G., Uyumaz, G., & Yandi, A. (2020). Measurement invariance testing with alignment method: Many groups comparison. International Journal of Assessment Tools in Education, 7, 657673. https://doi.org/10.21449/ijate.714218.Google Scholar
Spencer, M. S., Fitch, D., Grogan-Taylor, A., & Mcbeath, B. (2005). The equivalence of the behavior problem index across U.S. ethnic groups. Journal of Cross-Cultural Psychology, 36(5), 573589. https://doi.org/10.1177/0022022105278Google Scholar
Sprague, J. (2016). Feminist methodologies for critical researchers: Bridging differences. Rowman & Littlefield.Google Scholar
Strobl, C., Kopf, J., Kohler, L., von Oertzen, T., & Zeileis, A. (2021). Anchor point selection: Scale alignment based on an inequality criterion. Applied Psychological Measurement, 45, 214230. https://doi.org/10.1177/0146621621990743.Google Scholar
Studts, C. R., Polaha, J., & van Zyl, M. A. (2017). Identifying unbiased items for screening preschoolers for disruptive behavior problems. Journal of Pediatric Psychology, 42, 476486. https://doi.org/10.1093/jpepsy/jsw090.Google Scholar
Svetina, D., Rutkowski, L., & Rutkowski, D. (2020). Multiple-group invariance with categorical outcomes using updated guidelines: An illustration using Mplus and the lavaan/semtools packages. Structural Equation Modeling, 27, 111130. https://doi.org/10.1080/10705511.2019.1602776.Google Scholar
Tay, L., Meade, A. W., & Cao, M. (2014). An overview and practice guide to IRT measurement equivalence analysis. Organizational Research Methods, 18, 346. https://doi.org/10.1177/1094428114553062.Google Scholar
Walter, M., & Andersen, C. (2016). Indigenous statistics: A quantitative research methodology. Routledge. https://doi.org/10.4324/9781315426570.Google Scholar
Wen, C., & Hu, F. (2022). Investigating the applicability of alignment: A Monte Carlo simulation study. Frontiers in Psychology, 13, 845721. https://doi.org/10.3389/fpsyg.2022.845721.Google Scholar
West, S. G., Taylor, A. B., & Wu, W. (2012). Model fit and model selection in structural equation modeling. In Hoyle, R. H. (Ed.), Handbook of structural equation modeling (pp. 209231). Guilford Press.Google Scholar
Winter, S. D., & Depaoli, S. (2020). An illustration of Bayesian approximate measurement invariance with longitudinal data and a small sample size. International Journal of Behavioral Development, 44, 371382. https://doi.org/10.1177/0165025419880610.Google Scholar
Wolfe, E. W., & Smith, E. V. (2007a). Instrument development tools and activities for measure validation using Rasch models: Part I – Instrument development tools. Journal of Applied Measurement, 8, 97123.Google Scholar
Wolfe, E. W., & Smith, E. V. (2007b). Instrument development tools and activities for measure validation using Rasch models: Part II – Validation activities. Journal of Applied Measurement, 8, 204234.Google Scholar
Woods, C. M. (2009). Evaluation of MIMIC-model methods for DIF testing with comparison to two-group analysis. Multivariate Behavioral Research, 44, 127. https://doi.org/10.1080/00273170802620121Google Scholar
Xi, X. (2010). How do we go about investigating test fairness? Language Testing, 27, 147170. https://doi.org/10.1177/0265532209349465.Google Scholar
Yoon, M., & Lai, M. H. C. (2018). Testing factorial invariance with unbalanced samples. Structural Equation Modeling, 25, 201213. https://doi.org/10.1080/10705511.2017.1387859.Google Scholar
Young, M. (2021, June 28). Down with meritocracy. The Guardian: Politics.Google Scholar
Zill, N. (1990). Behavior problems index based on parent report. Unpublished memo. Bethesda, MD: Child Trends.Google Scholar
Zlatkin-Troitschanskaia, O., Toepper, M., Pant, H. A., Lautenbach, C., & Kuhn, C. (Eds.). (2018). Assessment of learning outcomes in higher education: Cross-national comparisons and perspectives. Springer. https://doi.org/10.1007/978-3-319-74338-7.Google Scholar

Save element to Kindle

To save this element to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Identifying and Minimizing Measurement Invariance among Intersectional Groups
Available formats
×

Save element to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Identifying and Minimizing Measurement Invariance among Intersectional Groups
Available formats
×

Save element to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Identifying and Minimizing Measurement Invariance among Intersectional Groups
Available formats
×