Hostname: page-component-745bb68f8f-s22k5 Total loading time: 0 Render date: 2025-01-07T10:47:52.032Z Has data issue: false hasContentIssue false

Examining Differential Item Functioning from a Multidimensional IRT Perspective

Published online by Cambridge University Press:  01 January 2025

Terry A. Ackerman*
Affiliation:
The University of Iowa
Ye Ma
Affiliation:
Amazon Web Services
*
Correspondence should be made to Terry A. Ackerman, The University of Iowa, 8 North Shore Drive, Edwardsville, IL62025, USA. tackerman@uiowa.edu
Rights & Permissions [Opens in a new window]

Abstract

Differential item functioning (DIF) is a standard analysis for every testing company. Research has demonstrated that DIF can result when test items measure different ability composites, and the groups being examined for DIF exhibit distinct underlying ability distributions on those composite abilities. In this article, we examine DIF from a two-dimensional multidimensional item response theory (MIRT) perspective. We begin by delving into the compensatory MIRT model, illustrating and how items and the composites they measure can be graphically represented. Additionally, we discuss how estimated item parameters can vary based on the underlying latent ability distributions of the examinees. Analytical research highlighting the consequences of ignoring dimensionally and applying unidimensional IRT models, where the two-dimensional latent space is mapped onto a unidimensional, is reviewed. Next, we investigate three different approaches to understanding DIF from a MIRT standpoint: 1. Analytically Uniform and Nonuniform DIF: When two groups of interest have different two-dimensional ability distributions, a unidimensional model is estimated. 2. Accounting for complete latent ability space: We emphasize the importance of considering the entire latent ability space when using DIF conditional approaches, which leads to the mitigation of DIF effects. 3. Scenario-Based DIF: Even when underlying two-dimensional distributions are identical for two groups, differing problem-solving approaches can still lead to DIF. Modern software programs facilitate routine DIF procedures for comparing response data from two identified groups of interest. The real challenge is to identify why DIF could occur with flagged items. Thus, as a closing challenge, we present four items (Appendix A) from a standardized test and invite readers to identify which group was favored by a DIF analysis.

Type
Theory & Methods
Copyright
Copyright © 2024 The Author(s), under exclusive licence to The Psychometric Society

1. Introduction

Dimensionality has long posed challenges for testing practitioners attempting to model test response data. Most tests inherently measure different composites of requisites skills outlined in their test specifications. It is important to understand that response data represent an interaction between examinees and the test items. While the resulting response data may appear unidimensional for one group of examinees, it could manifest as multidimensional for another group. For example, consider a math test that includes story problems that require both reading and math skills to answer correctly. If the test is written at a 4th-grade reading level and administered to fourth graders—some of whom may not read at the expected level—their responses may reflect deficits in either reading or math skills or both. However, when the same test is given to fifth graders who read at or above the fourth-grade level, the items should primarily differentiate based on math skills rather than reading abilities. Thus, due diligence demands that test practitioners thoroughly examine the dimensionality of the response data for individual subgroups as well as the entire test-taking population.

If the data are multidimensional, practitioners need to consider how the skills or subsequent scores may be misrepresented if the data are modeled as unidimensional. Fitting a two-dimensional model can help the practitioner understand substantively what composites are being measured and if the potential for differential item functioning (DIF) exists

DIF is a standard post-administration subgroup analysis conducted to ensure that test items do not favor one identifiable subgroup (e.g., males, females, whites, blacks, or Hispanics) when compared conditionally to another. The goal is to confirm test fairness. Over the years, many different approaches have been developed to detect DIF (Table 1). However, the challenge lies not merely in statistically identifying when items significantly favor one group over another, but rather in understanding the underlying reasons for why the DIF occurs.

Table 1 A compilation of seminal DIF methodology.

Kok (Reference Kok1988), Ackerman (Reference Ackerman1992), Camilli (Reference Camilli1992), and Shealy and Stout (Reference Shealy, Stout, Holland and Wainer1993a; Reference Shealy and Stoutb have hypothesized that DIF can occur when items inadvertently measure invalid skills, and the two groups being examined have different ability distributions related to these invalid skills. These researchers described DIF from a two-dimensional perspective as

(1) ε θ 2 [ P i , R e f ( u = 1 | θ 1 , θ 2 ) | θ 1 ] ε θ 2 [ P i , F o c ( u = 1 | θ 1 , θ 2 ) | θ 1 ] \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \varepsilon _{\theta _{2}}[P_{i,Ref}(u=1\vert \theta _{1},\theta _{2})\vert \theta _{1}]\ne \varepsilon _{\theta _{2}}[P_{i,Foc}(u=1\vert \theta _{1},\theta _{2})\vert \theta _{1}] \end{aligned}$$\end{document}

where

  1. P i , R e f \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$P_{i,Ref}$$\end{document} and P i , F o c \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$P_{i,Foc}$$\end{document} represent the probability of correct response for the Reference and Focal groups to item i,

  2. θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1} $$\end{document} represents the valid skill that is intended to be measured by the test publisher, and

  3. θ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2 }$$\end{document} represents an invalid or unintended-to-be-measured-skill (e.g., speededness, test-wiseness, reading ability on a test designed to measure mathematics ability) that affects the correctness of an examinee’s response.

Even though the Reference and Focal group examinees have the same θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1 }$$\end{document} -level of proficiency, DIF occurs because the θ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2}$$\end{document} -latent ability distributions for the two groups are different. Equation (1) can only hold if

(2) G Ref θ 2 | θ 1 = G Foc θ 2 | θ 1 , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} G_{Ref}\left(\theta _{2}\vert \theta _{1}\right) =G_{Foc}\left(\theta _{2}\vert \theta _{1}\right) , \end{aligned}$$\end{document}

where G Ref \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{\textrm{Ref}}$$\end{document} and G Foc \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{\textrm{Foc}}$$\end{document} denote the conditional distribution of θ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{\textrm{2 }}$$\end{document} given fixed values of θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{\textrm{1}}$$\end{document} . That is, for DIF to occur items must measure invalid skills and the two groups of interest being examined must differ in their ability distributions on the invalid skill. It must be a “perfect storm,” both situations must occur. If no invalid skills are being measured, then ability differences on any invalid skill are moot and DIF should not occur. Likewise, if a test contains items that measure invalid skills, but the two groups of interest have identical underlying distributions on these invalid skills, no DIF should occur. Specifically, DIF manifests itself as a function of differences in underlying ability distributions.

While most DIF researchers simulate DIF by changing the parameters for one group versus another (and never stating why), our approach focuses on maintaining identical generating item parameters for each group., we manipulate the underlying ability distributions (i.e., G Ref \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{\textrm{Ref}}$$\end{document} and G Foc ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{\textrm{Foc}})$$\end{document} , resulting in distinct parameter estimates for each group. This perspective emphasizes the interaction between the skill composites being measured and the underlying ability distributions of the examinees. Consequently, it will be essential to focus initially on the two-dimensional IRT model and review relevant notation and characteristics that do not occur in the unidimensional IRT model.

In this address, we will build upon the five research studies cited above, and illustrate how the multidimensional nature of a test can be used to comprehend and explore the underlying mechanisms of DIF using multidimensional item response theory (MIRT). Our analyses assume that testing practitioners have already conducted dimensionality assessments of their test response data (e.g., using scree plots (Cattell, Reference Cattell1966) or specialized software such as DETECT (Zhang and Stout, Reference Zhang and Stout1999) or DIMTEST (Stout, Reference Stout1987)) and in concert with test specifications, determined that their response data exhibit a two-dimensional structure. Subsequently, two-dimensional item response theory item parameters have been estimated.

The format for this article is as follows. First, a comprehensive review of the two-dimensional compensatory MIRT model is provided. This review includes an examination of how items can be graphically represented in a two-dimensional latent ability plane by detailing response surfaces, contour plots, item vector plots, and conditional centroid plots. Item vector plots provide insight into the range of composites being measured and assist testing practitioners in providing validity evidence regarding the test’s intended-to-be-measured skills. Items that measure the intended skills identified in a test’s specifications typically lie in an identifiable “validity sector.”

Following this, the work of Wang (Reference Wang1985) and Camilli (Reference Camilli1992) is explained, focusing on the analytical derivation of the reference composite (RC) resulting from a unidimensional calibration of two-dimensional data. Using the RC direction, a centipede plot is created to illustrate how examinees’ latent abilities ( θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{\textrm{1}}$$\end{document} , θ 2 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{\textrm{2}})$$\end{document} are mapped onto the unidimensional IRT scale.

Next, Camilli’s (Reference Camilli1992) work is examined, concentrating on the analytical derivation of unidimensional two-parameter logistic (2PL) IRT item parameter estimates when the underlying generating model is the two-dimensional compensatory MIRT model. Example results using these derivations are demonstrated for a hypothetical 19-item test specifically designed to showcase how changes in an examinee group’s underlying ability distribution affect the estimation of a ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{a}$$\end{document} and b ^ . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{b}.$$\end{document}

These explanations provide the terminology and graphical conceptualization as background for three studies that explore how dimensionality and disparate underlying latent ability distributions can influence DIF results. The first study adopts a strictly analytical approach, while the final two studies utilize two DIF statistics that condition on the number correct scores: Mantel –Haenszel (Holland et al., Reference Holland and Thayer1988) and Sibtest (Shealy and Stout, Reference Shealy, Stout, Holland and Wainer1993a, Reference Shealy and Stoutb). Simulated datasets are created for illustrative purposes to emphasize how DIF can occur. The article concludes with a challenge based on DIF analyses conducted on standardized test data. Readers are encouraged to inspect four items and identify the group indicated as significantly favored.

2. The Compensatory Two-Dimensional IRT Model

Before exploring MIRT models, it is best to examine the 2PL IRT model, which is widely used in measurement and standardized testing. This model describes the probability of a correct response for an examinee j, with latent ability θ j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{j}$$\end{document} , responding to an item i, with difficulty and discrimination values denoted by the parameters b i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$b_{i}$$\end{document} and a i , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$a_{i,}$$\end{document} respectively. The model is written as

(3) P u ij = 1 | θ j , a i , b i = 1.0 1.0 + e - 1.7 a i ( θ jk - b i ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} P\left({u_{ij}=1}\vert {\theta _{j}, a_{i}, b_{i}}\right) =\frac{1.0}{1.0+e^{-1.7a_{i}{(\theta }_{jk}-b_{i})}} \end{aligned}$$\end{document}

It should be noted that θ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta $$\end{document} and b are on the same metric, (usually ranging from -3 to + \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$+$$\end{document} 3) and b equals the θ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta $$\end{document} value for which the p = . 5 . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ p =.5.$$\end{document} Graphically the model is represented as an item characteristic curve (ICC), where the b is the θ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta $$\end{document} value corresponding to the point of inflection and a corresponds to 2.35 times the slope of the ICC at this point.

McKinley and Reckase (Reference McKinley and Reckase1982) extended the unidimensional 2PL to the multidimensional case, M2PL, which can be written as

P u ij = 1 | θ j , a i , d i = 1.0 1.0 + e - 1.7 ( k = 1 m a ik θ jk + d i ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} P\left({u_{ij}=1}\vert {\varvec{\theta }_{\varvec{j}}, \varvec{a}_{\varvec{i}}, d_{i}}\right) =\frac{1.0}{1.0+e^{-1.7(\sum \nolimits _{k=1}^m {a_{ik}\theta _{jk}+d_{i})} }} \end{aligned}$$\end{document}

where a i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{\textrm{i}}$$\end{document} is a vector of discrimination parameters, d i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_{i }$$\end{document} is a scalar difficulty parameter for item i, and θ j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{j }$$\end{document} is a vector of ability parameters for person j. It is important to note that d i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_{i}$$\end{document} is added in the logit, so unlike in the 2PL model, negative values represent difficult items. For each dimension, there is a discrimination parameter and a latent ability. However, regardless of the number of dimensions, there is only one difficulty parameter because in this model a difficulty parameter for each dimension is indeterminate, i.e., there is an unlimited number of a i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${a}_{i}$$\end{document} and d i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_{i}$$\end{document} values that yield the same probability of correct response.

To find equivalent counterparts to the unidimensional discrimination and difficulty they made the following substitution: a point in the ability space is redefined as:

θ j = ζ j cos α j , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \theta _{j}=\zeta _{j}\cos \alpha _{j}, \end{aligned}$$\end{document}

where ζ j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\zeta _{\textrm{j }}$$\end{document} is the distance from the origin to the point, and α j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha _{\textrm{j}}$$\end{document} is the angle created from the point to the j th \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$j^{\textrm{th}}$$\end{document} axis. Using this trigonometric substitution, the model can then be written as

P u ij = 1 | ζ j , a i , α j , d i = 1.0 1.0 + e - ζ j k = 1 m a ik cos α jk + d i . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} P\left({u_{ij}=1}\vert {\zeta _{j}, \varvec{a}_{\varvec{i}}\textbf{,}\varvec{\alpha }_{\varvec{j}}, d_{i}}\right) =\frac{1.0}{1.0+e^{-\left(\zeta _{j}\sum \nolimits _{k=1}^m {a_{ik}\cos {\alpha _{jk}+d_{i}}} \right) }}. \end{aligned}$$\end{document}

To find the location of the steepest slope in the α \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\upalpha }} $$\end{document} -vector direction, it is necessary to compute the second derivative with respect to ζ j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\zeta _{\textrm{j }}$$\end{document} and set it equal to zero. Like the unidimensional 2PL model, the maximum slope in α j direction \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\upalpha }}_{\textrm{j direction}}$$\end{document} occurs when P ij = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$P_{ij} =$$\end{document} .5. McKinley and Reckase (Reference McKinley and Reckase1982) defined the multidimensional discrimination analog to the unidimensional a-parameter for item i as

(4) MDISC = a i = k = 1 m a ik 2 , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \hbox {MDISC }= a_{i}=\sqrt{\mathop {\sum }\nolimits _{k=1}^m a_{ik}^{2}}, \end{aligned}$$\end{document}

and the M2PL parameter corresponding to the unidimensional difficulty parameter, b i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$b_{i}$$\end{document} , can be written as:

(5) MDIFF = b i = - d i k = 1 m a ik 2 = - d i a i . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \hbox {MDIFF}=b_{i}=\frac{{-d}_{i}}{\sqrt{\mathop {\sum }\nolimits _{k=1}^m a_{ik}^{2} }}=\frac{{-d}_{i}}{a_{i}}. \end{aligned}$$\end{document}

For this presentation, the focus will be on the two-dimensional case and the probability of correct response to item i is expressed as

(6) P u ij = 1 | θ 1 j , θ 2 j , a 1 i , a 2 i , d i = 1.0 1.0 + e - 1.7 ( a 1 i θ 1 j + a 2 i θ 2 j + d i ) . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} P\left({u_{ij}=1}\vert {\theta _{1j},\theta _{2j},a_{1i},a_{2i},d_{i}}\right) =\frac{1.0}{1.0+e^{-1.7(a_{1i}\theta _{1j}+a_{2i}\theta _{2j}+d_{i})}}. \end{aligned}$$\end{document}

Graphically, this function represents an item response surface. For an item i, we can inspect the surface plot or the contour plot to gain further insight. Using the software program, Mathematica (Wolfram, 2020) allows one to manipulate the item parameters and observe changes in the response surface. Such a Mathematica plot is shown in Fig. 1. The plot is configured to enable the user to change the item parameters using the parameter sliding bars on the left of the plot. The surface plot is also rotatable for viewing from different perspectives.

Figure. 1 Graphic representation of the response surface for the compensatory model and its corresponding contour.

Some researchers refer to the M2PL as “partially compensatory,” because the abilities are additive in the logit, allowing for compensation (i.e., being high on one ability can “compensate” for being low on the second ability) can occur. As shown in Fig. 2, the equiprobability contour plot for an item with M2PL parameters a 1 = a 2 = 1.0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$a_{1} = a_{2} = 1.0$$\end{document} and d = . 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d =.0$$\end{document} , two examinees, A and B, having exact opposite ability profiles such as high on θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} and low on θ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2}$$\end{document} versus low on θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} and high on θ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2}$$\end{document} can have the same probability of correct response. This occurs because of the additive nature of the logit of the M2PL model. When a 1 = a 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$a_{1} = a_{2}$$\end{document} , p( θ 1 i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1i}$$\end{document} , θ 2 i ) = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2i)} = $$\end{document} p( θ 1 j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1j}$$\end{document} , θ 2 j ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2j)}$$\end{document} for all examinees i and j where the sum θ 1 + θ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1} + \theta _{2}$$\end{document} equals the same value (e.g., p ( θ 1 = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p(\theta _{1 }=$$\end{document} 2, θ 2 = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2 }=$$\end{document} -2) = p ( θ 1 = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$= p(\theta _{1}=$$\end{document} 0, θ 2 = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2}=$$\end{document} 0) = p ( θ 1 = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$= p(\theta _{1 }=$$\end{document} -2, θ 2 = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2}=$$\end{document} 2). Note, that the slopes of the contours are all equal to - a 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$a_{1}$$\end{document} / a 2 , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$a_{2,}$$\end{document} and all ( θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} , θ 2 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2})$$\end{document} combinations such that θ 2 = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2 =}$$\end{document} (- a 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$a_{1}$$\end{document} / a 2 ) ( θ 1 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$a_{2})(\theta _{1})$$\end{document} will have the same probability of correct response or lie on the same equiprobability contour. Furthermore, when a 1 = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$a_{1}=$$\end{document} 0 or a 2 = 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{\textrm{2}} = 0 $$\end{document} there is no compensation and the model is equivalent to the 2PL unidimensional model. That is, when a 2 = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$a_{2} =$$\end{document} 0, p ( θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p(\theta _{1}$$\end{document} , θ 2 ) = p ( θ 1 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2}) = p(\theta _{1})$$\end{document} and when a 1 = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$a_{1 }=$$\end{document} 0, p ( θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p(\theta _{1}$$\end{document} , θ 2 ) = p ( θ 2 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2}) = p(\theta _{2})$$\end{document} .

Figure. 2 Contour plot of compensatory model item with a 1 = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{\textrm{1}} = $$\end{document} a 2 = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{\textrm{2}} = $$\end{document} 1.0 and d = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$=$$\end{document} .0.

2.1. Item Vector Representation

The drawback that occurs in the representation of M2PL two-dimensional items is that only one surface or contour can be examined at a time. This problem can be solved by representing items in the two-dimensional latent ability plane as a vector. This is accomplished using the following guidelines (Reckase, Reference Reckase2009):

  • All vectors lie on lines that pass through the origin.

  • Vectors can lie only in the first and third quadrants because the a-parameters are constrained to be positive.

  • Vectors representing easy items lie in the third quadrant; those representing difficult items lie in the first quadrant. (Note that if a 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$a_{1}$$\end{document} is negative the vector will lie in the second quadrant and if a 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$a_{2}$$\end{document} is negative the vector will lie in the fourth quadrant.)

  • The tail of the vector lies on the p = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p =$$\end{document} .5 equiprobability contour and the vector is always orthogonal to this contour.

Using the derived information for multidimensional discrimination and difficulty, these vectors are created where the length of the vector indicates how discriminating the item is equal to MDISC (4). The tail of the vector lies on the p = . 5 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p =.5 $$\end{document} equiprobability contour. The signed distance from the origin perpendicular to this contour is the unidimensional analog of difficulty, MDIFF (5). The angular direction from the θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} -axis indicates the ( θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} , θ 2 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2})$$\end{document} -composite of ability that the item i is best measuring:

α i = cos - 1 a 1 i MDISC . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \alpha _{i}=\cos ^{-1}\left(\frac{a_{1i}}{\hbox {MDISC}}\right) . \end{aligned}$$\end{document}

As illustrated in Fig. 3, vectors are projections, indicating the composite direction of maximum discrimination or maximum slope onto the latent ability plane. As is shown, the corresponding contour with the response surface over the third quadrant is removed so that the projected item vector is illustrated in relationship to the underlying contour surface. The greater the discrimination of the item, the steeper the response surface, causing the corresponding contours to become closer together and the greater the length of the vector. Vectors of easy items appear in the third quadrant and vectors representing difficult items are in the first quadrant.

Figure. 3 Illustration of the direction of maximum slope for a compensatory item projected onto the latent ability plane to form its item vector.

Angle item vectors differ by only a few degrees these nuances cannot be attributed solely to phrasing or vocabulary. Item writers and psychometricians need to examine the content sectors that contain different item contents or test specifications. These sectors help determine which ( θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} , θ 2 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2})$$\end{document} -composites an item measures best (Ackerman, Reference Ackerman1991). Ultimately, these content sectors will help in understanding or defining the θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1} $$\end{document} and θ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2 }$$\end{document} latent abilities and defining an imposed unidimensional score scale.

Item vectors are often color-coded based on their content classification. Ideally, vectors with the same content should cluster in a narrow content sector, indicating that they are measuring similar ( θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} , θ 2 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2})$$\end{document} -composites. For example, consider a standardized graduate admissions test with 101 items. In Fig. 4, observe how different content areas occupy unique sectors. The one vector in quadrant two had a negative a 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{\textrm{2}}$$\end{document} value.

Figure. 4 Item vectors for a standardized 101-item test with three content areas.

2.2. The Validity Sector

Ackerman (Reference Ackerman1992) defined the validity sector as the sector containing vectors of items measuring practitioner-determined valid composites. Unlike the 101-item test shown above, most standardized tests yield vectors that can be enclosed by a 30 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} –45 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} degree sector as illustrated in Fig. 5. In this figure, the green item vectors (and red dotted RC) are enclosed in a 45 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} validity sector and are believed to be vectors of items measuring the valid ( θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} , θ 2 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2})$$\end{document} -composites as described in the test’s specifications. Items whose vectors (red) fall outside the validity sector should be examined for DIF because they are likely measuring invalid or nuisance dimensions. DIF has the potential to occur if the groups of interest differ in their distributions of underlying abilities on these invalid skill composites. Ma et al. (Reference Ma, Ackerman, Ip and Chung2023) demonstrated that differences in multidimensional latent ability distribution on the invalid dimension can result in DIF especially when items measure primarily the invalid dimension. The insight gained from item vectors and the validity sector can guide psychometricians in refining assessments, ensuring validity, and understanding the intricate interplay of latent abilities.

Figure. 5 A validity sector enclosing item vectors (green) for a 60-item standardized test. Vectors outside the sector (red) are measuring composites that could result in DIF. (Color figure online)

2.3. Score Scale Consistency Using Conditional Centroids

Another way to understand how DIF can occur from a two-dimensional perspective is to examine whether an imposed unidimensional score scale consistently represents the same ( θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} , θ 2 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2})$$\end{document} -composite as one progresses across the observable score scale. When different parts of the unidimensional score scale represent different skill composites, score scale consistency breaks down.

For example, consider the 101-item test represents a dimensional scenario in which there is a lack of score scale consistency. That is, the same ( θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} , θ 2 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2})$$\end{document} -composite is not being measured equally well throughout the observable score range. To examine scale consistency, “centroid” plots can be created to show ( θ 1 ¯ , θ 2 ¯ ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\bar{\theta _{1}},\bar{\theta _{2}}) $$\end{document} for each number correct observed score, x, (i.e., ( θ 1 ¯ , θ 2 ¯ ) | X = x ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\bar{\theta _{1}},\bar{\theta _{2}})\vert X=x)$$\end{document} . This plot should be linear across the latent ability plane, indicating consistent measurement of ( θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} , θ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2}$$\end{document} ) across the scale. In the top illustration of Fig. 6, centroid plots for each content category are graphed across the score range. The Analytic Reasoning tends to be linear representing primarily differences in θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{\textrm{1}}$$\end{document} . Logical Reasoning and Reading Comprehension represent differences in θ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2}$$\end{document} , but not consistently . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{\mathrm {. }}$$\end{document} However, the results of this test are reported as a single score. The lower figure shows the centroid plot for the total test score scale. Scores in the range from 0 to 35 represent differences in the θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} -ability. Scores in the range from 35 to 80 indicate differences in the θ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2}$$\end{document} -ability. Scores from 85 to 100 represent proficiency differences in the upper θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} -ability. Work by Strachan et al. (Reference Strachan, Cho, Ackerman, Chen, de la Torre and Ip2022) found that if there is a confounding of difficulty and dimensionality (e.g., easy items measure one dimension and difficult items measure a second dimension) the composite may not be linear.

Such inconsistency makes score interpretation and certain psychometric procedures such as equating and computer adaptive testing pool development incredibly challenging. This variation also affects DIF procedures that group examinees according to their number correct scores for conditional analyses because different scores reflect different skills.

Figure. 6 Conditional centroid plots for each content category (top) and total test score (bottom).

2.4. Reference Composite: Mapping a 2PL Scale in a Ttwo-Dimensional Latent Space

Wang (Reference Wang1985) demonstrated that when calibrating multidimensional data, the estimated unidimensional 2PL model essentially combines latent abilities into a weighted composite known as the reference composite (RC). The RC is key because it indicates the ( θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} , θ 2 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2})$$\end{document} -composite being best measured by the unidimensional IRT θ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta $$\end{document} -scale and the number correct score scale. It is important to note that for, say a two-dimensional two-item test, the RC ( θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} , θ 2 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2})$$\end{document} -composite does not directly correspond with the composite skills being measured by either of the two items. This has important implications for how to substantiate or define the unidimensional scale for a test containing two-dimensional items. The RC is a useful tool for demonstrating the parallelism of multiple two-dimensional test forms. That is, after calibration and rescaling, the RCs of parallel forms should align closely within measurement error and other constraints based on the test specification (e.g., cut score measurement precision, content constraints).

For the two-dimensional case, Wang (Reference Wang1985) determined that the RC is a function of the L’A’AL matrix, where A is the n x 2 matrix of discrimination parameters for a given n-item test and L is the Cholesky decomposition of the underlying θ 1 - θ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}-\theta _{2}$$\end{document} variance–covariance matrix, Ω \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Omega $$\end{document} . The angle between the positive θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} -axis and the RC can be calculated as the arccosine of the first element of the eigenvector associated with the larger of the two eigenvalues of the L’A’AL matrix.

A simple example will help to clarify. Assume a two-item case where a 1 = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$a_{1}=$$\end{document} 1.3, a 2 = . 4 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$a_{2} =.4 $$\end{document} and d = - 1.2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d = -1.2$$\end{document} for Item 1 and a 1 = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$a_{1}=$$\end{document} .4, a 2 = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$a_{2} =$$\end{document} 1.3 and d = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d =$$\end{document} - \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document} 1.2 are the parameters for Item 2. The item vectors for Item 1 and Item 2 are, respectively 17.10 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} and 72.90 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} from the positive θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} -axis, respectively. Assume further a group of examinees, Group A, whose two-dimensional underlying ability is N 0 0 , 1.0 . 0 . 0 . 5 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N\left[ \left({\begin{array}{*{20}c} 0\\ 0\\ \end{array} } \right) ,\left({\begin{array}{*{20}c} 1.0 &{} .0\\ .0 &{} .5\\ \end{array} } \right) \right] $$\end{document} and Group B’s underlying distribution is N 0 0 , . 5 . 0 . 0 1.0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N\left[ \left({\begin{array}{*{20}c} 0\\ 0\\ \end{array} } \right) ,\left({\begin{array}{*{20}c} .5 &{} .0\\ .0 &{} 1.0\\ \end{array} } \right) \right] $$\end{document} . The Cholesky decomposition of Ω A \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Omega _{\textrm{A}}$$\end{document} is 1.0 . 0 . 0 . 7071 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\left[ {\begin{array}{*{20}c} 1.0 &{} .0\\ .0 &{} .7071\\ \end{array} } \right] $$\end{document} for both groups and the L’A’AL matrix is equal to 1.8500 . 7353 . 7353 . 9250 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\left[ {\begin{array}{*{20}c} 1.8500 &{} .7353\\ .7353 &{} .9250\\ \end{array} } \right] $$\end{document} . This matrix is negative definite. The eigenvalues of this matrix are 2.2562 and.5187 and the eigenvector that corresponds to the larger eigenvalue is - . 8753 - . 4835 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\left[ {\begin{array}{*{20}c} -.8753\\ -.4835\\ \end{array} } \right] $$\end{document} . The squared elements sum to 1.0, so this eigenvector can be considered as direction cosines. The angle associated with the RC can be computed by taking the arccosine of the absolute value of the elements since the L’A’AL matrix is negative definite. These calculations indicate that the RC for the Reference group lies 28.91 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} from the positive θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} -axis. Similarly, the RC for the Focal group lies 61.09 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} from the positive θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} -axis. Notice the angular difference between the RCs is 32.18 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} , which is the angular between the red dashed RC and green dashed RC figure in Fig. 7. The underlying joint and marginal distributions, RCs, and item vectors are also displayed in the figure.

Trying to compare scores from different RCs can be problematic because it could result in examinees being ordered differently on the two RCs. Ramsay (Reference Ramsay1990) and Junker and Stout (Reference Junker and Stout1991) examined the effects of differential ordering in a DIF context. Even though DIF procedures that condition upon the number correct score may not be sensitive to the dissimilar substantive interpretation of the two RCs, they are sensitive to differential ordering. This situation is graphically illustrated on the right in Fig. 7. In this diagram, two examinees, X and Y, would have one ordering if they are in the Focal group and orthogonally mapped onto the RC FOC \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{\textrm{FOC}}$$\end{document} , but a reverse ordering if they belong to the Reference group and are mapped onto RC REF \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{\textrm{REF}}$$\end{document} .

In an attempt to overcome scaling problems, DIF methodologies use different approaches. DIF analyses that use IRT methodology (Raju Reference Raju1988) require that before comparing item characteristic curves for Reference and Focal groups, the item parameters for one group must be rescaled and placed on the other group’s scale. This can be accomplished using either a mean–mean or mean–sigma rescaling (Kolen and Brennan, Reference Kolen and Brennan2014). It should be noted that these linking/rescaling procedures apply optimally when the reference composites have identical composite directions. They are designed to account for mean and standard deviation scale differences that are established in calibrating response data to fit a unidimensional model.

DIF Methodologies are not designed to compensate for scale differences that would occur when scales represent different ability composites. The greater the angular separation between the RCs, the less effective rescaling becomes. For example, consider an extremely unrealistic case. If the RC for the Reference group is 10 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} , the RC for the Focal group is 70 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} , placing the Focal group parameters on the Reference group’s scale would adjust primarily θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} differences, but the Focal group’s scale would primarily measure θ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2}$$\end{document} . This problem is discussed further in Study 1 below. Several researchers, including Li and Lissitz (Reference Li and Lissitz2000) and Oshima et al. (Reference Oshima, Davey and Lee2000), examined linking from a multidimensional perspective which would align the RC’s for two distinct groups, before adjusting for scale differences.

It should be further noted that the Mantel–Haenszel accomplishes rescaling by including the studied item in the calculation of the total score. Sibtest calculations utilize a regression correction. It is important to recognize that these rescaling techniques work optimally when the RC’s angular directions are similar. Because the RC direction can vary for different content subsets of items, Shealy and Stout (Reference Shealy, Stout, Holland and Wainer1993a, Reference Shealy and Stout1993b) recommend that practitioners identify valid test items to ensure the conditioning score (i.e., RC) represents a valid composite. As the number of items on a test increases, (say > 40), the influence of any one item decreases. Usually, the RC lies within the validity sector.

Additional research by Ackerman and Xie (Reference Ackerman and Xie2019) compared Camilli’s approach two other two unidimensional approaches to explore how well they capture the representation of two-dimensional latent ability space. Carlson (Reference Carlson2017) and Strachan et al. (Reference Strachan, Ip, Fu, Ackerman, Chen and Willse2020) conducted research in which the RC is nonlinear when there is a confounding of difficulty and dimensionality. Additionally, Ma et al. (Reference Ma, Ackerman, Ip and Chung2023) evaluated the efficiency of DIF detection using both Camilli’s approach and the projective IRT approach (Ip, 2010) when potential DIF items fell outside the validity sector.

Figure. 7 RCs for two groups having different underlying ability distributions based on a two-item test (left) and orthogonal mappings upon the composites for two examinees, X and Y, (right).

Figure. 8 RCs for the 101-item test for the three subsections and the total test.

The contour plot of the standardized 101-item test is displayed in Fig. 8. Within this plot are the RCs for the three individual content areas, as well as the total test. The RCs align with the direction of the sector containing content vectors. Interestingly, the wide angular range of the content RCs results in curved contours for higher score categories, resembling patterns seen in the noncompensatory model (10).

2.5. Centipede Plot: Graphically Mapping the Two-Dimensional IRT Latent Ability Space Onto the Unidimensional IRT Scale

This RC enables one to create an interesting visualization of how the two-dimensional latent ability plane gets mapped onto the unidimensional ability scale. This mapping can be illustrated by a “centipede” plot. In this type of plot, the compensatory model (6) test characteristic curve is first drawn in the RC-direction. Then, vectors are drawn from the generated θ 1 , θ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\left(\theta _{1},\theta _{2} \right) $$\end{document} to their expected score, obtained using the estimated θ . ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{\theta . }$$\end{document} Figure 9 displays two perspectives of a centipede plot for a 40-item test: one from a side view and another from an overhead view. The vertical axis represents the proportion correct true score. Vectors are displayed for a sample of 200 examinees. It is informative to observe which θ 1 , θ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\left(\theta _{1},\theta _{2} \right) $$\end{document} -combinations map onto the same proportion correct true scores. This information helps psychometricians and test developers explore how regions of the latent ability space with opposite θ 1 , θ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\left(\theta _{1},\theta _{2} \right) $$\end{document} -profiles, (high, low) vs (low, high), get mapped onto the same conditioning number correct score and thus will be in the same 2x2 contingency table often used in DIF approaches such as the Mantel–Haenszel.

Figure. 9 Two perspectives illustrating the mapping of the two-dimensional latent abilities onto the expected number correct score scale.

3. Analytically Estimating 2PL Item Parameters from a Two-Dimensional Latent Space

Although most tests are multidimensional, practitioners often fit unidimensional IRT models to the response data. Camilli (Reference Camilli1992) analytically determined how the unidimensional 2PL model can be extracted from data where the true model is a two-dimensional model (6). The estimated unidimensional IRT model can be expressed in terms of the two underlying factor scores as,

(7) ε υ 2 [ P ( u = 1 υ 1 , υ 2 ) υ 1 ] = - + P u = 1 | υ 1 , υ 2 ) G ( υ 2 | υ 1 d υ 2 , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \varepsilon _{\upsilon _{2}}[P(u=1\left| \upsilon _{1},\upsilon _{2}) \right| \upsilon _{1}]=\int _{-\infty }^{+\infty } {P\left(u=1{\vert \upsilon }_{1},\upsilon _{2}){G(\upsilon }_{2}\vert \upsilon _{1} \right) \textrm{d}\upsilon _{2},} \end{aligned}$$\end{document}

where ε υ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon _{\upsilon _{2}}$$\end{document} is the expected value of the unidimensional item response function anchored over the factor score, υ 1, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\upsilon _{\textrm{1,}}$$\end{document} the RC; υ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\upsilon _{\textrm{2 }}$$\end{document} is the second factor score which is orthogonal to υ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\upsilon _{\textrm{1}}$$\end{document} , the first factor score, and G( υ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\upsilon _{\textrm{2}}$$\end{document} | υ 1 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\upsilon _{\textrm{1}})$$\end{document} is the underlying conditional distribution (Camilli, Reference Camilli1992, p.133). Using this formulation, Camilli derived the formulas to calculate the unidimensional a ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{a}$$\end{document} and b ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{b}$$\end{document} as

(8) a ^ j = a j W 1 2.89 + a j W 2 W 2 a j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \hat{a}_{j}=\frac{\varvec{a}_{j}^{'}\varvec{W}_{1}}{\sqrt{2.89+\varvec{a}_{j}^{'}{\varvec{W}}_{2}\varvec{W}_{2}^{'}\varvec{a}_{j}} } \end{aligned}$$\end{document}

and

(9) b ^ j = d j - a j μ a j W 1 . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \hat{b}_{j}\mathrm {=}\frac{d_{j}-a_{j}^{'}\varvec{\mu }}{\varvec{a}_{j}\varvec{W}_{1}}. \end{aligned}$$\end{document}

where a j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{a}}_{j}$$\end{document} is the two-dimensional discrimination vector for the M2PL model (4), d j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_{j}$$\end{document} is the difficulty parameter for the M2PL model, W 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varvec{W}_{1}$$\end{document} and W 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varvec{W}_{2}$$\end{document} are the first and second standardized eigenvalues of the matrix L A AL \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{L}^{\varvec{'}}\varvec{A}}^{'}\varvec{AL}$$\end{document} , where A is the matrix of discrimination parameters for all the items on the test and L L \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varvec{L}^{\varvec{'}}\varvec{L}$$\end{document} isthe Cholesky decomposition of the two-dimensional latent ability variance–covariance matrix Ω \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varvec{\Omega }$$\end{document} . A visual representation using (7), (4), and (9) is illustrated in Fig. 10. For all items, the unidimensional ICCs would lie in the RC plane.

A more detailed graphical example for a projected ICC for M2PL item with a 1 = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$a_{1}=$$\end{document} 1.5, a 2 = 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$a_{2} = 0 $$\end{document} , and d = . 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d =.0$$\end{document} , and a RC- angle of 45 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} and an underlying a bivariate normal with a mean vector of {0,0}and the covariance matrix, Ω \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Omega $$\end{document} , as [{1,.4}, {.4,1}] yielding a ^ = . 73 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{a} =.73 $$\end{document} and b ^ = - . 47 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{b} =-.47$$\end{document} is illustrated in Fig. 10. In this figure, the components of Camilli’s formulation (7) are illustrated including the translucent M2PL response surface, the contour of this surface, the RC ( ν 1 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\nu _{1})$$\end{document} , the second principal component, ν 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\nu _{2}$$\end{document} , and the estimated unidimensional 2PL ICC with calculated p values (see values in green) for v 1 = { - 2 , - 1 , 0 , 1 , 2 } \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$v_{1} = \{-2, -1,0, 1, 2\}$$\end{document} . The complete calculations to derive these values are provided in Appendix B.

Figure. 10 Projected unidimensional ICC for a M2PL item with a1 = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$=$$\end{document} 1.5, a2 = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$=$$\end{document} 0, and d = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$=$$\end{document} .5, and a reference composite (RC) angle of 45 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} yielding 2PL a ^ = . 73 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{a} =.73$$\end{document} and b ^ = - . 47 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{b} = -.47$$\end{document} .

3.1. Illustration of Changes in a Two-Dimensional Latent Ability Distribution can Affect a ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{a}$$\end{document} and b ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{b}$$\end{document} Values Using Camilli’s Formulation

Using (6) and (7), one can observe how changes in the underlying two-dimensional distribution of an examinee population impact the direction of the RC and consequently affect estimated a-parameters. As an item’s vector angle approaches the angle of the RC, the estimated unidimensional a ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{a}$$\end{document} -parameter increases. Conversely, when an item’s vector angle deviates from the RC’s angular direction, the estimated a ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{a}$$\end{document} becomes smaller. For illustrative purposes, consider a generated test of 19 items. The vectors of these items span angles in 5 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} increments from 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} to 90 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} . Assume all items have an MDISC value of 1.5 and an MDIFF value of - \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document} 1.0. (Note in (6) the d-parameter is added in the logit, thus these would be considered to all be difficult items.) The vectors for this test are shown in Fig. 11. These parameters were selected for illustration purposes only and would never reflect an actual test. In most cases, real tests have more items and item vectors typically lie within a relatively narrow validity sector (e.g., 30 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} -45 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ )$$\end{document} . Only in extreme distributional cases would the RC be pulled out of the validity sector.

Figure. 11 Item vectors for a hypothetical 19-item test.

Using (6), a ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{a}$$\end{document} values were calculated for all items under four different two-dimensional latent ability distributional conditions: N 0 0 , 1 . 0 . 0 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\textrm{N}\left[ \left({\begin{array}{*{20}c} \textrm{0}\\ \textrm{0}\\ \end{array} } \right) \textrm{,}\left({\begin{array}{*{20}c} \textrm{1} &{} \mathrm {.0}\\ \mathrm {.0} &{} \textrm{1}\\ \end{array} } \right) \right] $$\end{document} , N 0 0 , 1 . 5 . 5 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \textrm{N}\left[ \left({\begin{array}{*{20}c} \textrm{0}\\ \textrm{0}\\ \end{array} } \right) \textrm{,}\left({\begin{array}{*{20}c} \textrm{1} &{} \mathrm {.5}\\ \mathrm {.5} &{} \textrm{1}\\ \end{array} } \right) \right] $$\end{document} , N 0 0 , 1 . 0 . 0 . 5 , N 0 0 , . 5 . 0 . 0 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\textrm{N}\left[ \left({\begin{array}{*{20}c} \textrm{0}\\ \textrm{0}\\ \end{array} } \right) \textrm{,}\left({\begin{array}{*{20}c} \textrm{1} &{} \mathrm {.0}\\ \mathrm {.0} &{} \mathrm {.5}\\ \end{array} } \right) \right] , \textrm{N}\left[ \left({\begin{array}{*{20}c} \textrm{0}\\ \textrm{0}\\ \end{array} } \right) \textrm{,}\left({\begin{array}{*{20}c} \mathrm {.5} &{} \mathrm {.0}\\ \mathrm {.0} &{} \textrm{1}\\ \end{array} } \right) \right] $$\end{document} . The RC-angles for these cases are 45 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} , 52.15 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} , 60.22 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} ; and 22.78 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} , respectively. As the correlation increased, the RC began to shift in the 45 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} direction. When the variances of the two groups are unequal, the RC shifts toward the axis of the ability with the greater variance. The results are displayed in Fig. 12 (top). For each of the four distributional conditions, the a ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{a}$$\end{document} ’s tend to approach the MDISC value (4) as an item’s angular direction aligns more closely with that of the RC. These trends are depicted in the plot by the respective colored vertical lines. The range of a ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{a}$$\end{document} was (.52 (0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} ) to.66(45 ) ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ ))$$\end{document} , (.44(0 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ )$$\end{document} to.67(60 ) ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ ))$$\end{document} , (.34(0 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ )$$\end{document} to.72(75 ) ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ ))$$\end{document} , and (.34(90 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ )$$\end{document} to.72(15 ) ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ ))$$\end{document} for the four respective conditions.

Additionally, two more conditions were considered in the calculation of b ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{b}$$\end{document} : N 1 0 , 1 . 0 . 0 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N\left[ \left({\begin{array}{*{20}c} \textrm{1}\\ \textrm{0}\\ \end{array} } \right) \textrm{,}\left({\begin{array}{*{20}c} \textrm{1} &{} \mathrm {.0}\\ \mathrm {.0} &{} \textrm{1}\\ \end{array} } \right) \right] $$\end{document} and N 0 1 , 1 . 0 . 0 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N\left[ \left({\begin{array}{*{20}c} \textrm{0}\\ \textrm{1}\\ \end{array} } \right) \textrm{,}\left({\begin{array}{*{20}c} \textrm{1} &{} \mathrm {.0}\\ \mathrm {.0} &{} \textrm{1}\\ \end{array} } \right) \right] $$\end{document} . Although mean differences do not affect the RC or a ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{a}$$\end{document} , they do affect b ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{b}$$\end{document} , along with changes in variances and correlations. Figure 12 (bottom) shows how the estimated difficulties change as the items’ measurement angles vary in reference to the RC (denoted by the colored vertical lines). The range of b ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{b}$$\end{document} was 1.41(0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} ) to.99(45 ) ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ ))$$\end{document} , 1.62 (0 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ )$$\end{document} to 1.00(50 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ )$$\end{document} , 2.01(0 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ )$$\end{document} to 1.00 (60 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ )$$\end{document} , - \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document} 2.01(90 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ )$$\end{document} to - \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document} 1.00 (30 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ )$$\end{document} , - \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document} 2.82(0 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ )$$\end{document} to - \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document} 1.41(90 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ )$$\end{document} and - \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document} 2.82(90 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ )$$\end{document} to - \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document} 1.41(0 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ )$$\end{document} for the six respective conditions. As an item’s angular direction approaches the angle of the RC, the closer the b ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{b}$$\end{document} value will be to the item’s MDIFF value (5).

Figure. 12 Estimated a-(top) and b values (bottom) for different underlying ability distributions.

Figure 13 presents a composite graph containing the 19 M2PL response surfaces and the RC plane (depicted in green, top left panel). The RC plane intersects the latent probability space at a 45 o \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^{\textrm{o }}$$\end{document} angle from the θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} axis. Additionally, the 19 ICCs are graphed in the RC plane (right panel). The top right panel shows the test characteristic surface and the corresponding contours, marked by the reference composite (indicated by the red arrow). Drawn in blue is the test characteristic curve (sum of the estimated 19 unidimensional ICCs using (4) and (9). This illustrates how closely the analytical estimates of a and b align with the two-dimensional model. Notably, the unidimensional TCC is not as steep as the TCS, indicating that the discrimination parameters may be underestimated. Wang (1986) previously compared 2PL estimates derived from generated and real data with estimates using the two-dimensional compensatory model. However, more research needs to be done in this area.

At the bottom of Fig. 13 are the 19 ICCs. The red ICCs represent vectors furthest from the 45 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} and have the lowest a ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{a}$$\end{document} values or flattest ICCs. The green ICCs correspond to item vectors with angles closest to the RC angle that have the largest a ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{a}$$\end{document} values or steepest ICCs.

Figure. 13 Composite graph illustrating the 19 M2PL item response surfaces and green RC plane (top left); the test characteristic surface, contour, reference composite (red arrow), and unidimensional TCC (top right); and, the 19 estimated unidimensional ICCs colored by item vector angle (bottom). (Color figure online)

In the next sections, three studies are examined. Each study provides a different perspective on how DIF can occur when response data are two-dimensional. The goal is to provide further insights for testing practitioners enabling them to conduct more informed DIF analyses and better understand the underlying causes when DIF occurs.

4. Study 1: DIF Illustrated Analytically Using Unidimensional IRT Item Calibration of a Two-Dimensional Latent Space

For this example, two pieces of research are foundational. The first builds upon the work of Shealy and Stout (Reference Shealy, Stout, Holland and Wainer1993a, Reference Shealy and Stout1993b) and Ackerman (Reference Ackerman1992). This research identifies one of the causes of DIF. Imagine a scenario where valid items measure the intended to-be-measured ability, θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} , while the test also contains items that inadvertently measure invalid skills, denoted as θ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2}$$\end{document} . By adopting a multidimensional perspective, we can estimate the potential for DIF by calculating the difference between the conditional expectations of the Reference and Focal groups, E θ 2 R | θ 1 - E θ 2 F | θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$E\left[ \theta _{2R}\vert \theta _{1} \right] -E\left[ \theta _{2F}\vert \theta _{1} \right] $$\end{document} . Assuming the regression of θ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2}$$\end{document} on θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{\textrm{1}}$$\end{document} is linear and homoscedastic, we can express the expected conditional difference, ECD, as follows:

(10) ECD = E θ 2 R | θ 1 - E θ 2 F | θ 1 = μ θ 2 R - μ θ 2 F + ρ R σ θ 2 R σ θ 1 R θ 1 - μ θ 1 R - ρ F σ θ 2 F σ θ 1 F θ 1 - μ θ 1 F , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \hbox {ECD}= & {} E\left[ \theta _{2R}\vert \theta _{1} \right] -E\left[ \theta _{2F}\vert \theta _{1} \right] \nonumber \\= & {} \left(\mu _{\theta _{2}R}-\mu _{\theta _{2}F} \right) +\left(\rho _{R}\frac{\sigma _{\theta _{2}R}}{\sigma _{\theta _{1}R}} \right) \left(\theta _{1}-\mu _{\theta _{1}R} \right) -\left(\rho _{F}\frac{\sigma _{\theta _{2}F}}{\sigma _{\theta _{1}F}} \right) \left(\theta _{1}-\mu _{\theta _{1}F} \right) , \end{aligned}$$\end{document}

where for the Focal group, θ 1F, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{\textrm{1F,}}$$\end{document} and θ 2F \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{\textrm{2F }}$$\end{document} are the two latent variables that are bivariate normally distributed with mean vector components μ θ 1 F \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu _{\theta _{1}F}$$\end{document} and μ θ 2 F \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu _{\theta _{2}F}$$\end{document} and variance–covariance is given as

θ 1 F θ 2 F N μ θ 1 F μ θ 2 F , σ θ 1 F ρ F σ θ 1 F σ θ 2 F ρ F σ θ 1 F σ θ 2 F σ 2 F , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \left({\begin{array}{*{20}c} \theta _{1F}\\ \theta _{2F}\\ \end{array} } \right) \sim N\left[ \left({\begin{array}{*{20}c} \mu _{\theta _{1}F}\\ \mu _{\theta _{2}F}\\ \end{array} } \right) ,\left({\begin{array}{*{20}c} \sigma _{\theta _{1}F} &{} \rho _{F}\sigma _{\theta _{1}F}\sigma _{\theta _{2}F}\\ \rho _{F}\sigma _{\theta _{1}F}\sigma _{\theta _{2}F} &{} \sigma _{2F}\\ \end{array} } \right) \right] , \end{aligned}$$\end{document}

and for the Reference group, θ 1R \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{\textrm{1R}}$$\end{document} and θ 2R \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{\textrm{2R }}$$\end{document} are the two latent variables that are bivariate normally distributed with mean vector components μ θ 1 R \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu _{\theta _{1}R}$$\end{document} and μ θ 2 R \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu _{\theta _{2}R}$$\end{document} and variance–covariance is given as

θ 1 R θ 2 R N μ θ 1 R μ θ 2 R , σ θ 1 R ρ F σ θ 1 R σ θ 2 R ρ R σ θ 1 R σ θ 2 R σ 2 R . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \left({\begin{array}{*{20}c} \theta _{1R}\\ \theta _{2R}\\ \end{array} } \right) \sim N\left[ \left({\begin{array}{*{20}c} \mu _{\theta _{1}R}\\ \mu _{\theta _{2}R}\\ \end{array} } \right) ,\left({\begin{array}{*{20}c} \sigma _{\theta _{1}R} &{} \rho _{F}\sigma _{\theta _{1}R}\sigma _{\theta _{2}R}\\ \rho _{R}\sigma _{\theta _{1}R}\sigma _{\theta _{2}R} &{} \sigma _{2R}\\ \end{array} } \right) \right] . \end{aligned}$$\end{document}

Equation 10 serves as a “Rosetta stone” for understanding how the potential for DIF could occur. If we have estimates of the characteristics of the underlying ability for the two groups of interest, we gain insight into how different parameters of the Reference and Focal distributions contribute to conditional differences with the potential for DIF. The conditional difference serves as a weighting of the differences between ICCs that have been calculated separately for each group and then rescaled. While the ECD is not a DIF analysis, it becomes valuable when subgroup distributional differences are estimated. It helps identify the potential for DIF to be significant using more traditional DIF approaches (e.g., Sibtest, Mantel–Haenszel). When the ECD is close to zero no DIF should be expected. If the ECD results in a constant value, uniform DIF could occur. Uniform DIF occurs when the rescaled ICCs for two groups differ only in their difficulty or b ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{{\varvec{b}}}$$\end{document} values (i.e., one group consistently has a lower probability of correct response across all levels of θ ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta )$$\end{document} . When ECD is a function of θ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta $$\end{document} (e.g., .4 θ ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta )$$\end{document} , it suggests that nonuniform DIF could occur. Nonuniform DIF occurs when the rescaled ICCs for two groups differ only in their discrimination or a ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{{\varvec{a}}}$$\end{document} values (i.e., for some levels of θ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta $$\end{document} the Reference group has a higher probability of correct response, while for other levels of θ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta $$\end{document} the Focal group has a higher probability of correct response).

The second approach involves a systematic analytic approach to explore DIF and builds upon the research by Camilli (Reference Camilli1992). In this study, we investigate scenarios where unidimensional 2PL item parameters are analytically estimated even when the true model is the M2PL model. This situation characterizes practitioners who ignore the dimensionality of their test data, as described by Equations (4) and (9). For illustration purposes, we created a ten-item two-dimensional set of items. The M2PL item parameters are shown in Table 2. The first nine items fall in a narrow validity sector and primarily measure θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} . However, Item 10 serves as a potential DIF item, primarily measuring θ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2 }$$\end{document} with a vector angle of 87.72 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} . A vector plot of the ten items is shown in the upper left panel of Fig. 14. Although the same parameters for all ten items were used for both the Reference and Focal groups, their underlying ability distributions differ. Consequently, their RCs will also differ, leading to distinct unidimensional item parameter estimates. Using Camilli’s formulation, Equations (4) and (9) were used to examine three different cases with dissimilar underlying distributions for the Reference and Focal groups. After calculating the a ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{a}$$\end{document} - and b ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{b}$$\end{document} values, the item parameters for the Reference group were placed on the Focal group’s scale, using a mean–mean transformation (Kolen and Brennan, Reference Kolen and Brennan2014).

Table 2 Generating compensatory model parameters for a 10-item test.

a \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^{\hbox {a}}$$\end{document} This item is considered to be a biased item.

Figure. 14 Graphical displays of the item vectors, and the underlying Reference (red) and Focal (green) distributions for each of the three cases outlined in Table 3. (Color figure online)

In Table 3, results are presented for two cases in which the Focal group and Reference group underlying ability distributions are identical and three cases where the distributions are different. In this table, the underlying bivariate normal distributional parameters and the angular direction of the RC for each group are listed in the first two columns. The third column provides the ECD based on each group’s distributional parameters. In the fourth and fifth columns are the 2PL a ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{ a}$$\end{document} and b ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{b}$$\end{document} values for Item 10 using Camilli’s formulation (4) and (9) for each group. Note that once the Reference parameters were calculated, they were rescaled using a mean–mean transformation. In the final column, the ICC differences are designated as displaying no DIF, uniform DIF, or nonuniform DIF. It is essential to reemphasize that DIF is caused by the underlying ability distributional differences.

As can be seen in the first two rows of Table 3, when the underlying distributions are identical, the ECD = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$=$$\end{document} .0, and the a ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{a}$$\end{document} and b ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{b}$$\end{document} values for Item 10 are identical. Thus, whenever the underlying distributions are identical, regardless of the different composites being measured by the items, there should be no DIF.

Table 3 Analytical results of estimated 2PL item parameters for Item 10 for the Reference and Focal group based on their underlying different distributions.

1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^{\textrm{1}}$$\end{document} Reference group estimated parameters were rescaled to the Focal group’s scale using the mean–mean transformation

2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^{\textrm{2}}$$\end{document} U = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$=$$\end{document} Uniform DIF; NU = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$=$$\end{document} Nonuniform DIF.

4.1. Case 1: Unequal μ θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu _{\theta 1}$$\end{document} - and Unequal μ θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu _{\theta 1}$$\end{document} Values: Uniform Bias

This case has two parts, one in which the Reference and Focal groups have mean differences in θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1 }$$\end{document} and mean differences in θ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2}$$\end{document} . ECD differences were 2 and - \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document} 2, respectively. There were no a ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{a}$$\end{document} differences. However, when the Reference group had the greater θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{\textrm{1}}$$\end{document} -mean, Item 10 was much easier for the Reference group with b ^ = 1.78 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{b} = 1.78 $$\end{document} compared to b ^ = - . 24 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{b} = -.24 $$\end{document} for the Focal group. Conversely, the Reference group had the greater mean for θ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2}$$\end{document} , the Reference group’s b ^ = - 2.30 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{b} = -2.30$$\end{document} versus b ^ = 1.73 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{b} = 1.73$$\end{document} for the Focal group. Notably, the RC-angle was 26.68 o \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^{\textrm{o }}$$\end{document} for both groups. Since these examples resulted in only b ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{b}$$\end{document} differences, they were an indication of uniform DIF.

4.2. Case 2: Unequal η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} Variance: Nonuniform Bias

In Case 2a, the Reference group has σ θ 1 2 = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma _{\theta _{1}}^{2}=$$\end{document} 2.5 and σ θ 2 2 = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma _{\theta _{2}}^{2}=$$\end{document} 1. and ρ = . 4 , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\rho =.4,$$\end{document} whereas the Focal group has σ θ 1 2 = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma _{\theta _{1}}^{2}=$$\end{document} .5 and σ θ 2 2 = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma _{\theta _{2}}^{2}=$$\end{document} 1.0 and ρ = . 4 . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\rho =.4.$$\end{document} Note that the σ θ 1 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma _{\theta _{1}}^{2}$$\end{document} is five times larger for the Reference group. In Case 2b, the variance differences were on θ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2}$$\end{document} : σ θ 1 2 = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma _{\theta _{1}}^{2}=$$\end{document} 2.5 for the Reference group and σ θ 2 2 = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma _{\theta _{2}}^{2}=$$\end{document} .5 for the Focal group. The ECD was -.31 θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{\textrm{1}}$$\end{document} and the RC angles were 16.40 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} and 42.05 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} for the Focal group and 37.78 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} and 18.23 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} for the Reference group for Case 2a and Case 2b, respectively. In both of these cases, both a ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{a}$$\end{document} and b ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{b}$$\end{document} differed between the two groups resulting in nonuniform DIF.

4.3. Case 3: Unequal Correlations: Nonuniform Bias

In this case, the θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} - and θ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2}$$\end{document} means and variances are equal but the correlations differ: Reference ρ θ 1 , θ 2 = . 8 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\rho _{{\uptheta 1,\uptheta 2}} =.8$$\end{document} and Focal ρ θ 1 , θ 2 = . 2 . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\rho _{{\uptheta 1,\uptheta 2}} =.2.$$\end{document} The expected difference is E η R | θ - E η F | θ = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$E\left[ \eta _{R}\vert \theta \right] -E\left[ \eta _{F}\vert \theta \right] =$$\end{document} .6 θ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta $$\end{document} . The RC-angles were 16.17 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} for the Focal group and 40.67 o \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^{\textrm{o }}$$\end{document} for the Reference group. Again, both a ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{a}$$\end{document} and b ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{b}$$\end{document} differed between the groups indicating nonuniform DIF.

Using the software package Mathematica (Wolfram, 2020), Ackerman and Xie (Reference Ackerman and Xie2019) created a DIF Graphical Simulator. This simulator allows researchers to change the underlying two-dimensional latent distributions for the Reference and Focal groups as well as the M2PL item parameters for a given suspect item. A graphical example is given in Appendix C.

5. Study 2: Examining the Effect of Different Conditioning Scores on DIF Analyses

Ackerman and Evans (Reference Ackerman and Evans1994) employed two DIF approaches that are standard DIF analysis by testing practitioners, Mantel–Haenszel and Sibtest. We will first provide a brief background of each approach and then examine how the approaches were used to assess the effect of changing the conditioning scores and attempt to account for the complete latent ability used by examinees to respond to a hypothetical two-dimensional test. The biggest cause of DIF when conditioning on raw scores (e.g., Mantel–Haenszel) is that the raw score may not always account for the complete latent ability space that examinees used to respond to the items. In this study, two DIF statistics were used: the Mantel–Haenszel (MH) (Holland et al., Reference Holland and Thayer1988) and Sibtest β u \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\beta _{u }$$\end{document} (Shealy and Stout, Reference Shealy, Stout, Holland and Wainer1993a, Reference Shealy and Stoutb). Both statistics are conditional analyses grouping subjects by their number correct score.

5.1. Background: Mantel–Haenszel and Sibtest DIF Detection Methods

When calculating the MH-statistic for item i, we consider two groups of examinees: the Reference group and the Focal group. The Focal (F) group typically represents a minority group (e.g., Hispanic examinees). The Reference (R) group is frequently a nonminority group (e.g., White examinees). Examinees from each group are matched based on their number correct score.

For each score category j, a 2 x 2 contingency table (Table 4) is created. This table notes the frequency of correct and incorrect answers for each group, along with the marginal and total frequencies. It is tacitly assumed that examinees in the same contingency table are matched on their latent abilities, that they used to respond to the item being examined. The centipede plot above (Fig. 9) illustrates how even though examinees may have the same number correct score, they could have quite different two-dimensional latent ability profiles.

Table 4 A 2 x 2 contingency used in the MH computation.

Summing over the contingency tables for item i and using a continuity correction, the MH statistic is calculated as

MH i = j A j - j E A j - 1 2 2 j V a r A j , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \hbox {MH}_{i}=\frac{\left[ \left| \sum \nolimits _j {A_{j}-\sum \nolimits _j {E\left(A_{j} \right) } } \right| -\frac{1}{2} \right] ^{2}}{\sum \nolimits _j {Var\left(A_{j} \right) } }, \end{aligned}$$\end{document}

where the expected value of Cell A frequency is given as

E A j = N R N 1 . j N . . j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} E\left(A_{j} \right) =\frac{N_{R}N_{1.j}}{N_{..j}} \end{aligned}$$\end{document}

and the variance of cell A frequencies equals

V a r A j = N Rj N Fj N 1 . j N 0 . j N . . j 2 N . . j - 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} Var\left(A_{j} \right) =\frac{N_{Rj}N_{Fj}N_{1.j}N_{0.j}}{\left(N_{..j} \right) ^{2}\left(N_{..j}-1 \right) } \end{aligned}$$\end{document}

Typically, the MH statistic is used to test the null hypothesis that for each raw score category j the odds of a Reference group examinee answering the item correctly equals the odds that a Focal group examinee will answer the item correctly (Holland et al., Reference Holland and Thayer1988). That is, if p Rj \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_{Rj}$$\end{document} and p Fj \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_{Fj}$$\end{document} ; are the probabilities of a Reference and Focal group examinee answering the item correctly, respectively, and q Rj \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q_{Rj} $$\end{document} and q Fj \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$q_{Fj}$$\end{document} are the probabilities of a Reference and Focal group examinee answering the item incorrectly, respectively,

H o : p Rj q Rj = p Fj q Fj j = 1 , , K \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} H_{o}: \frac{p_{Rj}}{q_{Rj}}=\frac{p_{Fj}}{q_{Fj}}\, \textrm{j}=1, \ldots , \textrm{K} \end{aligned}$$\end{document}

is tested against the alternative of uniform DIF,

H 1 : p Rj q Rj = α p Fj q Fj α 1 , j = 1 , , K \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} H_{1}: \frac{p_{Rj}}{q_{Rj}}=\alpha \frac{p_{Fj}}{q_{Fj}} \alpha \ne 1, j=1, \ldots , K \end{aligned}$$\end{document}

where H o \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H_{o}$$\end{document} is the null hypothesis, H 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H_{1}$$\end{document} is the alternative hypothesis, and α \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha $$\end{document} is the common odds ratio in the K 2 x 2 tables. Uniform DIF occurs when the rescaled, unidimensional item response functions differ only in difficulty. When H o \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H_{o}$$\end{document} is true, MH is distributed as χ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\chi ^{\textrm{2}}$$\end{document} with 1 degree of freedom. It should be noted that for a given examinee, the score (i.e., 0 vs 1) on the item being examined is part of the conditioning score.

DIF according to (Shealy and Stout, Reference Shealy, Stout, Holland and Wainer1993a, Reference Shealy and Stoutb), should be conceptualized by examining the difference in certain marginal item characteristic curves for the two groups of interest:

P X i = 1 | Θ = θ P i θ , η f η | θ d η , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} P\left({X_{i}=1}\vert {\varTheta =\theta }\right) \int P_{i} \left[ \theta ,\eta \right] f\left(\eta \vert \theta \right) \textrm{d}\eta , \end{aligned}$$\end{document}

where P i θ , η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$P_{i}\left[ \theta ,\eta \right] $$\end{document} is the M2PL model (Eq 1) and f ( η | θ ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$f(\eta \vert \theta )$$\end{document} is a specified group’s conditional distribution of the nuisance dimension, η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} , given a fixed value of θ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta $$\end{document} , the target ability. That is, for a fixed value of θ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta $$\end{document} , P i ( X i = 1 | Θ = θ ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$P_{i}(X_{i}=1\vert \varTheta =\theta )$$\end{document} is obtained by averaging P i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$P_{i}$$\end{document} [ θ , η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta , \eta $$\end{document} ] over η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} . That is, P i ( X i = 1 | Θ = θ ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$P_{i}(X_{i}=1\vert \varTheta =\theta )$$\end{document} is the ICC if the differences in the nuisance direction are integrated out. Y*

An estimate of the SIBTEST test statistic is given as

β ^ U = h = 0 n p ^ h Y ¯ Rh - Y ¯ Fh \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \hat{\beta }_{U}=\sum \limits _{h=0}^n \hat{p}_{h} \left(\bar{{\textrm{Y}}}_{ Rh}-\bar{Y}_{Fh} \right) \end{aligned}$$\end{document}

where

p ^ h = G Rh - G Fh j = 0 n G Rh - G Fh , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \hat{p}_{h}=\frac{\left(G_{Rh}-G_{Fh} \right) }{\mathop {\sum }\nolimits _{j=0}^n \left(G_{Rh}-G_{Fh} \right) } , \end{aligned}$$\end{document}

and G Rh \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{\textrm{Rh}}$$\end{document} and G Fh \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{\textrm{Fh}}$$\end{document} are the number of examinees in the Reference and Focal groups at the valid score X = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$=$$\end{document} h. The Sibtest test statistic is computed as

β U = β ^ U σ ^ ( β ^ U ) . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \beta _{U}=\frac{\hat{\beta }_{U}}{\hat{\sigma }(\hat{\beta }_{U})}. \end{aligned}$$\end{document}

where the standard deviation in the denominator is calculated as

σ ^ ( β ^ U ) = h = 0 n p ^ k 2 1 G Rh σ ^ 2 Y | h , R + 1 G Fh σ ^ 2 Y | h , 2 , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \hat{\sigma }(\hat{\beta }_{U})=\left\{ \sum \limits _{h=0}^n {\hat{p}_{k}^{2}\left[ \frac{1}{G_{Rh}}\hat{\sigma }^{2}\left(Y\vert h,R \right) +\frac{1}{G_{Fh}}\hat{\sigma }^{2}\left(Y\vert h, \right) \right] } \right\} ^{2}, \end{aligned}$$\end{document}

The test statistic has an approximate N(0,1) distribution when no DIF is present. Unlike the MH statistic, an examinee’s score on the studied item is not part of the conditioning score. Sibtest resolves rescaling issues by means of a regression correction (Shealy and Stout, Reference Shealy, Stout, Holland and Wainer1993a, Reference Shealy and Stoutb).

5.2. DIF Detection with Different Conditioning Scores

Using the Mantel–Haenszel and Sibtest statistics, Ackerman and Evans (Reference Ackerman and Evans1994) examined the impact of different conditioning scores on DIF results. Specifically, they looked at generated two-dimensional compensatory data where θ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta $$\end{document} was the valid skill and η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} represented the invalid skill. The testing scenario involved a 30-item test measuring ( θ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta $$\end{document} , η ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta )$$\end{document} -composites spanning measurement angles from 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} to 90 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} , in 3 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} increments. A vector of these items is displayed in Fig. 15. All items had a difficulty parameter ( d i ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_{i})$$\end{document} value of zero and an MDISC value of 1.5. The Reference and Focal groups had different latent ability distributions. The bivariate normal distributions for the Reference and Focal were N 1 - 1 , 1 . 4 . 4 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N\left[ \left({\begin{array}{*{20}c} \textrm{1}\\ \mathrm {-1}\\ \end{array} } \right) \textrm{,}\left({\begin{array}{*{20}c} \textrm{1} &{} \mathrm {.4}\\ \mathrm {.4} &{} \textrm{1}\\ \end{array} } \right) \right] $$\end{document} and N - 1 1 , 1 . 4 . 4 1 , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N\left[ \left({\begin{array}{*{20}c} \mathrm {-1}\\ \textrm{1}\\ \end{array} } \right) \textrm{,}\left({\begin{array}{*{20}c} \textrm{1} &{} \mathrm {.4}\\ \mathrm {.4} &{} \textrm{1}\\ \end{array} } \right) \right] ,$$\end{document} respectively. Three different sample size pairings were used but results were similar for each pairing. The results shown here are for the pairing N Ref = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{\textrm{Ref }}=$$\end{document} 1000, and N Foc = 500 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{\textrm{Foc}} = 500$$\end{document} .

The purpose of comparing the four different conditioning scores is to illustrate that DIF can occur when one has not accounted for the complete latent ability space. Here are the details for each conditioning variable:

  1. To condition on θ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta $$\end{document} , the transformation used is X θ = 10 ( θ ) + 25 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{{\uptheta }} = 10(\theta ) + 25$$\end{document} . In this case, the ability η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} is not accounted for and DIF should increase the more the ability η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} is required (i.e., as the angle of the item vector increases toward 90 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ )$$\end{document} .

  2. To condition on η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} , the transformation used is X η = 10 ( η ) + 25 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{{\upeta }} =10(\eta ) + 25$$\end{document} was used. When conditioning on η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} the ability θ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta $$\end{document} is not accounted for, and DIF should increase the more the ability η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} is required (i.e., as the angle of the item vector decreases toward 0 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ )$$\end{document} .

  3. The number correct score is equivalent to the case where θ = η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta = \eta $$\end{document} , (i.e., the RC angle is at 45 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} ). Items that require an equal weighting of both skills, (i.e., a 1 = a 2 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$a_{1} = a_{2)}$$\end{document} , should show no DIF. However, DIF should increase as items require more of θ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta $$\end{document} -skill (item vectors approach 0 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ )$$\end{document} or more of η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} (items approach 90 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ )$$\end{document} .

  4. Finally, to condition on ( θ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta $$\end{document} , η ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta )$$\end{document} the latent ability plane was divided into 64-square regions using an 8 x 8 grid (Fig. 16). All examinees in the same square of the grid were assigned the same conditioning score, (i.e., examinees in the square -3 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\le $$\end{document} θ 1 < - 2.25 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{\textrm{1}} < -2.25$$\end{document} and 2.25 < η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2.25 < \eta {\le }$$\end{document} 3 would be assigned a conditioning score of 1). Note that the conditioning score does not enter the calculation of either DIF statistic, but rather ensures that examinees with the same abilities are placed into the same 2 x 2 conditioning table. When the conditioning score is a function of ( θ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta $$\end{document} , η ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta )$$\end{document} , the complete latent ability is accounted for and there should be no DIF.

Figure. 15 Item vectors for the 30-item symmetric test and conditioning score composite directions.

Figure. 16 Overlaying an 8 x 8 grid on the ( θ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta $$\end{document} , η ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta )$$\end{document} latent ability plane with Reference (red) and Focal (green) underlying and marginal distributions. (Color figure online)

Results of the DIF analyses using the four different conditioning scores are illustrated in Fig. 17. When the conditioning score was the number correct, both DIF procedures consistently identified items 1–9 and 22–30 as showing DIF 100% of the time. When the conditioning score was a linear transformation of the generating θ , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{\textrm{, }}$$\end{document} items 9–30 were consistently rejected 100% of the time. Slight differences were observed between MH and Sibtest β U \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\beta _{U}$$\end{document} results. β U \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\beta _{U}$$\end{document} appeared to be more sensitive to DIF. Its rejection rate increased faster than MH as the angular composite of the item deviated from 0°. This is shown by the fact that the rejection rate reached or exceeded .9 by Item 7 for β U \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\beta _{U}$$\end{document} , whereas it did not occur until Item 9 for MH. Note that rather than using the Sibtest regression correction for β U \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\beta _{U}$$\end{document} , the conditioning variable was based on a latent trait parameter.

Figure. 17 MH and SIBTEST DIF results by item for each of the four condition scores.

As hypothesized, the opposite results occurred when the valid test direction was along the η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} -axis. That is, items measuring θ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta $$\end{document} (i.e., Items 1–23) consistently exhibited DIF in favor of the Reference group. For the final analysis, the 64 score categories matching examinees on both θ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta $$\end{document} and η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} were used as the conditioning scores. The results, as shown in Fig. 19, showed no DIF for any of the 30 items, regardless of the DIF procedure. However, it is important to recognize that this purely hypothetical testing situation would not occur in practice. Its purpose was to illustrate the significance of underlying group distributional differences interacting with items that measure a spread of two-dimensional composite skills. Identifying the specific composite skills being measured remains a genuine challenge for testing practitioners. Several studies have examined using multiple conditioning scores, such as those by Clauser et al. (Reference Clauser, Nungester and Swaminathan1996) and Mazor et al. (Reference Mazor, Hambleton and Clauser1998), in an attempt to condition on the complete latent ability space.

6. DIF Even Though Reference and Focal Two-Dimensional Distributions are Identical

6.1. The Two-Dimensional Noncompensatory MIRT Model

Up to this point, the discussion has centered around how DIF can occur when items measure invalid skill composites and the two groups of interest have different ability distributions related to the invalid skill. Interestingly, DIF can also occur when the underlying two-dimensional ability distributions are identical and the vectors corresponding to the test items lie in a very narrow validity sector (e.g., a unidimensional test). This phenomenon was examined by Ackerman and Evans (Reference Ackerman and Evans1994), Bolt and Johnson (Reference Bolt2009), and Ackerman et al. (Reference Ackerman, McCallaum and Ngerano2014). They found that DIF can occur when a two-dimensional test contains items for which different groups of students use distinct approaches to solve the same problem. These divergent solution strategies could occur due to pedagogical differences in how students were taught to information, particularly in items such as “story” problems. For instance, one group might have been explicitly taught to combine pieces of information, whereas another group was not instructed to integrate or combine these pieces.

In Ackerman and Evans (Reference Ackerman and Evans1994) and Ackerman et al. (Reference Ackerman, McCallaum and Ngerano2014), the different strategies were are modeled using two distinct MIRT models. The integration strategy was modeled using the compensatory model (6), and the nonintegration strategy was modeled using the MIRT noncompensatory model developed by Sympson (1978). This does not allow for compensation and can be expressed as

(11) P NC u ij = 1 | θ 1 j , θ 2 j , a 1 i , a 2 i , b 1 i , b 1 i = 1.0 1.0 + e a 1 i ( θ 1 j - b 1 i 1.0 1.0 + e a 2 ( θ 2 j - b 2 i . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} P_{NC}\left({u_{ij}=1}\vert {\theta _{1j},\theta _{2j},a_{1i},a_{2i},b_{1i},b_{1i}}\right) =\left[ \frac{1.0}{1.0+e^{\left(a_{1i}(\theta _{1j}-b_{1i} \right) }}\right] \left[ \frac{1.0}{1.0+e^{\left(a_{2}(\theta _{2j}-b_{2i} \right) }}\right] .\nonumber \\ \end{aligned}$$\end{document}

This model is essentially the product of two 2PL (1) models, with a discrimination and difficulty parameter for each dimension. Unlike the compensatory model (6) which assumes the abilities from different dimensions can compensate for each other, the noncompensatory model treats them independently. P NC \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{\textrm{NC}}$$\end{document} ’s multiplicative nature ensures that P NC \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{\textrm{NC }}$$\end{document} can never be larger than the maximum value of either dimension’s 2PL model.

Figure. 18 Contour and difference plots for a matched noncompensatory and compensatory item.

The multiplicative nature of this model causes the response surface equiprobability contours to become curved. That is, unlike the compensatory model contours which are always parallel lines, the noncompensatory model contours are always parallel curves, the larger the a values, the more discriminating the item and the closer together the equiprobability curves. A contour plot of a noncompensatory response surface where a 1 = a 2 = 1.6 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$a_{1} = a_{2} = 1.6$$\end{document} and b 1 = b 2 = - . 47 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$b_{1} = b_{2} = -.47$$\end{document} is displayed in the left panel in Fig. 18. In the left panel, the letters A, B, and C denote three different ( θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} , θ 2 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2})$$\end{document} -profiles, (high, low), (low, low), and (low, high), respectively. Notice that all lie on the same equiprobability contour or have the same probability of correct response.

6.2. DIF Study Simulation Using Matched Compensatory and Noncompensatory Items

A 30-item test was created where the primary focus was on measuring θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} and θ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2}$$\end{document} equally (i.e., the item vectors were enclosed in a narrow validity sector from 40 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} to 50 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} .) For the Reference group, all 30 items were modeled using the compensatory model (6). For the Focal group, for items 1–10 and 15–30 the probability of correct response followed the compensatory model, but for items 11–14, the probability of correct response was determined using the noncompensatory model (11) that matched their compensatory counterparts. To estimate noncompensatory item parameters that would match the compensatory for items 11–14, the approach proposed by Spray et al. (Reference Spray, Davey, Reckase, Ackerman and Carlson1990) was used. Specifically, PNC parameters were estimated by minimizing the function

i = 1 N { [ P C θ i , a , d - P NC ( θ i , a ^ , b ^ ) ] } 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \sum \limits _{i=1}^N {\{[P_{C}\left(\varvec{\theta }_{\varvec{i}},\varvec{a},d \right) -P_{NC}(\theta _{i},\hat{{\varvec{a}}},\hat{b})] \}}^{2} \end{aligned}$$\end{document}

for 2000 randomly generated examinee abilities from the latent underlying bivariate normal distribution, N 0 0 , 1 . 4 . 4 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N\left[ \left({\begin{array}{*{20}c} \textrm{0}\\ \textrm{0}\\ \end{array} } \right) \textrm{,}\left({\begin{array}{*{20}c} \textrm{1} &{} \mathrm {.4}\\ \mathrm {.4} &{} \textrm{1}\\ \end{array} } \right) \right] $$\end{document} . The Nminimize function in Mathematica (2020) was used for this optimization. This process was repeated for 10 replications for each of the four items to ensure that the estimates obtained were not unduly influenced by the samples selected or the starting values. The matched set of parameters for items 11–14 are displayed in Table 5. In Fig. 18, the center panel contains the equiprobability contour plot for the matched Item 14 compensatory item. The right panel displays the compensatory—noncompensatory difference contour for item 14. From this plot, it appears that the compensatory item 14 and the noncompensatory item 14 would produce similar probabilities of correct response for examinees in the first and third quadrants, but noticeably different probabilities for examinees in the third and fourth quadrants.

Table 5 Compensatory and Noncompensatory item parameters matched on p value for a given underlying ability distribution.

Response data were generated for 1000 examinees for both the Reference and Focal groups using the same underlying bivariate distribution. A DIF analysis was conducted for each item using the Mantel–Haenszel and Sibtest procedures using the software program difR (Magis et al., Reference Magis, Beland, Tuerlinckx and De Boeck2010). This process was replicated 100 times. For each item, we calculated the proportion of times it resulted in significant DIF using each method. The results are graphed in Fig. 19.

There was a clear demarcation between the items. Items 12–14, which were modeled discordantly, were flagged more frequently (65% to 85% of the replications) than the remaining items. All significant DIF results favored the Reference group, which is cross-validated by the right panel of Fig. 18 which shows the probability of correct response in quadrants 2 and 4 was greater for the compensatory model. These results parallel the findings of Ackerman and Evans (Reference Ackerman and Evans1994) which illustrated that DIF detection for two groups using different models was affected greatly by the discrimination power of the items. The MDISC value for item 11 was only.4, whereas for the remaining items it was.8, 1.2, and 1.6, respectively. It was also noted that for the remaining 24 matched items, some had Type I error rates as high as.12%, possibly affected by the conditioning total score for each group. The contour of the test characteristic surface for the Focal group exhibited a slight curve, resembling the contour of a noncompensatory item. Interestingly, even when conditioning on both θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} and θ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2}$$\end{document} , as illustrated in Study 2 above, the DIF for groups using two different MIRT models did not disappear as shown by Ackerman (Reference Ackerman, McCallaum and Ngerano2014).

Figure. 19 Mantel–Haenszel and SIBTEST DIF results for each of the 30 items.

7. The True Challenge: Substantively Identifying the Cause of the Manifested DIF

At a large national testing company, we routinely conducted DIF analyses after each administration using the Mantel–Haenszel procedure. Two DIF analyses were conducted, one looking for gender DIF (comparing Males vs Females) and a second analysis looking at racial DIF (comparing White Examinees vs Black Examinees). We would often share these results with the content editors to see if they could explain our DIF results. Sometimes we would do a “blind” test. That is, we would assemble a set of items flagged for significant DIF. We also included a few items that did not show DIF. We asked the editors to determine which items exhibited significant DIF and which group was favored. Below are four actual items which item writers found very difficult to explain the DIF results. Using your psychometric knowledge and DIF expertise, determine if the items below favored Males, Females, White examinees, or Black examinees, or showed No DIF. While statistically detecting DIF is straightforward, understanding why it occurs is the true challenge! Answers are provided in Appendix A.

Item 1

The Cold War threatened to erupt into a “hot” war in October 1962 when President Kennedy demanded that the Soviet Union

  1. A. dismantle naval bases located in Nicaragua.

  2. B. withdraw all troops from South Korea.

  3. C. remove all missiles and missile bases located in Cuba.

  4. D. return captured American pilot Gary Powers to the USA.

Item 2

In comparison with normal males, those with Klinefelter’s syndrome have:

  1. A. 1 extra X-chromosome

  2. B. 1 fewer Y-chromosome

  3. C. 1 extra Y-chromosome

  4. D. 2 extra X-chromosomes

Item 3

A bell was found fastened at a fork in a branch 15 feet from the ground in a 40-year-old tree. A person claimed that the bell was fastened to the tree about six feet from the ground when the tree was 10 years old. Of the following, the best evaluation of this story is that the claim is:

  1. A. true, because the bell moved upward as the tree grew taller.

  2. B. true, because the bell was fastened to a forked branch, which grew rapidly upward.

  3. C. false, because trees do not grow taller that quickly.

  4. D. false, because upward growth in trees occurs at the terminal buds, not within the trunk or branches.

Item 4

A customer at a service station asks the attendant to put 30 pounds of air in my right rear tire.” Assuming that the tire is completely flat, air will be pumped into the tire until the:

  1. A. tire’s weight increases by 30 pounds.

  2. B. air pressure inside the tire equals the atmospheric pressure.

  3. C. air pressure inside the tire is 30 pounds per square inch greater than the atmospheric pressure.

  4. D. air pressure inside the tire is 30 times greater than the atmospheric pressure.

8. Summary and Concluding Remarks

It is widely recognized that test response data often exhibit multidimensionality. Due diligence requires that testing practitioners should first examine the dimensionality of their data. By identifying the dominant dimensions and mapping them onto a test’s specifications, each dimension can be well defined. Additionally, practitioners need to be vigilant and identify unintended skills that are being measured. Using this foundational analysis, appropriate calibration model(s) can be selected. These models play a crucial role in estimating item parameters, scaling examinee abilities, and understanding the potential for DIF to occur. Research by Kok (Reference Kok1988), Ackerman (Reference Ackerman1992), Camilli (Reference Camilli1992), and Shealy and Stout (Reference Shealy, Stout, Holland and Wainer1993a, Reference Shealy and Stout1993b) hypothesized that DIF can result when a test measures invalid or unintended skills and the groups of interest exhibit distinct conditional ability distributions on these skills for different levels of the valid skill.

Using this perspective as a starting point, this article provides a detailed examination of the two-dimensional compensatory MIRT model. Graphical representations of two-dimensional item response surfaces and their corresponding contours were examined. These plots provide a deeper understanding of how items perform across the latent ability space. Plots of items as vectors indicating the ( θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} , θ 2 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2})$$\end{document} -composite that each item is optimally measuring were illustrated and discussed. Vectors of valid items should lie within a sector, termed the validity sector. Further insight about ( θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} , θ 2 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2})$$\end{document} -composite consistency across the observable score scale was illustrated using plots of conditional centroids. Finally, centipede plots, detailing how examinees’( θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} , θ 2 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2})$$\end{document} -abilities get mapped onto the unidimensional θ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta $$\end{document} -scale were explained. Such graphical analytics, in concert with knowledge of underlying ability distributions for subgroups of interest, can provide detailed insight into the potential for DIF to occur.

The article then focused on the analytical work of Wang (Reference Wang1985) and Camilli (Reference Camilli1992). They demonstrated how data generated using the two-dimensional compensatory model can be mapped onto a unidimensional 2PL IRT scale, referred to as the reference composite (RC). Additionally, they showed how estimated 2PL IRT item parameters can be derived given estimates of the underlying two-dimensional bivariate normal examinee ability distribution parameters ( μ θ 1 , μ θ 2 , σ θ 1 , σ θ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu _{\theta _{1}}, \mu _{\theta _{2} }, \sigma _{\theta _{1}}, \sigma _{\theta _{2}}$$\end{document} , ρ ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\rho )$$\end{document} and the compensatory model (6) item parameters ( a 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$a_{1}$$\end{document} , a 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$a_{2}$$\end{document} , and d). Specific examples were provided to illustrate how 2PL IRT parameter estimates can change as the underlying two-dimensional ability distributions change.

We then reviewed three studies that illustrate how DIF can occur using a two-dimensional framework:

  • The first study investigated how DIF can occur for an invalid item, one that deviates significantly from the validity sector. DIF results were examined across four different distributional scenarios. This study identified which distributional differences result in uniform DIF and which produce nonuniform DIF.

  • The second study emphasized the importance of considering the complete latent ability space when using DIF. conditional approaches. Simulations revealed that DIF occurred when examinees were matched solely on their θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{\textrm{1}}$$\end{document} value, or only their θ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2}$$\end{document} value, or on their number correct score where θ 1 = θ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1 }= \theta _{2}$$\end{document} . However, when examinees were matched on their ( θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} , θ 2 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2})$$\end{document} -groupings, no DIF occurred.

  • The third study detailed an educational scenario with identical underlying two-dimensional distributions for the two groups. Despite this similarity, certain items displayed DIF. The DIF occurred for items where one group’s responses were generated using the compensatory model and the other group’s responses were generated using the noncompensatory model. These models were chosen to simulate different response strategies that resulted from different instructional pedagogies.

The paper concludes with a test for readers to correctly identify the Mantel–Haenszel DIF analysis results for four given items from a nationally administered standardized test. This involves determining which examinee group was favored for each item or whether the results indicated no DIF. Conducting DIF analyses is relatively straightforward thanks to computer programs written for all the approaches listed in Table 1. Substantively explaining the results is where the real challenge lies. Psychometricians and item writers must collaborate to interpret DIF findings. One should also never discount the possibility of Type I error!

Generated sets of item parameters in this study were created specially to provide insight for the testing practitioner about how underlying ability distributions or different response styles can affect examinee performance and consequently unidimensional item parameter estimation to create DIF. They are not realistic. Real data are very messy, to say the least. We will never know the true item parameters, the true latent abilities, or whether the data are unidimensional or multidimensional. Only when we simulate data do we know the truth. It is paramount that psychometricians and testing practitioners always remember the words of wisdom by the noted British statistician George Box “Essentially, all models are wrong, but some are useful” (Box & Draper, 1987).

Declarations

Conflict of interest

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Open science statement

Data and the Mathematica code used in the illustration will be made available on the Open Science Framework upon publication.

Appendix A

Groups indicated as being favored in the Mantel–Haenszel analysis.

  • Male examinees

  • Black examinees

  • No DIF

  • Male examinees. This is the only item which has a possible explanation: that males, for the most part, know more about cars than females.

Appendix B

Example illustrating formulation of how the unidimensional 2PL model gets mapped into a two-dimensional latent ability space:

ε υ 2 [ P ( u = 1 υ 1 , υ 2 ) υ 1 ] = - + P u = 1 | υ 1 , υ 2 ) G ( υ 2 | υ 1 d υ 2 . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \varepsilon _{\upsilon _{2}}[P(u=1\left| \upsilon _{1},\upsilon _{2}) \right| \upsilon _{1}]=\int _{-\infty }^{+\infty } {P\left(u=1{\vert \upsilon }_{1},\upsilon _{2}){G(\upsilon }_{2}\vert \upsilon _{1} \right) \textrm{d}\upsilon _{2}.} \end{aligned}$$\end{document}

Assume you want to find the unidimensional 2PL a ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{a}$$\end{document} and b ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{b}$$\end{document} value for a two-item test where the two-dimensional compensatory parameters are given as A = [ { 1.5 , 0 } , { 0 , 1.5 } ] \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A = [\{1.5,0\}, \{0,1.5\}]$$\end{document} and D = { . 5 , . 5 } \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$D = \{.5,.5\}$$\end{document} and the underlying model is given as

P u ij = 1 | θ 1 j , θ 2 j , a 1 i , a 2 i , d i = 1.0 1.0 + e - 1.7 ( a 1 i θ 1 j + a 2 i θ 2 j + d i ) . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} P\left({u_{ij}=1}\vert {\theta _{1j},\theta _{2j},a_{1i},a_{2i},d_{i}}\right) =\frac{1.0}{1.0+e^{-1.7(a_{1i}\theta _{1j}+a_{2i}\theta _{2j}+d_{i})}}. \end{aligned}$$\end{document}

It is also given that the underlying two-dimensional distribution is a bivariate normal with a mean vector of {0,0}and the covariance matrix, Ω \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Omega $$\end{document} , as [{1,.4}, {.4,1}]. Note these are chosen only for illustration purposes. Item 1 measures only θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{1}$$\end{document} and item 2 measures only θ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{2}$$\end{document} .

Following the work of Wang (1986) and Camilli (Reference Camilli1992), we first determine the Cholesky decomposition, L, of Ω \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Omega $$\end{document} . L[ { 1 . , 0.4 } , { 0 . , 0.91651 } ] \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\{1.,0.4\},\{0.,0.91651\}]$$\end{document} . To compute the reference composite, we first need to calculate the L’A’AL matrix which equals, { 2.25 , 0.9 } , { 0.9 , 2.25 } \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\{2.25,0.9\},\{0.9,2.25\}$$\end{document} . The eigenvalues of this matrix are 3.15 , 1.35 } \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$3.15,1.35\}$$\end{document} and the eigenvectors associated with the eigenvalues, w ij \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{\textrm{ij}}$$\end{document} , are { 0.7071 , 0.7071 } , { - 0.7071 , 0.7071 } \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\{0.7071,0.7071\},\{-0.7071,0.7071\}$$\end{document} .

The reference composite is then calculated as the arccosine of the first element of the eigenvector associated with the largest eigenvalue. The arccosine of.7071 corresponds to 45 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} which corresponds to the reference composite direction from the positive θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{\textrm{1}}$$\end{document} -axis. This is the composite that would represent the unidimensional θ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta $$\end{document} -scale if the data were fit to the 2PL model.

It should also be noted that the first and second factor scores, υ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\upsilon _{1}$$\end{document} and υ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\upsilon _{2}$$\end{document} are then defined as:

υ 1 υ 2 = w 11 θ 1 - μ θ 1 + w 12 θ 1 - μ θ 1 w 21 θ 1 - μ θ 1 + w 22 θ 2 - μ θ 2 = . 7071 θ 1 + . 7071 θ 2 - . 7071 θ 1 + . 7071 θ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \left({\begin{array}{*{20}c} \upsilon _{1}\\ \upsilon _{2}\\ \end{array} } \right) =\left[ {\begin{array}{*{20}c} w_{11\left(\theta _{1}-\mu _{\theta _{1}} \right) }+w_{12\left(\theta _{1}-\mu _{\theta _{1}} \right) }\\ w_{21\left(\theta _{1}-\mu _{\theta _{1}} \right) }+w_{22\left(\theta _{2}-\mu _{\theta _{2}} \right) }\\ \end{array} } \right] =\left[ {\begin{array}{*{20}c}.7071\theta _{1}+.7071\theta _{2}\\ -.7071\theta _{1}+.7071\theta _{2}\\ \end{array} } \right] \end{aligned}$$\end{document}

In Fig. 20, the left panel is a contour plot of Item 1 with the reference composite ( υ 1 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\upsilon _{\textrm{1}})$$\end{document} direction indicated with a solid red arrow and the perpendicular υ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\upsilon _{\textrm{2}}$$\end{document} direction indicated with a dotted red arrow. We then substitute υ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\upsilon _{\textrm{1}}$$\end{document} and υ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\upsilon _{\textrm{2 }}$$\end{document} in for θ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{\textrm{1}}$$\end{document} and θ 2 in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\theta _{\textrm{2 in}}$$\end{document} the compensatory model to get

p u ij = 1 | υ 1 , υ 2 = 1.0 1.0 + e - 1.7 ( 1.5 υ 1 + . 0 υ 2 + . 5 ) . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} p\left({u_{ij}=1}\vert {\upsilon _{1},\upsilon _{2}}\right) =\frac{1.0}{1.0+e^{-1.7(1.5 \upsilon _{1}+.0\upsilon _{2}+.5)}}. \end{aligned}$$\end{document}

Figure. 20 The contour graph of the original item response surface with direction of first ( ν 1 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\nu _{1})$$\end{document} and second principal component ( ν 2 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\nu _{2})$$\end{document} (left) and contour surface rotated 45 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^\circ $$\end{document} (right).

To determine G ( υ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\upsilon $$\end{document} 2| υ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\upsilon $$\end{document} 1), we must first rotate the bivariate normal distribution 45o and then determine the conditional distribution. Assuming Σ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Sigma $$\end{document} is the original covariance Σ = σ 1 2 ρ σ 1 σ 2 ρ σ 1 σ 2 σ 2 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\Sigma }=\left[ {\begin{array}{*{20}c} \sigma _{1}^{2} &{} \rho \sigma _{1}\sigma _{2}\\ \rho \sigma _{1}\sigma _{2} &{} \sigma _{2}^{2}\\ \end{array} } \right] $$\end{document} and R θ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{{\uptheta }}$$\end{document} is the rotation matrix,

R θ = cos 45 - ( sin 45 ) sin 45 + ( cos 45 ) = 2 2 - 2 2 2 2 2 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} R_{\theta }=\left[ {\begin{array}{*{20}c} \left(\cos {45}^\circ \right) { - (}\sin {45}^\circ \mathrm {) }\\ \left(\sin {45}^\circ \right) {+ (}\cos {45}^\circ \mathrm {) }\\ \end{array} } \right] =\left[ {\begin{array}{*{20}c} \frac{\sqrt{2} }{2} &{} \frac{-\sqrt{2} }{2}\\ \frac{\sqrt{2} }{2} &{} \frac{\sqrt{2} }{2}\\ \end{array} } \right] \end{aligned}$$\end{document}

then the rotated mean vector, μ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu $$\end{document} ’, and rotated covariance matrix, Σ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Sigma $$\end{document} ’, are given by μ = R θ μ = 2 2 μ 1 - μ 2 μ 1 + μ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \mu ^{'}=R_{\theta }\mu =\frac{\sqrt{2} }{2} \left[ {\begin{array}{*{20}c} \mu _{1}-\mu _{2}\\ \mu _{1}+\mu _{2}\\ \end{array} } \right] $$\end{document} and Σ = R θ Σ R θ T = 1 2 σ 1 2 + σ 2 2 - 2 ρ σ 1 σ 2 σ 1 2 - σ 2 2 σ 1 2 - σ 2 2 σ 1 2 + σ 2 2 + 2 ρ σ 1 σ 2 , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\Sigma }^{'}=R_{\theta }{\Sigma }R_{\theta }^{T}= \frac{1}{2}\left[ {\begin{array}{*{20}c} \sigma _{1}^{2}+\sigma _{2}^{2}-2\rho \sigma _{1}\sigma _{2} &{} \sigma _{1}^{2}-\sigma _{2}^{2}\\ \sigma _{1}^{2}-\sigma _{2}^{2} &{} \sigma _{1}^{2}+\sigma _{2}^{2}+2{\rho \sigma }_{1}\sigma _{2}\\ \end{array} } \right] ,$$\end{document} where σ 1 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma _{1}^{2}$$\end{document} , σ 2 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma _{2}^{2}$$\end{document} and ρ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\rho $$\end{document} are the original variances and correlation of the original random variables.

The rotated mean vector is [0,0] and the rotated covariance matrix, Σ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\Sigma }^{'}$$\end{document} is [{.6,0}, {0,1.4}]. The formula for the conditional distribution of G( υ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\upsilon $$\end{document} 1| υ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\upsilon $$\end{document} 2) equals (Fig. 21)

G υ 1 | υ 2 = N μ Y + ρ σ Y σ X x - μ X , σ Y 2 1 - ρ 2 = N 0 , 1.4 . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} G\left({{\upupsilon 1}}\vert {{\upupsilon 2}}\right) =\sim N\left(\mu _{Y}+\rho \frac{\sigma _{Y}}{\sigma _{X}}\left(x-\mu _{X} \right) ,\sigma _{Y}^{2}\left(1-\rho ^{2} \right) \right) =\sim N\left(0,1.4 \right) . \end{aligned}$$\end{document}

Figure. 21 A contour plot of the original bivariate normal distribution (left) and the contour plot of the rotated distribution (right).

In Fig. 22 on the left are conditional normal distributions, G υ 1 | υ 2 , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$G\left({{\upupsilon 1}}\vert {{\upupsilon 2}}\right) ,$$\end{document} for υ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\upsilon $$\end{document} 1 = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$=$$\end{document} -2, -1,0,1,2. On the right are the conditional ICCs, p u ij = 1 | υ 1 , υ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p\left({u_{ij}=1}\vert {\upsilon _{1},\upsilon _{2}}\right) $$\end{document} , for υ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\upsilon $$\end{document} 1 = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$=$$\end{document} -2, -1,0,1,2.

Figure. 22 Conditional normal distributions, G υ 1 | υ 2 , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$G\left({{\upupsilon 1}}\vert {{\upupsilon 2}}\right) ,$$\end{document} for υ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\upsilon $$\end{document} 1 = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$=$$\end{document} -2, -1,0,1,2 (left) and conditional ICCs, p u ij = 1 | υ 1 , υ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p\left({u_{ij}=1}\vert {\upsilon _{1},\upsilon _{2}}\right) $$\end{document} , for υ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\upsilon $$\end{document} 1 = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$=$$\end{document} -2, -1, 0, 1, 2 (right).

Using the formula

u ij = 1 | υ 1 , υ 2 = - 6 + 6 P ( u = 1 | υ 1 , υ 2 ) G ( υ 2 | υ 1 ) d υ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \left({u_{ij}=1}\vert {\upsilon _{1},\upsilon _{2}}\right) =\int _{-6}^{+6} {P(u=1{\vert \upsilon }_{1},\upsilon _{2}){G(\upsilon }_{2}} \vert \upsilon _{1})\textrm{d}\upsilon _{2} \end{aligned}$$\end{document}

4where d υ 2 = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\textrm{d}\upsilon _{2}=$$\end{document} .001 we can estimate the unidimensional ICC for values of υ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\upsilon $$\end{document} 1 values of -2, -1,0,1,2. These values are.13,.34,.64,.86, and.95, respectively. Using Camilli’s derivational formulas,

a ^ j = a j W 1 2.89 + a j W 2 W 2 a j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \hat{a}_{j}=\frac{\varvec{a}_{j}^{'}\varvec{W}_{1}}{\sqrt{2.89+\varvec{a}_{j}^{'}{\varvec{W}}_{2}\varvec{W}_{2}^{'}\varvec{a}_{j}} } \end{aligned}$$\end{document}

and

b ^ j = d j - a j μ a j W 1 . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \hat{b}_{j}\mathrm {=}\frac{d_{j}-a_{j}^{'}\varvec{\mu }}{\varvec{a}_{j}\varvec{W}_{1}}. \end{aligned}$$\end{document}

we obtain the 2PL item parameter estimates: a ^ = . 73 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{a}= .73$$\end{document} and b ^ = - . 47 . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{b}=-.47. $$\end{document} Fig. 23 shows the estimated ICC using the a ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{a}$$\end{document} and b ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{b}$$\end{document} and the five color-coded estimated ( υ 1 , p ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\upsilon _{1},p)$$\end{document} values. Figure 24 illustrates three different perspectives of all the elements of Camilli’s formulation, including the M2PL response surface and corresponding contour plot, the RC (v1) which represents the estimated unidimensional scale, v2 (the orthogonal second principal component, the RC plane, the underlying conditional latent ability distribution, G υ 1 | υ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$G\left({{\upupsilon 1}}\vert {{\upupsilon 2}}\right) $$\end{document} , and the estimated unidimensional ICC.

Figure. 23 Estimated unidimensional ICC with five estimated ( υ 1 , p ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\upsilon _{1},p)$$\end{document} values plotted.

Figure. 24 Three different perspectives of different elements that were used in the mapping of the two-dimensional compensatory model onto a unidimensional ICC.

Appendix C

Ackerman and Xie (Reference Ackerman and Xie2019) created a DIF Graphical Simulator. This simulator enables researchers to modify the underlying two-dimensional latent distributions for the Reference and Focal groups and the M2PL item parameters for a given suspect item. Using the Camilli (Reference Camilli1992) analytical derivations, the 2PL unidimensional discrimination (a) and difficulty (b) parameters are estimated and the resulting ICC is illustrated. A mean–mean transformation is used to place the Focal group’s estimated parameters onto the scale of the Reference group. The transformed ICCs are then displayed, and the degree of misfit, defined as: θ = - 3 θ = 3 ( P ( θ ) Ref - P ( θ ) Foc ) 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sum \nolimits _{\theta =-3}^{\theta =3} {({P(\theta )}_{Ref}-{P(\theta )}_{Foc})}^{2} $$\end{document} , is calculated. The DIF Graphical Simulator is shown in Fig. 25

Figure. 25 The graphical display is shown by the DIF Graphical Simulator.

Footnotes

Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

References

Ackerman, T.A.. (1991). The use of unidimensional parameter estimates of multidimensional items in adaptive testing. Applied Psychological Measurement, 15, 1324.CrossRefGoogle Scholar
Ackerman, T.A.. (1992). A didactic explanation of item bias, item impact, and item validity from a multidimensional perspective. Journal of Educational Measurement, 29, 6791.CrossRefGoogle Scholar
Ackerman, T.A., Evans, J.A.. (1994). The influence of conditioning scores in performing DIF analyses. Applied Psychological Measurement, 18, 329342.CrossRefGoogle Scholar
Ackerman, T. A., McCallaum, B., & Ngerano, G. (2014). Differential item functioning from a compensatory-noncompensatory perspective. Invited address to the International Congress of Educational Research, Haceppette University, Ankara, Turkey.Google Scholar
Ackerman, T. A. & Xie, Q. (2019). DIF graphical simulator. Educational Measurement: Issues and Practice, 38(1), 5. https://doi.org/10.1111/emip.12171.CrossRefGoogle Scholar
Ackerman,T. A. & Xie, Q. (2019). DIF graphical simulator. Educational Measurement: Issues and Practice, 38(1), 5. https://doi.org/10.1111/emip.12171.CrossRefGoogle Scholar
Bauer, D.J., Belzak, W.C., Cole, V.T.. (2020). Simplifying the assessment of measurement invariance over multiple background variables: Using regularized moderated nonlinear factor analysis to detect differential item functioning. Structural Equation Modeling: A Multidisciplinary Journal, 27, 4355.CrossRefGoogle ScholarPubMed
Bolt, D. M., & Johnson. (2009). Addressing score bias and differential item functioning due to individual differences in response style. Applied Psychological Assessment, 33 (5), 335352. https://doi.org/10.1177/0146621608329891.Google Scholar
Camilli, G. (1992). A conceptual analysis of differential item functioning in terms of a multidimensional item response model. Applied Psychological Measurement, 16 2129147.CrossRefGoogle Scholar
Camilli, G, Penfield, D.A.. (1997). Variance estimation for differential test functioning based on Mantel–Haenszel statistics. Journal of Educational Measurement, 34 2123139.Google Scholar
Carlson, J.E.. (2017). Unidimensional vertical scaling in multidimensional space. ETS 11 Research Report Series, 2017 1128.CrossRefGoogle Scholar
Cattell, R.B.. (1966). The scree test for the number of factors. Multivariate Behavioral Research, 1 2245276PMID 26828106.CrossRefGoogle ScholarPubMed
Clauser, B. E. & Mazor, K. M. (1998). Using Statistical Procedures To Identify Differentially Functioning Test Items. An NCME Instructional Module. Educational Measurement: Issues and Practice, 17(1), 3144. https://doi.org/10.1111/j.1745-3992.1998.tb00619.x.CrossRefGoogle Scholar
Clauser, B.E., Nungester, R.J., Swaminathan, H. (1996). Improving the matching for DIF analysis by conditioning on both test score and an educational background variable. Journal of Educational Measurement, 33 4454464.CrossRefGoogle Scholar
Clauser, B. E. & Mazor, K. M. (1998). Using Statistical Procedures To Identify Differentially Functioning Test Items. An NCME Instructional Module. Educational Measurement: Issues and Practice, 17(1), 3144. https://doi.org/10.1111/j.1745-3992.1998.tb00619.x.CrossRefGoogle Scholar
Cohen, A.S., Kim, S.H., Baker, F.B.. (1993). Detection of differential item functioning in the graded response model. Applied Psychological Measurement, 17 4335350.CrossRefGoogle Scholar
De Boeck, P. (2008). Random item IRT models. Psychometrika, 73, 533559.CrossRefGoogle Scholar
Fleishman, J. A. & Lawrence, W. F. (2003). Demographic variation in SF-12 scores: true differences or differential item functioning. Medical care, 41(7), 7586. https://doi.org/10.1097/01.MLR.0000076052.42628.CrossRefGoogle Scholar
Ip, E. H. (2010). Empirically indistinguishable multidimensional IRT and locally dependent unidimensional item response models. British Journal of Mathematical and Statistical Psychology, 63, 395416. https://doi.org/10.1348/000711009x466835.CrossRefGoogle Scholar
Kolen, M.J., Brennan, R.L.Test equating, scaling, and linking: Methods and practices 2014 New YorkSpringer.CrossRefGoogle Scholar
Lim, H, Choe, E.M., Han, K. (2022). A residual-based differential item functioning detection framework in item response theory. Journal of Educational Measurement, .CrossRefGoogle Scholar
Liu, Y, Zumbo, B, Gustason, P, Huang, Y, Kroc, E, Wu, A. (2016). Investigating causal DIF via propensity score methods. Practical Assessment, Research and Evaluation, 21 13124.Google Scholar
Ma, Y., Ackerman, T., Ip, E., & Chung, J. (2023). The effect of the projective IRT model on DIF detection. IMPS 2023 Annual Meeting, College Park, Maryland, United States.Google Scholar
Mazor, K.M., Hambleton, R.K., Clauser, B.E.. (1998). Multidimensional DIF analyses: The effects of matching on unidimensional subtest scores. Applied Psychological Measurement, 22 4357367.CrossRefGoogle Scholar
Flowers, C.P., Oshima, T.C., Raju, N.S.. (1999). A description and demonstration of the polytomous-DFIT framework. Applied Psychological Measurement, 23 4309326.CrossRefGoogle Scholar
Holland, P. W., & Thayer, D. T. (1988). Differential item functioning detection and the Mantel–Haenszel procedure. In H. Wainer & H. I. Braun (Eds.), Test validity (pp.129145). Hillsdale, NJ: Lawrence Erlbaum. http://www.books.google.co.ke/books?isbn=1109103204.Google Scholar
Huang, P.H.. (2018). A penalized likelihood method for multi-group structural equation modelling. British Journal of Mathematical and Statistical Psychology, 71, 499522 121-130.CrossRefGoogle ScholarPubMed
Junker, B., & Stout, W. F. (1991). Robustness of ability estimation when multiple traits are present with one trait dominant. Paper presented at the International Symposium on Modern Theories in Measurement: Problems and Issues. Montebello, Quebec.Google Scholar
Kok, F. (1988). Item bias and test multidimensionality. In R. Lange Heine & J. Rost (Eds.), Latent trait and latent class models (pp. 263275). New York: Plenum Press. https://doi.org/10.1007/978-1-4757-5644-9_12.CrossRefGoogle Scholar
Li, Y.H., Lissitz, R.W.. (2000). An evaluation of the accuracy of multidimensional IRT linking. Applied Psychological Measurement, 24, 115138.CrossRefGoogle Scholar
Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum https://eric.ed.gov/?id=ED312280.Google Scholar
Magis, D, Beland, S, Tuerlinckx, F, De Boeck, P. (2010). A general framework and an R package for the detection of dichotomous differential item functioning. Behavior Research Methods, 42 3847862.CrossRefGoogle Scholar
McKinley, R. L., & Reckase, M. D. (1982). The use of the general rasch model with multidimensional item response data.Google Scholar
Muthen, B, Asparouhov, T. (2018). Recent methods for the study of measurement invariance with many groups: Alignment and random effects. Sociological Methods & Research, 47, 637664.CrossRefGoogle Scholar
Oshima, T. C., Davey, T. C., & Lee, K. (2000). Multidimensional linking: Four practical approaches. Journal of Educational Measurement 37(4), 357373. http://www.jstor.org/stable/1435246.CrossRefGoogle Scholar
Penfield, R, Algina, J. (2006). A generalized DIF effect variance estimator for measuring unsigned differential test functioning in mixed format tests. Journal of Educational Measurement, 43 4295312.CrossRefGoogle Scholar
Raju, N.S.. (1988). The area between two item characteristic curves. Psychometrika, 53, 495502.CrossRefGoogle Scholar
Raju, N.S., van der Linden, W.J., Fleer, P.F.. (1995). IRT-based internal measures of differential functioning of items and tests. Applied Psychological Measurement, 19, 353368.CrossRefGoogle Scholar
Ramsay, J. O. (1990). A kernel smoothing approach to IRT modeling. Talk presented at the Annual Meeting of the Psychometric Society at Princeton New Jersey.Google Scholar
Reckase, M.D. (2009) Multidimensional item response theory New YorkSpringer.CrossRefGoogle Scholar
Shealy, R, Stout, W.F., (1993). An item response theory model for test bias, In Holland, P, Wainer, H. (Eds.), Differential item functioning, HillsdaleErlbaum 197239.Google Scholar
Shealy, R, Stout, W.F.. (1993). A model-based standardization approach that separates true bias/DIF from group differences and detects test bias/DTF as well as item bias/DIF. Psychometrika, 58, 159–19.CrossRefGoogle Scholar
Spray, J., Davey, T., Reckase, M., Ackerman, T. & Carlson, J. (1990). Comparison of two logistic multidimensional item response theory models. ACT Research Report ONR90-8.Google Scholar
Stout, W.F.. (1987). A nonparametric approach for assessing latent trait unidimensionality. Psychometrika, 52 4589617.CrossRefGoogle Scholar
Strachan, T, Ip, E, Fu, Y, Ackerman, T, Chen, S.H., Willse, J. (2020). Robustness of projective IRT to misspecification of the underlying multidimensional model. Applied Psychological Measurement, 44 5362375.CrossRefGoogle ScholarPubMed
Strachan, T, Cho, U.H., Ackerman, T, Chen, S-H, de la Torre, J, Ip, E. (2022). Evaluation of the linear composite conjecture for unidimensional IRT scale for multidimensional responses. Applied Psychological Measurement, 46 5347360.CrossRefGoogle ScholarPubMed
Swaminathan, H., & Rogers, H. J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27, 361370. https://doi.org/10.1111/j.1745-3984.1990.tb00754.x.CrossRefGoogle Scholar
Sympson, B. (1978) A model for testing with Multidimensional items. In Weiss, D. J. (ed) Proceedings of the 1977 Computerized Adaptive Testing Conference, University of Minnesota, Minneapolis.Google Scholar
Thissen, D, Steinberg, L, Wainer, H. (1988). Use of item response theory in the study of group differences in trace lines. In Wainer, H, Braun, H.I.(Eds.), Test validity, Hillsdale NJErlbaum 147169.Google Scholar
Wang, M. (1985). Fitting a unidimensional model multidimensional item response data: The effects of latent space misspecification on the application of IRT Unpublished manuscript, University of Iowa.Google Scholar
Williams, N.J., Beretvas, S.N.. (2006). DIF identification using HGLM for polytomous items. Applied Psychological Measurement, 30, 2242.CrossRefGoogle Scholar
Wolfram, 2020 Wolfram Research, Inc., (2020). Mathematica, (Version 12.2), [Computer Software]. Champaign, IL.Google Scholar
Zhang, J, Stout, W.F.. (1999). The theoretical DETECT index of dimensionality and its application to approximate simple structure. Psychometrika, 64 2213249.CrossRefGoogle Scholar
Figure 0

Table 1 A compilation of seminal DIF methodology.

Figure 1

Figure. 1 Graphic representation of the response surface for the compensatory model and its corresponding contour.

Figure 2

Figure. 2 Contour plot of compensatory model item with a1=\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$_{\textrm{1}} = $$\end{document} a2=\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$_{\textrm{2}} = $$\end{document} 1.0 and d=\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$=$$\end{document}.0.

Figure 3

Figure. 3 Illustration of the direction of maximum slope for a compensatory item projected onto the latent ability plane to form its item vector.

Figure 4

Figure. 4 Item vectors for a standardized 101-item test with three content areas.

Figure 5

Figure. 5 A validity sector enclosing item vectors (green) for a 60-item standardized test. Vectors outside the sector (red) are measuring composites that could result in DIF. (Color figure online)

Figure 6

Figure. 6 Conditional centroid plots for each content category (top) and total test score (bottom).

Figure 7

Figure. 7 RCs for two groups having different underlying ability distributions based on a two-item test (left) and orthogonal mappings upon the composites for two examinees, X and Y, (right).

Figure 8

Figure. 8 RCs for the 101-item test for the three subsections and the total test.

Figure 9

Figure. 9 Two perspectives illustrating the mapping of the two-dimensional latent abilities onto the expected number correct score scale.

Figure 10

Figure. 10 Projected unidimensional ICC for a M2PL item with a1 =\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$=$$\end{document} 1.5, a2 =\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$=$$\end{document} 0, and d =\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$=$$\end{document}.5, and a reference composite (RC) angle of 45\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$^\circ $$\end{document} yielding 2PL a^=.73\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\hat{a} =.73$$\end{document} and b^=-.47\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\hat{b} = -.47$$\end{document}.

Figure 11

Figure. 11 Item vectors for a hypothetical 19-item test.

Figure 12

Figure. 12 Estimated a-(top) and b values (bottom) for different underlying ability distributions.

Figure 13

Figure. 13 Composite graph illustrating the 19 M2PL item response surfaces and green RC plane (top left); the test characteristic surface, contour, reference composite (red arrow), and unidimensional TCC (top right); and, the 19 estimated unidimensional ICCs colored by item vector angle (bottom). (Color figure online)

Figure 14

Table 2 Generating compensatory model parameters for a 10-item test.

Figure 15

Figure. 14 Graphical displays of the item vectors, and the underlying Reference (red) and Focal (green) distributions for each of the three cases outlined in Table 3. (Color figure online)

Figure 16

Table 3 Analytical results of estimated 2PL item parameters for Item 10 for the Reference and Focal group based on their underlying different distributions.

Figure 17

Table 4 A 2 x 2 contingency used in the MH computation.

Figure 18

Figure. 15 Item vectors for the 30-item symmetric test and conditioning score composite directions.

Figure 19

Figure. 16 Overlaying an 8 x 8 grid on the (θ\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\theta $$\end{document}, η)\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\eta )$$\end{document} latent ability plane with Reference (red) and Focal (green) underlying and marginal distributions. (Color figure online)

Figure 20

Figure. 17 MH and SIBTEST DIF results by item for each of the four condition scores.

Figure 21

Figure. 18 Contour and difference plots for a matched noncompensatory and compensatory item.

Figure 22

Table 5 Compensatory and Noncompensatory item parameters matched on p value for a given underlying ability distribution.

Figure 23

Figure. 19 Mantel–Haenszel and SIBTEST DIF results for each of the 30 items.

Figure 24

Figure. 20 The contour graph of the original item response surface with direction of first (ν1)\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\nu _{1})$$\end{document} and second principal component (ν2)\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\nu _{2})$$\end{document} (left) and contour surface rotated 45\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$^\circ $$\end{document} (right).

Figure 25

Figure. 21 A contour plot of the original bivariate normal distribution (left) and the contour plot of the rotated distribution (right).

Figure 26

Figure. 22 Conditional normal distributions, Gυ1|υ2,\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$G\left({{\upupsilon 1}}\vert {{\upupsilon 2}}\right) ,$$\end{document} for υ\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\upsilon $$\end{document}1 =\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$=$$\end{document} -2, -1,0,1,2 (left) and conditional ICCs, puij=1|υ1,υ2\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$p\left({u_{ij}=1}\vert {\upsilon _{1},\upsilon _{2}}\right) $$\end{document}, for υ\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\upsilon $$\end{document}1 =\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$=$$\end{document} -2, -1, 0, 1, 2 (right).

Figure 27

Figure. 23 Estimated unidimensional ICC with five estimated (υ1,p)\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\upsilon _{1},p)$$\end{document} values plotted.

Figure 28

Figure. 24 Three different perspectives of different elements that were used in the mapping of the two-dimensional compensatory model onto a unidimensional ICC.

Figure 29

Figure. 25 The graphical display is shown by the DIF Graphical Simulator.