1. Introduction
Latent variable models allow the estimation of latent traits, such as party ideology, based on manifest behavior, such as issue positions (Bafumi et al., Reference Bafumi, Gelman, Park and Kaplan2005). This approach has proven useful in many areas, one of which is placing parties in political space (König et al., Reference König, Marbach and Osnabrügge2013). A recurring question in such applications, however, is whether or not the resulting estimates of party positions are comparable across contexts writ large (Davidov et al., Reference Davidov, Meuleman, Cieciuch, Schmidt and Billiet2014). Comparability is hampered under contextual variation in the associations between latent and manifest variables, what is typically called differential item functioning (DIF).
DIF comes in two types. The first reflects artifacts emanating from measurement, which have received a great deal of attention (Bakker et al., Reference Bakker, Edwards, Jolly, Polk, Rovny and Steenbergen2014a,Reference Bakker, Jolly, Polk and Pooleb; Hare et al., Reference Hare, Armstrong, Bakker, Carroll and Poole2015). The second reflects substantive differences in the ideological space and its meaning and relevance for parties’ issue positions. This has received much less attention, although the fluidity of the specific meaning of general ideology is well-documented at the mass and party levels (Bornschier, Reference Bornschier2010; Wheatley and Mendez, Reference Wheatley and Mendez2021). This is our central focus.
We introduce a novel hierarchical item response-theoretic (IRT) model that allows researchers to disentangle and quantify different sources of DIF across countries, parties, and measurement instruments. The model particularly yields insight into the substantive aspect of DIF: the differential meaning of ideology across countries and the fact that ideology does not fully account for the positions of some parties on some issues.
We apply this model to the placement of Western-European parties in a two-dimensional ideological space using data from the 2019 Chapel Hill Expert Survey (CHES; Jolly et al., Reference Jolly, Bakker, Hooghe, Marks, Polk, Rovny, Steenbergen and Vachudova2022). We find that economic left–right maps quite uniformly onto issues across Western Europe. To a slightly lesser degree, this is true of the cultural dimension of ideology. By contrast, issues related to the European Union map rather heterogeneously across Europe. We also find that the two ideological dimensions cannot fully account for the positioning of specific parties on specific issues. The analysis speaks to a variety of ongoing debates about the Western-European political space, including its meaning and dimensionality (Hooghe and Marks, Reference Hooghe and Marks2018; De Vries and Hobolt, Reference De Vries and Hobolt2020; Wheatley and Mendez, Reference Wheatley and Mendez2021). More generally, the tool we develop (1) distinguishes between a variety of sources of contextual variation; (2) can be used for a wide variety of data sources on party positions, including expert surveys, voter placements, party manifestos, and other textual data; and (3) extends other frameworks such as Aldrich–McKelvey (A-M) scaling (Hare et al., Reference Hare, Armstrong, Bakker, Carroll and Poole2015).
2. Understanding DIF in party ideology
Latent variable models typically assume that latent traits are reflected in observable behavior and that, indeed, those traits induce coherence in what would otherwise be disparate items (Borsboom et al., Reference Borsboom, Mellenbergh and van Heerden2003). When the manifest variables are political issues, the latent variables are typically conceived of as ideological dimensions, following the logic of ideological constraint outlined by Converse (Reference Converse1964). In the Western-European context, there is increasing consensus that two dimensions are required to capture the nature of political conflict across political parties and citizens (Marks et al., Reference Marks, Hooghe, Nelson and Edwards2006; Bornschier, Reference Bornschier2010; Hooghe and Marks, Reference Hooghe and Marks2018). We take this two-dimensional view of the political space as our starting point in this research note that imposes a common structure between latent traits and observable behavior.Footnote 1
DIF concerns the question of whether latent dimensions connect uniformly to manifest variables (for an overview, see Davidov et al., Reference Davidov, Meuleman, Cieciuch, Schmidt and Billiet2014). We identify four sources of DIF in placing parties in a political space; two related to measurement and two related to substantive differences across contexts and actors. A first measurement-related source of DIF are the raters of party positions, whether they be experts, voters, or coders of party manifestos (e.g., see Steenbergen and Marks, Reference Steenbergen and Marks2007). This problem is typically addressed through response aggregation (e.g., Jolly et al., Reference Jolly, Bakker, Hooghe, Marks, Polk, Rovny, Steenbergen and Vachudova2022).
A second measurement-related source of DIF concerns cross-national differences in response behavior, which may result from different understandings of constructs or response styles. This problem can be addressed through the use of vignettes (Bakker et al., Reference Bakker, Edwards, Jolly, Polk, Rovny and Steenbergen2014a,Reference Bakker, Jolly, Polk and Pooleb) or A-M scaling (Hare et al., Reference Hare, Armstrong, Bakker, Carroll and Poole2015). Such approaches can be used to recover cross-national distortions in the linkage between observed indicators and latent traits.
Of course, we pay close attention to these sources of DIF in our analysis. Our primary interest, however, is one that we share with comparativists; substantive differences in the ideological space, as the meaning of political ideology is not the same everywhere (Huber and Inglehart, Reference Huber and Inglehart1995; Kriesi et al., Reference Kriesi, Grande, Lachat, Dolezal, Bornschier and Frey2006; Bornschier, Reference Bornschier2010). The central question here is how political ideology connects to specific issues, i.e., how issues are politicized (Bakker et al., Reference Bakker, Jolly and Polk2012; Rovny and Whitefield, Reference Rovny and Whitefield2019). If it is true, as much of the literature assumes, that ideology has a static quality, then the question of how existing ideological conflicts translate into issues is highly relevant and a potential third source of cross-national DIF (Marks et al., Reference Marks, Hooghe, Nelson and Edwards2006; Hooghe and Marks, Reference Hooghe and Marks2018).
A fourth source of DIF concerns cross-party differences. It is possible that parties take issue positions that are unexpected given their placement in ideological space. This is consistent with the idea of political entrepreneurship—parties strategically seeking to politicize issues orthogonal to existing ideological dimensions of conflict—but it may also be a legacy of specific issues giving rise to the party (De Vries and Hobolt, Reference De Vries and Hobolt2020). We conceive of this in terms of idiosyncratic shocks, in the way Lauderdale et al. (Reference Lauderdale, Hanretty and Vivyan2018) have done for public opinion. The result of these shocks is that parties with identical ideological positions can nevertheless display quite different positions on certain issues, reflecting that issue-specific dimensions of conflict need not necessarily align with ideological dimensions of conflict.
We need then a model that incorporates four different sources of variation relative to the common structure of a two-dimensional political space as described above: (1) variation among observers of issue positions; (2) national variation in response styles across those observers; (3) cross-national differences in the linkage between issue positions and political ideology; and (4) idiosyncratic shocks that cause ideologically identically positioned parties to take on heterogeneous issue positions. Such a model does not yet exist and we develop it in the next section.
3. Model
3.1. Setup
In our exposition, y ijce is the position of political party i on issue j, in country c, as indicated by expert e on an ordered scale. While we focus on experts, e can be generally thought of as any coder of issue positioning. The probability $\Pr ( y_{ijce} = k)$, where the response option k = 1, ⋅ ⋅ ⋅ , K, is determined by an appropriate link function between y ijce and the latent continuum $y_{ijce}^\ast$. The link function can be adapted to incorporate other kinds of measures such as word counts for manifestos.
Two party-specific factors are systematically associated with $y_{ijce}^\ast$, namely a party's position on D ideological dimensions, θic, and a party's idiosyncratic preferences, γ ij. The discrimination parameters of an issue in ${\mathbb R}^D$ are given by βjc and capture the strength and direction of the association between latent ideological traits and issue positions. Importantly, the subscript c allows for cross-national variation in these associations, so that ideological conflict can play out differently at the issue-level in different locations. In keeping with random-effects IRT models, we postulate β jcd = β jd0 + β jcd1, where β jd0 is the mean and β jcd1 is a country-issue-specific error term (cf. De Jong et al., Reference De Jong, Steenkamp and Fox2007; Fox and Verhagen, Reference Fox, Verhagen, Davidov, Schmidt, Billiet and Meuleman2018). This way of modeling the discrimination parameters constitutes one of our main contributions.
The idiosyncratic shocks are applied to the item difficulty parameters of the model. Specifically, for each party and each issue, we postulate a difficulty of γ ij. This is an additive component that makes it more or less likely for political parties to embrace a particular position, regardless of θic. Thus, two parties with identical θic can still have different positions on an issue.
Two additional factors influence the relationship between latent and manifest variables, namely heteroskedasticity in the scale of experts’ errors, σ e (cf. Harvey, Reference Harvey1976), and country-level variation in scaling, ζ c. The expert-specific term captures variation across experts, whereas the parameter ζ c is akin to the scale parameter in A-M models (Hare et al., Reference Hare, Armstrong, Bakker, Carroll and Poole2015) and captures cross-national differences in response behavior.
Combining everything so far, we obtain the following model:
We use an ordered probit specification to link $y_{ijce}^\ast$ and y ijce, using a set of ordered cutpoints α jc1, ⋅ ⋅ ⋅ , α jcK−1 (Samejima, Reference Samejima1969). Similar to the discrimination parameters, the cutpoints vary across issues and countries. We model them as α jck = α jk0 + α jc1 + α c, where α c is akin to the shift parameter in the A-M model, which captures country-specific shifts in responses (Hare et al., Reference Hare, Armstrong, Bakker, Carroll and Poole2015), and α jc1 is a country-issue-specific error term. The response probabilities are now given by:
where Φ denotes the cumulative standard normal distribution function.
The model's innovation is to allow for variation in the discrimination parameters and thresholds. In addition, it combines research on idiosyncratic shocks (Lauderdale et al., Reference Lauderdale, Hanretty and Vivyan2018) and A-M scaling (Hare et al., Reference Hare, Armstrong, Bakker, Carroll and Poole2015). Indeed, the model generalizes the A-M approach to settings with multiple observations of indicators for units of analysis, whereas that approach is typically used with single observations of indicators for units. This allows us to estimate country-item-specific scale and shift parameters in addition to the country-specific scale and shift parameters estimable via the A-M approach.
3.2. Priors
We estimate our IRT model using Bayesian inference, as implemented in Stan (Carpenter et al., Reference Carpenter, Gelman, Hoffman, Lee, Goodrich, Betancourt, Brubaker, Guo, Li and Riddell2017), with the following priors. Parties’ positions on latent ideological dimensions are drawn from a standard normal distribution: $\theta _{icd} \sim {{\cal N}}( 0,\; 1)$. Further, $\gamma _{ij} \sim {{\cal N}}( 0,\; \sigma _{\gamma })$. Throughout, the hyperparameters σ for priors of the idiosyncratic shocks (σ γ), the random effects on the discrimination parameters (σ β) and cutpoints (σ α), as well as expert-specific errors (σ σ) and country-specific scale parameters (σ ζ) are drawn from a truncated normal distribution, $\sigma \sim {{\cal N}_ + }( 0,\; 1)$, to ensure that σ > 0.
The mean item parameters are also drawn from normal distributions: (1) $\beta _{jd0} \sim {{\cal N}}( 0,\; 1)$ and (2) $\alpha _{jk0} \sim {{\cal N}}( 0,\; 1)$ (plus an ordering constraint). The random effects for both sets of parameters are drawn from multivariate normal distributions, specifically ${\boldsymbol \beta }_{cd1} \sim {{\cal M}VN}( {\bf 0},\; {\bf \Sigma }_{{\boldsymbol \beta }_d})$ and ${\boldsymbol \alpha }_{c1} \sim {{\cal M}VN}( {\bf 0},\; {\bf \Sigma }_{{\boldsymbol \alpha }})$. The covariance matrices here are estimated from the data by first multiplying scale parameters σ β and σ α with two vectors of length J drawn from uniform Dirichlet distributions (${{\cal D}}( 1)$). This results in vectors τ β and τ α as σ β and σ α are distributed across each element of the vectors constrained to sum to 1. Then, diagonal matrices with diagonal elements consisting of these vectors are multiplied with correlation matrices, Ω, drawn from LKJ-priors with shape = 4 so that, e.g., Σα = diag(τ α)Ωαdiag(τ α) (Barnard et al., Reference Barnard, McCullogh and Meng2000). The country-specific parameters for the ordered cutpoints are drawn from a standard normal distribution: $\alpha _c \sim {{\cal N}}( 0,\; 1)$.
Finally, the scale of expert-specific errors is drawn from a symmetric Dirichlet distribution, where the hyperparameter 1/σ σ is equal across experts: $\sigma _e \sim {{\cal D}}( 1/\sigma _{\sigma })$. This constitutes an uninformative prior over the resulting vector and assures that the resulting estimates cannot be negative. This is also the case for the country-specific scale parameters: $\zeta _c \sim {{\cal D}}( 1/\sigma _{\zeta })$.
3.3. Identification
Our IRT model is not identified without further constraints, which pertain to location, scale, and rotation of the latent dimensions (see Bafumi et al., Reference Bafumi, Gelman, Park and Kaplan2005). We fix the scale and location by re-scaling estimated ideological positions θ to a standard normal distribution after each iteration of the sampling procedure. We address the rotational invariance problem by constraining the discrimination parameters to 0 for selected issues on selected dimensions (see below) and by setting starting values for party positions based on prior knowledge.
The introduction of idiosyncratic preferences and random effects on item parameters introduce the potential for novel identification problems regarding location. We solve this by fixing the mean of those parameters to 0 for each issue and parameter. Introducing σ e and ζ c raises potential identification issues for the scale of the estimated latent parameters. Thus, we impose a mean of 1 for those parameters.
We assess convergence of our model via the $\hat {R}$ convergence diagnostic, which is below 1.1 for all parameters (Gelman and Rubin, Reference Gelman and Rubin1992).
4. Application
4.1. Data
In our application, we use expert-level CHES data (2019), which covers 15 Western-European countries, 21 issues, 129 political parties, and 191 experts (Jolly et al., Reference Jolly, Bakker, Hooghe, Marks, Polk, Rovny, Steenbergen and Vachudova2022). For each issue, experts position the party on an ordered response scale. The issues are selected to tap into three distinctive areas of political conflict, namely the economy (five items), social/cultural issues (ten items), and the European Union (six items). For more on the data, see Appendix 1 in the SI.
One major advantage of CHES is that it contains multiple observations per party and issue from different experts. On the one hand, this allows us to disentangle ideology from idiosyncrasy. On the other, this allows us to estimate σ e. Other data sources that have this feature are voter placements of parties or party manifestos, if there are multiple coders.
We impose a two-dimensional ideological structure to explain variation in party preferences. One dimension captures conflict over the economy, specifically the role of the state, whereas the second dimension captures conflict over culture with poles that have sometimes been described as green-alternative-libertarian (GAL) and traditional-authoritarian-nationalist (TAN) (Bakker et al., Reference Bakker, Edwards, Jolly, Polk, Rovny and Steenbergen2014a). For identification purposes, we set the discrimination parameter of the issue “economic intervention” to 0 for the cultural dimension. We also set the discrimination parameter for “immigration policy” on the economic dimension equal to 0. Substantively, this specifies that these two issues are entirely unrelated to these two ideological dimensions.
4.2. The ideological component
How did parties place on the ideological components? What patterns do the item discrimination parameters reveal? Figure 1 displays the correlation between the standardized mean expert placements on economic left–right and GAL–TAN (horizontal axes) and estimates of θic from our model (vertical axis). Overall, the values strongly correlate: r = 0.93 for the economic dimension and r = 0.90 for the cultural dimension. We interpret this as high face validity of our model.
Figure 2 is a box plot of the mean item discrimination parameters, βjc, with the boxes reflecting cross-country variation.Footnote 2 The absolute size of the parameters indicates how strongly issues connect with the ideological dimensions (i.e., how well does an issue distinguish parties on either side of a dimension?), whereas the sign indicates the nature of the ideological politicization of an issue (e.g., does the economic right or the economic left favor a policy?). The figure highlights that the core issues associated with the economic dimension are very similar across Western Europe (visible in the first row of the figure). For deregulation, economic intervention, income redistribution, and spending versus tax cuts, there is hardly any variation in the parameters.
On the social/cultural issues, there is more heterogeneity (visible in the second row). This is particularly true for urban–rural policy emphases, the role of religious principles in politics, and the status of regions. Of the latter issue, it could be said that it is salient in only a few countries (e.g., Belgium, Spain, and the United Kingdom) and can be framed in both economic and cultural terms. Considering crime, the environment, immigration, nationalism, and social lifestyle, there is considerably less cross-national variation in the discrimination parameters. Those issues appear to define the cultural GAL–TAN dimension uniformly in Western Europe.
The greatest variation in item discrimination parameters can be found for the issues related to European integration (visible in the third and final row). Here the inter-quartile distances (the box lengths) and the whiskers are sizable compared to the other issues. In some countries, such as the United Kingdom, the European Union mostly seems to be a cultural issue, showing a much larger discrimination parameter for that than for the economic dimension (see Figures 4 and 5 in the SI). In other countries, such as Greece, Europe appears to be contested mainly on the economic dimension.
The payoff of our modeling strategy is that we unconfound the discrimination parameters from the scaling factor ζ c. If we do not do this, then at least part of the variation in the association between items and latent variables may be due to differences in cross-national response behavior. That is not the case with our setup.
4.3. The idiosyncratic component
Our model allows for party-specific shocks, γ ij, relative to the ideological component structuring parties’ issue preferences. Figure 3 shows the size of those shocks for the various issues. The issues are sorted by the standard deviations of the shocks.
The first thing to observe is that the shocks can be sizable, meaning that for certain issues party preferences are driven not by ideology alone, but also by idiosyncratic preferences. If the image so far has been one of relative agreement about the issues that discriminate in the two-dimensional space, the current image is one of distinct heterogeneity.
The largest shocks appear for five issues: (1) the role of religious principles in politics; (2) the urban–rural focus of parties; (3) their regional policy stances; (4) the environment; and (5) social lifestyle. All of those issues discriminate on the cultural dimension. It is clear, however, that some parties take positions that deviate from their cultural ideology. Further analysis shows this to be true of the agrarian party family for urban–rural concerns, the confessional family for religious principles and social lifestyle, the green family for the environment, and the regionalist family for regional issues (see Figure 6 in the SI).
4.4. Other results
In our discussion thus far, we have focused on parameters associated with party-specific factors influencing parties’ issue positions, namely parties’ latent ideological placement (θic), discrimination parameters (βjc), and idiosyncratic shocks (γ ij). In doing so, we have highlighted the face validity of this model, variation in the politicization of issues across Western Europe, and the extent to which parties’ positions are determined by factors other than ideology. These are the quantities most clearly associated with our primary interest, namely substantive sources of DIF.
For brevity, we present the resulting estimates of other parameters regarding measurement-specific sources of DIF from this model in the SI: the country-issue-specific shocks to difficulty parameters (α jc; Figure 7), the country-specific scale and shift parameters (ζ c and α c; Figures 8 and 9), as well as expert-level errors (σ e; Figure 10). In Appendix 3 in the SI, we show via model comparisons that the single sole component that accounts for the largest increase in model performance relative to a conventional IRT model is the inclusion of idiosyncratic preferences. Overall, the results show that our model disentangles different sources of DIF that represent different sources of both substantial and measurement-specific variation in the association between manifest and latent variables across countries.
5. Conclusions
In this paper, we have outlined a method that allows scholars to parse DIF into a variety of sources, distinguishing substantive factors (i.e., idiosyncratic preferences and differential meanings of ideology) from measurement-related factors. The approach speaks to the nature of political conflict, issue politicization, and the role of agency in politics, i.e., parties positioning themselves in unique ways. In general, the IRT model uncovers political space in all its local nuance. Since the concept of space plays a crucial role in voting behavior and political representation, it is essential that scholars have the tools to understand the full complexity of its nature.
Our approach is general. While we focused on expert ratings of party positions, one could easily use other measurement instruments such as voter placements or placements by coders of party manifestos or other relevant texts. Key for the comprehensive model outlined here is that there are multiple observations per country, party, rater, and issue; but a more restricted model can also be estimated with more lenient data requirements. Of course, our insights were premised on a specific choice of two dimensions. For country comparisons to work, it is necessary to impose a specific dimensionality of the space. Differences in that dimensionality are another source of DIF. As long as we can identify subsets of at least two countries that abide by a particular dimensionality, our approach can be applied to those subsets. For instance, at the mass-level, Wheatley and Mendez (Reference Wheatley and Mendez2021) found that two dimensions were needed in some countries, whereas others required three. Thus, we could estimate separate models for both groups of countries.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/psrm.2023.16 and replication materials at https://doi.org/10.7910/DVN/98ZAOG.
Acknowledgements
The authors are listed in alphabetical order. We thank Tom Paskhalis, Sarah Engler, Matthias Enggist, and the two anonymous reviewers for their insightful feedback. Earlier versions were presented at the 2021 Annual Meetings of the European Political Science Association, the Swiss Political Science Association, and the Political Studies Association of Ireland.