Policy preferences of members of parliaments (MPs) are a central concept in comparative research. They are core to the study of MP's individual behavior, help understand the relationships between representatives and their electorate, and are also an essential factor driving public policies. Starting with the scaling of legislative roll-calls (Poole, Reference Poole2005), several methodological advancements have improved our understanding of the ideological placement of political actors. However, a particularly thorny problem remains in estimating individual positions in legislatures with strong partisan discipline and rare opportunities for unconstrained voting. In particular, a roll-call vote is more likely to express government-opposition dynamics than policy positions (Spirling and McLean, Reference Spirling and McLean2007; Dewan and Spirling, Reference Dewan and Spirling2011; Hix and Noury, Reference Hix and Noury2016). Alternative measurement strategies, therefore, relied on data generated outside of parliament, for example, based on campaign finance (Bonica, Reference Bonica2014) or social media (Barberá, Reference Barberá2015).
This paper follows these advances and proposes a novel and low-cost approach for estimating an individual MP's ideological position in large legislatures. Our design consists of four features: First, we survey the leadership of parties’ youth organizations because they possess detailed knowledge about elected representatives’ ideological stances. Second, we designed a sampling algorithm to efficiently select informative pairs, which minimizes the cost of the survey without compromising the results. Third, these experts compare pairs of legislators along a left-right dimension in a simple online survey. Finally, we utilize these comparisons and estimate a Davidson model that generates an ideological position and its accompanying uncertainty for each legislator. After discussing the costs and benefits of the design, we illustrate our design with an estimation of the ideological position of the 709 members across the six parties represented in the 19th German Bundestag.
The German Bundestag is not only one of the world's largest parliaments, but it also consists of homogeneous and disciplined parties. Hence, it constitutes a challenging case for measuring the individual ideology of its members. Twenty-four respondents produced over 10,000 unique comparisons. The resulting estimates of individual ideological positions display evident variation across and within political parties. These estimated positions follow expected partisan differences. We demonstrate our estimates’ face and convergent validity, which mirror well-known differences between party wings and correspond with legislators’ self-placement. In conclusion, we explore the feasibility and flexibility of our design. Our survey technique and subsequent estimation are simple to implement. They can be extended easily—by using common anchors, such as heads of governments—across jurisdictions and over time.
1. Measuring individual ideological positions
Since the seminal work of Downs (Reference Downs1957), political preferences are conceptualized as spatial models. The number of dimensions on which these positions are evaluated is typically small, and they often correspond to a single left-right ideological dimension (Poole, Reference Poole2005). Spatial models are especially compelling when preferences are linked to institutional rules (Plott, Reference Plott1991), e.g., the electoral system, in order to explain collective outputs, such as the formation of a government, or the adoption of a policy. For instance, Proksch and Slapin (Reference Proksch and Slapin2012) investigate the determinants of floor access in legislatures and show that in mixed-member proportional electoral systems, party leaders prevent MPs with extreme positions from accessing the floor. In a different setting, Hix (Reference Hix2004) shows that the voting behavior of members of the European Parliament is determined by the distance between the position of the members, their European parliamentary group, and their national party. Such models populate the field of legislative politics, where MPs’ positions play a crucial role in the analysis of legislative behaviors and outcomes. Testing these models requires an accurate measure of individual MPs’ ideological positions. Substantive research that relies on MPs’ positions goes beyond legislative and coalition politics. These works extend to constituency preferences and their representation (for a review, see Canes-Wrone, Reference Canes-Wrone2015). Ultimately, they all inquire about how democratic politics works.
Separating preferences from behaviors sheds light on electoral and legislative politics. On the aggregate level, the intra-partisan distribution of core preferences affects a party's ability to adopt policies, negotiate coalition agreements, or represent its electorates. On the individual level, MPs’ level of sincerity reflects their capacity to represent their constituent and is likely to vary across contexts and considered behaviors. In parliamentary systems characterized by strong partisan discipline, speeches and social media posts are, for instance, more likely to be more sincere than roll-call votes. A central challenge of this research constitutes the distinction between MPs’ individual preferences and their “revealed preferences” as legislative behavior (for the larger debate, see Knox et al., Reference Knox, Lucas and Cho2022). Simply, an MP's roll-call vote, parliamentary speech, or public communication entails preferences and (cor-)responds to contextual and other strategic factors (Hix and Noury, Reference Hix and Noury2016). Behavior can vary despite stable core preferences, yet the two concepts must be kept apart theoretically and empirically. Our goal, therefore, is to determine MPs’ preferences independent of their behavior.
In short, three reasons for measuring individual preferences of legislators exist: (1) elected representatives are the basic unit of political action, (2) much of politics organizes around a single left-right ideological dimension, and (3) measures of MPs’ preferences reveal important aspects of individual behaviors and collective choice.
Measuring legislators’ ideological positions is a formidable task. Laver (Reference Laver2014); Carmines and D'Amico (Reference Carmines and D'Amico2015) summarize state of the art. Laver (Reference Laver2014) identifies several challenges when measuring ideology. First, it is necessary to choose between discovering the substantive meaning of the ideological dimension inductively or developing it deductively. Second, ideology is a latent concept whose empirical reality varies across space and time. For example, holding a left-leaning position amounts to different policy preferences in Germany and France. A deductive approach makes the measure more adaptive to the context but less comparable across contexts. On the other hand, proceeding inductively helps to measure a comparable phenomenon across contexts but raises the risk of locally applying an inaccurate definition of ideology. Finally, the numeric scale onto actors’ preferences is projected needs to stay stable across actors.
Research about the measure of individual positions can be classified into two groups: survey and behavioral approaches. We introduce these and briefly discuss their respective advantages and drawbacks. Concentrating on survey approaches, it is worth mentioning that the usual large number of representatives renders classic expert surveys inconceivable. Assessing the left-right position of an actor requires in-depth knowledge of this actor. A small handful of experts enjoy this sort of knowledge of all legislators. MPs themselves can offer such depth and breadth and can be asked to place themselves on an ideological scale. Directly surveying legislators has been carried out in the American states (Maestas et al., Reference Maestas, Neeley and Richardson2003) and across different democracies. The Comparative Candidate Survey (CCS) is an international effort that asks many legislative candidates to place themselves on an 11-point left-right scale. This basic form of self-placement follows typical voter surveys and assumes that candidates know their own position and conceive the left-to-right space in the same way.
Individual self-placement possesses three drawbacks. First, its scope is limited. Despite the impressive and substantial size of their sample, their compliance rate is, as already noticed in previous elite surveys (Bailer, Reference Bailer, Martin, Saalfeld and Strøm2014), low. In Germany in 2017, 803 of the 4828 (about 17 percent) candidates took the survey. In total, 182 were eventually elected, which represents only a quarter of all German legislators. Second, self-placement questions expect the respondent to perceive the numeric scale and its association with the underlying dimension in the same way (Lesschaeve, Reference Lesschaeve2017). If two candidates respond with the same position, it is tempting to conclude they hold the same view. Yet, they just might possess a different perception of how their position translates on the numeric scale. Finally, it is impossible to prevent strategic answers. This misrepresentation might partly be occurring because the population is small, and anonymity is hard to uphold. Fearing potential backlash, respondents might consequently take over the position of their constituency or leadership.
A recent study by Hopkins and Noel (Reference Hopkins and Noel2022) offers a valuable advancement of survey-based measures. To construct the ideological positions of US Senators, the authors ask politically engaged citizens to compare pairs of legislators on an ideological scale. Then, they leverage these comparisons and estimate the underlying ideological positions. Their design overcomes most issues related to scaling actors’ positions. There are still two potential disadvantages that we remedy here. First, it assumes that politically active citizens know all considered political actors well enough. As chamber size increases, familiarity with legislators is likely to decline. In this context, obtaining a complete ideological picture of a chamber becomes complicated. To solve this issue, our approach relies on sets of respondents whose daily work brings in proximity to legislators, such as members of the parties’ executives, political journalists, or parliamentary staff. Second, as the number of actors increases, the comparison space becomes huge. Random exploration of that comparison space, as in Hopkins and Noel (Reference Hopkins and Noel2022), also becomes very inefficient.Footnote 1 One does not learn much from a pairing of a very conservative with a very progressive member of the same party. Again, their design works well for small and active legislatures, such as the US Senate and the 10,000 possible comparisons, but is of limited use in larger chambers.
The second group of strategies consists of behavioral measures. These measures posit that ideological positions can be derived from observable behavioral patterns. The more similar the behaviors of actors, the closer their ideological positions are. In the context of legislators, two types of behaviors have been scrutinized extensively: roll-call vote (Poole and Rosenthal, Reference Poole and Rosenthal1985; Carroll and Poole, Reference Carroll, Poole, Martin, Saalfeld and Strøm2014) and speeches (Proksch and Slapin, Reference Proksch and Slapin2010; Lauderdale and Herzog, Reference Lauderdale and Herzog2016; Rheault and Cochrane, Reference Rheault and Cochrane2020). At first sight, it seems reasonable to expect MPs sharing an ideological view to vote together and to deliver similar speeches. However, the extensive use of both measurement approaches in the last 20 years has shown that estimating positions with behaviors is more complicated than expected.
Roll-calls are often discretionary (Ainsley et al., Reference Ainsley, Carrubba, Crisp, Demirkaya, Gabel and Hadzic2020) and suffer from three limitations. First, they are specifically triggered when MP have incentives not to vote sincerely (Carrubba et al., Reference Carrubba, Gabel and Hug2008). Second, roll-call analyses cluster together fringe representatives without necessarily distinguishing between different ideological orientations (Spirling and McLean, Reference Spirling and McLean2007). Finally, roll-calls have been developed in the American context, where partisan constraints on voting are lower than in most other legislatures (Hix and Noury, Reference Hix and Noury2016). In other cases, especially in Western European democracies, partisan discipline is high, and unconstrained voting is rare (Spirling and McLean, Reference Spirling and McLean2006; Dewan and Spirling, Reference Dewan and Spirling2011; Hix and Noury, Reference Hix and Noury2016). After carefully designing roll-call models to estimate the ideal points of German MPs, Bräuninger et al. (Reference Bräuninger, Müller and Stecker2016) conclude that non-spatial factors irregularly but considerably influence roll-call votes. “Off-the-shelf estimates may be biased in various ways, and we should instead turn to more complex behavioral models to arrive at valid point estimates” (p. 191).
Measures based on speeches, such as wordscore (Laver et al., Reference Laver, Benoit and Garry2003), wordfish (Proksch and Slapin, Reference Proksch and Slapin2008), or wordshoal (Lauderdale and Herzog, Reference Lauderdale and Herzog2016), address some of these drawbacks. Floor speeches are less affected by partisan discipline and more likely to reflect individual preferences. Similarly, speeches contain more information than discrete roll calls. The speech of a fringe left-leaning MP is unlikely to be confused with the speech of a fringe right-leaning MP, even if they both oppose the same bill. Yet, extracting position from speeches is not as straightforward as it seems. Text-scaling methods aim to project high-dimensional data—word frequencies—onto a few dimensions. A transcribed speech entails precise information about ideology, but it also contains a lot of non-ideological content. In this context, systematically linking word patterns with ideological positions is challenging. Even once speeches are located along one dimension, it is necessary to validate the interpretation of the obtained dimension and its mapping along the desired latent left-right ideology. This validation is complicated without an actual gold standard accurately measuring left-right positions. Lauderdale and Herzog (Reference Lauderdale and Herzog2016) shows, for instance, that the first dimension structuring debates in the Irish Dáil amounts to the divide between the government and the opposition and, hence, does not match the left-right dimension as previous studies suggested.
With these two approaches of measurement and their accompanying trade-offs in mind, the following section presents the design of an expert survey that overcomes the limitations of existing surveys. In a nutshell, we ask national experts to repeatedly compare pairs of MPs. In doing so, we overcome issues related to the subjective and potentially varying interpretations of a numeric scale. We can provide point estimates and uncertainties for all legislators within a parliament, independent of behavior.
2. Measurement strategy
Our measurement strategy consists of a simple expert survey using pairwise comparisons and a Davidson model to fit those responses. We propose to run the expert survey with politically active party members who work for or are closely associated with legislators. In the illustration below, we rely on leaders of the youth wings of political parties in Germany.Footnote 2 Instead of the classical scale-placement question, we take advantage of simple pairwise comparisons (Carlson and Montgomery, Reference Carlson and Montgomery2017). In our case, respondents compared 500 pairs of MPs according to an ideological criterion.Footnote 3 This number seems low compared to the 500,000 possible pairs—the Bundestag has over 700 members. Yet, our approach uses one respondent's previous answers to identify the most informative pairs of legislators, hence compressing as much information as possible in these 500 pairs. This approach allows an efficient exploration of the comparison space.
2.1 National experts
Measuring the position of individual legislators with an expert survey requires participants to be able to distinguish between as many legislators as possible.Footnote 4 Importantly, only a limited number of potential participants know backbenchers relatively well. We believe that leaders of the German youth parties are a good source of expertise. First, these organizations are highly institutionalized and work hand in hand with their mother organizations. They are part of the daily routine of the party: they hold executive positions at the local level, they commonly work as parliamentary assistants, they participate in grass-root activities such as campaigning or rallying, they also have a voice in their party's national executive board and even get some of their members elected as representatives. Their daily contact with political parties makes them ideal subjects for estimating the positions of both prominent and inconspicuous legislators. Beyond youth leaders, we considered using parliamentary staffers, parliamentary journalists, or MPs themselves. We focused on youth leaders for practical reasons, too: youth leaders are very accessible, and their participation was easier to incentivize.Footnote 5 As shown in Appendix A, compliance was indeed very high.
A crucial aspect of this inductive measurement strategy regards the absence of a clear definition of left and right. Respondents are presented with two MPs and must identify which MP holds the most left-leaning position. We did not provide any further explanation on how left-leaning should be understood. All respondents perceived the task as straightforward and did not ask for more details. In doing so, we rely on their subjective interpretation of left and right, which is relatively homogeneous within a given country at a given point of time (Huber, Reference Huber1989). This is particularly true among politically active respondents, who are unlikely to misconceive left and right when comparing two legislators. Respondents were very consistent in their answers, supporting the hypothesis of a shared left-right definition.
The potential drawback of subjectivity is low compared to its advantages. The absence of a fixed definition of left and right-leaning improves the flexibility of the resulting measure, which can be applied across contexts. If, in a given context, left-leaning positions are about defending state intervention in the economy, respondents will be aware of it and compare MPs accordingly. If such positions are instead related to a decentralization debate, respondents will instead compare MPs concerning decentralization. This inductive approach may be problematic if the definition of left and right-leaning varies across the ideological spectrum. If left-leaning respondents view ideology in terms of social policy while right-leaning respondents approach it in terms of immigration policy, the ideological composition of the pool of respondents may affect the measure's accuracy.Footnote 6 This highlights the importance of recruiting a sample of respondents as representative as possible of the ideological distribution within the considered context.Footnote 7 Furthermore, it relaxes all behavioral assumptions. There is no need to link an ideological position with a specific behavior (voting with or against the party or holding a specific speech). When comparing MPs, respondents rely on their personal knowledge of the legislator, his work, voting record, network, agenda, etc. The resulting estimates are consequently not tied to one type of behavior. Instead, they reflect how respondents perceive the global behavior of an MP.
Assuming the respondents know the meanings of left and right, two types of biases still threaten our measure's validity: sympathetic bias and collective non-ideological heuristics. First, sympathetic bias may lead respondents to assess sympathetic MPs (i.e., from their own party or belonging to the same social group) differently than other MPs (Benoit and Laver, Reference Benoit and Laver2006). Accordingly, they would rate sympathetic MPs closer to their own ideological position. For instance, they might project their own position on MPs of the same party to prevent cognitive dissonance. The German Young Greens are known to be much more left-leaning than their older counterparts. Sympathetic bias might encourage Young Greens to systematically label Green legislators as more left-leaning, resulting in a misstated position. To limit the potential effect of sympathetic bias, we implemented three safeguards. First, we recruited members from each major German party, so that our sample of respondents is representative of the German political landscape. Second, each participant had to classify members from all parties and not only from their own. Third, our models took into account respondent heterogeneity and modeled it explicitly.
The second type of bias happens when respondents mobilize external cues instead of their personal knowledge to estimate the ideology of an MP. There is a trade-off between providing respondents with enough information on the MPs for identification and cueing their answers by providing too many or particular pieces of information. We settled on offering two pieces of information: a name and an official portrait taken from the parliamentary website. Most notably, we did not disclose MPs’ party and explicitly asked the respondent not to look for more information, such as the Wikipedia page. In addition, respondents were encouraged to declare an MP as unknown when they had no or limited knowledge about the MP. Providing a picture is debatable because visual cues, including gender, race, facial expression, background, etc., can influence opinion formation (Olivola and Todorov, Reference Olivola and Todorov2010). In a pre-test, we only showed names and respondents complained about the difficulty of identifying an MP on the mere basis of the name.Footnote 8 By providing the respondents with the name and picture of an MP and offering them to declare an MP as unknown, we tried, as much as possible, to minimize the use of external cues.
2.2 Pairwise comparisons
Pairwise comparisons constitute a simple and valuable tool to measure latent trait (Benoit et al., Reference Benoit, Munger and Spirling2019). Classical scaling approaches ask respondents to place MPs on an absolute scale. But (Carlson and Montgomery, Reference Carlson and Montgomery2017, p. 836), “completing such tasks requires workers to continuously maintain in their memory how previous [MPs] were coded and remember detailed rules dictating how stimuli are placed into categories.” Respondents have to come up with a rule system differentiating numerical values for each step on a scale, e.g., a “4” from a “5.” These rule systems are likely to vary across time and respondents, as “individuals understand the ‘same’ question in vastly different ways” (Brady, Reference Brady1985). Instead, pairwise comparisons compress an MP's ideology in relative terms. It does not matter whether an MP is moderate or radical in the absolute; only their relative position to each other matters. The task is consequently more reliable across coders and easier to perform because of the binary nature of the decisions.
In our illustration, respondents were asked to compare 500 pairs with the following task description: “Which of these two MPs holds a more leftist position?.” They could choose between three answers: “A is more leftist than B,” “B is more leftist than A,” or “A and B defend a similar position.”Footnote 9 On average, respondents needed 103 min (13 s/comparison) to complete the full survey and were rewarded with $75{\euro }$.
The 709 German MPs generate over 500,000 possible comparisons. Drawing randomly from this set would be inefficient, as it would include many uninformative comparisons. For instance, there is no need to compare far-right members with far-left members. A random draw would accordingly increase the number of comparisons required to estimate accurate and precise positions. After each comparison, we automatically use the new information provided to detect the most informative pairs. To do so, our sampling algorithm focuses on the pairs, that would be undecided if we were to assume a complete transitivity of preferences. If a respondent declared A to be more leftist than B and B to C, it would be redundant to compare A and C as A is much more likely to be more leftist than C. Instead, the algorithm would focus on introducing another MP, e.g., D, as the comparison between D and any other MP is thought to be more informative than comparing A and C. Again, if D was rated more leftist than B, it would be uninformative to compare it with C. Instead, the next pair would either compare D and A (as both are more leftist than B) or introduce a fifth MP. This focus on the most informative pairs renders the exploration of the comparison space more efficient. It reduces the number of comparisons required to obtain stable estimates and the direct costs of the survey, as shown in Appendix B.
Our sampling algorithm focuses on pairs that would stay “undefined” if total transitivity was assumed. Most crucially, it only affects the exploration of the comparison space but does not affect the actual estimation for which no transitivity needs to be assumed. If a respondent was to mistakenly rate a pair, it would only affect the efficiency of the exploration but not the accuracy of the pairs rated after the mistake.Footnote 10 There are reasons to believe that such mistakes constitute a trivial threat. First, final ideological scores rely on all respondents’ comparisons. Idiosyncratic mistakes committed by one respondent are corrected by other respondents. Second, both the probability of such mistakes and their negative influence on efficiency decrease as the ideological distance between the two MPs increases. If such mistakes happen, they are likely to have limited consequences. Finally, the survey is initiated with prominentFootnote 11 MPs to help the respondent get familiar with the task.
To safeguard our procedure, we use two tests to estimate the potential impact of mistaken ratings. First, we estimate the agreement between coders who rated the same pairs. To estimate the inter-code reliability, we compute Krippendorff's alpha and estimate an ICR of .77 (1319 comparisons with 318 unique pairs), which suggests a substantial agreement between the respondents even when accounting for the random chance of agreement. Second, we used a Jackknife sampling scheme and re-estimated the model after removing all ratings from each respondent. The results remain extremely stable.Footnote 12 Unless the same mistake was committed by several respondents, we are confident to rule out the hypothesis that mistakes have been amplified by our sampling method.
2.3 Estimating latent positions from pairwise comparisons
We use a Davidson model to organize the comparisons and estimate MPs’ ideological positions. Our model accounts for the nested structure of the data, with respondent-specific random effects and standard errors clustered at the level of the respondent. The model we used has three important characteristics that contrast with the traditional modeling of pairwise comparison: it incorporates ties in ratings, accounts for multiple comparisons by each rater, and estimates the model in a Bayesian framework.
Statistical models for pairwise comparison have been widespread and appeared in psychology as early as the 1920s. A well-known variant with applications in political science is the Bradley–Terry modelFootnote 13 (Bradley and Terry, Reference Bradley and Terry1952; Loewen et al., Reference Loewen, Rubenson and Spirling2012; Agresti, Reference Agresti2013; Benoit et al., Reference Benoit, Munger and Spirling2019). It aims at providing an ordering of empirical units based on simple pairwise comparisons. In our empirical example, we compare pairs of politicians to identify their ideological positions within a legislative chamber.
Typical models for pairwise comparisons and their estimation are well-established (e.g., Cattelan (Reference Cattelan2012) for a review) and can, for our purpose, be summarized as follows. Y sij is a random variable containing the ratings of legislator pairs (i, j)s comparing legislators i and j made by the raters s = 1, …, S. In the model, we denote λ = λ 1, …, λ n as the vector of individual ideological positions for a set of n MPs. Following conventions, λ i > λ j is equivalent to λ i is “more right-leaning” than λ j. Consequently, higher and positive scores mean right-leaning positions, while lower and negative scores mean left-leaning positions. For each pair of MPs (i, j), there is a probability π i,j that respondents rate i as more right than j. This probability is linked to the ideological scores λ i and λ j with a logistic function:
Our first extension considers that legislators might be rated as “holding similar positions.” Classical Bradley–Terry only allows for strict comparisons and forces observers to discard pairs of objects judged to be similar. We explicitly allow respondents to judge two legislators as similar and wish to incorporate this information in the model (about 17 percent of the pairs of MPs were rated as similar). According to Davidson (Reference Davidson1970), we can incorporate these ties by adding a parameter $\nu \in {\opf R}$. Adding this information to the model, we obtain the following parametrization (1) π i j|i≠j—i.e., the probability that respondents rate i as more right than j given that i and j are not rated as holding similar positions—and (2) π i=j—i.e., the probability that respondents rate i and j as having similar positions.
Here, ν can be interpreted as the degree to which the probability of i = j is affected by the relative difference in ideological scores of i and j. Notably, when $\nu \rightarrow -\infty$, i and j never have the same position, but when $\nu \rightarrow + \infty$ i and j are systematically rated as holding similar positions.
The second extension to a simple Bradley–Terry model acknowledges that our observations, i.e., ratings by each youth group leader, are not independent of each other. Each rater s makes multiple comparisons. To account for multiple judgments, we use an extension of the Davidson model, proposed by Böckenholt (Reference Böckenholt2001), and decompose the prediction into a fixed and a random component. The fixed effect component estimates each legislator's average (log) position, while the random component accounts for respondent-specific effects. Given a set of S subjects, then λ is = λ i + U is, where U i,s refers to the random effect on the ideological score of MP i, when rated by s ∈ {S}. This extension can be incorporated in the parametrization above.
The third consideration pertains to the Bayesian estimation of the outlined Davidson model. Instead of detailing identification and estimation (Cattelan, Reference Cattelan2012), we concentrate on two aspects. First, the full identification of the model requires the constraint $\sum _i^n \lambda _{i} = 1$. Second, different estimation methods have been proposed to approximate the resulting likelihood. Many raters provided comparisons involving a very large group of legislators, and it is important to reflect the nested structure of the data. Despite the large sample size, it is relatively small compared to the number of estimated parameters, which justifies the use of a Bayesian approach. For our implementation, we used weakly informative priors for λ i, ν, and U i,s (normally distributed, centered around 0 and with variance 3.0). Our model is estimated in R using bpcs (Mattos and Ramos, Reference Mattos and Ramos2021), which uses stan and its No-U-Turn (NUTS) Hamiltonian Monte Carlo sampler (Hoffman and Gelman, Reference Hoffman and Gelman2014) for estimating the parameters λ, v, and U. We present these estimates and the accompanying credible intervals visually.
3. Individual positions in the 19th Bundestag in Germany
We illustrate this new research design using data from the 19th Bundestag in Germany. With 709 members across six different parties, the German parliament is a challenging environment for measuring MPs’ positions. The ideological space populated by German parties is reasonably narrow, especially among governing parties, and essentially structured along a single left-right dimension. A mixed-member electoral system and a strong second chamber set incentives for German political actors to cultivate a personal vote and pursue consensual positions.
For the survey, we recruited members of executive committees among six German youth party organizations: Junge Alternative für Deutschland (AFD), Junge Union (CDU), Junge Liberale (FDP), Jusos (SPD), Grüne Jugend (Bündnis 90/Die Grünen), and Linksjugend Solid (Die Linke). We emailed each member of these executives, asking them to contact us if they were interested in taking the survey, and selected participants on a first-come-first-serve principle. Participants who completed 500 comparisons were rewarded with ${\euro }75$. Considering our budget constraint, we could afford up to five participants for each organization. The survey was taken by 24 participants between March, 30th 2020, and June, 15th 2020 for a total cost of ${\euro }1800$. All participants were asked to complete 500 comparisons.
The survey produced 11,453 comparisons for the 709 members of the Bundestag. The most prominent MP, Angela Merkel, was compared 1353 times and the average representative was, on average, compared 32 times. Using those data, we fit a Davidson model and present the estimation results visually.
As a first step, Figure 2 presents the ideological distribution of the MPs’ point estimates for each party. When aggregated at the partisan level, the ideological positions of the six parties correspond to their well-known and commonly accepted positions. Within each party, centrists are more common than extremists, as the close-to-normal distribution of MPs in each party attests. Going from the left to the right, we observe the Left (Die Linke), the Greens (Bündnis 90/Die Grünen), the SPD, the FDP, the CDU/CSU, and finally the AfD. As one would expect, the AfD is more distant from the CDU/CSU than the FDP. MPs from the CDU/CSU and FDP are, in aggregate, ideologically very similar to each other. On the left, the Greens have an average ideological score very similar to the SPD, but have a less pronounced right tail. Most members of the Left are more leftist than the median members of the SPD and Greens.
Taking a closer look at the ideological estimate for each legislator, Figure 3 presents these point estimates (symbolized by a dot) and their uncertainties (as a shade) for each representative. We group these estimates by party and move from ideologically left to right. Based on the rankings by our national experts, the ideologically most left and most right members of the 19th Bundestag are Tobias Pflüger (Die Linke) and Frank Magnitz (AfD). For each political party, the graphic labels some of the most prominent members of each party. For the Left, Jan Korte appears to be close to the median member of his party, e.g., Harald Weinberg. Some prominent legislators circumscribe the ideological range of the Green party: Claudia Roth on the left and Cem Özdemir on the right, while the median member is Lisa Paus. Karl Lauterbach, a professor of health economics and an illustrious voice during the Corona crisis, is slightly to the left of the median (Lothar Binding) within the Social Democratic Party. The FDP is estimated to have a relatively wide ideological range, and its party leader, Christian Lindner, is placed slightly right next to the party's median MP (Marco Buschmann). Angela Merkel, the chancellor at the time, is among the most centrist legislators in the 19th Bundestag. This position places her among the more leftist members of her party, the CDU (median member is Erwin Rüddel). Philipp Amthor, who expressed a strong disagreement with his own party's progressive immigration policy, is an example of his party's right wing. According to our estimates, Alexander Gauland is one of the most right parliamentarians of the AfD (median member is Jürgen Pohl). All in all, this illustration provides a realistic and detailed picture of the ideological composition of the German Bundestag.
3.1 Establishing the estimates’ validity
The illustration above provides basic face validity to our estimates. Political parties and some well-known legislators are placed appropriately on the ideological scale. Two additional benchmarks add further validity: our estimates behave as expected when compared with (1) MP's membership in ideologically distinct party wings and (2) MPs’ own ideological placement.Footnote 14
German parties organize around wings and factions (for a review, see Sältzer, Reference Sältzer2020). Because these wings hold different and homogeneous ideological positions, we can verify our estimates by matching members of party wings to our estimates. For example, Jürgen Trittin, Claudio Roth, and Anton Hofreiter are known to belong to the “fundamentalist”—FUNDI—faction of the Greens, which can be distinguished from the “realist”—REALO—like Franziska Brantner or Cem Özdemir. Likewise, for the SPD, ideological differences prevail among party wings. The co-party leader—Saskia Esken—was a vocal critic of the decision to enter a coalition with the CDU and is recognized as more left-leaning than the rest of her party. Contrastingly, representatives from the “Seeheimer Kreis” like Heiko Maas, Thomas Oppermann, or Johannes Kahrs belong to the right segment of their parties. Given these differences, we expect the CSU (Bavarian conservative), the Seeheimer Circle (economically liberal democrats), and the Realists (economically liberal greens) to be to the right of their respective parties.
While no official listing of the faction membership exists, Sültzer (2020) compiled data on MP's memberships. Mapping faction membership with our ideology scores, we can assess whether our estimates accurately placed factions beyond the few prominent individuals mentioned above for two parties in the Bundestag. These comparisons consist of 442 legislators. Figure 4 groups legislators of the SPD and the Greens according to their faction. As expected, REALOs are more right and distinguishable from the FUNDIs among the Greens. For the SPD, legislators belonging to the “Seeheimer Kreis” are placed to the right of the SPD. In a similar vein, we can affirm that conservative members from Bavaria (CSU) still are, on average, more right-leaning than other conservatives (CDU). We, therefore, recover well-known factional differences within parties.
In the absence of a gold standard capturing the individual position of representatives, an alternative external measure for comparison and validation comes from non-behavioral data. The Comparative Candidate Survey (CCS) is, to our knowledge, the only possible measure here. Before a legislative election, the CCS asks candidates to identify their ideological position on an 11-point scale. For the 2017 German election, 803 candidates took the survey, of which 182 respondents were elected. We use these data to investigate the convergent validity of our estimate. As seen in Figure 5, our expert-based estimates correlate highly with the self-placement from the CCS (r = .85). There seems to be a slight mismatch among MPs of the AfD who place themselves as centrists but are estimated to be among the far-right legislators.Footnote 15 Overall, these figures and statistics deliver face validity for individual estimates and partisan aggregates, as well as convergent validity when compared to self-placement.
3.2 No respondent bias
Finally, we investigate the possibility of a systematic respondent bias by our raters. For each combination of respondents-MPs (s, i), the fitted model estimated a random effect U i,s, capturing the systematic bias of a given respondent s against a given MP i. A bias of 1 of respondent s against MP i can be easily interpreted, as it means that the respondent i perceives MP s as more right-leaning than it is perceived by the rest of the respondents.
Respondents are, most of the time, not systematically biased against specific MPs. Figure 6 illustrates this assertion and shows that the respondent bias is for most respondents concentrated around zero. More precisely, only 2.5 percent of all posterior distributions did not include 0 in their 95 percent-credible interval and 95 percent of the estimated respondent-specific random effects fall within the interval [−0.99, 0.92], which is tiny considering the width of the overall scale [−10; 10].
4. Conclusion
Measuring individual positions of political actors is a fundamental task for political science. Especially in settings with high partisan discipline like the German parliament (Sieberer et al., Reference Sieberer, Saalfeld, Ohmura, Bergmann and Bailer2020), obtaining ideological placements of legislators is challenging. In this paper, we introduce a simple research design for measuring the ideological positions of legislators. The measurement strategy comprises four components. First, we recruit members of the executive committee of the youth wings of parties. These experts are easy to reach and familiar with both MPs in leadership positions and backbenchers. Second, we ask these respondents to compare the ideological positions of many pairs of legislators. These relative comparisons are quick and reliable. It avoids differences in interpretations of numerical scales among respondents. Third, we leverage a sampling algorithm that identifies the most informative zones of the huge comparison space and increases the efficiency of the survey without compromising its precision. Fourth, we estimate a Davidson model based on all pairwise comparisons.
Naturally, the resulting estimates can be used substantially to improve our understanding of parliamentary processes. Additionally, our study enables new validation strategies for behavioral measures. For instance, our estimates can be used to understand what factors influence the validity of speech-based estimates so that the latter can be, when suited, systematically deployed. Our survey-based measure greatly contributes to making behavioral measures of ideology more robust.
One important application relates to the measure of ideology in the past. If the proposed method cannot directly be leveraged in the context of non-contemporaneous studies, it can help us calibrate behavioral measures and improve our ability to study ideology in past contexts. Established behavioral measures such as roll-call votes or text-scaling are now established and accepted, but the scope of their validity remains unclear. Using measures based on pairwise comparison can help us understand the conditions driving the validity of behavioral measures and eventually help us adequately use these measures to go back in time.
In addition to the methodological aims, our article offers a substantive contribution by identifying the ideological positions of individual legislators in the 19th German Bundestag. The German parliament is a large national assembly with strong partisan discipline. Employing our design, we estimate the ideological position of its 709 members. These estimates coincide with the common perceptions of prominent MPs and with ideological demarcations within party wings. We show that our estimates are valid and insensitive to potential biases among surveyed respondents. Overall, the proposed design is easy to implement and delivers accurate estimates and associated measures of uncertainty for legislators’ ideological stance.
Our method is applicable in any political system. Pairwise comparisons are a simple and robust psychometric tool for scaling preferences. As this approach becomes more popular, there is a growing need for an in-depth methodological understanding of this tool. The sampling strategy presented here is one step in this direction, but the best practices are far from being settled. For instance, there are no guidelines on how to appropriately pick a well-suited number of respondents or the number of pairs that each respondent should answer. These two quantities depend on both the size of the target (i.e., how many MPs are being compared) and the signal-to-noise ratio (i.e., how knowledgeable the respondents are). Future work should answer these questions to help scholars adequately use pairwise comparisons.
Furthermore, one big challenge lies in the identification and recruitment of experts. Young partisan leaders might not always be the ideal choice. In other countries, parliamentary journalists, parliamentary staffers, or even political actors themselves might provide informative comparisons. A common strategy is to contact country experts and academics directly, but several other options exist. For the American case, Hopkins and Noel (Reference Hopkins and Noel2022) used a polling company to recruit political activists and screen them before the survey. The central concern for recruitment centers on identifying highly knowledgeable political observers who are willing to participate in a long survey.
We also believe that the design is flexible and can be extended easily. Two avenues seem particularly worthwhile to explore. First, one can expand the substantive scope of inquiry. In our application, we focus on a single left-right ideological dimension. Instead, a survey might ask a different set of questions altogether or let respondents decide whether they would like to rate legislators on more than one dimension (e.g., post-materialist values). Second, a common problem with spatial estimates is that they are based on a latent scale, making comparisons across political units and time difficult. Our design offers a simple solution to this problem. One might use a common anchor, such as a head of government, or even a fictional anchoring vignette to project individuals from different units, such as different branches of government, jurisdictions, or even countries, onto a common scale.
Data
Replication code for this article is available on GitHub. Replication material for this article can be found at https://doi.org/10.7910/DVN/0ULECU.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/psrm.2024.68.
Acknowledgements
We are deeply grateful to Michael Herrmann, Marius Sälzer, Susumu Shikano, Resul Umit, Jens Wäckerle, and reviewers for their insightful comments and valuable feedback on this manuscript. We extend our special thanks to Simon Roth for his involvement in this project. Earlier versions of this project were presented at the American Political Science Association conference in 2020, and we are thankful for the constructive discussions and suggestions received there.
The project is funded by the Deutsche Forschungsgemeinschaft (DFG—German Research Foundation) under Germany's Excellence Strategy—EXC-2035/1 – 390681379.
Competing interests
None.