1 Introduction
Spatial models of voting and electoral competition (Downs Reference Downs1957; Davis, Hinich, and Ordeshook Reference Davis, Hinich and Ordeshook1970; Enelow and Hinich Reference Enelow and Hinich1990) have become tremendously important and powerful to understand the interplay between voter preferences and spatial party strategies. In the search for a better understanding of how citizens form their preferences to arrive at an electoral decision and how parties or candidatesFootnote 1 strategically adopt policy platforms in response to the electorate, a huge literature emerged (see Dewan and Shepsle Reference Dewan and Shepsle2011; Jackson Reference Jackson2014, for reviews). The still growing ‘neo-Downsian agenda’ (Grofman Reference Grofman2004, 40) spans many different subfields, addressing various aspects such as: Are voters solely driven by policy preferences, or do they have nonpolicy motivations? What role play nonpolicy characteristics of parties, considered as valence, and how do these dimensions of competition interact? And how do all these factors affect optimal party locations? The discrete choice framework seems to provide the promising modeling approach to translate the theory into empirical models that allow addressing these and many more questions in real-world elections.
Based on random utility maximization, discrete choice models rely on an underlying microfoundation that can be directly linked to the logic of probabilistic voting (e.g., Enelow and Hinich Reference Enelow and Hinich1989; Coughlin Reference Coughlin1992; Adams Reference Adams1999; Lin, Enelow, and Dorussen Reference Lin, Enelow and Dorussen1999; Thurner Reference Thurner2000). Discrete choice models based on a logit or probit link function have become well-established tools in the rational choice modeling of voting behavior and party competition. Especially the usage of conditional logit models (McFadden Reference McFadden and Zarembka1974) has become common practice.Footnote 2 Although differing in the operational model they employ and how underlying latent policy dimensions are constructed, empirical applications share one common approach: they rest on the model parameters to understand both voter choice and party policy strategies. Based on survey data, they first estimate voter choice models. Then, the empirically estimated parameters are used further on to calculate equilibrium constellations and to explore their nature, conditions, and characteristics. Work in this genre produced comprehensive findings on equilibria in spatial competition (e.g., Adams and Merrill III Reference Adams and Merrill2000, Reference Adams and Merrill2009; Merrill III and Adams Reference Merrill and Adams2001; Adams et al. Reference Adams, Ezrow, Merrill and Somer-Topcu2013).
While in the traditional spatial framework voter choice is solely a function of spatial proximity on one single dimension, neo-Downsian models are multidimensional and incorporate both spatial proximity and voters’ nonpolicy motivations. The latter is typically represented by sociodemographic voter attributes to account for differences in voting behavior among the electorate. Several probabilistic voting models also include the parties’ nonspatial qualities, considered as party valences, into voter utility functions. One prominent and influential line of such probabilistic models, Schofield’s valence model (e.g., Schofield Reference Schofield2003, Reference Schofield2004; Schofield and Sened Reference Schofield and Sened2005a,Reference Schofield and Senedb, Reference Schofield and Sened2006; Schofield and Miller Reference Schofield and Miller2007), relies on the estimated party intercepts in discrete choice models to measure the valence qualities and advantages of parties empirically. These party-specific constants are assumed to capture the average weight voters attach to the parties’ valence qualities. The respective estimates are ordered based on their absolute valuesFootnote 3 to assess how quality differences affect the nature of spatial competition.
I show here that using this valence ranking, the key factor in Schofield’s valence model, further on to calculate and analyze equilibrium strategies does not allow us to generate generalizable conclusions about the impact of valence on parties’ spatial strategies. Under the discrete choice framework, only differences in utility matter and the absolute values are irrelevant. Due to this central property and the specific nature of party intercepts within this framework, the valence ranking is highly sensitive to the coding of voter attributes. Depending on arbitrary coding decisions, the valence ranking differs, as changes in the coding lead to changes in absolute values. As a consequence, it is impossible to represent valence with the constants and to learn something substantial from the resulting valence ranking. I recommend that researchers do not rely on party intercepts, which reflect the relative average effect of unmeasured sources of utility—that might or might not be nonspatial party qualities—but to make the valence qualities measurable instead and incorporate them into voter utility functions by observable components.
2 Neo-Downsian Spatial Voting Models
Neo-Downsian models usually operate in a probabilistic framework where both observed and unobserved factors determine voter utility. The random component is assumed to be independent of spatial factors and is formulated in various ways in the literature (see, e.g., Enelow and Hinich Reference Enelow and Hinich1989; Coughlin Reference Coughlin1992; Burden Reference Burden1997; Adams Reference Adams1999; Lin, Enelow, and Dorussen Reference Lin, Enelow and Dorussen1999). Another central aspect and major model extension relates to the importance of voters’ nonpolicy motivations in addition to policy in the voting calculus (e.g., Adams and Merrill III Reference Adams and Merrill1999a,Reference Adams and Merrillb, Reference Adams and Merrill2000; Thurner Reference Thurner2000; Merrill III and Adams Reference Merrill and Adams2001; Adams, Merrill III, and Grofman Reference Adams, Merrill and Grofman2005). By accounting for nonspatial motivations, a behavioral perspective was incorporated into the spatial voting model in which psychological or sociological factors also influence voter choice. The nonpolicy considerations enter empirical models by measured variables that represent the voters’ socioeconomic characteristics, class, or religious ties, etc. The incorporation of such voter attributes allows accounting for segment-specific evaluations of parties.
Several probabilistic voting models posit that voters also evaluate and value the nonspatial qualities of parties. In these models, party competition is characterized by an additional dimension, namely valence. The concept of valence, going back to Stokes (Reference Stokes1963, Reference Stokes and Kavanagh1992) and his fundamental critique of the Downsian framework, received much scholarly attention in both formal theoretical and empirical work. In the electoral research literature, valence is very diversely understood, resulting in many different debates and subfields with varied definitions, subcategories, and empirical measurement or operational strategies (for an overview, see Green and Jennings Reference Green, Jennings, Arzheimer, Evans and Lewis-Beck2017, and references therein). Within the spatial voting literature, valence is very broadly defined and generally conceptualized as a nonpolicy-related characteristic of a party (or a party leader) that distinguishes parties from one another. The literature proposes several spatial models that connect the two dimensions of competition to study their interplay, yielding a whole class of models in which a valence term enters voters’ utility functions. The formal work produced a lot of theoretical results regarding equilibrium constellations and properties of spatial competition (e.g., Enelow and Hinich Reference Enelow and Hinich1982; Ansolabehere and Snyder Jr. Reference Ansolabehere and Snyder2000; Groseclose Reference Groseclose2001; Aragonès and Palfrey Reference Aragonès and Palfrey2002; Ashworth and de Mesquita Reference Ashworth and de Mesquita2009; Serra Reference Serra2010). This prominent stream of research demonstrates that party quality differences affect the nature of political competition (for a recent review of formal models, see Evrenk Reference Evrenk, Congleton, Grofman and Voigt2019).
As diverse as the understanding of the concept is in the different theoretical debates, are the ways how it is quantified and modeled. Empirical probabilistic voting models typically assume the party valence component to be unrelated to spatial factors. The main difference in how valence is operationalized in the empirical work is whether the party quality term enters the systematic or the unobserved component of voter utility functions when using discrete choice models to arrive at an operational model. Some studies model it directly by observed variables using survey or expert data designed to capture valence qualities, so that the valence term enters utility functions as measured components (Stone, Maisel, and Maestas Reference Stone, Maisel and Maestas2004; Stone and Simas Reference Stone and Simas2010; Adams et al. Reference Adams, Merrill, Simas and Stone2011; Buttice and Stone Reference Buttice and Stone2012; Franchino and Zucchini Reference Franchino and Zucchini2015). Other works consider valence as a dichotomous concept consisting of measured and unmeasured attributes (Adams, Merrill III, and Grofman Reference Adams, Merrill and Grofman2005; Adams and Merrill III Reference Adams and Merrill2009; Adams et al. Reference Adams, Ezrow, Merrill and Somer-Topcu2013).
One prominent line of probabilistic models, which became an influential strategy for incorporating valence considerations into empirical models, is Schofield’s spatial valence model of politics. The model and extensions thereof were applied to many different countries, such as the United States (e.g., Schofield, Miller, and Martin Reference Schofield, Miller and Martin2003; Schofield and Miller Reference Schofield and Miller2007; Gallego and Schofield Reference Gallego and Schofield2016), Britain (e.g., Schofield Reference Schofield2004, Reference Schofield2005; Schofield, Gallego, and Jeon Reference Schofield, Gallego and Jeon2011), Germany (e.g., Kurella and Pappi Reference Kurella and Pappi2015; Schofield and Kurella Reference Schofield, Kurella, Schofield and Caballero2015), Israel (e.g., Schofield Reference Schofield2004; Schofield and Sened Reference Schofield and Sened2005b), or Russia (Schofield and Zakharov Reference Schofield and Zakharov2010). To measure the valence advantages empirically, the model exclusively relies on unobserved information by using the party intercepts in discrete choice models. The estimated party-specific constants are assumed to reflect “the average perception, among the electorate, of the ‘quality’ of the party leader” (e.g., Schofield Reference Schofield2005, 348). These valence characteristics are defined as a nonpolicy-related term $\unicode[STIX]{x1D706}_{ij}$ that is “interpreted as the weight that voter $i$ gives to the competence of candidate $j$” (e.g., Schofield Reference Schofield2004, 448).Footnote 4 The valence terms are ranked based on the absolute constant estimates so that $\unicode[STIX]{x1D706}_{p}\geqslant \unicode[STIX]{x1D706}_{p-1}\geqslant \cdots \geqslant \unicode[STIX]{x1D706}_{1}$, where $p$ denotes the number of parties and 1 the party with the lowest valence. The average valence of the parties other than the lowest ranked is given by $\unicode[STIX]{x1D706}_{av\,(1)}$ = $[1/(p-1)]$$\sum _{j=2}\unicode[STIX]{x1D706}_{j}$, and $\unicode[STIX]{x1D6EC}=\unicode[STIX]{x1D706}_{av\,(1)}-\unicode[STIX]{x1D706}_{1}$ defines the valence difference for the lowest valence party. The valence ranking is then the crucial factor to investigate how valence differences influence parties’ spatial strategies. Whereas the formal model does not account for sociodemographic voter characteristics (e.g., Schofield and Sened Reference Schofield and Sened2006, 33), which are “regarded as sociodemographic valences, generated by common perceptions of the parties by different societal subgroups” (Schofield and Zakharov Reference Schofield and Zakharov2010, 179), empirical models incorporate them in order to avoid that the valence terms are exaggerated.
However, empirical applications of the formal model do not yield unique results in fully specified voter models (i.e., when both voters’ spatial and nonpolicy motivations are incorporated into voter utility functions) due to central properties of discrete choice models and the specific role party intercepts play in these models. The next section summarizes these fundamental properties to identify the problem.
3 Spatial Voting Models within the Discrete Choice Framework
Within the spatial voting framework, we assume voter $i\in \{1,\ldots ,n\}$ faces a choice among $J$ parties and each party $j\in \{1,\ldots ,J\}$ provides a specific amount of utility, denoted by $U_{ij}$. Following the principle of utility maximization, voter $i$ chooses party $j$ if and only if: $U_{ij}>U_{ir},\forall j\neq r$. Probabilistic voting models incorporate a random element in the voter’s calculus so that both observed and unobserved factors determine voter utility. This rationale is in accord with discrete choice theory and can be related to the concept of random utility maximization (Manski Reference Manski1977), which is the most widely used derivation of discrete choice models (see, e.g., Louviere, Hensher, and Swait Reference Louviere, Hensher and Swait2009; Train Reference Train2009; Hensher, Rose, and Greene Reference Hensher, Rose and Greene2015). Under this approach, a probabilistic choice mechanism is introduced so that the true utilities of the alternatives are random variables. The randomness is caused by the researcher’s lack of full information about the decision process, i.e., by the nonobservability of all relevant decision criteria due to unobserved characteristics of the alternatives and individuals, measurement or specification errors.
The limitation on the side of the researcher is captured by dividing the overall utility $U_{ij}$ into two parts of the utility expression so that $U_{ij}=V_{ij}+\unicode[STIX]{x1D716}_{ij}$. The first part represents the utility contributions that are measured, and the second one contains the utility sources we cannot observe. Relating this perspective to voter choice and after rearranging, the probability of voter $i$ to decide for party $j$ equals the probability that the difference in the unobserved factors induced by any other party $r$ compared to party $j$ is below the difference in the observed factors of these two parties: $Pr(\unicode[STIX]{x1D716}_{ir}-\unicode[STIX]{x1D716}_{ij}<V_{ij}-V_{ir}),\forall j\neq r$.
As the choice probabilities are only a function of differences, the utilities derived from discrete choice models are based on ordinal utility theory. Under this framework, utilities express the preference ordering of parties, and the absolute values are irrelevant. Ordinal utilities, therefore, indicate relative utility differences which are maintained by order-preserving changes in the absolute values. The fact that the choice probabilities only depend on utility differences has major implications for model specification and identification, parametrization, and parameter interpretation. Before we discuss these issues in more detail, let us consider the components that typically enter voter utility functions in neo-Downsian spatial models.
I specify a simple model that is multidimensional and incorporates both a policy component and nonspatial factors into the choice rule. The spatial term contains voter-party proximity measures. For simplicity, I assume a city-block voter policy metric based on individual perceptions of party positions. Let $z_{ijk},k\in \{1,\ldots ,K\}$, represent the proximity between the voters’ self-placements $x_{ik}$ and the voter-specific party placements $p_{ijk}$ on each issue $k$ so that $z_{ijk}=-|x_{ik}-p_{ijk}|$.Footnote 5 Sociodemographic variables typically represent voters’ nonpolicy motivations. Let $s_{im},m\in \{1,\ldots ,M\}$, contain these voter characteristics. For $i\in \{1,\ldots ,n\}$ and $j\in \{1,\ldots ,J\}$, define $V_{ij}$ as linear predictors that accumulate the voter utility functions in scalar quantities:
The parameters $\unicode[STIX]{x1D6FD}_{10},\ldots ,\unicode[STIX]{x1D6FD}_{J0}$ are intercepts specific to each party $j$.Footnote 6 These constants play a central role in the utility expression of discrete choice models. They capture the average effect of unobserved sources of utility associated with each party $j$.Footnote 7 Thus, the constant terms represent the average role of all factors that are central to the electoral choice but are not measurable, i.e., cannot be related to observable party or voter attributes. $\boldsymbol{\unicode[STIX]{x1D6FC}}^{T}=(\unicode[STIX]{x1D6FC}_{1},\ldots ,\unicode[STIX]{x1D6FC}_{K})$ is the parameter vector related to voter-party proximity measures $\mathbf{z}_{ij}$ on $K$ different policy dimensions. These coefficients indicate the weight voters attach to the policy issues.Footnote 8 $\boldsymbol{\unicode[STIX]{x1D6FD}}_{j}^{T}=(\unicode[STIX]{x1D6FD}_{j1},\ldots ,\unicode[STIX]{x1D6FD}_{jM})$ is the coefficient vector related to the $M$-dimensional individual-specific covariate vector $\mathbf{s}_{i}$. These parameters are specific to each party $j$ and indicate segment-specific evaluations of parties. To derive at the choice probabilities of the conditional logit model (McFadden Reference McFadden and Zarembka1974), the working horse in empirical applications, we assume a logit link function that connects the $V_{ij}$ with the choice probabilities, yielding
where $\unicode[STIX]{x1D70B}_{ij}$ denotes the $J$-dimensional vector of choice probabilities and $Y\in \{1,\ldots ,J\}$ the $j$-categorical, probabilistic response variable.
Since only differences in utility are relevant, and the absolute utility values are meaningless in discrete choice models, particular parameters in Equation (2) need special consideration. Here it is important to note that two different types of predictors represent the spatial and nonspatial components of the voter utility functions: variables that are specific to parties and voters (voter-party proximity measures $\mathbf{z}_{ij}$), and variables that are specific to voters only (voter attributes $\mathbf{s}_{i}$). This distinction is fundamental as only those parameters can be estimated freely that relate to variables which capture differences across parties. As the voter attributes $\mathbf{s}_{i}$ are invariant across parties, the level of utility has to be set to an arbitrary value for at least one pair of the parameters $\boldsymbol{\unicode[STIX]{x1D6FD}}_{j}^{T}=(\unicode[STIX]{x1D6FD}_{j1},\ldots ,\unicode[STIX]{x1D6FD}_{jM})$ to make the remaining ones estimable. The same applies to the constants $\unicode[STIX]{x1D6FD}_{10},\ldots ,\unicode[STIX]{x1D6FD}_{J0}$. As a result, Equation (2) refers to the conditional logit model in its generic form where at most $J-1$ intercepts and $(J-1)\times M$ parameters for the voter attributes are identifiable. The standard side constraint to identify the model is to select one party $l$ whose constant and coefficients related to voter attributes are normalized by setting $\unicode[STIX]{x1D6FD}_{l0}=0$ and $\unicode[STIX]{x1D6FD}_{l}^{T}=(0,\ldots ,0)$. The choice probabilities for $j\in \{1,\ldots ,J-1\}$ and for the reference party $l$ are given by
Since $\unicode[STIX]{x1D6FD}_{l0}=0$ and $\unicode[STIX]{x1D6FD}_{l}^{T}=(0,\ldots ,0)$, the log odds for $j\in \{1,\ldots ,J-1\}$, giving the utility differences associated with party $j$ as compared to the reference party $l$, simplify to
The relative utility difference between each of the parties remains the same by order-preserving changes in absolute values. Changes in the coding of voter attributes $\mathbf{s}_{i}$ induce changes in the absolute parameter values for the constants. Depending on how we code voter attributes to account for differences in the decision-making process among members of particular social segments, which is an arbitrary choice we make, the absolute constant values differ. For instance, assume $\mathbf{s}_{i}$ only contains a dummy-coded variable:
When we solely reverse the coding of one single voter attribute, all constants are affected. The respective parameters adjust to leaving the utility differences and therefore, the corresponding choice probabilities unchanged. By contrast, the parameters related to the policy component are unaffected, as the party-specific variables $\mathbf{z}_{ij}$ define differences between parties.
4 Empirical Illustration
For illustration purposes, I use survey data on voter choices over the five major parties in the 1998 German parliamentary election: the Christian Democrats (CDU), the Social Democratic Party (SPD), the Liberal Party (FDP), the Greens, and the Left Party (PDS). The 1998 German national election study (Falter, Gabriel, and Rattinger Reference Falter, Gabriel and Rattinger2012) includes three policy issues and the ideological Left–Right scale on which respondents were asked to locate themselves and the parties. These placements are used to construct the voter-party proximity variables $\mathbf{z}_{ij}$. As voter-specific nonpolicy attributes $\mathbf{s}_{i}$, six standard sociodemographics were selected: working class, union membership, religious denomination, age, gender, and region.Footnote 9
Source: 1998 German national election study (Falter, Gabriel, and Rattinger Reference Falter, Gabriel and Rattinger2012).
Note: Entries are to be read columnwise. Numbers in parentheses show standard errors. $N=715$, $df=4$, $\text{AIC}=1864.86$, $\text{BIC}=1883.15$, $\text{Log}\,L=-928.43$.
I begin by illustrating the important role the party intercepts play in discrete choice models. The estimates do not only capture the average effects of unobserved sources of utility, but they also reproduce the observed vote shares in the sample within the conditional logit framework. The parameters should be interpreted with that in mind. To demonstrate that, I fitted constant-only models which do not contain any voter choice predictors but only use the information on the vote shares in the data to estimate the choice probabilities. The deterministic part of utility associated with each party $j$ simplifies to $V_{j}=\hat{\unicode[STIX]{x1D6FD}}_{j0}$. Since no covariates enter the models, the utility functions and the choice probabilities $\hat{\unicode[STIX]{x1D70B}}_{j}$ are not subscripted by voter $i$ and are therefore identical for each voter in the sample. Table 1 compares the resulting utility functions and choice probabilities. The first column gives the vote shares as observed in the sample.Footnote 10 The remaining columns report the party intercept estimates where each time the constant associated with one of the parties is normalized to zero to ensure model identification. The values give the utility functions as compared to the respective reference party. The last column reports the estimated vote shares. Since the models based on differing reference parties contain the same information, all $J-1$ combinations of the constant terms are functionally equivalent and yield the observed vote shares in the sample. When defining the SPD as the reference party, for instance, and plugging the vote shares for the CDU and the SPD into Equation (4), one obtains the log odds. These give the relative difference in utilities that equals the intercept associated with the CDU:
The utility difference between these two parties and any party pairs are preserved regardless of what party is defined as the reference because the relative utility differences are maintained by order-preserving changes in the absolute values.Footnote 11
Next, I specify a voter choice model that adds substantive information so that the relative differences in unobserved utilities between the parties should decrease significantly. In accord with Equation (1), the model is fully specified and contains four spatial voter-party proximities (immigration, European unification, nuclear energy, Left–Right) and six voter attributes. The model uses the CDU as the reference party and is based on 32 degrees of freedom: $4$ issue weights, $5-1$ intercepts, and $(5-1)\times 6$ parameters related to voter attributes. The maximum likelihood point estimates and associated standard errors for the policy component are as follows: $\hat{\boldsymbol{\unicode[STIX]{x1D6FC}}}^{T}=(\hat{\unicode[STIX]{x1D6FC}}_{1},\hat{\unicode[STIX]{x1D6FC}}_{2},\hat{\unicode[STIX]{x1D6FC}}_{3},\hat{\unicode[STIX]{x1D6FC}}_{4})=(0.09,0.20,0.31,0.39)$; $\hat{\boldsymbol{\unicode[STIX]{x1D70E}}}_{\boldsymbol{\unicode[STIX]{x1D6FC}}}^{T}=(\hat{\unicode[STIX]{x1D70E}}_{1},\hat{\unicode[STIX]{x1D70E}}_{2},\hat{\unicode[STIX]{x1D70E}}_{3},\hat{\unicode[STIX]{x1D70E}}_{4})=(0.05,0.06,0.04,0.03)$. The party intercept estimates are $\hat{\boldsymbol{\unicode[STIX]{x1D6FD}}}_{0}^{T}=(\hat{\unicode[STIX]{x1D6FD}}_{S0},\hat{\unicode[STIX]{x1D6FD}}_{F0},\hat{\unicode[STIX]{x1D6FD}}_{G0},\hat{\unicode[STIX]{x1D6FD}}_{L0})=(-0.06,-1.73,-2.44,-0.37)$; $\hat{\boldsymbol{\unicode[STIX]{x1D70E}}}_{\boldsymbol{\unicode[STIX]{x1D6FD}}_{0}}^{T}=(0.28,0.57,0.58,0.38)$.
When we rank the constant terms from $p$ to $1$, where $p$ denotes the number of parties and $1$ the lowest valence party, as proposed in Schofield’s valence model by $\hat{\boldsymbol{\unicode[STIX]{x1D6FD}}}_{0}^{T}=(\hat{\unicode[STIX]{x1D6FD}}_{[p]0},\ldots ,\hat{\unicode[STIX]{x1D6FD}}_{[1]0})=(0,-0.06,-0.37-1.73,-2.44)$, we get the ordering $(\hat{\unicode[STIX]{x1D6FD}}_{C0},\hat{\unicode[STIX]{x1D6FD}}_{S0},\hat{\unicode[STIX]{x1D6FD}}_{L0},\hat{\unicode[STIX]{x1D6FD}}_{F0},\hat{\unicode[STIX]{x1D6FD}}_{G0})$.Footnote 12 The average valence of the parties other than the lowest ranked $\unicode[STIX]{x1D6EC}=1.90$. According to this framework, the result suggests that the CDU is the party with the highest valence, followed by the SPD, whereas the Greens are the lowest valence party.Footnote 13 The valence ranking reflects the one empirical applications of the valence model typically obtain: the major parties have a high valence and parties gaining small vote shares have low valences. Under the discrete choice framework, such a result, however, indicates that our empirical voter choice models are good at capturing utility differences between large parties but not between large and small parties.
Next, I demonstrate that the valence ranking is highly sensitive to the coding of voter attributes. Figure 1 illustrates how arbitrary coding decisions affect the estimates for the party intercepts and the valence ranking. Each time I reversed the coding of one of four dummy-coded covariates and re-estimated the models. From the left panel in Figure 1, we observe that the signs of all estimates related to these covariates are reversed, but the sizes remain the same. The coefficients for all other voter attributes and the voter-party issue proximities, which are not displayed (see Section D in the Supplementary Materials for estimation tables), are unaffected by the reversed coding, but the constants are not. Since only utility differences matter, the party intercepts adjust to the sign change to reproduce these differences and to maintain the respective choice probabilities.Footnote 14
As shown in the right panel of Figure 1, the reversed coding of the standard variable gender would lead us to conclude that the FDP is the lowest valence party instead of the Greens, what we would infer under the original coding. When we reverse the coding of the variable worker, the SPD is the highest valence party instead of the CDU. The reversed coding of the region variable leads to the Left as the lowest valence party, which was ranked in the middle under the original coding. The most dramatic change in valence ranking is observed when we recode the variable trade union: the Left moves to the highest valence party, ranking before the major parties SPD and CDU.
As a consequence, when we rely on the estimated party intercepts to measure the valence advantages of parties empirically, a single change in the coding of voter attributes affects the results and their substantial interpretation. Depending on how we code voter attributes to account for differences in the decision-making process among members of particular social segments, the ordering of valence parties differs. Using the valence ranking further on to calculate and analyze equilibrium positions will produce different conclusions we would draw from the model. To illustrate this, let us consider how the convergence coefficientFootnote 15 is defined in the valence model:
$(\hat{\unicode[STIX]{x1D6FC}}_{1},\ldots ,\hat{\unicode[STIX]{x1D6FC}}_{K})$ are the estimated issue weights for the $K$ different policy dimensions, and $\tilde{\unicode[STIX]{x1D6FC}}$ is a $K\times K$ diagonal matrix of issue weights. $\unicode[STIX]{x1D6FB}_{0}^{\ast }=(1/n)\unicode[STIX]{x1D6FB}_{0}$ is the $K\times K$ electoral covariance matrix of $n$ voters’ ideal points. The smallest valence party comes into play by the term $\unicode[STIX]{x1D70C}_{1}$, which defines the probability that the voter chooses the lowest valence party when all parties are at the joint electoral mean (e.g., Schofield and Zakharov Reference Schofield and Zakharov2010, 180):
As $\mathbf{z}_{ij}^{T}=(0,\ldots ,0)$ when all parties are located at the joint electoral mean, the vote shares are assumed to be independent of voter $i$’s ideal points. Since the policy component drops out of the equation, the choice probability of the party with the smallest valence is only a function of differences in absolute constant estimates. As demonstrated above, depending on how voter attributes are coded, a different lowest valence party can result. That is, depending on arbitrary coding decisions each time another party would be considered in the calculation of the convergence coefficient.
The intercept values are also sensitive to the variables included in the model. When we delete or add covariates, the intercept estimates, and therefore the resulting valence ranking can change as well. The variable selection affects the information contained in the observed part of utility, and consequently, the unobserved utility sources represented by the constants. This equally applies to unstandardized and standardized variables.
One might argue that the solution to the variability of party intercepts might be to specify policy-only models as the voter-party proximities $\mathbf{z}_{ij}$ are unaffected by the coding of voter attributes. However, by neglecting the importance of segment-specific evaluations of parties, we run the risk of missing crucial parts of how voters form their preferences to arrive at an electoral choice. A model comparison based on performance measures (Log-Likelihood, AIC, BIC) shows that the full model clearly outperforms the policy-only model: Voter attributes are essential for the specification of a realistic model of voting and party competition in mass elections.Footnote 16
It is also not a remedy to look at predicted probabilities (e.g., by calculating choice probabilities of a particular combination of voter attributes or the ‘average voter’). Under this empirical measurement strategy, no variable enters the model that explicitly measures valence, and that can be increased to examine how a variable change affects the choice probabilities. Since such computations condition the predicted probabilities on voter attributes and spatial proximities, they would not help us to understand how much weight or importance voters attach to valence considerations—broadly understood as a positive nonspatial characteristic of a party. That is, a concept that relates first and foremost to parties, not voters even though the valence perceptions might vary across voters.
5 Discussion
The preceding empirical illustration highlights the analytical challenge in studying valence and the need for alternative ways to measure valence empirically. It raises the question of how we can deal with the concept within the spatial voting framework. What is the exact meaning and definition of valence? That is the most important question to be addressed first, as conceptual work must lay the foundation for its empirical analysis. In his critique of the Downsian voting framework, Stokes (Reference Stokes1963, 363) defines valence issues as “those that merely involve the linking of the parties with some condition that is positively or negatively valued by the electorate”. Within the spatial voting paradigm, valence is very broadly understood and is generally defined as any positive nonpolicy-related attribute of a party (or party leader) that voters value. But, what are desirable qualities, traits, or skills that form an additional base—next to spatial considerations—on which voters evaluate how attractive the different parties are to them? Within the spatial voting literature, a very promising conceptualization of valence is employed in Stone, Maisel, and Maestas (Reference Stone, Maisel and Maestas2004), Stone and Simas (Reference Stone and Simas2010) and Adams et al. (Reference Adams, Merrill, Simas and Stone2011). This work proposes to differentiate between two different types or dimensions of valence: character or personal qualities and strategic qualities. Character qualities refer to attributes that define abilities and skills such as competence, job performance, leadership, integrity, trustworthiness, or diligence. By contrast, strategic qualities are instrumental in nature and comprise, for instance, name recognition, campaign or fundraising abilities. Voters are expected to respond first and foremost to character valence-based characteristics because strategic qualities are ones that parties might possess or not enjoy, but voters not intrinsically value in parties (or party leaders), and therefore might not take into account (see Stone and Simas Reference Stone and Simas2010; Stone Reference Stone2017).
The next concern is how to translate the theoretical concept into empirically quantifiable units. I believe it is worth to consider how the main competing model to the spatial framework deals with this task, the ‘valence politics model of party choice’ (e.g., Clarke et al. Reference Clarke, Sanders, Stewart and Whiteley2004, Reference Clarke, Sanders, Stewart and Whiteley2009, Reference Clarke, Sanders, Stewart and Whiteley2011; Sanders et al. Reference Sanders, Clarke, Stewart and Whiteley2011; Whiteley et al. Reference Whiteley, Clarke, Sanders and Stewart2013). Core components of the model, which directly resulted from Stokes’s critique on the spatial voting framework, are party leader images, problem-solving capacities, or party identification. This strand of literature arrives at empirical measures of valence by relying on mass survey responses to questions such as who is best able to handle the most important issue a country is facing, feelings about party leaders (e.g., competence, responsiveness, and trust), or performance evaluations. Candidate trait measures might best capture how valence is understood in Schofield’s valence model. However, by relying on performance evaluations, one might argue that such considerations are contrary to the prospective logic and the concept of expected utility within the spatial voting paradigm. An additional difficulty arises when we operate in a multiparty competition environment where such valence measures are necessary for several parties. Another critical challenge is how to empirically separate valence factors from spatial considerations which might be correlated (see, e.g., Feldman and Conover Reference Feldman and Conover1983; Granberg and Brown Reference Granberg and Brown1992). Spatial and nonspatial considerations might not be separated by relying on survey data so that valence judgments independent from individual perceptions are necessary (see, e.g., Stone Reference Stone2017).
6 Concluding Remarks
Based on a discussion of central properties of discrete choice models and the specific nature of the estimates related to party intercepts within this framework, my objective here was to demonstrate why and how empirical applications of Schofield’s formal valence model, which is based on the ordering the party intercepts, do not yield unique results in fully specified voter models. Arbitrary changes in the coding of voter attributes affect the valence ranking and therefore, the conclusions we would draw from the model. It is impossible to depict valence with the constants and to ascertain something substantial from ordering absolute constant estimates. I recommend that researchers do not rely on party intercepts, but to make the valence qualities of parties (or party leaders) measurable instead and include them into voter utility expressions by observed variables. Future research should seek to quantify valence in a way that is in accord with the decision-theoretical logic of spatial voting.
Acknowledgments
I thank M. Socorro Puy, Paul W. Thurner, Susumu Shikano, Oliver Pamp, Anna-Sophie Kurella, Thomas Bräuninger, Thomas Däubler, the participants of the MZES Colloquium and the Panel ‘New Developments in Spatial Models of Party Competition’ at the 2018 ECPR General Conference. I especially thank Micha Schneider, the editor and the reviewers for their highly valuable comments, and Hannah Reitz and Daniel Krähmer for superb assistance in preparing the tables.
Data availability
Supplementary materials for this article are available on the Political Analysis website. For Dataverse replication materials, see Mauerer (Reference Mauerer2019).
Supplementary material
For supplementary material accompanying this paper, please visit https://doi.org/10.1017/pan.2019.41.