1. Introduction
A synesthetic metaphor is a description of a percept in terms of a different sensory modality. For example, the phrase soft brightness is a description of a visual percept in tactile terms. As their name implies, synesthetic metaphors are instances of metaphorical mapping, from a source domain to a target domain. In the case of soft brightness, these are the domains of touch and sight, respectively.
Perhaps the most widely discussed issue in the literature on synesthetic metaphor is that of directional preferences. The idea behind this is that certain types of mappings are somehow ‘better’ than their opposites, for example, that touch-to-sound is better than sound-to-touch. Ullmann (Reference Ullmann1945, Reference Ullmann1957) famously proposed that directional preferences conform to a hierarchy of the senses, such that mappings ‘upward’ on the hierarchy, that is, from low modalities (touch, taste) to high modalities (sound, sight), are better than their opposite, ‘downward’ mappings. Thus, phrases like soft brightness (touch-to-sight), tasty noise (taste-to-sound), and chilled scent (touch-to-smell), are expected to be more frequent in discourse, judged as more natural, recalled better, etc., than their opposites: bright softness (sight-to-touch), noisy taste (sound-to-taste), and scented chill (smell-to-touch).
Ullmann (Reference Ullmann1945, Reference Ullmann1957) pioneered the empirical study of synesthetic metaphors with a quantitative analysis of 19th century English, French, and Hungarian literary corpora. He collected several hundred synesthetic metaphors in various grammatical forms, disregarding what he judged to be ‘stale’ or conventionalized metaphors, and annotated their target and source domains.
From his findings, Ullmann draws three principal generalizations. First, the majority of transfers (i.e., mappings) are directed from lower toward higher levels of the sensorium. This generalization relies on the assumption, adopted from classical philosophy, that the senses are ordered hierarchically as in (1) below. Ullmann’s second generalization is that most of the transfers are taken from the sphere (i.e., domain) of touch, and the third is that most of the transfers are directed toward the sphere of sound. A corollary of the two latter generalizations is that the single most frequent type of transfer is from touch to sound.
Ullmann explicitly raises the possibility that these generalizations represent a semantic law, although there are exceptions to this proposed law even in his own data, namely in mappings between sight and sound. Thus, downward mappings from sight to sound are more frequent in his corpora than the opposite upward mappings. As a possible explanation for this exception, Ullmann remarks that sound has fewer words associated with it than sight, and hence is more likely to ‘recruit’ descriptors from other domains.
Following Ullmann’s work, a great deal of research on synesthetic metaphors has been devoted to corroborating, extending, refining, or explaining some or all of Ullmann’s generalizations regarding directional preferences (Day, Reference Day1996; Dombi, Reference Dombi1974; Shen & Cohen, Reference Shen and Cohen1998; Shen & Eisenman, Reference Shen and Eisenman2008; Shen & Gadir, Reference Shen and Gadir2009; Shen & Gil, Reference Shen, Gil, van Peer and Auracher2008; Shinohara & Nakayama, Reference Shinohara and Nakayama2011; Werning, Fleischhauer & Beseoglu, Reference Werning, Fleischhauer and Beseoglu2006; Williams, Reference Williams1976; Wise, Reference Wise1997; Yu, Reference Yu2003). These include corpus studies reproducing Ullmann’s generalizations in various languages and genres, for example, Hungarian poetry (Dombi, Reference Dombi1974), English and German literature (Day, Reference Day1996), Hebrew poetry (Shen, Reference Shen1997), and Chinese literature (Yu, Reference Yu2003). Williams (Reference Williams1976) is a diachronic study of English and Japanese sensory adjectives, extending Ullmann’s generalizations from novel metaphorical mappings to fully conventional ones. Later works include experimental studies conducted in various languages, extending Ullmann’s generalizations from analyses of naturally occurring data, to interpretation of novel experimental materials (e.g., Shen, Reference Shen1997; Shen & Eisenman, Reference Shen and Eisenman2008; Shen & Gadir, Reference Shen and Gadir2009; Shen & Gil, Reference Shen, Gil, van Peer and Auracher2008; Shinohara & Nakayama, Reference Shinohara and Nakayama2011).
This entire research paradigm, however, has recently come under criticism. Winter (Reference Winter2016, Reference Winter2019a,Reference Winter, Speed, O’Meara, San Roque and Majidb; see also Ronga et al., Reference Ronga, Bazzanella, Rossi and Iannetti2012) critiques many of its underlying assumptions and methodological practices, as well as the theorizing behind it. First, many of the studies following Ullmann assume a clear delineation of sensory domains. Although they differ in the ways they ‘carve up’ the sensory conceptual space, for example, whether they separate touch from heat (Ullmann, Reference Ullmann1945) or color from dimension (Williams, Reference Williams1976), they generally agree that human sensory experience can be delineated into five to eight independent domains. As Winter (Reference Winter2016) notes, the various delineations often reflect a particular researcher’s cultural framework rather than any established psychophysical theory.
Second, studies within this paradigm tend to adopt a categorical approach to sensory words, whereby each such word is taken to evoke a single sensory domain, and all associations between sensory words and domains are considered equal. This assumption contrasts with experimental and corpus evidence that words may be associated with several sensory domains and to different degrees (Lynott & Connell, Reference Lynott and Connell2013; Winter, Reference Winter2016). Moreover, it is the researchers themselves that usually code each sensory word as associated with one domain or another, instead of relying on more reproducible methods. For example, Shen & Gadir (Reference Shen and Gadir2009) use Hebrew words they translate as honey and form to evoke taste and sight, respectively, in their experiment. Yet the participants in Lynott & Connell’s (Reference Lynott and Connell2013) study rated English honey as strongly experienced through sight (M = 4.12, SD = 0.93) and smell (M = 3.76, SD = 1.15) in addition to taste (M = 4.76, SD = 0.56), and in fact as more strongly experienced through sight than form (M = 3.24, SD = 1.75).
Similar to the above is the assumption that sensory words are always used to evoke their associated sensory domains, rather than with other conventional, non-sensory meanings. Winter (Reference Winter, Speed, O’Meara, San Roque and Majid2019b) argues that many of the examples discussed in the literature on synesthetic metaphors are conventionalized, to the extent that they might not be considered metaphorical at all. For example, English sweet could be argued to have a fully conventional affective meaning, which no longer depends on its meaning as a taste word. Hence, phrases like sweet melody might be interpreted as straightforward affective evaluations, rather than metaphorical mappings across sensory domains. Even when researchers address the difference between conventional and novel metaphors, as when Ullmann (Reference Ullmann1945, Reference Ullmann1957) sets aside ‘stale’ metaphors, they tend to rely on their own judgments in doing so.
Finally, Winter critiques the theoretical accounts advanced in many of these studies, particularly the way they draw causal conclusions from correlational data. Various factors have been proposed in the literature as potential causes for directional preferences, but only a few of these factors have been tested directly. Such testing requires either an experiment, where one factor is manipulated while other, potentially confounding factors, are controlled for; or careful statistical analyses of corpus data, revealing whether one or more factors reliably predict the occurrence of synesthetic metaphors.
Broadly speaking, the factors previously proposed as contributing to directional preferences fall into two camps (cf. Winter, Reference Winter2019a). The first consists of perceptual factors, that is, properties of the sensory modalities themselves, which are taken to have a direct effect on directional preferences. For example, the preference for soft brightness over bright softness might be due to differences between how we perceive light as opposed to how we perceive texture, perhaps that texture perception is more embodied than light perception. Crucially, the effect of perceptual factors does not depend on the word choice in a given synesthetic metaphor, but only on the modalities involved.
The second camp consists of lexical factors, that is, properties of the words associated with the different sensory modalities. For example, the preference for soft brightness over bright softness might be due to differences between the words soft and bright, perhaps that soft is more frequent or more affectively loaded than bright. Such lexical factors might still ultimately be traced back to perceptual factors, in that the properties of each sensory modality influence the makeup of its associated lexical field. Thus, the properties of soft and bright might be typical to touch-words and sight-words, respectively, because of differences between how we perceive light as opposed to how we perceive texture. However, this is an indirect effect of perceptual factors on directional preferences, contingent on the particular word choice in a given synesthetic metaphor.
An example of a perceptual factor is degree of embodiment, invoked in a speculative account by Shen and colleagues (Shen, Reference Shen1997; Shen & Eisenman, Reference Shen and Eisenman2008; Shen & Gadir, Reference Shen and Gadir2009). They propose that touch and taste, as the only modalities that require direct contact between perceiver and stimulus, are more embodied than smell, sound, and sight, and are therefore more cognitively accessible. This account subsumes directional preferences in synesthetic metaphors under the general principles of conceptual metaphor theory (e.g., Lakoff & Johnson, Reference Lakoff and Johnson1980, Reference Lakoff and Johnson1999), which posits that metaphorical mappings generally occur from more accessible to less accessible domains. Degree of embodiment, defined in this way, is a perceptual factor, not a lexical factor; whether or not a stimulus is perceived via direct contact depends on the sensory modality involved, not on the words used to describe the stimulus. To my knowledge, the effect of degree of embodiment on directional preferences has never been tested directly.
Shibuya, Nozawa & Kanamaru (Reference Shibuya, Nozawa, Kanamaru, Plümacher and Holz2007) attempt to explain directional preferences specifically between touch and sight, by appealing to a notion of sensory association. They propose that associations between the senses, which allow for interpretable synesthetic mappings, are based on co-occurrences between sensory stimuli in daily experience. The relative frequency of the co-occurrence determines the strength of the association. For example, tactile stimuli almost always co-occur with visual stimuli, whereas only a minority of visual stimuli co-occur with tactile stimuli. Therefore, the association of touch with sight is stronger than the association of sight with touch, making mappings from touch to sight more interpretable than their opposites. Sensory association is another perceptual factor as the co-occurrence of stimuli does not depend on the words used to describe the stimuli. Like embodiment, the effect of sensory association on directional preferences has not been tested directly in previous literature.
Popova (Reference Popova2005) as well as Petersen et al. (Reference Petersen, Fleischhauer, Beseoglu and Bücker2007) discuss the notion of gradability in this connection, particularly antonymic, unbounded gradability, which is a lexical factor. They propose that metaphorical mapping of antonymic, unbounded gradable features is more natural than that of other features, because the former can be mapped onto abstract, modality-general scales. For example, softness is antonymic (the opposite of soft is hard) and unbounded (there is no maximal degree of either softness or hardness). It can therefore be mapped onto an abstract scale such as intensity or affectivity, allowing soft in soft brightness to be naturally reinterpreted as say faint or pleasing. In contrast, redness is not antonymic (there is no unique opposite to red) and silence is bounded (one thing cannot be literally more silent than another). Therefore, red and silent cannot be mapped onto those same abstract scales, and do not receive a natural reinterpretation. Popova (Reference Popova2005) further argues that touch (and to a lesser extent, taste) is associated with this kind of gradability more so than sound or sight, which explains why it makes a better source domain. Nevertheless, gradability is a lexical factor: a given percept, in any sensory modality, can be described using either gradable or non-gradable features, for example, quiet (unbounded) vs. silent (bounded), or light (antonymic) vs. white (non-antonymic). Petersen et al. (Reference Petersen, Fleischhauer, Beseoglu and Bücker2007) present experimental evidence for the effect of gradability on directional preferences. They show that in German synesthetic metaphors with sight as their source domain, antonymic gradable features like bright lead to higher accessibility than non-antonymic features like red.
Strik Lievers (Reference Strik Lievers2015) and Winter (Reference Winter2016) both discuss affectivity and lexical distribution, among other lexical factors which may influence directional preferences. They both relate metaphorical usage to affectivity, and propose that certain modalities, particularly taste and smell, are more affectively loaded than others. In other words, one of the points of using metaphors is affective evaluation, and since taste and smell are affectively loaded, they make better source domains. Like gradability, affectivity is a lexical factor; any given percept can be described with either high-affectivity or low-affectivity words. Winter (Reference Winter2016) presents a corpus study showing that a word’s affectivity is a reliable predictor of its use in naturally occurring synesthetic metaphors, as are its frequency and iconicity.
Strik Lievers (Reference Strik Lievers2015) and Winter (Reference Winter2016) also note that lexical coding in different modalities is not distributed evenly across lexical categories. For example, in English, there are relatively few lexical adjectives associated with sound, and relatively many associated with sight (Strik Lievers & Winter, Reference Strik Lievers and Winter2018). Therefore, it is statistically more likely that we find a sight adjective modifying a sound noun than a sound adjective modifying a sight noun. This point echoes the early suggestion by Ullmann (Reference Ullmann1957), that mappings from sight to sound are more frequent than ones from sound to sight because the domain of sight is lexically richer than that of sound. While this explanation accounts well for corpus findings, additional assumptions are required for it to account for experimental findings. A priori, there is no reason to expect general facts of lexical distribution to influence judgments about specific pairs of concepts, like the stimuli presented to participants in experiments. Naturally, lexical distribution is also a lexical factor.
Summarizing, several lexical factors have been shown to have an effect on directional preferences in synesthetic metaphors (Petersen et al., Reference Petersen, Fleischhauer, Beseoglu and Bücker2007; Strik Lievers & Winter, Reference Strik Lievers and Winter2018; Winter, Reference Winter2016). As of yet, there is no comparable evidence for an independent, direct effect of any perceptual factor. We may then entertain the possibility that there is no property of the sensory modalities, in and of themselves, which directly causes directional preferences. In other words, there might not be any criterion relevant for directional preferences by which the senses are ordered hierarchically. The conclusion would be that what has so often been referred to as Ullmann’s hierarchy of the senses may turn out to be descriptively adequate, but explanatorily inert. The so-called hierarchy might be ‘explained away’, partially or entirely, as an artifact of independent, idiosyncratic lexical factors.
Before we relegate Ullmann’s hierarchy to an artifact, we might want to ask why perceptual factors have not been tested directly, let alone established empirically, in previous research. One reason for this may be that perceptual factors are more difficult to operationalize and manipulate than lexical factors, whether one is annotating naturally occurring synesthetic metaphors, or constructing experimental stimuli (see Winter, Reference Winter2019a). A second reason is that the effects of perceptual factors may be difficult to isolate, given the ubiquity of confounding lexical factors. Of course, all synesthetic metaphors are limited by the inventory of sensory words in the relevant language. If a lexical field is sparse, as is the case for smell in English for instance, this limitation can be quite severe (Majid & Burenhult, Reference Majid and Burenhult2014). But even a rich lexical field is limiting, because sensory words tend to have complex, idiosyncratic meanings. Finding a set of words that are associated with different senses but are otherwise comparable, that is, not differentiated by lexical category, affectivity, gradability, morphological complexity, or other lexical factors, can be a formidable task.
To illustrate, consider intensity, one of the few candidates for a dimension that is straightforwardly analogous across the senses (Levinson & Majid, Reference Levinson and Majid2014). As such, we might expect to find comparable lexical means for expressing high and low intensity in different sensory domains, yet we do not. In English, dim, quiet, and bland can mean low intensity of light, sound, and flavor, respectively, but they are not truly comparable because their meanings are more complex than that, as can be seen from their antonymy relations. Thus dim, in addition to being an antonym of bright, is also an antonym of clear, which is orthogonal to intensity. The two obvious antonyms of quiet, loud, and noisy, both mean high intensity, but the latter also means something like erratic or disturbing. And the obvious antonym of bland is tasty, which means positive evaluation rather than high intensity.
The preceding paragraph focuses on adjectives, because adjective–noun phrases, such as soft brightness, are the most frequent and best-studied kind of synesthetic metaphor. But the difficulties in controlling for lexical factors in sensory words are not limited to adjectives. An exception to the ubiquitous focus on adjective–noun phrases in the literature is Shen & Gadir’s (Reference Shen and Gadir2009) experimental study of the Hebrew genitive construction X shel Y, which included concrete nouns with a salient sensory feature (e.g., sukar shel bosem ‘sugar of perfume’), as well as abstract nouns derived from adjectives (e.g., melixut shel digdugiut ‘saltiness of ticklishness’). Despite not using adjectives, there are conspicuous lexical semantic differences in the materials: contrast the highly affective siraxon ‘stench’ with the low affectivity taam ‘flavor’, and the antonymic kshixut ‘rigidity’ with the non-antonymic tsehivut ‘yellowness’.
In light of the above, my goal here is to experimentally investigate directional preferences in synesthetic metaphors, while controlling for lexical factors which previous studies have not accounted for, and which have thus potentially warped the empirical picture. To my knowledge, this is the first attempt to take on this methodological challenge, and as a result, the first investigation of a potential direct effect of perceptual factors on directional preferences.
The results of the experiment reported below show that some directional preferences do surface when lexical factors are controlled for. It thus provides unprecedented evidence that perceptual factors play a role in determining directional preferences in synesthetic metaphors. However, the directional preferences found here do not add up to an overarching preference for mappings either upward or downward on Ullmann’s hierarchy of the senses.
2. Experimental study
This study was designed to test whether directional preferences in synesthetic metaphors arise in the absence of lexical factors. To that aim, I use synesthetic metaphors in a verbal analogy construction, wherein the target and source domains are each evoked by a copulative perception verb (CPV): look, sound, smell, taste, or feel. (2) lists naturally occurring examples of synesthetic metaphors in verbal analogies, retrieved from the enTenTen15 corpus at www.sketchengine.eu (Kilgarriff et al., Reference Kilgarriff, Baisa, Bušta, Jakubíček, Kovář, Michelfeit, Rychlý and Suchomel2014).
Using verbal analogy to test directional preferences in synesthetic metaphors crucially relies on the assumption that this construction may actually involve metaphorical mapping. This is a non-trivial assumption, which ties into the long-running debate on the relationship between metaphors and comparisons in general (e.g., Bowdle & Gentner, Reference Bowdle and Gentner2005; Chiappe & Kennedy, Reference Chiappe and Kennedy2001; Croft & Cruse, Reference Croft and Cruse2004; Glucksberg & Haught, Reference Glucksberg and Haught2006; Glucksberg & Keysar, Reference Glucksberg and Keysar1990). I assume, following Gil & Shen (Reference Gil and Shen2021), Steen et al. (Reference Steen, Dorst, Herrmann, Kaal, Krennmayr and Pasma2010), Wolff & Gentner (Reference Wolff and Gentner2011), inter alia, that there is such a thing as a metaphorical comparison, in the sense that it involves unidirectional mapping of properties or inferences from one domain to another (though see Steen et al., Reference Steen, Dorst, Herrmann, Kaal, Krennmayr and Pasma2010, pp. 92–96, for challenges in delineating domains in these cases). Pertinent evidence in support of this position is that comparisons between concepts in different domains exhibit directional preferences parallel to other metaphors, such as a preference for abstract and concrete concepts in subject and complement positions, respectively, rather than vice versa (Ortony, Reference Ortony1979; Porat & Shen, Reference Porat and Shen2017; Shen, Reference Shen1997).
By using CPVs to explore synesthetic metaphors, I control for a number of lexical factors which have potentially muddled the results of previous studies, where sensory domains were evoked using nouns and adjectives. The first advantage of using CPVs is circumventing the issue of differential lexical distribution. This is because English CPVs comprise a closed set of lexemes, which stand in a one-to-one relation to the five Aristotelean senses. That is, each of the five sensory domains can be evoked using exactly one CPV, making them all equally encoded. As such, none of the domains is more or less likely to require the ‘recruitment’ of descriptors from another domain or to be recruited in the description of another domain.
Second, using CPVs makes it possible to control for lexical semantic factors such as affectivity and gradability. This is because CPVs – on their attributary reading (see below) – have directly comparable and very lean semantic contributions, which amount to specifying the sensory domain to which a description applies. The actual substance of the description, including its affectivity and gradability, is determined by the verb’s complement, which, in an experimental setting, can be kept constant across verbs.
Third, using CPVs justifies some of the assumptions criticized by Winter (Reference Winter2016, Reference Winter2019a) in earlier studies. Since there are exactly five English CPVs and each evokes a single sensory modality, the question of how to delineate the senses is resolved straightforwardly: I assume the five senses for which there are CPVs.Footnote 1 Likewise, each CPV can be assumed to be categorically associated with the modality it evokes, and not associated with any other modality. Lastly, synesthetic metaphors in the verbal analogy construction can be assumed to be categorically novel and not conventional, given their infrequency in natural usage.
At the same time, CPVs are polysemous, and their other readings, besides the aforementioned attributary reading, are not necessarily comparable. Moreover, even on their attributary reading, CPVs probably do differ in their connotative meanings. But careful material design can block undesired readings, as well as override differences in connotative meanings.
CPVs are a subset of phenomenon-based perception verbs, that is, the class of perception verbs that take a stimulus rather than a perceiver as their grammatical subject (Viberg, Reference Viberg1983). Phenomenon-based perception verbs can be divided into (i) predicates, which take a stimulus subject (e.g., glow, buzz, stink, tickle) and sometimes a stimulus object (e.g., reflect) or perceiver object (e.g., dazzle, Swedish synas ‘be visible’; cf. Viberg, Reference Viberg2019); and (ii) copulatives, which take a stimulus subject and a predicate complement, which in English may be an adjective, a noun phrase, or a comparative construction headed by like or as if.
All English CPVs are polysemous, with (at least) two logically independent readings (Gisborne, Reference Gisborne2010). The relevant reading here is the attributary one, on which a CPV takes its stimulus subject and its predicate complement as logical arguments. The CPV specifies that the predicate holds of a particular percept of the stimulus, that is, of the stimulus as perceived via a particular sensory modality. For example, (3a) expresses that wonderful holds of the smell of the wine. The attributary reading can be paraphrased using a possessive construction, with the stimulus as the possessor, the sensory modality as the possessed, and the predicate modifying the latter, as in (3b). Examples marked with ~ are constructed.
Since the attributary reading assigns a description to a percept, it can be used to express a synesthetic metaphor, namely when the verb’s complement evokes a different sensory domain than the verb itself. For example, (4a) expresses that a visual description applies to an olfactory percept, a clear synesthetic mapping.
However, phrases in which a CPV’s complement evokes a different domain than the verb itself are often naturally interpreted with a different reading of the CPV, namely the evidential reading, where no synesthetic mapping occurs.Footnote 2 The availability of evidential readings in such cases is discussed by Petersen & Gamerschlag (Reference Petersen, Gamerschlag, Gamerschlag, Gerland, Osswald and Petersen2014). In addition, if a CPV’s complement is a lexical predicate associated with a particular sensory domain, as is the case with purple, we again run into issues of lexical coding and lexical semantics.
To control for lexical factors, and to encourage an attributary reading rather than an evidential reading, I use the verbal analogy construction in (5). In essence, the construction expresses that some implicit description which applies to stimulus b as perceived via modality y (the source domain), also applies to stimulus a as perceived via modality x (the target domain). For example, (6a) expresses that an auditory description of the speaker’s music also applies to the visual percept of the painting. This meaning can be paraphrased as in (6b).
2.1. Materials
The experimental materials consisted of 80 short passages, each containing a synesthetic metaphor in the verbal analogy construction. Each passage consisted of (i) an explicit value judgment, I (don’t) like how this noun verbs; followed by (ii) the phrase In a way; and finally (iii) a verbal analogy containing two different inanimate nouns, two different CPVs, and a modality-general adjective. The template for the passages is given in (7), with an example in (8).
The explicit value judgment and the abstract adjective were included as contextual cues for the interpretation of the verbal analogy, following a pilot experiment in which participants found bare verbal analogies difficult to interpret. The explicit value judgment also served to override differences in affective connotations between the CPVs. The phrase In a way was included to encourage a nonliteral interpretation of the verbal analogy.
Each synesthetic metaphor contained two different verbs, two different nouns, and one adjective. In total, 5 verbs, 40 nouns, and 12 adjectives were used in the experiment. The verbs were the five English CPVs: look, sound, smell, taste, and feel. For each verb, 8 inanimate, concrete nouns were chosen from among the 50 most frequent subjects of that verb occurring as a CPV (i.e., tagged as a verb, preceded by a word tagged as a noun, and followed by either a word tagged as an adjective or the word like), in the Sketch Engine corpus enTenTen15 (Kilgarriff et al., Reference Kilgarriff, Baisa, Bušta, Jakubíček, Kovář, Michelfeit, Rychlý and Suchomel2014). Each noun was only used as the subject of a single verb. For example, car was only used as a subject of sound, despite also being a frequent subject of look and smell.
The adjectives were chosen to represent five modality-general dimensions, three of which correspond to Osgood, Suci, & Tannenbaum’s (Reference Osgood, Suci and Tannenbaum1957) affective components: good/bad for valence, interesting/boring for arousal, and strong/weak for dominance. Half of the occurrences of strong/weak were substituted with huge/tiny to avoid phrases which were otherwise difficult to interpret literally, such as strong painting and weak house. The remaining two dimensions were abstract: familiar/strange for familiarity, and expensive/cheap for price.
Nouns, verbs, and adjectives were combined pseudo-randomly to create 40 synesthetic metaphors, such that each noun appeared in two metaphors with two different, non-antonym adjectives. From these 40 metaphors, another 40 metaphors were generated by flipping the order of the nouns and verbs. For example, this coat feels like an expensive soup tastes was flipped to create this soup tastes like an expensive coat feels. The 80 metaphors were embedded in the template in (8) above, with the adjective’s polarity determining whether the value judgment was positive or negative. The full list of metaphors is available online at <https://osf.io/2hmcb/>.
2.2. Participants
48 monolingual English speakers (29 female and 17 male; 2 did not disclose their gender) between the ages of 21 and 69 (M = 37, SD = 11.5) were recruited using Prolific (www.prolific.co). One participant showed zero variance in their responses, so their responses are excluded below, leaving 47 participants. The sample size was chosen prior to conducting any statistical analyses, based on sample sizes in comparable previous studies (Shen & Eisenman, Reference Shen and Eisenman2008; Shen & Gadir, Reference Shen and Gadir2009; Shen & Gil, Reference Shen, Gil, van Peer and Auracher2008).
2.3. Procedure
Four lists were created, each consisting of 20 passages, with each of the 40 nouns appearing once per list, and the number of positive and negative value judgments counterbalanced between lists. The order of the passages in each list was randomized. The lists were uploaded to an online survey platform (www.qualtrics.com), and participants were randomly assigned to one of the four lists.
Participants were told they would see figurative sentences expressing opinions about things and comparing them to other, possibly very different things. They were instructed to rate how natural or unnatural each sentence was. A natural sentence was defined as ‘one that makes sense, that you would not be surprised to hear in conversation’, an unnatural sentence as ‘one that doesn’t make sense, and sounds awkward or foreign’, and an intermediate sentence as ‘one you could make sense of, perhaps with some difficulty, though you might not expect to hear it in conversation’. Participants rated the naturalness of each sentence on a 7-point scale, with 7 labeled ‘very natural’ and 1 labeled ‘very unnatural’.
Prior to the experiment proper, participants saw two practice questions designed to establish benchmarks of naturalness and unnaturalness. The first practice question included a conventional metaphorical mapping, and the second included an anomalous metaphorical mapping. The two practice questions were followed by explanations tying them to the instructions and suggesting how their naturalness might be rated:
The results were analyzed with a mixed-effects ordinal model. Analysis was conducted in the R software environment (using R version 3.6.3, R Development Core Team, 2020), with the packages ‘ordinal’ (Christensen, Reference Christensen2018) and ‘tidyverse’ version 1.3.0 (Wickham et al., Reference Wickham, Averick, Bryan, Chang, McGowan, François, Grolemund, Hayes, Henry, Hester, Kuhn, Pedersen, Miller, Bache, Müller, Ooms, Robinson, Seidel, Spinu, Takahashi and Yutani2019). Data were entered into a cumulative link model (i.e., ordinal regression model) with fixed effects for mapping direction (upward mapping and downward mapping), senses (each of the 10 possible 2-sense combinations), and value judgment (positive and negative), all of which were sum-coded. The analysis also included an interaction term for direction × senses, and a random effect for participants. The scripts and the data are available online at <https://osf.io/2hmcb/>.
3. Results
Fig. 1 presents the overall distribution of the naturalness ratings. The distribution is centered below the middle of the naturalness scale (M = 3.1, SD = 1.63, median = 3, mode = 2), indicating that the participants generally found that the synesthetic metaphors did not make much sense or were difficult to make sense of. This result is not surprising, given that the experimental materials were novel metaphorical mappings presented with little supporting context.
Fig. 2 presents the means and interquartile ranges of the naturalness ratings grouped by sense combination and mapping direction. Of the 10 sense combinations, naturalness was by far highest in smell + taste (M = 4.479, SD = 1.664), which was also the only combination for which the mean as well as the median (= 5) were above the middle of the naturalness scale. For all nine other sense combinations, both the mean and the median were below the middle of the scale. Further setting smell + taste apart from the other combinations, the mean difference between this pair and the next highest combination, smell + feel (M = 3.436, SD = 1.707), was greater than the difference between the second highest and the very lowest-rated combination, look + sound (M = 2.585, SD = 1.629).
Turning to mapping direction, naturalness across sense combinations was slightly higher in upward mappings (M = 3.155, SD = 1.686) than in downward mappings (M = 3.040, SD = 1.580). Within the 10 sense combinations, mean naturalness was higher in upward mappings than in downward mappings in 6 combinations, but lower in the remaining 4. The mean difference between upward and downward mappings was greatest in sound + feel, where upward mappings were preferred (M = 3.617, SD = 1.726; downward: M = 2.872, SD = 1.439). The next greatest difference was in look + sound, where the opposite direction was preferred (upward: M = 2.383, SD = 1.497; downward: M = 2.878, SD = 1.744).
The mixed-effects model coefficients are provided in Table 1. The effect of mapping direction on naturalness was minor, and not statistically significant (Estimate = −0.089, SE = 0.060, p = 0.135). There was a significant effect of value judgment, with naturalness for negative judgments lower than the grand mean (Estimate = −0.541, SE = 0.066, p < 0.001). There were also multiple significant effects of sense combination on naturalness: naturalness was considerably higher than the grand mean in smell + taste (Estimate = 1.955, SE = 0.192, p < 0.001), and lower, to varying degrees, in look + sound (Estimate = −1.117, SE = 0.191, p < 0.001), look + taste (Estimate = −0.833, SE = 0.189, p < 0.001), and sound + smell (Estimate = −0.589, SE = 0.177, p = 0.001).
*** p < 0.001, ** p < 0.01, * p < 0.05.
The interaction between mapping direction and sense combination had a noticeable and statistically significant effect for two sense combinations: in look + sound, naturalness in downward mappings was higher than the mean (Estimate = 0.385, SE = 0.186, p = 0.038). Conversely, in sound + feel, naturalness in downward mappings was lower than the mean (Estimate = −0.347, SE = 0.176, p = 0.049).
4. Discussion
The results above suggest that localized directional preferences exist for synesthetic metaphors, even when lexical factors are controlled for. Specifically, there was a noticeable preference for downwards mappings in look + sound, and an opposite preference in sound + feel. These two opposite preferences align with two of Ullmann’s (Reference Ullmann1945, Reference Ullmann1957) early observations: first, touch-to-sound mappings are the single most frequent type of synesthetic mapping; and second, mappings between sight and sound are the single consistent exception to the general preference for mappings ‘upwards’ on the hierarchy of the senses.
At the same time, the results provide no evidence that synesthetic mappings upwards on Ullmann’s hierarchy of the senses are consistently preferred over downwards mappings. First, the overall effect of mapping direction was minor. Second, upwards mappings actually received lower mean naturalness ratings than downwards mappings in 4 out of 10 possible sense combinations. These results do not align with the findings of numerous earlier experimental studies (Shen & Cohen, Reference Shen and Cohen1998; Shen & Eisenman, Reference Shen and Eisenman2008; Shen & Gadir, Reference Shen and Gadir2009; Shen & Gil, Reference Shen, Gil, van Peer and Auracher2008). The design of the present experiment fundamentally differs from previous designs in two ways: (i) it evokes sensory domains using CPVs rather than adjectives or nouns; and (ii) it relates sensory domains using analogy rather than modification or predication. In the next section, I consider how these differences may be responsible for the contrast between present and past findings.
Next, the results indicate that mappings between certain senses, regardless of direction, are more natural than others. Particularly, mappings between smell and taste received considerably higher naturalness ratings than all other possible sense combinations, and were the only ones for which mean and median naturalness were higher than the midpoint of the scale.
A possible explanation for the gap between smell + taste and all the other sense combinations is that comparisons between smell and taste percepts are not actually metaphorical at all (see Fishman, Reference Fishman2020, for comparable differences in ratings of literal and metaphorical comparisons). That is, the sensory domains of smell and taste may be so similar, or intersect to such an extent, that comparisons between them are naturally taken as intra-domain rather than cross-domain mappings. This idea is obliquely supported by the strong positive relationship between gustatory and olfactory measures of words (Lynott & Connell, Reference Lynott and Connell2013; see also Winter, Reference Winter2016), as well as by neurocognitive evidence for integration between the gustatory and olfactory systems (Verhagen & Engelen, Reference Verhagen and Engelen2006).
To test the reliability of the experimental results, I conducted a corpus study. I ran a search for the verbal analogy construction with CPVs in the Sketch Engine corpus enTenTen15 (Kilgarriff et al., Reference Kilgarriff, Baisa, Bušta, Jakubíček, Kovář, Michelfeit, Rychlý and Suchomel2014), using the following query:
[lemma=“look|sound|smell|taste|feel” & tag=“V.*”] []{0,2} [lemma=“like”] []{1,3} [lemma=“look|sound|smell|taste|feel” & tag=“V.*”]
The above query returns results which include the following five elements, in order: (i) one of the five lexemes look, sound, smell, taste, and feel, tagged as a verb; (ii) a sequence of 0 to 2 words (for a potential modifier, e.g., a bit, very much, exactly, or a potential perceiver argument, e.g., to me); (iii) the word like; (iv) a sequence of 1 to 3 words (for the subject of the second verb and a potential auxiliary verb, e.g., have, would); and again (v) one of the five lexemes look, sound, smell, taste and feel tagged as a verb.
I extracted a random sample of 10,000 hits, which I then manually inspected to filter out false positives and duplicate hits, leaving 869 unique instances of the verbal analogy construction. Next, I hand-coded the sample, filtering out occurrences of the verbs as experiencer verbs rather than CPVs (i.e., with a perceiver rather than a stimulus as grammatical subject), as well as instances where one or both CPVs could be interpreted as evidential rather than attributary. This left 413 instances, presented in Table 2. Of these, the majority were instances containing the same CPV in both verbal positions, meaning they were literal intra-domain comparisons rather than synesthetic metaphors. Only 90 instances were actual cross-domain mappings, indicating that synesthetic metaphors in the verbal analogy construction are quite rare in natural usage. This finding aligns with the low mean naturalness ratings elicited in the experiment.
Note. Upward and downward mappings are in the top right and bottom left halves, respectively. Literal comparisons are in the diagonal, in parentheses.
The small size of the corpus makes it impossible to draw statistically reliable conclusions. Perhaps the most conspicuous finding is the high number of mappings between smell and taste (n = 48), which account for over half of the cross-domain mappings. This fits with the substantial difference in naturalness ratings in the experiment, between smell + taste on the one hand, and all other sense combinations on the other hand. However, downward mappings from smell to taste were far more frequent in the corpus (43 of 48), whereas there was no clear preference for either direction in the experimental results.
Upward mappings from touch to sound were somewhat more frequent than the opposite (9 of 13), as were downward mappings from look to sound (10 of 16). These findings are compatible with the directional preferences in the experiment, but the numbers are far too small to be reliable. The entirety of the corpus data is available online at <https://osf.io/2hmcb/>.
5. Conclusions
The experiment described above is, to my knowledge, the first experiment to probe directional preferences in synesthetic metaphors while controlling for lexical factors. As such, it constitutes the first attempt to directly test the effect of perceptual factors on directional preferences in synesthetic metaphors. This testing is made possible by focusing on an oft-overlooked set of perception verbs and using a novel construction: CPVs and the verbal analogy construction, respectively. This study thus circumvents a crucial methodological limitation of previous research into synesthetic metaphors, while also broadening the empirical scope of the phenomenon.
The results reveal directional preferences in verbal analogies with CPVs, namely in mappings between touch and sound, and between sight and sound. Although more localized than the overarching preference for upward mappings observed in earlier studies, these preferences do align with previous findings. Mappings from touch to sound, and from sight to sound, have consistently been found to be preferred over their opposites, and generally rank among the top possible mappings in frequency, accessibility, and comprehensibility (Shen & Gil, Reference Shen, Gil, van Peer and Auracher2008; Shinohara & Nakayama, Reference Shinohara and Nakayama2011; Strik Lievers, Reference Strik Lievers2015; Ullmann, Reference Ullmann1945; Williams, Reference Williams1976; Winter, Reference Winter2016). Importantly, however, the present findings are the first that cannot be attributed to differences in lexical semantics or lexical coding. I would go further and venture that this is the first evidence for a direct effect of perceptual factors, that is, properties of the sensory modalities themselves, on directional preferences in synesthetic metaphors. The experiment was not designed to explore which perceptual factors these may be, so I refrain from speculating on the matter. Nonetheless, these findings place new and important restrictions on any future theory of synesthetic metaphors, and, I believe, also point to exciting new avenues for future empirical research.
Perhaps the more striking finding arising from the present results is the lack of an overarching effect of mapping direction. In this, the present study diverges from decades of empirical research into synesthetic metaphors, comprising corpus and experimental studies in various languages, and consistently showing a preference for mappings ‘upwards’ on Ullmann’s hierarchy of the senses. How can we account for this divergence?
I propose that the preference for upward mappings observed in previous studies is due to one or more factors that are not in effect or are somehow mitigated, in the present study. Given the goals of this study, some ‘immediate suspects’ are lexical semantic factors, such as affectivity and gradability, along with differences in lexical coding. I have argued that these factors do not differentiate CPVs, and hence are rendered inert in the experiment reported here. At the same time, such factors have previously been shown to reliably predict the frequency and acceptability of synesthetic metaphors (Petersen et al., Reference Petersen, Fleischhauer, Beseoglu and Bücker2007; Winter, Reference Winter2016). It is not a huge leap to posit that the overarching preference for upward mappings is due to the accumulated effects of several such lexical factors, some of which we may not yet know about. If that is indeed the case, we might conclude that mapping direction with respect to Ullmann’s hierarchy of the senses is an artifact, with no independent effect on synesthetic metaphors. Put another way, what appears to be a preference for upward mappings, may actually turn out to be a conflation of several independent lexical factors, which just so happens to (roughly) fit the ideas of classical philosophers about the senses.Footnote 3
It is also possible that certain factors relevant to directional preferences were unintentionally mitigated in the present study. Here I consider three such factors: the (im)possibility of metaphorical mapping in comparisons, the inherent directionality of the grammatical form, and the interpretability of the synesthetic metaphor.
As noted earlier in the article, I assume here that comparisons in general, and the verbal analogy construction in particular, may involve metaphorical mapping. I take the directional preferences revealed in the present study, which parallel two of the directional preferences most consistently observed with other synesthetic metaphors, as further evidence in support of this assumption. However, let us consider the alternative, that metaphorical mapping is fundamentally impossible in comparisons. Proponents of this view might argue that the reason the present study did not find additional directional preferences, for example, an overarching preference for upward metaphors, is that comparison and metaphor are subject to influence by different perceptual factors. More specifically, it would seem that comparisons are influenced by a subset of the perceptual factors which influence metaphors: those that drive preferences for touch-to-sound and sight-to-sound, but not those that drive the general preference for upward metaphors. This then raises questions regarding which perceptual factors influence which figures of speech, and why they influence one but not the other.
The directionality of a grammatical form is the degree to which the form constrains the direction of metaphorical mapping (see Fishman & Shen, Reference Fishman and Shenin press; Gil & Shen, Reference Gil and Shen2021; Porat & Shen, Reference Porat and Shen2017). Adjectival modification (e.g., soft brightness) and nominal predication (e.g., brightness is softness) both exemplify high directionality, with strict mapping from adjective to noun and from predicate to subject, respectively. As such, preferences in mapping direction can be clearly detected using naturalness ratings about these constructions. Conversely, genitive constructions (e.g., a softness of a brightness) and comparisons in intransitive collective constructions (e.g., softness and brightness are alike) exemplify low directionality, with mapping direction virtually unconstrained. As such, preferences in mapping direction might be entirely obfuscated in naturalness ratings about these constructions, though they can be revealed using other experimental tasks (Shen & Gadir, Reference Shen and Gadir2009). The directionality of comparisons with a subject and a complement (e.g., brightness is like softness) is a matter of debate (e.g., Chiappe & Kennedy, Reference Chiappe and Kennedy2001; Glucksberg & Keysar, Reference Glucksberg and Keysar1990; Wolff & Gentner, Reference Wolff and Gentner2011), but plausibly falls somewhere between those two extremes. The verbal analogies used here are such comparisons, and therefore might be less inherently directional than other frequent forms of synesthetic metaphor, especially adjective–noun phrases. Hence, it is possible that there was a preference for upward mappings in the experiment reported here after all, but it went undetected due to the verbal analogy’s relatively low directionality and the nature of the experimental task.
Another factor that may have stymied directional preferences in the experiment reported here is interpretability. A study by Fishman and Shen (Reference Fishman and Shenin press) suggests that interpretability has an independent effect contributing to directional preferences. Fishman and Shen conducted an experiment testing preference between two grammatical forms of comparisons: an intransitive collective construction (A and B are alike; low directionality) and a construction with a subject and a complement (A is like B; higher directionality). They reasoned that speakers would choose the more directional form when they had a clearer preference for a particular mapping direction. They found a greater preference for the more directional form in interpretable metaphorical comparisons (e.g., Salesmen are like bulldozers) relative to anomalies, that is, uninterpretable metaphorical comparisons (e.g., Deserts are like bulldozers). They conclude that interpretability, though not a necessary condition for directional preferences, plays a role independently of factors like concreteness and typicality. The present findings, namely the low observed frequency of synesthetic metaphors in the verbal analogy construction, and the overall low naturalness ratings elicited for the experimental materials, indicate that synesthetic metaphors in this construction are quite difficult to interpret. It may be that this difficulty stymies the preference for upward mappings relative to more frequent and more interpretable synesthetic metaphors, for example, adjective–noun phrases. This is especially true if many of the latter rely on conventionalized meanings, as argued by Winter (Reference Winter, Speed, O’Meara, San Roque and Majid2019b).
In spite of the present study’s limitations with regard to directionality and interpretability, as discussed above, the experiment did find empirical evidence for some directional preferences. This is not to say that the issues of directionality and interpretability should be brushed off. On the contrary, in future research, I intend to address these issues directly, by investigating the interplay between these two factors and directional preferences, not only in synesthetic mapping but in metaphorical mapping more generally. I believe further exploration of these notions is crucial for advancing our understanding of metaphor.
Acknowledgments
I thank Yeshayahu Shen, Mira Ariel, Bodo Winter, Klaus von Heusinger, audiences at the University of Latvia and FTL 5, and the editor and two anonymous reviewers for insightful comments on earlier versions of this article.
Funding statement
This research was supported by the Israel Science Foundation (grant no. 1398/20 to Mira Ariel and grant no. 1196/12 to Yeshayahu Shen).
Competing interests
I declare that I have no competing interests.
Data availability statement
The experimental materials and data, the corpus data, and the scripts for the statistical analysis are available at https://osf.io/2hmcb/.