1. Introduction
A cheesy way of inviting people to smile for a picture consists in asking them to ‘say cheese’. The reason is that articulating the vowel in cheese, /i/, leads to a facial expression that resembles smiling (especially when the /i/ is lengthened). This resemblance is consistent with associations between vowels and valence. Participants associate pseudo-words containing /i/ rather than /o/ with positive meaning (Rummer et al., Reference Rummer, Schweppe, Schlegelmilch and Grice2014). In the present research, we examine the cross-linguistic generalizability of this phenomenon by comparing associations between vowels and emotional valence for participants speaking unrelated languages: German and Japanese.
1.1. Sound symbolism
In words like slurp or bang, the phoneme sequences imitate the denoted meaning. Although such imitative words are comparatively rare in Indo-European spoken languages, they are more prevalent in other language families (Vigliocco et al., Reference Vigliocco, Perniss and Vinson2014). Many spoken languages contain ideophones, a word class depicting sensory qualities (Dingemanse, Reference Dingemanse2012). In Japanese ideophones, for example, consonant voicing depicts mass, so that koro denotes a small object rolling and goro denotes a large object rolling. The more general property of language that sublexical features of word forms (e.g., phonemes) are associated with word meaning is called sound symbolism or iconicity (for reviews, see Dingemanse et al., Reference Dingemanse, Blasi, Lupyan, Christiansen and Monaghan2015; Lockwood & Dingemanse, Reference Lockwood and Dingemanse2015; Nuckolls, Reference Nuckolls1999; Perniss et al., Reference Perniss, Thompson and Vigliocco2010). In addition to its prevalence in lexicons, sound symbolism has also been demonstrated experimentally. Vowels, for example, have been found to be associated with size (Sapir, Reference Sapir1929). To denote large (vs. small) objects, participants tend to choose pseudo-words that contain /a/, such as MAL, over pseudo-words that contain /i/, such as MIL (Newman, Reference Newman1933; Thompson & Estes, Reference Thompson and Estes2011). In addition to size, sound symbolism has been demonstrated for various other dimensions, for example, shape (Ćwiek et al., Reference Ćwiek, Fuchs, Draxler, Asu, Dediu, Hiovain, Kawahara, Koutalidis, Krifka, Lippus, Lupyan, Oh, Paul, Petrone, Ridouane, Reiter, Schümchen, Szalontai, Ünal-Logacev and Winter2022; Köhler, Reference Köhler1929), color (Cuskley et al., Reference Cuskley, Dingemanse, Kirby and van Leeuwen2019; Simner et al., Reference Simner, Ward, Lanz, Jansari, Noonan, Glover and Oakley2005), speed (Kuehnl & Mantau, Reference Kuehnl and Mantau2013; Monaghan & Fletcher, Reference Monaghan and Fletcher2019), taste (Motoki et al., Reference Motoki, Saito, Park, Velasco, Spence and Sugiura2020; Pathak et al., Reference Pathak, Calvert and Motoki2020), personality (Sidhu et al., Reference Sidhu, Deschamps, Bourdage and Pexman2019), and complexity (Lewis & Frank, Reference Lewis and Frank2016).
The present research examines valence sound symbolism, the association between valence of the referent and vowels in the word denoting the referent. Specifically, /i/ (as in English meet) has been found to be associated with positive valence compared with /o/ (as in French rose; Crockett, Reference Crockett1970; Rummer & Schweppe, Reference Rummer and Schweppe2019), /u/ (as in English blue; Crockett, Reference Crockett1970; Garrido & Godinho, Reference Garrido and Godinho2021), /y/ (as in French tu; Körner & Rummer, Reference Körner and Rummer2022a), and /˄/ (as in American English gut; Yu et al., Reference Yu, McBeath and Glenberg2021). Additionally, associations between consonants and emotional properties have also been observed (e.g., Adelman et al., Reference Adelman, Estes and Cossu2018; Aryani et al., Reference Aryani, Conrad, Schmidtke and Jacobs2018; Auracher et al., Reference Auracher, Albers, Zhai, Gareeva and Stavniychuk2011; Kambara & Umemura, Reference Kambara and Umemura2021; Körner & Rummer, Reference Körner and Rummer2022b; Whissell, Reference Whissell2000; for associations of valence with both vowels and consonants in brand names, see, e.g., Motoki et al., Reference Motoki, Park, Pathak and Spence2022). The association between vowels and valence has been found when participants were asked to judge the valence of words (Yu et al., Reference Yu, McBeath and Glenberg2021), guess the meaning of pseudo-words (Körner & Rummer, Reference Körner and Rummer2022a), invent pseudo-words when in positive (vs. negative) mood (Rummer et al., Reference Rummer, Schweppe, Schlegelmilch and Grice2014), judge the warmth and competence of people with mock user names (Garrido & Godinho, Reference Garrido and Godinho2021), and give names to valenced faces (Körner & Rummer, Reference Körner and Rummer2022a; Rummer & Schweppe, Reference Rummer and Schweppe2019) and valenced objects (Rummer & Schweppe, Reference Rummer and Schweppe2019).
Valence sound symbolism can be explained by an articulatory mechanism relating to facial muscle tension (Körner & Rummer, Reference Körner and Rummer2022a). Facial muscle tension for articulation and emotional expressions overlap, so that the zygomaticus major muscle is active both, when articulating /i/ and when smiling (Hardcastle, Reference Hardcastle1976; see also Rummer et al., Reference Rummer, Schweppe, Schlegelmilch and Grice2014; Whissell, Reference Whissell2003). The association between zygomaticus activity and positive valence could have extended, via proprioceptive feedback during articulation, to the vowel /i/, so that the articulation of /i/ is associated with positive valence. In contrast, the articulation of rounded vowels entails contracting muscles that are antagonistic to the ones responsible for lip spreading (Leanderson et al., Reference Leanderson, Persson and Öhman1971). Lip rounding could therefore be associated with negative valence or less positive valence (Rummer et al., Reference Rummer, Schweppe, Schlegelmilch and Grice2014). Empirically, articulatory similarity (specifically facial muscle tension) rather than acoustic similarity predicts vowel–valence associations (Körner & Rummer, Reference Körner and Rummer2022a). Thus, valence sound symbolism – at least for /i/ versus rounded vowels – seems driven by articulatory vowel properties.
If valence sound symbolism is caused by this articulatory mechanism, it should occur for all languages that use vowels whose articulation resembles smiling and ones whose articulation inhibits smiling. As yet, however, valence sound symbolism has been mainly examined for Indo-European languages (English, European Portuguese, German, and Russian), with Mandarin Pinyin (Yu et al., Reference Yu, McBeath and Glenberg2021) as the only exception (other studies lack critical comparisons [e.g., Miron, Reference Miron1961], employed some vowels that did not occur in examined languages [Taylor & Taylor, Reference Taylor and Taylor1962], or examined vowels in isolation, i.e., without word context [Ando et al., Reference Ando, Liu, Yan, Yang, Namba, Abe and Kambara2021]). The present research makes a first step toward testing the cross-linguistic generalizability of valence sound symbolism by comparing participants from two linguistically unrelated languages: German and Japanese.
1.2. Language comparisons
Psychological research in general (Henrich et al., Reference Henrich, Heine and Norenzayan2010) and also in sound symbolism (Motoki & Pathak, Reference Motoki and Pathak2022) is biased toward examining Western participants and languages. Among the studies that did compare sound symbolism for unrelated languages, both similarities and differences have been observed. The largest study, examining sound symbolism in the basic vocabulary of more than 4,000 languages, observed, for example, that a large portion of languages use nasal sounds in words for nose (Blasi et al., Reference Blasi, Wichmann, Hammarström, Stadler and Christiansen2016; see also Johanson et al., Reference Johansson, Anikin, Carling and Holmer2020). Additionally, size sound symbolism (Huang et al., Reference Huang, Pratoomraj and Johnson1969; Shinohara & Kawahara, Reference Shinohara and Kawahara2010; see also Blasi et al., Reference Blasi, Wichmann, Hammarström, Stadler and Christiansen2016) and shape sound symbolism (Ćwiek et al., Reference Ćwiek, Fuchs, Draxler, Asu, Dediu, Hiovain, Kawahara, Koutalidis, Krifka, Lippus, Lupyan, Oh, Paul, Petrone, Ridouane, Reiter, Schümchen, Szalontai, Ünal-Logacev and Winter2022) have been observed across many languages (for other cross-linguistic similarities, see, e.g., Dingemanse et al., Reference Dingemanse, Torreira and Enfield2013; Winter et al., Reference Winter, Sóskuthy, Perlman and Dingemanse2022). However, differences between languages have been observed, for example, concerning valence associations. Nasal consonants at word beginnings have been found to be associated with positive valence in speakers of some Germanic languages but with negative valence in speakers of Chinese (Louwerse & Qu, Reference Louwerse and Qu2017; for other differences between languages, see, e.g., Taylor & Taylor, Reference Taylor and Taylor1962; for a mixture of similarities and differences, see Athaide & Klink, Reference Athaide and Klink2012). In sum, there is evidence for cross-linguistic generalization for some sound symbolic associations, but also evidence for language-specific associations (see also Imai & Kita, Reference Imai and Kita2014).
In the present experiment, we compare emotional valence sound symbolism across two unrelated languages, German (an Indo-European language) and Japanese (a Japonic language). Although several sound symbolism phenomena have been demonstrated in both languages, for example, associations with size (Shinohara & Kawahara, Reference Shinohara and Kawahara2010), color (Asano & Yokosawa, Reference Asano and Yokosawa2011), and shape (Ćwiek et al., Reference Ćwiek, Fuchs, Draxler, Asu, Dediu, Hiovain, Kawahara, Koutalidis, Krifka, Lippus, Lupyan, Oh, Paul, Petrone, Ridouane, Reiter, Schümchen, Szalontai, Ünal-Logacev and Winter2022; Kawahara et al., Reference Kawahara, Isobe, Kobayashi, Monou, Okabe and Minagawa2019), some studies observe different and especially more sound symbolic associations for speakers of Japanese compared with Indo-European languages (e.g., Iwasaki et al., Reference Iwasaki, Vinson and Vigliocco2007; see also Saji et al., Reference Saji, Akita, Kantartzis, Kita and Imai2019). Similarly, ideophones are underdeveloped in German as well as other languages from the Indo-European language families (see, e.g., Dingemanse & Majid, Reference Dingemanse, Majid, Miyake, Peebles and Coope2012), but very prevalent in Japanese, so that, according to Kakehi and colleagues (Kakehi et al., Reference Kakehi, Tamori and Schourup1996, xi) in Japanese “the occurrence of iconic words […] is anything but marginal. Such forms are indispensable to daily communication.” Thus, although some associations are similar, German and Japanese differ in their prevalence of sound symbolism. Judging from previous research, therefore, it is unclear whether or not to predict the same vowel–valence associations across the two languages.
Judging from theoretical considerations, however, we make the same predictions for /i/ and rounded vowels. As smiling is universally used to express joy (e.g., Scherer & Wallbott, Reference Scherer and Wallbott1994), the muscle tension to express positive affect and the muscle tension to articulate /i/ should overlap in all languages where /i/ occurs and involves activity of the zygomaticus major muscle. Accordingly, the proposed mechanism for valence sound symbolism – overlapping muscle activity for articulation and emotional expressions – predicts that /i/ is universally associated with more positive valence than vowels whose articulation is incongruent with smiling, specifically rounded vowels.
To test this hypothesis, we employed the experimental paradigm from Rummer and Schweppe (Reference Rummer and Schweppe2019) in which participants are asked to invent pseudo-words to denote specific objects or people (for similar paradigms, see Berlin, Reference Berlin2006; Shinohara et al., Reference Shinohara, Yamauchi, Kawahara and Tanaka2016; Vinson et al., Reference Vinson, Jones, Sidhu, Lau-Zhu, Santiago and Vigliocco2021; Whissell, Reference Whissell2000). This paradigm contains fewer constrictions than typically employed paradigms where participants have to rate or match experimenter-selected pseudo-words. When using experimenter-selected pseudo-words, any aspect of the pseudo-word might influence judgments. For example, position of the target letter in pseudo-words has been found to influence judgments (e.g., Maschmann et al., Reference Maschmann, Körner, Boecker and Topolinski2020; Nielsen & Rendall, Reference Nielsen and Rendall2013). Moreover, when comparing speakers of different languages, such incidental pseudo-word features might influence speakers of different languages differently. Therefore, using a paradigm where linguistic stimuli are as unconstrained as possible, as is the case when participants invent pseudo-words, is likely to be least biased, which seems especially important for cross-linguistic studies.
In the present study, participants invented pseudo-names for faces that differed in emotional expression. Specifically, we compared vowel usage for faces with positively valenced emotional expressions with neutral expressions as well as two negatively valenced emotional expressions: anger and sadness. Comparing the two negative expressions enables us to explore whether, in addition to emotional valence, arousal (high for anger and low for sadness) also influences vowel usage.
As participants typed in the pseudo-words, we examined vowel usage on grapheme (instead of phoneme) level. Both languages have a close grapheme-to-phoneme mapping, so that vowel graphemes correspond to one phoneme (in Japanese) or to one of a few similar phonemes (in German). We examined how frequently the vowels A, E, I, O, and U, which constitute all Japanese vowels, were used in invented names depending on both participant language and emotional expression of the depicted face. We predicted the vowel I to be associated with positive emotional valence and O and U with negative emotional valence, for both Japanese-speaking and German-speaking participants. We had no specific predictions for A and E but included these vowels for exploratory purposes.
2. Method
2.1. Participants
Participants were recruited through social media or approached in person and invited to participate. Those who were recruited online received a link to the study; those who agreed to participate when asked in person participated on site (mostly in Cafés) using their own or the experimenter’s laptop. A total of 134 participants, 76 German-speaking and 58 Japanese-speaking, completed the study (8 additional participants, 4 from each country, started the study but terminated less than 10% into the study). Of these, participants who reported a native language other than the expected language (German in Germany and Japanese in Japan; N = 18) and participants who provided existing names or existing words instead of self-invented words in more than 50% of the trials (N = 17) were excluded from all analyses, resulting in a final sample size of 99 participants (49 Japanese-speaking, 50 German-speaking; 39 female, 58 male, 2 other gender; M age = 35, SD age = 12).
 This yields a power of β = .80 (with α = .05) for finding an effect of d
z = 0.28 for within-participants effects of emotional expressions. For the exploratory question whether there is an interaction between participant origin and emotional expression, the study has a power of β = .80 (with α = .05) for finding an effect of 
 $ {\unicode{x03B7}}_{\mathrm{p}}^2 $
= .02. Final sample size depended on logistic constraints and was determined before any data analysis was performed. We report all data exclusions, all manipulations, and all measures. Materials, data, and analysis codes are available at https://osf.io/bdrsh/.
$ {\unicode{x03B7}}_{\mathrm{p}}^2 $
= .02. Final sample size depended on logistic constraints and was determined before any data analysis was performed. We report all data exclusions, all manipulations, and all measures. Materials, data, and analysis codes are available at https://osf.io/bdrsh/.
2.2. Materials
Participants were asked to invent names for faces taken from the Karolinska Directed Emotional Faces (Lundqvist et al., Reference Lundqvist, Flykt and Öhman1998; for European faces), and from the Taiwanese Facial Expression Image Database (Chen & Yen, Reference Chen and Yen2007; for East Asian faces). From each database, eight male and eight female persons were selected and one picture of each of four facial expressions (happy vs. neutral vs. angry vs. sad) per person were selected. The faces were cropped to show only the face (chin to hair line and ear to ear) and were converted to gray scale (where necessary, brightness was adjusted). Each participant saw one randomly selected picture per face.
2.3. Procedure
After providing informed consent, participants were asked to invent a name for each of 32 ensuing faces. The names should not exist in a language they knew and should be at least two syllables long. Faces were presented separately and in random order (two for each combination of gender, cultural background, and emotional expression). For each face, participants were to invent a name, then to articulate this name, and finally to type it; however, some participants did not consent to have their voice recorded and some participants preferred to have the experimenter type in their responses. As additionally the quality of many audio files was poor, a phonemic transcription of the spoken pseudo-words was infeasible, which is why we report only grapheme-based analyses. All phases of the experiment were self-paced.
After inventing names for 32 faces, participants were asked, as a manipulation check, to rate the same faces (in new random order) for valence. Specifically, for each face, they answered the question What in your opinion describes the facial expression of this person? (translated) responding by clicking one of five numbers (1 = very positive; 2 = positive; 3 = neutral; 4 = negative; 5 = very negative). Finally, participants provided demographic information and could comment on the study.
2.4. Statistical analysis
 As the experiment included nonindependence due to both repeated measures within participants and repeated measures of stimuli (for different participants), we report linear mixed-effects analyses using R (R Core Team, 2021; version 4.1.2) and the packages lme4 (version 1.1.-30; Bates et al., Reference Bates, Maechler, Bolker and Walker2015) and lmerTest (version 3.1-3; Kuznetsova et al., Reference Kuznetsova, Brockhoff and Christensen2017). We used a maximal random effects structure (Barr et al., Reference Barr, Levy, Scheepers and Tily2013). When this resulted in negative eigenvalues, random effects were removed until the issue was resolved. As significance tests, we report Type III Analysis of Variance with the Satterthwaite method for calculating degrees of freedom (for more information on this method, see Kuznetsova et al., Reference Kuznetsova, Brockhoff and Christensen2017). As there is no generally accepted effect size measure for linear mixed-effects analyses, we report 
 $ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 and d
z, calculated from participant-level data.
$ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 and d
z, calculated from participant-level data.
3. Results
 For the manipulation check, valence evaluations were entered into a 4 (emotional expression: happy vs. neutral vs. sad vs. angry; within participants) × 2 (participant language; between participants) factorial linear mixed-model analysis. There was no main effect of participant native language on valence evaluations (F(1, 97) = 0.59, p = 0.445, 
 $ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .006, 90% CI = [.000, .056]). However, confirming the validity of the manipulation, the emotional expression did influence valence judgments (F(3, 3,063) = 3,604.73, p < 0.001,
$ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .006, 90% CI = [.000, .056]). However, confirming the validity of the manipulation, the emotional expression did influence valence judgments (F(3, 3,063) = 3,604.73, p < 0.001, 
 $ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .932, 90% CI = [.921, .940]). Specifically, faces with the two negative emotional expressions did not significantly differ in valence, whereas all other pairwise comparisons show significant differences (see Table 1). In addition to this main effect of emotional expression, the interaction between participant language and emotional expression was also significant (F(3, 3,063) = 25.41, p < 0.001,
$ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .932, 90% CI = [.921, .940]). Specifically, faces with the two negative emotional expressions did not significantly differ in valence, whereas all other pairwise comparisons show significant differences (see Table 1). In addition to this main effect of emotional expression, the interaction between participant language and emotional expression was also significant (F(3, 3,063) = 25.41, p < 0.001, 
 $ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .088, 90% CI = [.037, .137]; see Fig. 1). For simple comparisons, see the Supplementary Material.
$ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .088, 90% CI = [.037, .137]; see Fig. 1). For simple comparisons, see the Supplementary Material.
Table 1. Descriptive statistics and pairwise comparisons for the manipulation check, consisting of mean valence evaluations depending on facial expression of the depicted face

Note. The tests refer to linear mixed models, comparing the valence ratings for faces with happy, neutral, angry, and sad emotional expressions. The test statistics are supplemented by effect sizes (and 95% confidence intervals) calculated from participant-level data. The values in the diagonals are the mean (and SE) values of these valence ratings (from 1 = very positive to 5 = very negative).

Fig. 1. Results from the manipulation check: mean valence evaluations depending on facial expression and on participant native language. Note. The figure depicts mean valence ratings (from 1 = very positive to 5 = very negative) depending on emotional expression and participant language. The black dots with error bars represent means with 95% confidence intervals. The shapes are density plots.
For the main analysis, examining how participant language and emotional expression influenced which vowels were used when inventing pseudo-words, each pseudo-word was coded by a native speaker blind to condition. Real words in the target language or in English as well as words that were repeated more than twice were excluded from analyses (8.2% of the words). Hiragana and Katakana mores were transliterated using the Hepburn system (using the R package stringi, version 1.7.6; Gagolewski, Reference Gagolewski2022) and accents were removed. Repeated consecutive vowel graphemes were replaced by single graphemes (e.g., Obaata was changed to Obata), and the number of occurrences for each vowel grapheme per word was calculated (in the preceding example, the value for A is 2, for O it is 1, and for all other vowels, it is 0).
 The mean vowel occurrence per invented word was then entered into a 5 (grapheme: A vs. E vs. I vs. O vs. U; within-participants) × 4 (emotional expression: happy vs. neutral vs. sad vs. angry; within participants) × 2 (participant language: Japanese vs. German; between participants) factorial linear mixed model. The three-way interaction was significant, indicating that participant language and emotional expression influenced the usage of different graphemes differently (F(12, 13,541) = 4.00, p < 0.001, 
 $ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .030, 90% CI = [.008, .038]; for lower-order effects, see the Supplementary Material). Frequencies of grapheme occurrences depending on emotional expression and participant language were then analyzed separately for each vowel.
$ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .030, 90% CI = [.008, .038]; for lower-order effects, see the Supplementary Material). Frequencies of grapheme occurrences depending on emotional expression and participant language were then analyzed separately for each vowel.
 For the vowel I, there was no main effect of language (F(1, 96) = 0.01, p = 0.940, 
 $ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 < .001, 90% CI = [.000, .014]). Moreover, the interaction of language and emotional expression also failed to reach significance (F(3, 211) = 2.55, p = 0.057,
$ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 < .001, 90% CI = [.000, .014]). Moreover, the interaction of language and emotional expression also failed to reach significance (F(3, 211) = 2.55, p = 0.057, 
 $ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .029, 90% CI = [.001, .061]). However, I occurrences did differ depending on emotional expression (F(3, 104) = 22.49, p < 0.001,
$ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .029, 90% CI = [.001, .061]). However, I occurrences did differ depending on emotional expression (F(3, 104) = 22.49, p < 0.001, 
 $ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .346, 90% CI = [.273, .409]; see Fig. 2). I was used more frequently in pseudo-words for people with happy facial expressions than for people with other facial expressions. Among the other three emotions, I occurrences did not differ significantly (see Table 2). Thus, replicating previous valence sound symbolism findings (e.g., Rummer & Schweppe, Reference Rummer and Schweppe2019), I was associated with positive emotional valence.
$ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .346, 90% CI = [.273, .409]; see Fig. 2). I was used more frequently in pseudo-words for people with happy facial expressions than for people with other facial expressions. Among the other three emotions, I occurrences did not differ significantly (see Table 2). Thus, replicating previous valence sound symbolism findings (e.g., Rummer & Schweppe, Reference Rummer and Schweppe2019), I was associated with positive emotional valence.

Fig. 2. Results of the main analysis: mean frequencies of the vowel I depending on participant language and emotional expression on the named face. Note. The figure depicts mean occurrences of the grapheme I per word depending on emotional expression and participant language. The black dots with error bars represent means with 95% confidence intervals. The shapes are density plots.
Table 2. Descriptive statistics and pairwise comparisons for the frequency of occurrence for the vowels I, O, and U depending on emotional expression

Note. Values in the diagonals are the mean (and SE) occurrences of the target vowel per word for the emotional expression. Tests are linear mixed-model comparisons of the target vowel usage when creating pseudo-words for faces with the specific emotional expressions. Effect sizes (and 95% confidence intervals) are calculated from participant-level data.
 For the vowel O, there was also no influence of participants’ native language, neither as a main effect (F(1, 95) = 0.96, p = 0.330, 
 $ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .011, 90% CI = [.000, .069]), nor as an interaction of language and emotional expression (F(3, 142) = 1.72, p = 0.167,
$ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .011, 90% CI = [.000, .069]), nor as an interaction of language and emotional expression (F(3, 142) = 1.72, p = 0.167, 
 $ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .026, 90% CI = [.000, .055]). However, O occurrences did differ depending on emotional expression (F(3, 73) = 11.82, p < 0.001,
$ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .026, 90% CI = [.000, .055]). However, O occurrences did differ depending on emotional expression (F(3, 73) = 11.82, p < 0.001, 
 $ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .135, 90% CI = [.075, .192]; see Fig. 3). O was used less frequently in names for people with happy facial expressions than for people with other facial expressions. Among the other three emotional expressions, there were no significant differences (see Table 2). Thus, replicating the findings of Rummer and Schweppe (Reference Rummer and Schweppe2019), O was associated less with positive emotional valence than with negative or neutral emotional valence.
$ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .135, 90% CI = [.075, .192]; see Fig. 3). O was used less frequently in names for people with happy facial expressions than for people with other facial expressions. Among the other three emotional expressions, there were no significant differences (see Table 2). Thus, replicating the findings of Rummer and Schweppe (Reference Rummer and Schweppe2019), O was associated less with positive emotional valence than with negative or neutral emotional valence.

Fig. 3. Results of the main analysis: mean frequencies of the vowel O depending on participant language and emotional expression on the named face. Note. The figure depicts mean occurrences of the grapheme O per word depending on emotional expression and participant language. The black dots with error bars represent means with 95% confidence intervals. The shapes are density plots.
 For the vowel U, there was also neither a significant main effect of language (F(1, 96) = 1.43, p = 0.235, 
 $ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .015, 90% CI = [.000, .078]), nor a significant interaction of language and emotional expression (F(3, 130) = 2.10, p = 0.103,
$ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .015, 90% CI = [.000, .078]), nor a significant interaction of language and emotional expression (F(3, 130) = 2.10, p = 0.103, 
 $ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .030, 90% CI = [.001, .062]). However, U occurrences did differ depending on emotional expression (F(3, 90) = 17.29, p < 0.001,
$ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .030, 90% CI = [.001, .062]). However, U occurrences did differ depending on emotional expression (F(3, 90) = 17.29, p < 0.001, 
 $ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .194, 90% CI = [.126, .256]; see Fig. 4). Qualitatively identical to O, U was used less frequently in names for people with happy facial expressions than for people with other facial expressions, while among the other three emotions, U occurrences did not differ significantly (see Table 2).
$ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .194, 90% CI = [.126, .256]; see Fig. 4). Qualitatively identical to O, U was used less frequently in names for people with happy facial expressions than for people with other facial expressions, while among the other three emotions, U occurrences did not differ significantly (see Table 2).

Fig. 4. Results of the main analysis: mean frequencies of the vowel U depending on participant language and emotional expression on the named face. Note. The figure depicts mean occurrences of the grapheme U per word depending on emotional expression and participant language. The black dots with error bars represent means with 95% confidence intervals. The shapes are density plots.
 For the vowel E, there were no significant effects. Specifically, neither the main effect of language (F(1, 99) = 3.79, p = 0.054, 
 $ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .041, 90% CI = [.000, .123]), nor the main effect of emotional expression (F(3, 80) = 0.90, p = 0.443,
$ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .041, 90% CI = [.000, .123]), nor the main effect of emotional expression (F(3, 80) = 0.90, p = 0.443, 
 $ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .015, 90% CI = [.000, .038]), nor the interaction of language and emotional expression was significant (F(3, 116) = 0.50, p = 0.686,
$ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .015, 90% CI = [.000, .038]), nor the interaction of language and emotional expression was significant (F(3, 116) = 0.50, p = 0.686, 
 $ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .008, 90% CI = [.000, .023]; see Fig. 5).
$ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .008, 90% CI = [.000, .023]; see Fig. 5).

Fig. 5. Results of the main analysis: mean frequencies of the vowel E depending on participant language and emotional expression on the named face. Note. The figure depicts mean occurrences of the grapheme E per word depending on emotional expression and participant language. The black dots with error bars represent means with 95% confidence intervals. The shapes are density plots.
 For the vowel A, there was no main effect of language (F(1, 98) = 0.00, p = 0.999, 
 $ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 < .001, 90% CI = [.000, .020]). However, there was a main effect of emotional expression (F(3, 115) = 11.89, p < 0.001,
$ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 < .001, 90% CI = [.000, .020]). However, there was a main effect of emotional expression (F(3, 115) = 11.89, p < 0.001, 
 $ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .094, 90% CI = [.042, .145]; for simple comparisons, see Table 3). In contrast to the other vowels, for A, there was a significant interaction of language and emotional expression (F(3, 2798) = 4.04, p = 0.007,
$ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .094, 90% CI = [.042, .145]; for simple comparisons, see Table 3). In contrast to the other vowels, for A, there was a significant interaction of language and emotional expression (F(3, 2798) = 4.04, p = 0.007, 
 $ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .045, 90% CI = [.009, .084]; see Fig. 6). Whereas, for German-speaking participants, there was no significant influence of emotional expression on A occurrences (F(3, 73) = 1.00, p = 0.400,
$ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .045, 90% CI = [.009, .084]; see Fig. 6). Whereas, for German-speaking participants, there was no significant influence of emotional expression on A occurrences (F(3, 73) = 1.00, p = 0.400, 
 $ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .021, 90% CI = [.000, .057]), for Japanese-speaking participants, the influence of emotional expression was significant (F(3, 1,417) = 14.70, p < 0.001,
$ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .021, 90% CI = [.000, .057]), for Japanese-speaking participants, the influence of emotional expression was significant (F(3, 1,417) = 14.70, p < 0.001, 
 $ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .209, 90% CI = [.110, .296]). Specifically, for Japanese-speaking participants, A was more frequently used for people with happy emotional expressions compared with all other examined expressions. Additionally, A was used more frequently for people with neutral expressions than for people with negative expressions. For the two negative expressions, the frequency of A did not differ (see Table 3). In sum, in contrast to German-speaking participants, for Japanese-speaking participants, A was associated more with positive and less with negative emotional valence compared with neutral valence.
$ {\unicode{x03B7}}_{\mathrm{p}}^2 $
 = .209, 90% CI = [.110, .296]). Specifically, for Japanese-speaking participants, A was more frequently used for people with happy emotional expressions compared with all other examined expressions. Additionally, A was used more frequently for people with neutral expressions than for people with negative expressions. For the two negative expressions, the frequency of A did not differ (see Table 3). In sum, in contrast to German-speaking participants, for Japanese-speaking participants, A was associated more with positive and less with negative emotional valence compared with neutral valence.
Table 3. Descriptive statistics and pairwise comparisons for the frequency of occurrence for the vowel A depending on emotional expression for all participants and for Japanese participants separately

Note. The values in the diagonals are the mean (and SE) occurrences of the target vowel per word for the emotional expression. The tests are linear mixed-model comparisons of the target vowel usage when creating pseudo-words for faces with the specific emotional expressions. Effect sizes (and 95% confidence intervals) are calculated from participant-level data.

Fig. 6. Results of the main analysis: mean frequencies of the vowel A depending on participant language and emotional expression on the named face. Note. The figure depicts mean occurrences of the grapheme A per word depending on emotional expression and participant language. The black dots with error bars represent means with 95% confidence intervals. The shapes are density plots.
4. Discussion
The aim of the present work was to gain a deeper understanding of valence sound symbolism by comparing participants speaking two unrelated languages. Using the articulation-based explanation of valence sound symbolism (Körner & Rummer, Reference Körner and Rummer2022a; built on Rummer et al., Reference Rummer, Schweppe, Schlegelmilch and Grice2014; Rummer & Schweppe, Reference Rummer and Schweppe2019), we predicted that I, because its muscle tension overlaps with smiling, would be associated with positive emotional valence, whereas vowels that involve antagonistic muscle tension (the rounded vowels O and U) would be associated with less positive emotional valence. When inventing names for people with different facial expressions, Japanese-speaking and German-speaking participants preferentially used I in names for people with happy facial expressions compared with both neutral and negative (angry and sad) facial expressions. Conversely, O and U were used less for people with happy facial expressions compared with neutral and negative expressions. None of these results were moderated by participant language, indicating that valence sound symbolism generalizes across the two employed languages: German and Japanese.
Another extension compared to previous research, which mostly examined two (e.g., Rummer & Schweppe, Reference Rummer and Schweppe2019; Yu et al., Reference Yu, McBeath and Glenberg2021) or three vowels (Körner & Rummer, Reference Körner and Rummer2022a), was that the present research examined occurrences of all five Japanese vowels. Exploratory analyses indicated that E is not strongly associated with emotional valence in either language as the usage of E did not differ across emotional expressions. However, A was associated with positive emotional valence for Japanese-speaking participants but not for German-speaking participants. Thus, except for A, the present results indicate that emotional valence associations in these two languages are similar.
The association of A with positive emotional valence for Japanese speakers (although not German speakers) might seem surprising because, in previous research, /i/ has been contrasted with another a-type vowel, /˄/, and the latter seemed to be a negatively associated vowel (Yu et al., Reference Yu, McBeath and Glenberg2021; for a similar result using syllables and /a/ instead of /˄/, see Tarte, Reference Tarte1982). However, the previously employed paradigm did not test whether /˄/ is associated with negative valence more strongly than with positive valence. In Yu et al. (Reference Yu, McBeath and Glenberg2021), the task consisted in indicating whether a word containing /i/ compared with /˄/ was more positive. Therefore, it is possible that both vowels are associated with positive rather than negative valence, only /i/ more strongly than /˄/. Testing this reasoning in the present data, we find an interaction between vowel (I vs. A) and emotional expression (see the Supplementary Material), indicating that, for Japanese-speaking participants, I is more strongly associated with positive (compared with other) emotions than A. Thus, although A is more strongly associated with positive than neutral or negative emotional valence, this valence association is less strong than for I. In sum, both findings can be reconciled; /a/ and /˄/ could be sound symbolically less positive than /i/, but /a/ need not be associated with negative valence but instead could be neutral or somewhat positive in its valence association.
In general, the present results seem driven by positive (rather than negative) emotional valence. That is, vowel usage for faces with positive expressions were different from the rest, whereas there were neither significant differences between neutral and negative faces nor between the two types of negative faces. That is, in the present research, vowel usage was not influenced by arousal as it did not differ for anger (a high arousal emotion) and sadness (a low arousal emotion). The finding that positive instead of negative emotional valence drives valence sound symbolism is similar to the results reported in Rummer and Schweppe (Reference Rummer and Schweppe2019), where the simple comparisons also generally resulted in significant differences between positive and both, neutral and negative valence, but no significant differences between the latter. Thus, rounded vowels were not specifically associated with negative valence but rather less strongly with positive valence. Although early research on valence sound symbolism postulated rather an association with negative valence than a less strong association with positive valence for rounded vowels (Rummer et al., Reference Rummer, Schweppe, Schlegelmilch and Grice2014), the described mechanism is more consistent with the present findings. This mechanism rests on a facilitation (vs. inhibition) of smiling, specifically activation (vs. inhibition) of the zygomaticus major muscle. Rounded vowels, by involving the contraction of zygomaticus antagonists, are associated with less positive valence than other vowels but not with negative valence. In other words, valence sound symbolism seems driven by the contraction (vs. inhibition) of smiling muscles, so that positive valence drives the observed valence sound symbolism effect for I compared to rounded vowels.
The major caveat of the present research is that we examined vowels only on the (Latinized) grapheme level. Both examined languages have the five vowels: /a/, /e/, /i/, /o/, and /u/. The Japanese vowel system comprises these five vowels. Although the German vowel system is larger, the German vowels /a/, /e/, /i/, /o/, and /u/ are similar to the respective Japanese vowels. The only exception is /u/, which involves slightly different articulation; in German, /u/ is a close-back rounded vowel ([uː]), whereas in Japanese, /u/ is also close-back but unrounded ([ɯ̟]) or compressed ([ɯ̟ᵝ]). Accordingly, when taking only the coarse five vowel grapheme distinction into account, Japanese and German vowels can be compared. Still, for more complete examination of vowel–valence associations, future research should examine vowels on the phoneme instead of the grapheme level. This would be useful for German and other languages that contain more vowel phonemes than graphemes, and it is imperative for languages, such as English, where there is no close grapheme to phoneme mapping.
The present manipulation uses pictures of emotional facial expressions. Positive facial expressions entail smiling so that facial mimicry might have led to participants’ smiling when looking at positive expressions, which might in turn have facilitated I usage in pseudo-names for positive expressions. Although we cannot rule out this possibility in the present study, previous research has observed valence sound symbolism for /i/ compared to rounded vowels when mimicry was inhibited, for example, when participants invented names while holding a pen between their lips (which blocks contraction of the zygomaticus major; Rummer & Schweppe, Reference Rummer and Schweppe2019, Exp. 2); and when mimicry was impossible because no faces were presented, for example, when participants invented words for valenced objects (e.g., coffin vs. dolphin; Rummer & Schweppe, Reference Rummer and Schweppe2019, Exps. 3 and 4), or when participants judged the competence of a person known only by user name (Garrido & Godinho, Reference Garrido and Godinho2021). Thus, although in the present study it might have increased the effect size, mimicry is not necessary for valence sound symbolism.
Although sound symbolism is a vibrant research area, the psychological mechanisms that drive sound symbolism are frequently unclear (Sidhu & Pexman, Reference Sidhu and Pexman2018). Probably the broadest distinction of mechanisms is between associations whose origin are incidental co-occurrences (also called conventional sound symbolism; Hinton et al., Reference Hinton, Nichols, Ohala, Hinton, Nichols and Ohala1994) and associations that are driven by psychologically meaningful processes (synesthetic sound symbolism; Hinton et al., Reference Hinton, Nichols, Ohala, Hinton, Nichols and Ohala1994). Incidental associations could stem from accidental clustering of specific sublexical features for related meanings. Statistical learning could then lead to associations between these word form features and the depicted meaning, which might, in turn, lead to an increasing number of words being coined and persisting that fit this association. Incidental clustering might be similar in related languages but should be less similar in unrelated languages.
In contrast, psychologically meaningful sound symbolism phenomena rest on general psychological processes that can result from ecological or embodied experiences (Körner et al., Reference Körner, Castillo, Drijvers, Fischer, Günther, Marelli, Platonova, Rinaldi, Shaki, Trujillo, Tsaregorodtseva and Glenberg2022). For example, pitch height overlaps between sounds emitted by small objects and high vowels (Ohala, Reference Ohala1984). Accordingly, size sound symbolism might originate from ecological co-occurrences between (small vs. large) object size and (high vs. low) auditory pitch elicited by objects or animals. Whenever these experiences are universal (independent of, say, geographic and cultural aspects), they should have a similar probability of leading to sound symbolic associations across unrelated language families. Conversely, finding that a sound symbolism phenomenon occurs in unrelated language families can be seen as an indication for a psychologically meaningful association. Thus, although statements about wider prevalence of valence sound symbolism require evidence from a much larger number of unrelated languages, the present result lends initial support for the argument that valence sound symbolism could reflect a psychologically meaningful association.
Supplementary Materials
To view supplementary material for this article, please visit http://doi.org/10.1017/langcog.2022.39.
Data availability statement
All data, analysis scripts, and materials can be found at https://osf.io/bdrsh/.
Acknowledgement
The authors are very grateful to Hikaru Watanabe for translating the instructions and recruiting the participants.
Competing interests
The authors declare no competing interests exist.
 
 








