1. Introduction
This paper deals with phonetic differences between information-seeking questions (ISQs) and rhetorical questions (RQs) in Icelandic, specifically voice quality (VQ) and speaking rate/global duration, focusing on polar and wh-questions. The prosody of the two illocution types (ISQs, RQs) has recently been compared for several languages, among them English (Dehé & Braun Reference Dehé and Braun2020b), German (Braun et al. Reference Braun, Nicole Dehé, Wochner and Zahner2019), Standard Chinese (Zahner et al. Reference Zahner, Chen, Dehé and Braun2021), French (Beyssade & Delais-Roussarie, to appear), Estonian (Asu, Sahkai & Lippus Reference Asu, Sahkai and Lippus2020), Italian (Sorianello Reference Sorianello2018, Reference Sorianello, Nuzzo and Vedder2019), Cantonese (Lo, Kiss & Tulling Reference Lo, Kiss, Tulling, Sasha Calhoun, Tabain and Warren2019), Japanese (Miura & Hara Reference Miura and Hara1995) and Icelandic (Dehé, Braun & Wochner Reference Dehé, Braun and Wochner2018, Dehé & Braun Reference Dehé and Braun2020a); see Dehé et al. (Reference Dehé, Bettina Braun, Wochner and Zahner2022) for an overview. These studies show that speakers make use of the same prosodic parameters to indicate rhetorical meaning across languages: F0, constituent duration/speaking rate, and VQ. The ways in which they are used vary across languages, with most variation for f0 modification. Some of the f0-related variation follows from prosodic typology (e.g. intonation languages vs. tone languages) and language-specific pitch accent inventories. Regarding VQ, non-modal VQ often signals rhetorical meaning. For example, breathy VQ occurs in sentence-initial position in German polar and wh-RQs (Braun et al. Reference Braun, Nicole Dehé, Wochner and Zahner2019) and English wh-RQs (Dehé & Braun Reference Dehé and Braun2020b). In Chinese, glottal VQ occurs more frequently in RQs than in ISQs in both initial and final positions (Zahner et al. Reference Zahner, Chen, Dehé and Braun2021). In German, VQ also distinguishes between questions and statements in general (e.g. more breathy voice in declarative questions than in declarative statements, Niebuhr et al. Reference Niebuhr, Julia Bergherr, Lill and Neuschulz2010). In several African languages, breathiness has been associated with questionhood (Rialland Reference Rialland2009 for utterance-final breathiness in polar questions in languages of the Gur family). Constituent durations are generally longer, or speaking rate slower, in RQs than in ISQs across languages. Faster speaking rate has also been observed in declarative questions than in string-identical statements (van Heuven & van Zanten Reference Heuven, Vincent and van Zanten2005 for Manado Maylay, Orkney English and Dutch, Niebuhr et al. Reference Niebuhr, Julia Bergherr, Lill and Neuschulz2010 for German). This is potentially relevant because RQs have been argued to be assertion-like (e.g. Han Reference Han2002), in that eventually, all discourse participants are committed to the propositional content of the utterance, and RQs are thus closer in meaning to statements than to questions.
For Icelandic specifically, Dehé & Braun (Reference Dehé and Braun2020a) show that ISQs and RQs differ in nuclear pitch accent types and in the type and frequency of prenuclear accents. The default boundary tone is low (L%) across question and illocution types. Regarding duration, the first word of the utterance (finite verb in polar questions, wh-word in wh-questions) and the nuclear syllable (first syllable of object noun) are longer in RQs than in ISQs (Dehé et al. Reference Dehé, Braun and Wochner2018). VQ and global durational parameters have not yet been included in the prosodic comparison of RQs and ISQs in Icelandic. The present paper addresses these research gaps, showing that Icelandic exploits both VQ and speaking rate/duration to distinguish between RQs and ISQs.
2. Methodology
The current study is a post hoc analysis of Dehé & Braun’s (Reference Dehé and Braun2020a) data. While Dehé & Braun (Reference Dehé and Braun2020a) focus on the intonation of ISQs vs. RQs, the present paper investigates VQ and speaking rate/duration. The data was elicited in a production experiment mimicking dialogue situations; materials consisted of 21 pairs of polar and 21 pairs of wh-interrogatives (e.g. (1)). All wh-questions started with the wh-pronoun hver ‘who’; the subject in all polar questions was einhver ‘anybody’.
Data of 17 native speakers of Icelandic were analysed (average age 26.9 years; age range 20–32 years; 11 female, six male). Overall, 645 target interrogatives were analysed, 313 polar (156 ISQs, 157 RQs) and 332 wh-questions (166 ISQs, 166 RQs), exactly the same utterances as in Dehé & Braun (Reference Dehé and Braun2020a). They were annotated in Praat (Boersma & Weenink Reference Boersma and Weenink2018). Following Braun et al. (Reference Braun, Nicole Dehé, Wochner and Zahner2019), VQ was annotated on a perceptual basis, at four positions. VQ was annotated by the second author, with 7% of the data also annotated by a research assistant. Interrater reliability (Cohen’s Kappa, Cohen Reference Cohen1960) showed substantial agreement (90%, κ = 0.71) (Landis & Koch Reference Richard and Koch1977). In wh-questions, the four positions were (i) the sentence-initial wh-word (hver /khvεːr/), (ii) the initial, stressed syllable of the finite verb (e.g. /pɔr/ in borðar /ˈpɔr.ðar/ ‘eats’), (iii) the initial, stressed syllable of the object noun (e.g. /liː/ in límónur /ˈliː.mo͡u.ˌnʏr/ ‘limes’), and (iv) the offset of the utterance (e.g. last syllable of límónur). In polar questions, the four positions were (i) the stressed syllable of the sentence-initial verb (e.g. /pɔr/ in borðar), (ii) the initial, stressed syllable of the subject einhver (/e͡in/ in /ˈe͡͡in.khvεr/ ‘anybody’), (iii) the stressed syllable of the object noun, and (iv) the offset of the sentence. Three types of VQ were perceptually classified: modal (neutral mode of phonation, Laver Reference Laver1980), breathy (audible friction of the air) and glottalized (low frequency irregular vocal fold vibrations, Braun et al. Reference Braun, Nicole Dehé, Wochner and Zahner-Ritter2021).Footnote 1
Speaking rate was operationalized as the number of syllables per second. The actual sentence duration served as the frame of reference for the calculation, to which the number of assumed syllables for each target sentence was set in relation, i.e. any segmental reductions or deletions were disregarded. However, reductions were minimal given the short length of the utterances and the laboratory setting.
For statistical analysis, we used a series of linear mixed effects regression models (lmers) with illocution type (ISQ, RQ) and question type (wh, polar) as fixed factors and participants and items as crossed-random factors (random intercepts). Random slopes were added and retained if they improved the model fit (Bates et al. Reference Bates, Kliegl, Vasishth and Baayen2015, Matuschek et al. Reference Matuschek, Reinhold Kliegl, Baayen and Bates2017), as indicated by the anova() function in R-studio (R Core Team 2013). For the analysis of VQ (categorical variable), we used generalized linear mixed models (glmers). For the analysis of a specific VQ, the relevant type of VQ was coded as 1, the other two as 0. The effect of the fixed factors was calculated for these modified dependent variables (Agresti Reference Agresti2002). Model fitting followed the same procedure as for lmers. P-values were calculated using the Satterthwaite approximation in the R package lmerTest and adjusted (padj) by means of the Benjamini–Hochberg correction (Benjamini & Hochberg Reference Benjamini and Hochberg1995).
3. Results
3.1 Voice quality
The results for VQ are plotted in Figure 1 (top: wh-questions; bottom: polar questions). In the first three positions, there were no interactions between illocution type and question type (p > .5); we therefore report main effects. Breathy VQ occurred more often in RQs than in ISQs in all positions, but in the first three positions, observations for breathy VQ were too few to calculate statistical models.
We report the results for the four positions in turn. In sentence-initial position (verb in polar, wh-word in wh-questions), RQs were more often realized with glottalized VQ than ISQs, although the difference was not statistically significant (p > .1, p adj = .4). There was a main effect of question type; polar questions were significantly more often realized with glottal VQ than wh-questions (ß = 0.91, SE = 0.25, z = 3.68, p < .001, p adj < .01). For modal VQ, there were main effects of both illocution type and question type. RQs were less often realized with initial modal VQ than ISQs (ß = −0.74, SE = 0.23, z = −3.21, p = p adj < .01), and polar questions were less often realized with modal VQ than wh-questions (ß = −0.66, SE = 0.23, z = −2.81, p < .001, p adj < .05).
In second position (einhver in polar, verb in wh-questions), RQs were realized with glottal VQ more often than ISQs (ß = 1.1, SE = 0.35, z = 3.11, p = p adj < .01); there was no main effect of question type (p > .05, p adj < .07). For modal VQ, there was a main effect of illocution type; across question types, ISQs were produced with modal VQ more often than RQs (ß = −1.09, SE = 0.35, z = 3.1, p = p adj < .01).
In third position (first syllable of object noun), there were no main effects of illocution type on glottalized VQ (p = p adj > .1) or modal VQ (p = p adj > .1). Overall, there was a higher occurrence of glottalized voice in RQs as compared to ISQs, and more ISQs than RQs were realized with modal VQ.
Finally, at the offset of the utterance, all three VQs occurred with frequency high enough to allow for statistical analysis. For breathy VQ, no interaction between illocution type and question type was observed (p = p adj > .7) and there was no effect of question type (p = p adj < .3). There was a main effect of illocution type: RQs showed significantly more occurrences of breathiness than ISQs (ß = 1.88, SE = 0.22, z = 8.65, p = p adj < .001). There was an interaction between illocution type and question type for modal VQ (ß = 0.63, SE = 0.22, z = 2.9, p < .01, p adj < .05). In polar questions, ISQs were more often realized with final glottalized VQ than RQs. Conversely, glottalized VQ was more frequent in wh-RQs than in wh-ISQs. There were no main effects (p > .3). An interaction between illocution type and question type was also observed for modal VQ (ß = 1.09, SE = 0.29, z = 3.8, p = p adj < .001), suggesting a stronger difference for modal VQ in wh-questions than in polar questions in this position. A main effect of illocution type was observed such that RQs were significantly less often realized with modal VQ than ISQs (ß = −3.04, SE = 0.13, z = −5.6, p = p adj < .001).
Note that two non-modal VQs may occur on different syllables of the object noun. Specifically, of all items with breathy VQ at the offset, 17.2% were glottalized on the first syllable of the noun (78.5% modal, 4.3% breathy), with generally more glottal VQ in RQs (see above).
Figures 2 and 3 illustrate sentence-final breathy VQ and sentence-final glottalized VQ, respectively, in wh-RQs. Figure 4 shows sentence-final modal VQ in a wh-ISQ.
3.2 Speaking rate/duration
The results for speaking rate are plotted in Figure 5. There was no significant interaction between illocution type and question type (p = p adj > .3), but main effects of illocution type and question type were observed. RQs had slower speaking rate (significantly fewer syllables per second) than ISQs (polar: 5.2 (RQs) vs. 6.4 (ISQs) syllables per second; wh: 4.3 (RQs) vs. 5.7 (ISQs); ß = −1.24, SE = 0.1, t = −11.92, p = p adj < .0001). Wh-questions were produced with a significantly slower speaking rate than polar questions (ß = −0.74, SE = 0.08, t = −9.29, p = p adj < .0001).
The results for global duration, plotted in Figure 6, show an interaction between illocution type and question type (ß = 53.63, SE = 21.57, t = 2.48, p = p adj < .05), suggesting stronger durational differences in wh- than in polar questions. A main effect of illocution type was observed across question types: RQs have significantly longer average durations than ISQs (polar: 1325.2ms (RQs) vs. 1076.9 ms (ISQs); wh: 1315.8 ms (RQs) vs. 1029.8 ms (ISQs); ß = 267.93, SE = 24.27, t = 11.04, p = p adj < .001). Wh-questions have slightly shorter duration than polar questions, but this difference is not significant (p = .0614, p adj > .1).
4. Discussion
The analysis reveals that RQs in Icelandic differ from ISQs in terms of VQ and speaking rate/duration. First, RQs are generally longer than ISQs, as well as realized with a slower speaking rate. This is in line with results for other languages (see Section 1 above), suggesting that temporal cues are used cross-linguistically to distinguish between RQs and ISQs in prosody. A reviewer suggests that RQs may come with paralinguistic attitudes such as anger, exasperation or scornfulness, of which the slower speaking rate would be a correlate, rather than of the rhetorical meaning. Neitsch (Reference Neitsch2018) compared RQs with and without strong speaker attitudes, showing that both types of RQs exhibit the same prosodic parameters, among them longer durations than ISQs, with stronger magnitude for RQs with strong speaker attitudes.
Second, like German and English, Icelandic makes use of breathy VQ in the production of RQs. However, there are also differences between the languages. In German, both polar and wh-RQs often have breathy voice in sentence-initial position (Braun et al. Reference Braun, Nicole Dehé, Wochner and Zahner2019). In English, only wh-questions show differences in VQ (breathy voice in initial position in wh-RQs; Dehé & Braun Reference Dehé and Braun2020b). In Icelandic RQs, initial breathiness is rare. Instead, breathy voice mainly occurs in utterance-final position. We interpret this positional difference between German and English on the one hand and Icelandic on the other as an interaction between VQ and intonation. In German and English, boundary tones distinguish between utterance types (e.g. questions vs. statements; see von Essen Reference von Essen1964, Brinkmann & BenzmĴller Reference Brinckmann and BenzmĴller1999 for German; Bartels Reference Bartels1999 for English). This is not the case in Icelandic, where the default boundary tone for all utterance types is L%, including both polar and wh-questions (Árnason Reference Árnason2005, Reference Árnason2011; Dehé & Braun Reference Dehé and Braun2020a). Moreover, in German and English, polar RQs are distinguished from polar ISQs by means of boundary tones (high rising boundary tone in ISQs vs. high plateau in RQs; Braun et al. Reference Braun, Nicole Dehé, Wochner and Zahner2019 for German, Dehé & Braun Reference Dehé and Braun2020b for English), which is not the case in Icelandic (Dehé & Braun Reference Dehé and Braun2020a). In German, boundary tones also distinguish between wh-ISQs and wh-RQs (mandatory fall in wh-RQs, high number of rising movements in wh-ISQs, Braun et al. Reference Braun, Nicole Dehé, Wochner and Zahner2019). While this is not the case in English (L% in both RQs and ISQs), constituent duration steps in, with longer duration of the final object in wh-RQs than ISQs (Dehé & Braun Reference Dehé and Braun2020b). In Icelandic, despite the general availability of the high boundary tone (H%) to mark special aspects of meaning (Árnason Reference Árnason2005, Reference Árnason2011), this cue is not used for the expression of rhetorical meaning. Both ISQs and RQs end in L% (Dehé & Braun Reference Dehé and Braun2020a). It is therefore conceivable that Icelandic speakers exploit the manipulation of VQ in final position as a compensation strategy. As an apparent general boundary marker, used in both ISQs and RQs to a considerable extent, glottalized voice is not the preferred option to signal rhetorical meaning. Instead, breathy voice marks the offset of RQs. Breathiness has also been found at the terminus of polar questions in African languages of the Gur family. Interestingly, polar questions in those languages typically end in a fall, too (Rialland Reference Rialland2009). This further supports the assumption that final VQ may replace boundary tones as a cue to illocution type. Future research will show whether Icelandic also makes use of VQ as a cue to pragmatic meaning in other utterance types (e.g. exclamatives, see Wochner Reference Wochner2021 for German) or in statements with specific pragmatic connotations (e.g. emphasis, see Niebuhr et al. Reference Niebuhr, Julia Bergherr, Lill and Neuschulz2010, or obviousness, see Wochner Reference Wochner2021). Generally speaking, Icelandic fits in with previous studies showing that non-modal VQ is cross-linguistically used as a cue to rhetorical meaning in questions, although the particular ways in which non-modal VQ is used is language-specific.
Acknowledgements
We thank Bettina Braun and Katharina Zahner-Ritter for statistical advice, Johanna Schnell and Gloria Sigwarth for data annotations, and three reviewers for NJL for their comments. The research was funded by the DFG, grant numbers DE 876/3-1 and DE 876/3-2.