1 Introduction
Laterals have been a topic of interest to linguists due to their complicated articulatory and acoustic features across languages, dialects and speakers. Although laterals are common, existing in about 82% of all the 317 sample languages in the UPSID (UCLA Phonological Segment Inventory Database), the retroflex lateral is rare and accounts for less than 7% of all laterals in this database. Comparatively, about 87% of all laterals are produced in the dental/alveolar region with probably more frequent alveolar laterals than dentals (Maddieson Reference Maddieson1984). The comparative scarcity of the retroflex-dental/alveolar contrast for laterals has motivated research on data from different languages, for example the Dravidian languages of India and the indigenous languages of Australia. In this study, we examine the acoustic differences between the retroflex lateral /ɭ/ and non-retroflex alveolar lateral /l/ in the Zibo dialect of Chinese, contributing to the acoustic description of this typologically rare contrast from an underdescribed language.
The Zibo dialect of Chinese is a member of the northern Mandarin Chinese family (ISO 693-3: [cmn]) spoken in Zibo, a city with an area of 5,965km2 and a population of 4.70 million (China Discovery 2022) located in central Shandong province, People’s Republic of China (see Figure 1). In classification, it belongs to Jilu Mandarin, one of the eight subgroups of the Mandarin family, together with Northeast, Beijing (Standard), Zhongyuan, Jiaoliao, Lanyin, Jianghuai, and Southwest (Wurm et al. Reference Wurm1987). Besides the rarity of the retroflex lateral and its underdocumentation in dialects of China, the retroflex lateral in the Zibo dialect is interesting for three main reasons. First, different acoustic findings are reported for the alveolar-retroflex contrast in laterals in studies of dialects in China and other languages in the world, especially for the F3 and duration of the two laterals. Second, the phonetic context of the retroflex lateral /ɭ/ in the Zibo dialect is different from that for the more studied Dravidian languages of India and the indigenous languages of Australia. Third, there is some controversy as to the phonemic contrast between the retroflex and non-retroflex alveolar laterals in the Zibo dialect. In the remainder of this section, we provide an introduction to previous phonetic studies of the retroflex and non-retroflex lateral contrast as well as an overview of Zibo phonology to further explain these three aspects.
1.1 Acoustic and phonotactic characteristics of retroflex laterals in previous studies of other languages in the world
Both the Dravidian languages of India and the indigenous languages of Australia are well known for having an alveolar versus retroflex contrast in the consonant system (Bhat Reference Bhat1973; Ladefoged & Maddieson Reference Ladefoged and Maddieson1996). Studies on retroflex laterals are reported for Dravidian languages spoken in South Asia (India, Pakistan and Sri Lanka), including Tamil (McDonough & Johnson Reference McDonough and Johnson1997; Narayanan & Kaun Reference Narayanan and Kaun1999; Narayanan et al. Reference Narayanan, Byrd and Kaun1999), Malayalam (Punnoose et al. Reference Punnoose, Khattab and Al-Tamimi2013; Scobbie et al. Reference Scobbie, Punnoose, Khattab, Spreafico and Viette2013; Tabain & Kochetov Reference Tabain and Kochetov2018) and Kannada (Tabain & Kochetov Reference Tabain and Kochetov2018), and in Australian Aboriginal languages, such as Arrernte, Pitjantjatjara and Warlpiri (Tabain et al. Reference Tabain, Butcher, Breen and Beare2016; Tabain et al. Reference Tabain, Butcher, Breen and Beare2020a; Tabain et al. Reference Tabain, Kochetov and Beare2020b). Other fragmentary reports on retroflex laterals are found in Gujarati (Dave Reference Dave1977) among the Indo-Aryan languages and in East Norwegian (Hamann Reference Hamann2003a; Moen et al. Reference Moen, Gram Simonsen, Martinus Lindstad and Cowen2003) among the North Germanic languages. The retroflex lateral [ɭ] in East Norwegian does not have a phonemic contrast with its non-retroflex counterpart, and it is a result of a retroflexion rule that merges /r/ and dento-alveolars across morpheme or word boundaries (Kristoffersen Reference Kristoffersen2000).
Previous studies on this contrast in laterals have investigated articulatory and acoustic differences. Among Dravidian languages, Tamil is reported to have a dento-alveolar /l/ vs. retroflex lateral /ɭ/ contrast. When producing /ɭ/ in Tamil, the tongue is curled back so that contact with the palate is made with the underside of the tongue, and the narrowest tongue constriction appears in the palatal region (Narayanan et al. Reference Narayanan, Byrd and Kaun1999). No consistent differences are found between /ɭ/ and /l/ in F1. A lower F3 is found in /ɭ/ than /l/, with a slightly higher F2 in /ɭ/ than /l/, and the duration of /ɭ/ is found to be considerably shorter than that of /l/ (McDonough & Johnson Reference McDonough and Johnson1997; Narayanan et al. Reference Narayanan, Byrd and Kaun1999). While the Tamil /ɭ/ has sublingual articulation with palatal constriction, /ɭ/ in Malayalam has a considerable tongue root retraction and a substantial tongue blade raising and retraction (Scobbie et al. Reference Scobbie, Punnoose, Khattab, Spreafico and Viette2013). Acoustic studies show a higher F1, lower F2 and lower F3 in /ɭ/ than in /l/ (Punnoose Reference Reenu2011; Punnoose et al. Reference Punnoose, Khattab and Al-Tamimi2013; Tabain & Kochetov Reference Tabain and Kochetov2018). The lower F2 in /ɭ/ is the opposite to the finding for F2 in Tamil. The duration of /ɭ/ in Malayalam is also reported to be shorter than that of /l/ (Punnoose Reference Reenu2011; Tabain & Kochetov Reference Tabain and Kochetov2018), which is consistent with the finding of duration for Tamil. As for Kannada, descriptive phonetic accounts disagree on the exact place of the articulation of its non-retroflex coronal lateral /l/: some describe it as ‘dental’ (Bright Reference Bright1958), others as ‘alveolar’ (Upadhyaya Reference Upadhyaya1972), but they unanimously characterize /ɭ/ as retroflex, with some noting its subapical articulation. Tabain and Kochetov (Reference Tabain and Kochetov2018) show in their acoustic investigation that Kannada retroflex /ɭ/ has a higher F1 and lower F3 than alveolar /l/, which is consistent with the findings for Malayalam. But different from Malayalam, Kannada retroflex /ɭ/ has a higher F2 than its alveolar counterpart /l/. Besides, echoing the findings for Tamil and Malayalam, the duration of the retroflex lateral in Kannada is found to be shorter than the alveolar lateral.
In some Central Australian languages, the alveolar and retroflex laterals are both apical (Tabain et al. Reference Tabain, Butcher, Breen and Beare2020a). In an acoustic study of lateral consonants in three Central Australian languages: Arrernte, Pitjantjatjara and Warlpiri (Tabain et al. Reference Tabain, Butcher, Breen and Beare2016), analyses of formants at the temporal midpoint of the laterals show that the retroflex /ɭ/ is not significantly different from the alveolar /l/ for F1 or F2, but there is a significantly lower F3 found in the retroflex lateral. No significant difference is found for duration of the two laterals.
Overall, discrepancies in formant patterns of the retroflex and non-retroflex (dental/alveolar) lateral contrasts reflect variations in realization in different languages, especially when producing retroflex laterals, caused by difference in place of contact of the tongue, degree of retraction of tongue tip/root, tongue tip flexion/raising, etc. Despite these variations, there is a consistently lower F3 for /ɭ/ than for /l/ in the Dravidian languages and the three Central Australian languages mentioned above, which suggests that F3 is a robust cue for the retroflex-dental/alveolar contrast in laterals. F3 is related to front cavity resonances (Fant Reference Fant1970) and F3 in laterals can be at least partly suppressed by the lateral anti-resonance: some loss of energy resulting from a side-branch (a pocket of air above the tongue) (Kochetov, Petersen & Arsenault Reference Kochetov, Heegard Petersen and Arsenault2020). For retroflexes, Stevens (Reference Stevens1998) finds a lower F3 during the constricted interval, due to a sublingual cavity. A lower F3 as a main correlate of retroflexes is also found in studies of vowels and other consonants (Ladefoged & Bhaskararao Reference Ladefoged and Bhaskararao1983; Ladefoged & Maddieson Reference Ladefoged and Maddieson1996; Hamann Reference Hamann2003b; Hussain et al. Reference Hussain, Proctor, Harvey and Demuth2017).
Regarding the results for duration, the retroflex lateral /ɭ/ is reported to have a shorter duration than the non-retroflex /l/ in Tamil, Malayalam and Kannada. In the three Australian Aboriginal languages (Tabain et al. Reference Tabain, Butcher, Breen and Beare2016), there is no significant difference in duration between the two apical laterals /ɭ/ and /l/, with both having shorter durations than the laminal laterals.
In addition to the articulatory and acoustic characteristics of the retroflex and non-retroflex lateral contrast, the phonotactics of /ɭ/ and /l/ are also discussed in previous studies. Findings show that retroflex laterals occur at different positions in a word in different languages. In Tamil, both /l/ and /ɭ/ occur in intervocalic and word-final positions, but the phonotactics of Tamil bar retroflex sounds from initial position (McDonough & Johnson Reference McDonough and Johnson1997), and Narayanan et al. (Reference Narayanan, Byrd and Kaun1999) has also noted that /ɭ/ in syllable-initial cases in Tamil is realized as a flap, and often may not involve complete (subapical) palatal contact in fluent speech, different from a complete linguapalatal closure for /ɭ/ in the syllable-final position. This may also explain the shorter duration of the retroflex lateral /ɭ/ than its dental/alveolar counterpart /l/ in Tamil. Similarly, in Kannada, both /l/ and /ɭ/ occur frequently intervocalically, /l/ also occurs word-initially but neither lateral occurs word-finally (Tabain & Kochetov Reference Tabain and Kochetov2018). Laterals /l/ and /ɭ/ in Malayalam occur in a wider range of contexts, contrasting word-medially, word-finally, and rather infrequently, word-initially (Tabain & Kochetov Reference Tabain and Kochetov2018). For the three Central Australian languages – Arrernte, Pitjantjatjara and Warlpiri, the contrast between the alveolar and the retroflex also mainly occurs intervocalically and is neutralized in the word-initial position (Tabain et al. Reference Tabain, Butcher, Breen and Beare2016; Tabain & Beare Reference Tabain and Beare2018; Tabain et al. Reference Tabain, Butcher, Breen and Beare2020a). It can be seen that the retroflex lateral /ɭ/ in the languages mentioned above most frequently occurs intervocalically, less often in the word-final position, and is rare in the word-initial position. This tendency is found to be common in previous studies and it is also noted that retroflex consonants in CV structure are not as well distinguished as in VC structure (Steriade Reference Steriade1995; Ohala & Ohala Reference Ohala and Ohala2001; Hamann Reference Hamann2003b; Tabain et al. Reference Tabain, Butcher, Breen and Beare2020a). In addition, the retroflex-alveolar contrast is found to be most clearly realized in the context of an /a/ vowel, and least clearly realized in the context of an /i/ vowel (Tabain et al. Reference Tabain, Butcher, Breen and Beare2020a).
The literature also shows that inter-speaker variation of laterals is high, especially of the retroflex lateral (Stevens Reference Stevens1998; Nance Reference Nance2014; Kochetov et al. Reference Kochetov, Heegard Petersen and Arsenault2020; Tabain et al. Reference Tabain, Butcher, Breen and Beare2020a), due to individual speaker differences, language change, modes of acquisition in revitalization contexts, positional and contextual factors and task effects, etc.
In summary, the research review on the retroflex lateral and its dental/alveolar contrasts shows that there are different realizations of the retroflex lateral /ɭ/ in different languages, but some characteristics of this phoneme are shared by most of the languages mentioned above: a lower F3 and shorter duration than its dental or alveolar counterpart /l/, its infrequent occurrence (even barred or neutralized in some languages above) in the word-initial position as well as a large variation of its realizations.
1.2 Acoustic and phonotactic characteristics of retroflex laterals in previous studies of dialects of Chinese
The retroflex lateral is also reported in dialects of Chinese. A syllabic retroflex lateral is found in Susong Gan dialect of Anhui province in Northeast China (Tang Reference Tang2005) and in Jincheng dialect of Shanxi province in North China (Zhu & Jiao Reference Zhu and Jiao2006). [ɭ] in Susong Gan dialect is non-phonemic as a variant of the alveolar lateral /l/, and in Jincheng dialect, /ɭ/ is a phonemic syllabic consonant. In addition, the retroflex lateral is recorded as a phoneme in records of dialects in some areas of Shandong province (see Figure 1), for example in Zibo, Lijin, Qingdao, Linyi and Heze (Meng & Luo Reference Meng and Luo1994; Yang Reference Yang1990; Qingdao Municipal Archive 1997; Ma & Wu Reference Ma and Wu2003; Qi Reference Qi2019). In these dialects, the retroflex lateral is recorded to only occur syllable-initially, followed by a shorter and higher schwa to form the syllable /ɭə/.
Compared with studies of retroflex laterals in other languages of the world, studies and records on retroflex laterals in Chinese dialects are few and mainly just report the existence of this phoneme without detailed articulatory or acoustic analyses. To the best of our knowledge, there are three acoustic studies of /ɭ/ in dialects of Chinese, one on the syllabic retroflex phoneme /ɭ/ in Jincheng dialect with one female speaker (Zhu & Jiao Reference Zhu and Jiao2006), the second on the retroflex consonant /ɭ/ in the dialects of some areas from Heze and Linyi in Shandong province with one to two male speakers for each area (Qi Reference Qi2019) and the third is a case study by Dong and Liang (Reference Dong and Liang2019) with one 61-year-old female speaker in the Zibo dialect. For the first two studies, only descriptive data analyses are given without statistical data analyses. It is found that the syllabic /ɭ/ in Jincheng dialect has a lower mean F1 and higher mean F2 than its alveolar counterpart /l/, but no difference is found in F3 of the two laterals. The acoustic analysis of the phonemic consonant /ɭ/ by Qi (Reference Qi2019) finds no difference in F1 for the two laterals, but for data from most areas in the research, F2 for /ɭ/ is lower than that for /l/; as for F3, the study does not find a consistently lower F3 in /ɭ/ than /l/ through F3 trajectory observation. Regarding the duration of the laterals, Qi (Reference Qi2019) reports that the retroflex lateral /ɭ/ has a longer duration than the alveolar /l/, with an average of 127 ms for /ɭ/ and 52 ms for /l/. The case study on /ɭ/ and /l/ in the Zibo dialect has investigated the acoustic characteristics of the two laterals and the schwas following them (Dong & Liang Reference Dong and Liang2019). In this study, the retroflex lateral /ɭ/ in the Zibo dialect is found to have a significantly lower F1 and a significantly higher F2 than /l/; the duration of /ɭ/ is significantly longer than that of /l/ with a large difference. No significant difference is found in F3 of the two laterals. In addition, the schwa following /ɭ/ has a significantly lower F1 and shorter duration than the schwa following /l/.
It can be seen that a consistent lower F3, the main correlate for a retroflex lateral compared with its dental/alveolar counterpart in other languages, is not found in the few acoustic studies on the retroflex lateral in dialects of Chinese. The duration of /ɭ/ is longer than /l/ in dialects of Zibo, Heze and Linyi (Dong & Liang Reference Dong and Liang2019; Qi Reference Qi2019), which is the opposite to findings of Tamil, Malayalam and Kannada and different from the three Australian Aboriginal languages (McDonough & Johnson Reference McDonough and Johnson1997; Narayanan et al. Reference Narayanan, Byrd and Kaun1999; Punnoose Reference Reenu2011; Tabain et al. Reference Tabain, Butcher, Breen and Beare2016; Tabain & Kochetov Reference Tabain and Kochetov2018). As for the phonetic context, in the Zibo dialect, /ɭ/ occurs in a restricted phonetic environment: only in the syllable-initial position followed by schwa, which is considered an infrequent or even barred position for /ɭ/ in other languages.
1.3 Zibo phonology and its lateral contrasts
The Zibo dialect differs in a number of important ways from the best-known Mandarin variety (i.e., the Standard Mandarin) not only in its tonal system but also in several segmental properties. The Zibo dialect has three lexical tones, marked here with the numerical superscripts 1–3 representing three tonal categories. Using the tone characters of Chao (Reference Chao1930) recommended in the International Phonetic Alphabet (IPA), we can transcribe the tones as T1, T2 (˥) and T3 (˧˩). Phonetically, the pitch height is represented by numbers from 1 (lowest) to 5 (highest). Marked examples of the citation forms of three tones in the Zibo dialect are presented in Table 1. Besides the full lexical tones, many Mandarin dialects have a neutral tone, commonly known as T0. Neutral-tone syllables are generally short and prosodically weak, and they receive tone values from the preceding full-tone syllable (Chen & Gussenhoven Reference Chen and Gussenhoven2015). In this study, the neutral tone in the Zibo dialect is marked as T0.
The Zibo dialect has 24 phonemic consonants, as shown in Table 2. All consonants only occur syllable-initially, except /ŋ/ which can occur in both syllable-initial and syllable-final positions (e.g., /ŋɛ1/ ‘sad’ vs. /ləŋ1/ ‘to throw’). The vowel inventory (including monophthongs, diphthongs and triphthongs) is also shown in Table 2. There are seven monophthongs in the Zibo dialect /i y u ə ɑ ɛ ɔ/, all of which can appear in open syllables. Only five of them (/i y u ə ɑ/) can appear in closed syllables with a nasal coda /ŋ/.
Note: vl is short for ‘voiceless’ and vd for ‘voiced.’
The syllable structure in the Zibo dialect is (C)V(/ŋ/), where C stands for consonant and V for vowel. The obligatory element V can be a monophthong, a diphthong or a triphthong. The following examples show the possible syllable structures in the Zibo dialect: /i1/ ‘clothes’, /iŋ2/ ‘success’, /li2/ ‘pear’, /liŋ2/ ‘bell’ and /liɑŋ2/ ‘food’, with superscript numbers denoting the tone category of the monosyllabic words, see Table 1.
Like many varieties in the Mandarin family, the Zibo dialect shows a two-way (alveolar and retroflex) coronal contrast in fricatives /s, ʂ/ (e.g., /sɑ1/ ‘to let go’ vs. /ʂɑ1/‘sand’) and affricates /ts, tʂ, tsʰ, tʂʰ/ (e.g., /tsɑ2/ ‘complex’ vs. /tʂɑ2/ ‘to fry’, and /tsʰɑ1/ ‘to rub’ vs. /tʂʰɑ1/ ‘to insert’). Of particular interest is the alveolar-retroflex contrast in laterals. In the Zibo dialect, the alveolar lateral /l/ can be followed by most vowels (including nasalized vowels), while the retroflex lateral /ɭ/ can only be followed by the mid-vowel /ə/. In this context, minimal pairs of monosyllabic words show the phonemic contrast of the two laterals: /lə2/ ‘to bother’ vs /ɭə2/ ‘son’, /lə3/ ‘heat’ vs /ɭə3/ ‘two’ (see more examples in Tables A1 and A2 in Appendix A). The diachronic development of the retroflex lateral in Mandarin dialects is considered to be related to the development of a retroflex high vowel after an initial retroflex affricate (Zhang Reference Zhang1999; Gao Reference Gao2013). In some dialects of Shandong province where Zibo is located, this combination is pronounced as a retroflex vowel /ɚ/, as in Standard Mandarin (Lee & Zee Reference Lee and Zee2003) and most northern dialects in China, while in some other dialects of Shandong, it is pronounced as /ɭə/, such as in dialects of Zibo, Linyi and Zhucheng (Qian, Zhang & Luo Reference Qian, Zhang and Luo2001).
Another point of interest presented by Zibo laterals involves their somewhat contradictory descriptions in the literature. It is noted that the schwa following the retroflex lateral /ɭ/ in the Zibo dialect is recorded to be higher than that following the alveolar lateral /l/, which is also confirmed by finding a lower F1 in the schwa following the retroflex lateral /ɭ/ in Dong and Liang (Reference Dong and Liang2019). Regarding the more obvious difference between the two schwas following /ɭ/ and /l/ in auditory impression, Zhang (Reference Zhang1999) suggests an /ɨ/ phoneme to denote the higher schwa following the retroflex lateral and considers that the two laterals belong to the same /l/ phoneme since the difference in auditory impression between the two laterals is minor (Zhang Reference Zhang1999; Qi Reference Qi2019). This is inconsistent with Meng and Luo (Reference Meng and Luo1994)’s records of the Zibo dialect, which report an alveolar-retroflex phonemic contrast of the two laterals in the Zibo dialect.
All the discrepancies found in the acoustic and phonotactic characteristics of the two laterals in the Zibo dialect compared with other languages or dialects, together with the controversy as to the phonemic status of the retroflex lateral and its following schwa, need further exploration and explanation.
1.4 Domain and aim of the present paper
In all varieties and dialects of Chinese, most morphemes are exactly one syllable long (Wang Reference Wang1973). In addition, most morphemes are also monosyllabic words in Chinese and it is fair to say that most Chinese syllables are monosyllabic words; there are also polysyllabic words, usually compounds (Duanmu Reference Duanmu, Marc Van Oostendorp, Hume and Rice2011). Therefore, in the Zibo dialect, the syllable-initial position of /ɭ/ and /l/ is also the word-initial position of the monosyllabic words /ɭə/ and /lə/.
This study focuses on the acoustic analyses of the differences between the retroflex lateral /ɭ/ and the alveolar lateral /l/, as well as the two schwas following the two laterals in the Zibo dialect. The minimal pair /ɭə/ and /lə/ (monosyllabic words) will be studied. The research questions are as follows:
-
(1) What are the acoustic differences between /ɭ/ and /l/ in the Zibo dialect?
-
(2) What are the acoustic differences between the schwas following the two laterals?
-
(3) What is the phonemic status of the two laterals and the schwas following them?
2 Methods
2.1 Speakers
This study presents data from 22 native speakers of the Zibo dialect in two gender groups: 11 male speakers (five of them aged between 55–70, and six of them aged between 30–45) and 11 female speakers (six of them aged between 55–70, and five of them aged between 30–45). These speakers were chosen because all of them had lived and worked in Zibo ever since they were born. None of them were teachers or in a profession where Standard Mandarin was needed during work. All subjects speak the Zibo dialect in their daily communication, with very few chances of speaking Standard Mandarin. Among these speakers, those aged over 55 can speak the Zibo dialect and very little Standard Mandarin, and most younger speakers (aged 30–45) can speak both the Zibo dialect and Standard Mandarin.
2.2 Materials and procedure
The wordlist for this study contained 80 compound words with two syllable-initial laterals in the Zibo dialect, namely 40 words for /ɭə/ and 40 words for /lə/. The 80 words were carefully chosen from the records of the Zibo dialect (Meng & Luo Reference Meng and Luo1994) and words native speakers were familiar with. Before the recording, the wordlist was checked by three native speakers to make sure words on the list were frequently used in their lives.
All 80 compound words in the wordlist were read by each of the 22 subjects in random order in the carrier sentence Since the English translation does not reflect the Chinese word order in the carrier sentence, here is an example of a complete morpheme-by-morpheme gloss as well as the translation of the carrier sentence for ease of readers’ understanding: (‘son and grandson’) of (‘son’) three times’, translated as ‘I read /ɭə2/ as in /ɭə2suə̃1/ three times’. Then the 80 sentences were read a second time in the same order. The carrier sentence is designed to help speakers produce more natural pronunciations (Müller Reference Müller2015) with a sentence focus on the monosyllabic word /ɭə/ or /lə/ (the second /ɭə/ or /lə/ in the carrier sentence, shown in bold in the example above). Laterals and schwas of the monosyllabic word /ɭə/ or /lə/ in the carrier sentence were extracted as target tokens. The recording from the second time was used for analysis because speakers were more practiced and relaxed when doing the second recording, which was more fluent and natural with fewer mistakes due to nervousness.
The recordings were made onto a laptop computer by the first author (a native of Zibo) in 2019 in a local sound-proof recording booth, using a Sennheiser GSP 602 headset microphone at a sampling rate of 44.1kHz. Some of the words in the recording are provided in Table 3 as examples and the whole wordlists are attached in Tables A1 and A2 in Appendix A. (Note: the target tokens extracted for analysis are not from the compounds in the wordlist, but from the monosyllabic /ɭə/ and /lə/ in the focus position of the carrier sentence).
In order to have a detailed understanding of the two schwas regarding their positions in the vowel chart, we also recorded monosyllabic words with the alveolar lateral /l/ followed by other monophthongs in the Zibo dialect, i.e., /li, ly, lu, lɑ, lɛ, lɔ/ by the same 22 speakers after the recording of /ɭə/ and /lə/ in the recording procedure. The same carrier sentence was used in the recording, and for each vowel of /i, y, u, ɑ, ɛ, ɔ/, 30 tokens from 30 monosyllabic words were extracted.
2.3 Annotation and analyses
Annotation and acoustic analysis were performed using Praat (Boersma & Weenink Reference Boersma and Weenink2021). All tokens were manually annotated based on auditory impression, the waveform and spectrogram of the target sounds. The boundaries of laterals were mainly determined by a relatively low overall amplitude and weaker formant structure (lighter in the spectrogram), and lower F1 with respect to adjacent schwas (cf. Machač & Skarnitzl Reference Machač and Skarnitzl2009; Kochetov et al. Reference Kochetov, Heegard Petersen and Arsenault2020). The following schwas were annotated based on the onset and offset of a similar periodic waveform and inspections of the end of regular glottal pulses with stable vowel formants in the spectrogram. Figure 2 presents a sample annotated token /ɭə/ in Tone 3 by one of the female speakers.
Running a Praat script (with LPC Burg method), formants and duration information for each segment were extracted. The first three formant values in Hz were extracted at 11 evenly distributed points during each lateral and each vowel with the default window settings (a 25 ms Gaussian window with a 5 ms step). The maximum formant value used in the Burg analysis was set by gender (5000 Hz for males and 5500 Hz for females) to find five formants. Data for the temporal midpoint (point 6) of each segment were used for the static analysis of the laterals and schwas (Tabain et al. Reference Tabain, Butcher, Breen and Beare2016; Tabain & Kochetov Reference Tabain and Kochetov2018; Kochetov et al. Reference Kochetov, Heegard Petersen and Arsenault2020). Since the presence of antiformants and lower spectral intensities can affect the accuracy of automatic formant tracking in lateral approximants (Punnoose et al. Reference Punnoose, Khattab and Al-Tamimi2013; Kochetov et al. Reference Kochetov, Heegard Petersen and Arsenault2020), we performed a manual re-check of all lateral tokens through an inspection of spectrograms and FFT spectra in order to correct for potential errors (cf. Johnson Reference Johnson2012; Punnoose et al. Reference Punnoose, Khattab and Al-Tamimi2013; Kochetov et al. Reference Kochetov, Heegard Petersen and Arsenault2020). Altogether 3520 target tokens from the monosyllabic words /ɭə/ and /lə/ (including 880 /ɭ/, 880 /l/, 880 [əɭ] and 880 [əl]) were extracted. In addition, 660 tokens for monophthongs /i, y, u, ɑ, ɛ or ɔ/ following /l/ (30 tokens from each of the 22 speakers) were extracted, and the F1 and F2 values at their temporal midpoints were measured in Praat to calculate the mean F1 and F2 for each vowel, shown in the vowel chart of the Zibo dialect in Figure 3. In order to maintain the consistency of auditory and acoustic analyses, laterals produced as syllabic consonants were removed from data analysis.
Two analyses were conducted on the laterals and the following schwas: a static acoustic analysis and a dynamic acoustic analysis. For the descriptive part of the static analysis, F1, F2 and F3 values (Hz) of laterals and schwas in two gender groups as well as the duration of each segment were presented in order to facilitate comparisons with studies of other languages and dialects. The measures used for the statistical analyses to capture the spectral differences between the two laterals and between the two schwas were F1, F2 and F3 in Bark (Traunmüller Reference Traunmüller1990). In order to normalize the speech rate of different speakers (Miller Reference Miller, Eimas and Miller1981; Port & Dalby Reference Port and Dalby1982), C/V duration ratio (DR: the duration of lateral divided by the duration of schwa) was used in the statistical analyses to reflect the temporal information of the lateral in relation to that of the following schwa in the monosyllabic words.
2.4 Statistics
Linear Mixed Effects (LME) models were constructed to the overall dataset to examine each formant measure (F1, F2 and F3 in Bark) at the temporal midpoints (point 6) of the two laterals and the schwas following them, as well as the C/V duration ratio of the two laterals in the monosyllabic words, with the lmer() function in the lme4 package (Bates et al. Reference Bates, Mächler, Bolker and Walker.2015) in R (version 4.1.1; R Core Team 2021). The models were:
FiL ∼ Consonant + (Consonant|Speaker) + (1|Item), where FiL = Lateral F1, F2, F3 (Bark)
DR∼ Consonant + (Consonant|Speaker) + (1|Item), where DR = Lateral Duration Ratio
FiV ∼ Vowel + (Vowel|Speaker) + (1|Item), where FiV = Vowel F1, F2, F3 (Bark)
The fixed factor was ‘Consonant’ (/ɭ/, /l/) or ‘Vowel’ (/ə/ following /ɭ/, /ə/ following /l/) in the models. ‘Speaker’ and ‘Item’ were the random factors. Tests were conducted separately for each formant measure for both the laterals and the following schwas, as well as for the C/V duration ratio of laterals. In addition, a paired sample t-test was conducted on the two laterals and the two schwas for each speaker to see individual variation with significance.
Dynamic formant analyses were conducted in order to present a clearer picture of the differences between the two laterals and between the schwas following them across the whole syllable interval. It was done by time-normalizing each segment with 11 points, with altogether 21 points for each syllable /ɭə/ or /lə/ since the eleventh point of the consonant segment overlapped the first point of the following vowel. We used 11 time-normalized points for each segment in the CV syllable rather than 21 time-normalized points across the entire CV interval as in Nance (Reference Nance2014), because there was difference in the duration of the two laterals and the schwas following them, as shown in the static data analysis. We wanted to capture equivalent points for each phoneme for comparison. Formant trajectories were analyzed using Smoothing Spline ANOVAs and 95% Bayesian confidence intervals and further visualized in R, similar to Nance (Reference Nance2014) and Kirkham (Reference Kirkham2017), also see Davidson (Reference Davidson2006) for a general discussion of SS-ANOVA as related to speech research.
3 Results
The static and dynamic acoustic analysis are presented separately in the following sections. [əɭ] and [əl] are used to distinguish the schwas following /ɭ/ and /l/ for ease of readability. [əɭ] with a subscript retroflex lateral denotes the schwa following /ɭ/ and [əl] with a subscript alveolar lateral for that following /l/.
3.1 Results of the static acoustic analysis
3.1.1 Formant frequencies of laterals and schwas as well as the C/V duration ratio of laterals
A general description of the values in Hz of the first three formants as well as the duration information (means and SDs) of the laterals and the following schwas in the two gender groups are tabulated in Table 4, facilitating comparisons with studies of other languages and dialects.
Note: Duration ratio* here refers to the duration of schwa divided by the duration of lateral in the monosyllabic word.
It can be seen in Table 4 that for both groups, mean F1 of the retroflex lateral is lower than that of the alveolar lateral. No obvious difference is found for F2 between /ɭ/ and /l/ in either gender group. The retroflex lateral does not show an obvious lower F3 value (a mean of 31Hz lower for the male group and a mean of 32Hz lower for the female group) compared with its non-retroflex counterpart as it does in other languages, for example, a mean of 230Hz lower for Malayalam males (Tabain & Kochetov Reference Tabain and Kochetov2018), and a mean of 600 Hz lower for a Tamil male (Narayanan et al. Reference Narayanan, Byrd and Kaun1999). Additionally, the mean duration of /ɭ/ in the Zibo dialect in both groups is longer than /l/ in monosyllabic words.
As for [əɭ] and [əl], the major difference is found in their F1 values and durations, with [əɭ] having an average lower F1 and shorter duration than [əl] in both groups. In addition, similar to the finding of Standard Mandarin (Mok Reference Mok2009), the Zibo dialect also has a syllable-timed rhythm, therefore, the shorter duration of [əɭ] could be predicted by the longer duration of /ɭ/ in the monosyllabic word and the same is true for [əl]. These results imply that although both are marked /ə/ in the records of the Zibo dialect (Meng & Luo Reference Meng and Luo1994), there are acoustic differences between the two schwas. Figure 3 shows the data for the monophthongs in the Zibo dialect. In the vowel charts of both gender groups, [əl] is below the central part on the vertical axis (F1), and [əɭ] occurs around the central part of the vowel chart; in both groups, [əɭ] and [əl] have very similar F2 values and [əɭ] has a lower F1 than [əl].
Formant frequencies were then converted to Bark scale for the statistical analyses. Significant results from the LME models are presented in Table 5 with the alveolar lateral /l/ and its following schwa [əl] set as the baseline. Boxplots comparing F1, F2 and F3 as well as the C/V duration ratio of the two laterals are presented in Figure 4.
The LME results in Table 5 show that the retroflex lateral /ɭ/ has a significantly lower F1 than the alveolar retroflex /l/ (β = −0.23, SE = 0.03, t = −6.54, p< .0001). No significant difference is found for F2 (β = 0.03, SE = 0.04, t = 0.85, p = .4070) or F3 (β = −0.07, SE = 0.04, t = −1.86, p = .0775) of the two laterals. As for the C/V duration ratio of laterals (DR), it is found that /ɭ/ has a significantly higher duration ratio than /l/ (β = 1.08, SE = 0.11, t = 10.19, p< .0001). These results suggest that the major acoustic differences between the two laterals in the Zibo dialect are a significantly lower F1 and a significantly larger duration ratio of the lateral for the retroflex lateral, see the boxplots comparing formants and duration ratio of the two laterals in Figure 4.
Generally, a lower F1 is usually an acoustic correlate of raised tongue body, higher tongue body position or an anterior occlusion like round lips or raised tongue tip (Stevens Reference Stevens1998; Johnson Reference Johnson2012). The significantly lower F1 found in the retroflex lateral could be an intrinsic property of the sound, but it should be noted that the schwa following /ɭ/, namely [əɭ] in the present study, is recorded to be higher than schwas following other consonants in the Zibo dialect (Meng & Luo Reference Meng and Luo1994). If we take coarticulation into consideration, the lower F1 in retroflex laterals could also be caused by the lower F1 in its following schwa. This will be further discussed in the dynamic analysis in Section 3.2.
Note: DR is the abbreviation of C/V duration ratio.
The result for F2 of the two laterals is different from the previous case study in Dong and Liang (Reference Dong and Liang2019) where a higher F2 was found in /ɭ/ in the data of a native female speaker, which might be due to individual variability in articulation. The result for F3 for the two laterals is consistent with previous studies of some Chinese dialects (Zhu & Jiao Reference Zhu and Jiao2006; Dong & Liang Reference Dong and Liang2019; Qi Reference Qi2019), but different from the findings for studies of /ɭ/ in other languages, such as Tamil (McDonough & Johnson Reference McDonough and Johnson1997; Narayanan et al. Reference Narayanan, Byrd and Kaun1999), Malayalam (Punnoose Reference Reenu2011; Punnoose et al. Reference Punnoose, Khattab and Al-Tamimi2013; Tabain & Kochetov Reference Tabain and Kochetov2018), Gujarati (Dave Reference Dave1977), Kannada (Tabain & Kochetov Reference Tabain and Kochetov2018) and several Australian Aboriginal languages (Tabain et al. Reference Tabain, Butcher, Breen and Beare2016). In acoustic phonetics, it is considered that retroflexes have a low F3 (Stevens Reference Stevens1998), which has been shown to be the primary acoustic difference between retroflexes and other coronal sounds (Hamann Reference Hamann2003b). In this study, no significant difference for F3 is found between the two laterals, with a trend of a lower F3 for the retroflex lateral (p = .0775). Individual variation of the two laterals and the following schwas will be shown in detail in Section 3.1.2.
A significantly larger duration ratio is found for /ɭ/ than for /l/, which is inconsistent with studies in other languages where different results are found (duration: /ɭ/ < /l/) in Tamil (McDonough & Johnson Reference McDonough and Johnson1997), Malayalam and Kannada (Tabain & Kochetov Reference Tabain and Kochetov2018), and no duration difference is found in the three Australian languages Arrernte, Pitjantjatjara and Warlpiri (Tabain et al. Reference Tabain, Butcher, Breen and Beare2016). The larger duration ratio of /ɭ/ shows the retroflex lateral in the Zibo dialect holds its lateral closure longer before releasing to schwa in /ɭə/ than /l/, and it takes an average of 72% of the time in the monosyllabic word /ɭə/ compared with an average of 59% for /l/ in /lə/ in this study.
Boxplots comparing F1, F2 and F3 of the two schwas are presented in Figure 5, in which the most obvious difference is shown in F1. The LME results in Table 5 show that the schwa following the retroflex lateral ([əɭ]) has a significantly lower F1 than that following the alveolar retroflex ([əl]) (β = −1.18, SE = 0.05, t = −23.70, p< .0001) and F3 is significantly higher for [əɭ] than for [əl] (β = 0.14, SE = 0.06, t = 2.42, p = .0246). No significant difference is found in F2 of the two schwas (β = 0.11, SE = 0.07, t = 1.64, p = .1170). The difference in F1 of schwas could have an effect of phonetic cue enhancement in the distinction between the two laterals in the Zibo dialect, which will be explained in detail in Section 4.2.
3.1.2 Individual variation
In this study, we have treated ‘Speaker’ as a random variable given the relatively large number of speakers involved. In data exploration, it is found that each individual data in the LME models show a large amount of variability, especially for F3 of the laterals (see List S1 for individual data modeled using LME models in the supplementary materials), suggesting speakers employ a wide range of strategies in the realization of the retroflex and non-retroflex lateral contrast. Inter-speaker variability has been reported in previous studies on both laterals (Nance Reference Nance2014; Kirkham Reference Kirkham2017; Kochetov et al. Reference Kochetov, Heegard Petersen and Arsenault2020) and retroflexes (Simonsen et al. Reference Simonsen, Moen and Cowen2008; Tabain Reference Tabain2009; Tabain et al. Reference Tabain, Kochetov and Beare2020b; Lorenc et al. Reference Lorenc, Marzena Żygis, Pape and Sóskuthy2023). Detailed results for the paired sample t-tests of individual data (to show significance of the differences between laterals and between schwas for individual data) and illustrative boxplots of individual variability in F2 and F3 of laterals and schwas are shown respectively in Table A3 and Figures A1–A4 in Appendix B. Analysis of individual production patterns reveals variation in strategies of producing the retroflex-alveolar lateral contrast: speakers F01, F06, F08 (female, aged over 55), speaker F02 (female, aged under 45), speakers M01, M02 (male, aged over 55) and speakers M06, M07, M10 (male, aged under 45) show a significant lower F3 in their production of retroflex laterals than alveolar laterals while speaker F10 (female, aged over 55), speakers F04, F11 (female, under 45), speaker M04 (male, over 55) and speakers M05, M09 (male, under 45) present exactly the opposite result for F3 values in /ɭ/ and /l/.
Despite the variations among native speakers, for all 22 speakers in this study, the duration ratio of laterals is significantly higher for /ɭ/, and F1 of [əɭ] is significantly lower than that of [əl]; for 19 speakers (except speakers F02, F06 and M04), F1 value is significantly lower for /ɭ/ than for /l/ but with a small difference. This reveals that the strongest and most consistent acoustic cues to distinguish between /ɭəɭ/ and /ləl/ are the larger duration ratio of the retroflex lateral and the lower F1 of the schwa following it.
3.2 Dynamic acoustic analysis and results
The results of SS-ANOVAs are visualized in Figure 6. Since the temporal information is normalized, this does not show any duration information, and only compares the formant trajectories of F1, F2 and F3. The part from Point 1 to Point 11 is the lateral interval and that from Point 11 to Point 21 is the schwa interval. In Figure 6, the formant trajectories obtained from SS-ANOVA analyses are in colored lines with 95% confidence intervals in gray (it is invisible when the CIs are very small). Values in the x axis are time-normalized with C1 marking the onset of the lateral, and C11/V1 marking the end of the lateral and onset of the following schwa.
First, the F1 curve of the retroflex lateral is lower than that of the non-retroflex lateral with a small difference during the whole lateral interval (from C1 to C11), and the F1 curve of [əɭ] is lower than that of [əl] during the whole schwa section (from V1 to V11) with a notable difference. This result is in line with the findings for F1 at the temporal midpoints of both laterals and schwas in the static analysis. It further illustrates that not only for the formant-stable midpoints of laterals and schwas, a lower F1 is found for /ɭəɭ/ than for /ləl/ over the whole syllable interval. This suggests that the lower F1 found in /ɭ/ is more likely to be an intrinsic acoustic characteristic for this lateral rather than an effect of coarticulation since a lower F1 in /ɭ/ is shown from the very beginning of the lateral segment (C1). Perceptual evidence from further experiments is needed to show whether the statistical significance in F1 of laterals in production is reflected in perception.
Second, the F2 trajectories of two laterals (from C1 to C11) almost overlap with each other; for schwas, the F2 curve of [əɭ] overlaps with that of [əl] for about the first 20% of the schwa interval (approximately from V1 to V3) and becomes a little higher for the rest part of the schwa interval. This difference is small and does not show a significance at the midpoint of the schwa interval (V6), as shown in the results of the LME models in Table 5.
Third, for F3 trajectories of laterals (from C1 to C11), the formant trajectories overlap at both the initial and final positions but diverge in the middle part of the lateral segment, with /ɭ/ having a slightly lower F3 than /l/. This small difference in F3 of laterals does not show a significance at the midpoint of the lateral interval (C6), as shown in Table 5. For schwas (from V1 to V11), the F3 curve of [əɭ] overlaps with that of [əl] at the beginning (from V1 to V3), but the two formant curves begin to diverge at around V3 and the F3 curve of [əɭ] becomes higher than that of [əl] for the rest of the schwa interval.
4 Discussion
This section aims to discuss the results reported above by analysing the observed phenomena and providing possible explanations.
4.1 Laterals and the following schwas in the Zibo dialect
In summary, as already reported for the Zibo dialect (Meng & Luo Reference Meng and Luo1994), there are two laterals in this dialect, namely the retroflex lateral /ɭ/ and non-retroflex lateral /l/. /ɭ/ appears in a limited phonemic context: only followed by the mid-vowel /ə/, while /l/ can be followed by most vowels (including nasalized ones) in the vowel inventory. In this production experiment, since /ɭə/ and /lə/ are recorded in the same phonetic context in the carrier sentence, F3 of /ɭ/ is expected to be lower than /l/, as has been reported in the retroflex laterals of other languages (Tamil, Malayalam, Kannada and some Australian Aboriginal languages), and the following schwas might show different phonetic realizations due to coarticulation from different initial laterals. Some findings are unexpected.
Overall, findings in the static acoustic analysis show several acoustic cues presented by all speakers in the production experiment: /ɭ/ has a significantly larger C/V duration ratio in the monosyllabic word and the F1 of [əɭ] is significantly lower than that of [əl]. This is in line with Qi’s (Reference Qi2019) study of the two laterals in some other dialects in Shandong province. Differently, another acoustic cue found in the Zibo dialect is that F1 of /ɭ/ is significantly lower than that of /l/, and individual data analyses show that 19 out of 22 speakers show a significantly lower F1 in /ɭ/ than /l/, while no significant difference is found in the data of the other three speakers. The difference for F1 of the two laterals is small, and F1 in /ɭ/ is not consistently lower in the retroflex lateral among all speakers, which suggests that F1 of laterals may not be an important acoustic characteristic. No significant difference is found for F2 or F3 of the two laterals. The finding for F3 of the retroflex lateral in the Zibo dialect is consistent with previous studies in the dialects of Chinese (Zhu & Jiao Reference Zhu and Jiao2006; Dong & Liang Reference Dong and Liang2019; Qi Reference Qi2019), but is different from studies of retroflex laterals in other languages of the world, e.g., Tamil, Malayalam, Kannada and some Australian Aboriginal languages. The acoustic data also show that the following schwas, i.e. [əɭ] and [əl] in this study, show obvious formant differences, especially for F1.
4.2 A phonemic /ɭ/ with evidence from the production experiment
In this study, we support the view that there are two lateral phonemes (/ɭ/ and /l/) in the Zibo dialect and the schwas following the two laterals could be allophones of a phonemic /ə/. The explanations are provided as follows.
4.2.1 Duration ratio
In the acoustic findings of the two laterals, an obvious distinction is the duration ratio of laterals in the syllable. Segment duration is not a phonological contrast in the Zibo dialect: there is neither a long/short vowel contrast in the Zibo dialect as in Japanese (Hirata Reference Hirata2004) and German (Hertrich & Ackermann Reference Hertrich and Ackermann1997), nor does it have consonantal gemination, as is reported in Malayalam (Tabain & Kochetov Reference Tabain and Kochetov2018). Despite this fact, duration of segments is still important since it may affect the clarity of its perception and even compensate for the lack of spectral distinction, as in some Russian vowels (Kouznetsov Reference Kouznetsov2003).
As the retroflex and non-retroflex contrast occurs infrequently or is often neutralized in the word-initial position (Ohala & Ohala Reference Ohala and Ohala2001; Hamann Reference Hamann2003b; Tabain et al. Reference Tabain, Butcher, Breen and Beare2016; Tabain & Kochetov Reference Tabain and Kochetov2018), this study assumes that the longer duration for /ɭ/ in the Zibo dialect may have a function of maintaining its distinctiveness from the alveolar lateral and make this segment more perceptually distinguished in the word-initial position. Perceptual evidence is needed to (dis)confirm this point.
In some special cases, several productions of the retroflex lateral by speakers M01, M02 and M04 (all of them aged above 55) are even syllabic, as shown in Figure 7. This echoes with the syllabic /ɭ̩/ reported in Jincheng dialect (Zhu & Jiao Reference Zhu and Jiao2006). The reason might be that for some elder male speakers, infrequently, they would produce the retroflex lateral with no obvious release of the lateral constriction into the following schwa for ease of articulation. These could be seen as extreme cases when /ɭ/ holds for an even longer duration and takes the time of the whole syllable. In this case, an atypical [əɭ] with a higher tongue position would not be necessary to maintain /ɭ/’s distinction from /l/.
Moreover, it is recorded in Zhang (Reference Zhang1999) and also reported from speakers in this study that when the pronunciation of /ɭəɭ/ is prolonged, speakers tend to lengthen the /ɭ/ part in the syllable; while when the pronunciation of /ləl/ is prolonged, speakers are more likely to lengthen the schwa in the syllable. By increasing the duration of different parts of the syllable, the distinction between /ɭəɭ/ and /ləl/ is made. This could also be a proof of a phonemic /ɭ/ in the Zibo dialect.
4.2.2 Phonetic enhancement and maximum perceptual contrasts
We may hold the opinion that /ɭ/ and /l/ in the Zibo dialect belong to different phonemes from analyses above, but the acoustic results, especially the dynamic formant curves of /ɭə/ and /lə/, show that the two schwas, considered to be allophones of the same phoneme, show a larger spectral difference than the two laterals as in Figure 6.
The contradiction can be explained by the theory of phonetic enhancement. This theory refers to the employment of secondary phonetic parameters to facilitate the perception of the primary phonetic feature in the implementation of a phonological contrast (Stevens et al. Reference Stevens, Jay Keyser, Kawasaki, Perkell and Klatt1986; Stevens & Keyser Reference Stevens and Jay Keyser1989, Reference Stevens and Jay Keyser2010; Keyser & Stevens Reference Keyser and Noble Stevens2006). According to Stevens et al. (Reference Stevens, Jay Keyser, Kawasaki, Perkell and Klatt1986), groups of distinctive features tend to be implemented simultaneously to form segments, and a given distinctive feature can be represented in a sound with varying degrees of strength, which in turn can be enhanced by its co-occurrence with other features. For example, in English the auditory distinctiveness of intervocalic /b/ and /p/ is signaled by a shorter duration of a silent interval for /b/, while by lengthening the duration of the preceding vowel when producing a /b/, speakers co-vary silent durations with longer preceding vowels to produce a clearer auditory distinction between /b/ and /p/ (Lotto & Holt Reference Lotto, Holt, Hickok and Small2016). The enhancement is more effective when the distinction occurs domain-finally, where the primary distinctiveness is weak (Stevens et al. Reference Stevens, Jay Keyser, Kawasaki, Perkell and Klatt1986).
As already noted, the retroflex lateral /ɭ/ is rare, and even rarer in the word-initial position (Steriade Reference Steriade1995; Ohala & Ohala Reference Ohala and Ohala2001; Hamann Reference Hamann2003b; Tabain et al. Reference Tabain, Butcher, Breen and Beare2020a). However, according to Zibo phonology, laterals only occur syllable-initially (also word-initially in monosyllabic words). In order to keep such a weak acoustic contrast in the word-initial position or CV structure perceptible, the contrast is expected to be enhanced by the features of its phonetic context.
Strategies to keep maximum perceptual contrasts are variable across languages, speakers and contexts (Stevens & Keyser Reference Stevens and Jay Keyser2010). Phonetic enhancement found in the retroflex and non-retroflex contrasts research of an Australian language (Tabain et al. Reference Tabain, Butcher, Breen and Beare2020a) is that retroflexes rely on the preceding vowel for correct identification since the differences between retroflex and alveolar nasals are compromised by vowel nasalization. For the retroflex lateral in the Zibo dialect, the place where phonetic enhancement most likely occurs for maximal phonetic contrast is the following vowel, /ə/ in this case. We will explain this from both the temporal and the spectral aspects of the segments.
Literature has shown that the frequency response of human beings’ auditory system is not linear, and the auditory system is more sensitive to frequency changes at the low end (between 100 and 1000 Hz) of the audible frequency range than at the high end (between 1000 and 10,000 Hz) (Ladefoged Reference Ladefoged1996; Johnson Reference Johnson2012). Thus, a phonetic variation below 1000 Hz in the formants of the following schwa can better enhance the distinction of the previous laterals. For /ɭəɭ/ and /ləl/ in the Zibo dialect, since /ɭ/ has a lower F1 than /l/, a lower F1 in its following schwa [əɭ] can co-occur with it to make a robust contrast enhancement.
The vowel charts in Figure 3 also help to explain as a variation, why F1 of [əɭ] is lower than [əl] to fit in the chart. As can be seen, in both the male and the female vowel charts, [əl] is lower in the space, and [əɭ] occurs around the central part of the vowel chart which is less dense. According to vowel dispersion theory (Liljencrants & Lindblom Reference Liljencrants and Lindblom1972; Lindblom Reference Lindblom1986; ten Bosch Reference ten Bosch1987), vowel systems generally tend to be optimal, in the sense that all vowels must be evenly spread in the available vowel space. It supposes that vowels are distributed in vowel space so as to maximize contrasts. Therefore, [əɭ] is a reasonable variation of [əl] in that it not only fits in the vowel space where the vowel density is small to maximize its contrasts with surrounding vowels, but also has similar F2 with [əl] in the vowel chart.
4.2.3 Geographic and typological factors
Apart from the above explanations, the retroflex lateral /ɭ/ is also reported in some dialects in other cities in Shandong province, such as Lijin, Qingdao, Linyi and Heze (Yang Reference Yang1990; Qingdao Municipal Archive 1997; Ma & Wu Reference Ma and Wu2003; Qi Reference Qi2019); for example, in Ma and Wu (Reference Ma and Wu2003), /ɭ/ is recorded as being followed by a weak schwa in Linyi dialect. In addition, Gao (Reference Gao2013) has stated that some Shandong dialects have a non-typical weak schwa following the retroflex lateral /ɭ/, and the realizations of this schwa may vary across dialects. Qian et al. (Reference Qian, Zhang and Luo2001) also report that phonologically, a rich inventory of initials (e.g., consonants), a simple system of finals (e.g., vowels) and a simple system of tones are the characteristics of Shandong dialects. Therefore, the typological characteristics of dialects in Shandong support the existence of /ɭ/ in the Zibo dialect. Taking /ɭ/ and /l/ as two phonemes with [əɭ] and [əl] as two allophones of the phoneme /ə/ may be more convincing.
4.2.4 Individual variation and a possible change in progress
Although not significant in the results of the LME models for all data, a significantly lower F3 in /ɭ/ is still found in some speakers (five out of 11 speakers in the male group and four out of 11 speakers in the female group), see Figure A1–A4 and Table A3 in Appendix B.
Retroflexion is said to affect mainly higher formants, which are generally lowered (Ladefoged & Maddieson Reference Ladefoged and Maddieson1996), and specifically the third formant shows a characteristic lowering (Stevens & Blumstein Reference Stevens and Blumstein1975; Hamann Reference Hamann2003b). This is found in language-specific investigations of retroflex consonants (e.g., stops, laterals, rhotic trills) in many languages, such as some Australian languages (Hamilton Reference Hamilton1996; Tabain et al. Reference Tabain, Butcher, Breen and Beare2016) and Malayalam (Dart & Nihalani Reference Dart and Nihalani1999; Tabain & Kochetov Reference Tabain and Kochetov2018). Hamilton (Reference Hamilton1996) claims that the lowered trajectory of F3 is what actually distinguishes retroflexes from other coronals, and the general consensus suggests that a lower F3 is sufficient to distinguish retroflexes from its non-retroflex counterparts. Previous studies of the retroflex fricatives in Standard Mandarin find that this retroflex does not necessarily involve the tongue tip in its articulation and thus shows no kind of bending backwards of the tip at all, which is very different from the same segment class in Tamil (Ladefoged & Wu Reference Ladefoged and Wu1984; Ladefoged & Maddieson Reference Ladefoged and Maddieson1996; Lee Reference Lee1999). Its place of constriction is the post-alveolar region, but it differs from the traditional laminal post-alveolar [ʃ] and from the retroflex fricative in Tamil in the shape of its tongue body, which is flatter. Although Lee (Reference Lee1999) refrains from referring to this segmental class as ‘retroflex’ due to its non-typical retroflex articulation, his finding of a lower frequency range for retroflex fricatives conforms to the phonetic literature of taking the Mandarin post-alveolar fricative as retroflex (Chao Reference Chao1948, Reference Chao1968; Ladefoged & Wu Reference Ladefoged and Wu1984; Ladefoged & Maddieson Reference Ladefoged and Maddieson1996).
Similarly, since the retroflex lateral in the Zibo dialect is not articulated with a subapical articulation or does not necessarily involve bending backwards of the tongue tip, it is not a typical retroflex lateral as that in Tamil or in other languages mentioned above. In the present study, taking into account the fact that averaging results might blur important individual differences (Lorenc et al. Reference Lorenc, Marzena Żygis, Pape and Sóskuthy2023), we consider that although no significant difference is found in F3 between the two laterals in the LME models, a lower F3 may still be important for some speakers to distinguish between /ɭ/ and /l/ in their production (see Section 3.1.2). In articulation, retroflexion is traditionally described as an articulation with bending backwards of the tongue (Trask Reference Trask1996), but differently, Hamann (Reference Hamann2003b) has proposed four articulatory characteristics for retroflexion, namely apicality, sublingual cavity, posteriority and retraction. Not all of them are required to occur in the same degree in all instances of retroflex segments, but the more of these properties a segment has, the more it is like a prototypical retroflex. In this study, the variation in individual data could suggest variability of degree of retroflexion, which would confirm an inherent variability of retroflexes as a sound category. This may also be indicative of a change in process towards disappearance of retroflexion in /ɭ/ and merging of the two laterals.
5 Conclusion
The study has presented an investigation of the acoustic differences between the retroflex and non-retroflex (alveolar) laterals as well as the schwas following the two laterals in the Zibo dialect.
For the first two research questions in Section 1.4, it is found that there is a significantly lower F1 in /ɭ/ than /l/, and the duration ratio of the lateral in /ɭə/ is significantly larger than that in /lə/. No significant difference is found for F2 or F3 of the two laterals. For schwas, it is found that F1 of the schwa following /ɭ/ is significantly lower than that following /l/ and F3 of the schwa following /ɭ/ is significantly higher than that following /l/. In addition, the LME results and individual data inspection indicate that a longer C/V duration ratio and a lower F1 in the following schwa are the most robust acoustic differences between /ɭəɭ/ and /ləl/. Some inter-speaker variation is observed in the realization of /ɭ/, especially in F3, which may be indicative of a change in progress. The answer to the third research question in Section 1.4 is that evidence could suggest that both the retroflex lateral /ɭ/ and the alveolar lateral /l/ are phonemes due to some phonetic differences as well as cue enhancement from the following schwa.
This study has provided acoustic data contributing to the understanding of lateral sounds in general. The results in this study are preliminary and further perceptual studies on the two laterals as well as two schwas are needed to (dis)confirm the results in production. In addition, articulatory research on the retroflex lateral in the Zibo dialect and other dialects in China is also necessary in order to investigate the position of constriction, the tongue shape and movement of the tongue during its articulation.
Acknowledgments
We would like to thank Dafydd Gibbon, Yang Xiaohu, Yu Jue and Wang Ting for their valuable comments on this work, as well as Marzena Zygis and three anonymous reviewers for their constructive feedback and suggestions on previous versions of this article. We are also thankful to Adele Gregory for kindly answering all our questions through the submission process.
Supplementary material
To view supplementary material for this article, please visit https://doi.org/10.1017/S0025100324000094
Appendix A. Wordlists
Note: All the tone values in the Phonemic column are citation tones. The tones of some compounds are realized with tone sandhi in the actual productions, which is not presented here. The target tokens used in this study are extracted where they appear as a monosyllabic word in the carrier sentence (the second /ɭə/ or /lə/ in the carrier sentence), not directly from the compound words. During the recording, speakers read the words shown in random order.
Appendix B. Paired sample t-test results of individual data grouped by gender and age
Note:* means p< .05, ** means p< .01, *** means p< .001, the t-value and p-value are provided where no significant difference is found.