1. Introduction: Imilike Igbo
In this paper, we examine the acoustics of vowels in the Imilike [ìmìlìkè] dialect of Northern Igbo (Igboid, Niger-Congo) (Manfredi Reference Manfredi and Bendor-Samuel1989). Imilike is spoken in several towns and villages in Enugu State, Nigeria (Nweya Reference Nweya2013, Reference Nweya2015). Standard Igbo has eight vowels (Emenanjo Reference Emenanjo1987; Ikekeonwu Reference Ikekeonwu1991; Emenanjo Reference Emenanjo2016), but auditorily-based research has identified eleven vowels in Imilike, as shown in Figure 1 (Nweya Reference Nweya2015). Like Standard Igbo, Imilike contrasts vowels in height, backness, tongue-root advancement (ATR). The three additional vowels that do not exist in Standard Igbo are the RTR mid front vowel [ɛ] and both the ATR and RTR versions of a nonlow central vowel, represented as ə̘ and ə̙ respectively. ATR and RTR schwas have also been identified in other dialects of Northern Igbo (Ikekeonwu et al. Reference Ikekeonwu, Ajapurumba Ezikeojiaku, Ubani and Ugoji1999; Mbah & Mbah Reference Mbah and Mbah2010). In all documented dialects of Igbo, the low vowel [a] is occasionally central but slightly forward (e.g., Uguru Reference Uguru2015 on Ika Igbo, Emenanjo Reference Emenanjo1987: 5 on Standard Igbo), as illustrated in Figure 1.
Using X-ray cinematography, Ladefoged and Maddieson (Reference Ladefoged and Maddieson1996) show that tongue-root retraction alongside the raising of the larynx distinguishes RTR vowels from their ATR counterparts in Standard Igbo. The labels RTR and ATR are meant to distinguish the two series, and do not signify which articulatory structures cause the movements. With further data, these labels could change. However, based on these results, a distinction in the language has been recognized by both speakers and linguists. It has been called ATR/RTR based on initial observations, and it could equally well be called raised/lowered larynx based on current evidence. This paper will address the underlying phonetic cues, for further evidence towards the mechanism. Studies hypothesize that these same articulatory gestures distinguish the ATR and RTR schwas in Northern dialects of Igbo, including Imilike (e.g., Ikekeonwu et al. Reference Ikekeonwu, Ajapurumba Ezikeojiaku, Ubani and Ugoji1999; Mbah & Mbah Reference Mbah and Mbah2010; Nweya Reference Nweya2015). However, this claim has not been investigated either acoustically or articulatorily. Studies that identify ATR and RTR schwas in Northern Igbo are auditorily-based mostly by speaker-linguists of the dialects (e.g. Ikekeonwu et al. Reference Ikekeonwu, Ajapurumba Ezikeojiaku, Ubani and Ugoji1999; Mbah & Mbah Reference Mbah and Mbah2010; Nweya Reference Nweya2015). Consequently, we can say their research is also based on their intuition as native speakers of the language.
Unlike the other vowels in the language, the two schwas do not occur in V-initial syllables but can occur in other environments. With respect to the feature ATR and RTR, all the vowels exhibit contrastive distribution, as shown in the minimal pairs in (1).
(1) Minimal and near-minimal pairs with vowels
The vowel [ɛ] is contrastive, but it occurs in free variation with the vowel [a] in some words, as shown in (2). While [ɛ] varies freely with [a], there are instances of [a] which do not vary with [ɛ]. This suggests that [ɛ] and [a] are distinct vowels.
(2) [ɛ] in free variation with [a] (Nweya Reference Nweya2015: 157)
The other vowels that occur in free variation with other vowels are the schwas. In this case, the two schwas in most words occur in free variation with high and low vowels, as shown in (3a). Whether the allophones of the schwas are high or low vowels depends on the word, but the schwas and their corresponding peripheral vowels in free variation have the same ATR feature. As a result of this, Nweya (Reference Nweya2015) suggests that the schwas are not contrastive. That said, there are cases of words, such as the examples in (3b), where the schwas are invariant.
(3) Words with schwas in Imilike
Similar to all varieties of Igbo and many African languages (Pulleyblank Reference Pulleyblank1986 on Okpẹ (Edoid, Nigeria), Archangeli and Pulleyblank Reference Archangeli and Pulleyblank2002 on Kinande (Bantu, DR Congo), Morton Reference Morton, Marlo, Adams, Green, Morrison and Purvis2012 on Anii (Kwa, Ghana/Togo), Noske Reference Noske1996 on Turkana (Nilotic, Kenya), Angsongna and Akinbo Reference Angsongna and Akinbo2022; Angsongna Reference Angsongna2023 on Dagaare (Mabia, Ghana), etc.), the grouping of vowels into ATR and RTR in Imilike is based in part on their phonological patterning in tongue-root harmony, which involves vowels in a specific domain obligatorily agreeing in the feature ATR/RTR. In this case, all the vowels in a specific domain are either ATR or RTR, as shown in (4a). Nonlow vowels only cooccur with vowels of the same ATR/RTR value, but the vowel [a] can cooccur with either ATR or RTR vowels, as shown in (4b).
(4) Root-internal vowel cooccurrence
Similar to Standard Igbo (Zsiga Reference Zsiga1992), ATR harmony also occurs across stem-affix boundaries in Imilike. The root vowels trigger harmony which targets prefixal and suffixal vowels, as illustrated in (5) where the targets are in boldface. All the ATR and RTR vowels, including the schwas, can be triggers and targets of tongue-root harmony. The low vowel [a] consistently triggers RTR harmony, as in (5b). While the vowel [a] can occur with ATR and RTR vowels, there are instances, such as the prefix in (5d), where the low vowel [a] alternates with [e] to agree in tongue-root feature with the adjacent stem vowel.
(5) Tongue-root harmony across stem-affix boundary
In this paper, we conduct an acoustic study of Imilike Igbo, which has not previously been done. We confirm the eleven vowels acoustically and examine the phonetic correlates of the ATR contrast in the language. We find that the eleven vowels are most reliably distinguished by F1, B1, the energy (dB) of voiced sound below 500Hz and duration.
The paper is organized as follows: Section 2 gives some background on the feature ATR, including its typical phonetic correlates. Section 3 presents the research methodology, including the number of participants and the acoustic measurements. The results of the investigations are presented in Section 4. In Section 5, we present the discussion and conclusion of our investigation.
2. Background: The ATR feature
Advanced Tongue Root, or ATR, is a property of vowels that is very common in the languages of Sub-Saharan Africa, including both the Niger-Congo and Nilo-Saharan language families, along with certain other families outside of Africa (e.g., Tungusic). Vowels in such languages are either advanced (ATR) or retracted (RTR). Many languages with ATR contrasts also have ATR harmony, in which vowels within a domain (typically a word) must be either all ATR or all RTR.
A wide variety of phonetic correlates have been attributed to ATR, with variation both across languages and between speakers of the same language. For example, an X-ray study of Luo showed that different speakers differentiated ATR from RTR vowels using different articulatory mechanisms (Jacobson Reference Jacobson1979). While it has been repeatedly shown that F1 is the usual acoustic correlate of ATR across a variety of languages (with RTR vowels having higher F1 than their ATR counterparts), the reliability of other properties like F2 and F3 varies from one language to another (Przezdziecki Reference Przezdziecki2005; Starwalt Reference Starwalt2008). Studies have also shown that the bandwidth of the first formant (B1) can distinguish ATR/RTR contrasts in some languages (Hess Reference Hess1992; Ladefoged & Maddieson Reference Ladefoged and Maddieson1996; Fulop et al. Reference Fulop, Kari and Ladefoged1998). According to Ladefoged and Maddieson (Reference Ladefoged and Maddieson1996: 301–302), the narrower bandwidths of ATR vowels are ‘probably because there is greater tension of the vocal tract walls and fewer acoustic losses in the region of the resonances’. However, Ladefoged and Maddieson (Reference Ladefoged and Maddieson1996) did not confirm this claim in their study. An increased B1 has also been shown to correlate with breathiness (Klatt & Klatt Reference Klatt and Klatt1990; Silverman et al. Reference Silverman, Blankenship, Kirk and Ladefoged1994). However, ATR vowels, which have breathiness as a characteristic in certain languages, are shown to have lower B1 (Reh Reference Reh1996; Guion et al. Reference Guion, Post and Payne2004).
Breathy phonation as a property of ATR vowels can be understood if we consider the observation that many factors contribute to B1 values, including glottal configuration and subglottal pressure (Fant Reference Fant1979; Silverman et al. Reference Silverman, Blankenship, Kirk and Ladefoged1994; Park Reference Park2002). Recent articulatory studies of the lower vocal tract observe that voice quality as an additional property of vowels in the ATR system is a result of the ‘physiological states that synergize with epilaryngeal constriction’ including tongue retraction (Moisik et al. Reference Moisik, Czaykowska-Higgins and Esling2021: 173; see also Edmondson et al. Reference Edmondson, Padayodi, Hassan and Esling2007; Esling et al. Reference Esling, Moisik, Benner and Crevier-Buchman2019; Moisik Reference Moisik2013).
While tongue-root advancement and retraction are considered the articulatory gestures of ATR and RTR vowels respectively, studies suggest that the vertical movement of the larynx also distinguishes the pairs (Esling Reference Esling1996; Ladefoged & Maddieson Reference Ladefoged and Maddieson1996; Elgendy Reference Elgendy2001; Edmondson et al. Reference Edmondson, Padayodi, Hassan and Esling2007). Research also indicates that there is a negative correlation between vowel height and larynx height (Hoole & Kroos Reference Hoole, Kroos, Rose, Mannell, Robert-Ribes and Vatikiotis-Bateson1998), and that the raising of the larynx shortens the length of the vocal tract (Sundberg & Nordstr¨om Reference Sundberg and Nordström1976; Lee et al. Reference Lee, Potamianos and Narayanan1999; Tecumseh Fitch & Reby Reference Tecumseh Fitch and Reby2001). The results of laryngoscopic studies on the ATR/RTR contrast in Kabiye (Gur, Togo) and Akan (Kwa, Ghana) show the aryepiglottic folds come together medially, in the RTR set, but that they leave room antero-posteriorly for the continuant vowel production (Edmondson & Esling Reference Edmondson and Esling2006b; Edmondson et al. Reference Edmondson, Padayodi, Hassan and Esling2007; Esling Reference Esling, Carpenter, David, Lionnet, Sheil, Stark and Wauters2012). Such other articulatory properties also have measurable acoustic correlates. For instance, acoustic studies indicate that aryepiglottic compression correlates with an increase in the values of spectral centre of gravity, which represents the average frequency across the entire frequency domain (Edmondson & Esling Reference Edmondson and Esling2006a; Starwalt Reference Starwalt2008).
Analyses of ATR in some languages have found voice quality to differentiate ATR vowels from their RTR counterparts. This is notably the case in many Nilotic and Nilo-Saharan languages, in which ATR vowels are breathy voiced when compared to their RTR counterparts (Tucker & Mpaayei 1955; Halle & Stevens Reference Halle and Stevens1969; Reh Reference Reh1996; Guion et al. Reference Guion, Post and Payne2004).Footnote 1 In contrast, voice quality is not generally considered a reliable cue for ATR/RTR contrasts in Niger-Congo languages, though it has only been investigated acoustically in very few of them.
Finally, ATR/RTR contrasts may be associated with durational differences in some languages, including ones with contrastive vowel length. For instance, in their work on Tugen (a variety of Kalenjin), Local and Lodge (Reference Local and Lodge2004) find that RTR vowels are shorter than ATR vowels. Olejarczuk et al. (Reference Olejarczuk, Otero and Baese-Berk2019) also find that RTR vowels are shorter than their ATR counterparts in Komo (Nilo-Saharan, Sudan and Ethiopia). To our knowledge, duration as a cue to ATR/RTR contrasts has not been investigated in Niger-Congo languages.
The phonetics of ATR/RTR contrasts have been investigated in a limited number of languages. In this paper, we examine a large number of potential phonetic correlates of ATR in Imilike, in order to thoroughly document the phonetic cues in this system.
3. Methodology
3.1 Stimuli, participant and procedure
The stimuli in this work are based on the Swadesh 100-word list. The data come from two speakers of Imilike, one male (age 40) and one female (age 28), in Nsukka, Nigeria. Both speakers pronounced the same word list. The data were elicited in a quiet room with a zoom H5 recorder, at a sampling rate of 48kHz: 16Bit. The vowels of the words were manually annotated in Praat for acoustic analysis (Boersma & Weenink Reference Boersma and Weenink2021). Using the scripts written by Riebold (Reference Riebold2013) and Xu (Reference Xu2013), acoustic measurements were extracted from the annotations. Two scripts were used in order to obtain all the measurements.
Given that the data is from a word-list elicitation, we could not control for tone, consonant types and word positions. Furthermore, the vowels in the word list are not evenly distributed, as shown in Table 1. None of the words in the list were repeated in the elicitation by the two speakers, and about eighty words on the list have at least two vowels. However, variations of some words were produced by the speakers. In the next section, we present the statistical models that are utilized in this work.
3.2 Statistical analysis
The values of the acoustic parameters were z-score normalized for each speaker in order to increase comparability across the two speakers. To indicate that the values of the parameters have been normalized, the prefix ‘norm’ is attached to the label of the parameters (i.e. normF1, normDuration, etc.). Linear mixed effects (LME) models were fitted for each normalized acoustic parameter, with a fixed effect of tongue root (i.e. ATR or RTR) and a random effect of participant. This was done using the package ‘lmerTest’ in R (Kuznetsova et al. Reference Kuznetsova, Brockhoff and Christensen2017). In order to identify the specific differences between a given pair of vowels with different ATR and RTR features, we did posthoc tests repeating the same statistical model within subsets of each relevant vowel pair (i.e., [i, ɪ], [u, ʊ], [e, ɛ], [o, ɔ] [ə̘, ə̙], [ə̘, a] and [a, ə̙]). The null hypothesis in each case is that tongue root has no effect on the distribution of the acoustic parameters. To calculate the correspondence between acoustic parameters, we used the Kendall correlation coefficient R which measures the strength and direction of a linear relationship between two variables. In the next section, we present the results of the statistical analysis.
4. Results
Ten acoustic parameters were measured for the annotated vowels in order to detect the acoustic correlates of ATR features in Imilike. The parameters are three vowel formants, formant bandwidth, formant dispersion, energy (dB) of voiced sound below 500Hz, cepstral peak prominence, harmonics, duration and centre of gravity. The parameters were selected based on evidence in previous studies (Ladefoged & Maddieson Reference Ladefoged and Maddieson1996; Fitch Reference Fitch1997; Fulop et al. Reference Fulop, Kari and Ladefoged1998; Starwalt Reference Starwalt2008; Edmondson et al. Reference Edmondson, Padayodi, Hassan and Esling2007, etc.).
The formant plot of the vowels in Imilike is presented in Figure 2. As shown in the plot, the schwas and the low vowel [a] are central, but the ATR schwa is further front than the RTR schwa.
The other acoustic measurements are presented in boxplots in Figures 3–21. In each graph, the y-axis contains the (normalized) acoustic measurements such as F1, and the x-axis shows the vowels, presented in ATR/RTR pairs. We focus first on paired vowel qualities, before discussing the unpaired vowel [a] in comparison with the other central vowels. In all cases, we treat significance as meaning p<0.05, which is indicated with a star ‘*’.
F1 is significantly higher for RTR vowels compared to their respective ATR counterparts, as shown in Figure 3. The result of F1 in our study is consistent with the findings in previous studies.
The results of F2 and F3 are presented in Figure 4 and Figure 5, respectively. The result in Figure 4 shows that F2 is higher in ATR vowels than their respective RTR counterparts, but the values of F2 only significantly differentiate ATR from RTR in the pairs [u, ʊ], [e, ɛ] and [ə̘, ə̙], but not in the pairs [i,ɪ] and [o, ɔ] (p>0.05). As shown in Figure 5, the values of F3 are higher in RTR vowels than their respective ATR counterparts. However, it only significantly differentiates ATR from RTR in the pairs [e, ɛ] and [o, ɔ], but not in the pairs [i, ɪ], [u, ʊ], and [ə̘, ə̙]. The observed pattern of ATR–RTR distinction in our study, as reflected in the F2 and F3 values, aligns with what previous research has found.
We turn to the results of B1, as shown in Figure 6. In line with the observation of Ladefoged and Maddieson (Reference Ladefoged and Maddieson1996), B1 is higher in RTR vowels when compared to their respective ATR counterparts in our study. A higher B1 means a wider F1 bandwidth. The effect of tongue root on B1 is statistically significant for all ATR/RTR pairs (p< 0.05).
As noted in previous studies, larynx raising and lowering synergize with tongue-root retraction and advancement respectively, but the rate of larynx raising and lowering vary by vowels (Moisik Reference Moisik2013; Moisik et al. Reference Moisik, Czaykowska-Higgins and Esling2021). However, the acoustic correlate of larynx raising has not been previously investigated for RTR vowels. To fill this gap, we measured formant dispersion, which is the average distance between each adjacent pair of formants. The measurement of formant dispersion involves an average distance between adjacent formants up to F3. Lower formant dispersion means the formants that are closer together. Given that there is a negative correlation between formant dispersion and vocal-tract length (Fitch Reference Fitch1997; Feinberg et al. Reference Feinberg, Jones, Little, Michael Burt and Perrett2005; Evans et al. Reference Evans, Neave and Wakelin2006), the prediction is that RTR vowels will have higher formant dispersion when compared to their ATR counterparts. However, as shown in Figure 7, the value of formant dispersion is higher when the vowel is ATR but lower when the vowel is RTR. With the exception of the pair [ə̘, ə̙], this effect is significant for all pairs of vowels (p<0.05). This result is contrary to the prediction; we expect formant dispersion to be higher in RTR vowels, not vice versa.
Our contrary result is probably due to the fact that the formant dispersion, as used in Fitch (Reference Fitch1997), only considers the length of the vocal tract, without taking into account lingual and laryngeal constrictions. It is also worth mentioning that formant dispersion has not previously been used for vowel identification, let alone for ATR–RTR distinctions. Doing a comparison across pairs (e.g. [i] vs. [u]), we observe that the value of formant dispersion is higher in front vowels than back vowels. As such, we examined whether it was possible that our unexpected result on formant dispersion may have come from other vowel parameters, such as height or frontness, by examining the correlation between formant dispersion, and F1 and F2. We present the results of the correlation coefficient in Figure 8.
The results of the correlation coefficient indicate that formant dispersion has a significant negative correlation with F1 and a significant positive correlation with F2 (p<0.05). The correlation between F2 values and formant dispersion is in line with the findings that the vocal tract is shorter in the production of front vowels (see Wood Reference Wood1986; Hoole & Kroos Reference Hoole, Kroos, Rose, Mannell, Robert-Ribes and Vatikiotis-Bateson1998; Janssen et al. Reference Janssen, Moisik and Dediu2019; Moisik et al. Reference Moisik, Czaykowska-Higgins and Esling2021).
As noted earlier, an increase in the centre of gravity values, which is the average frequency across the entire frequency domain, is considered a correlate of aryepiglottic compression found in RTR vowels. To determine whether the distinction between ATR and RTR involves aryepiglottic compression, we also measured the centre of gravity for all the annotated vowels in our Imilike data, as shown in Figure 9. We find that centre of gravity significantly distinguishes all pairs of vowels, except [ə̘, ə̙], with values higher in RTR vowels than in ATR ones (p<0.05). This is consistent with the finding that RTR vowels are produced with aryepiglottic compression (Edmondson & Esling Reference Edmondson and Esling2006a; Starwalt Reference Starwalt2008).
Like formant dispersion, energy (dB) of voiced segments below 500Hz has not been previously used for examining ATR/RTR distinctions. Previous research suggests that vocal cord tensing might be a by-product of aryepiglottic constriction and larynx raising in the production of RTR vowels (Esling Reference Esling1996: 81; Moisik Reference Moisik2013; Moisik et al. Reference Moisik, Czaykowska-Higgins and Esling2021). Given that vocal cord tensing has been shown to lower the energy (dB) of voiced segments below 500Hz (Tolkmitt et al. Reference Tolkmitt, Helfrich, Reiner and Klaus R1982; Scherer et al. Reference Scherer, Grandjean, Johnstone, Klasmeyer and B¨nziger2002, Reference Scherer, Sundberg, Fantini, Trznadel and Eyben2017; Johnstone et al. Reference Johnstone, van Reekum, Hird, Kirsner and Scherer2005), we expect the RTR vowels in Imilike to have lower values for energy (dB) of voiced segments below 500Hz. To test this claim, we measured energy (dB) of voiced segments below 500Hz. The results of our measurements are presented in Figure 10. The result shows that the RTR vowels have lower energy (dB) below 500Hz when compared to their respective ATR counterparts. The difference between ATR and RTR vowels for energy (dB) below 500Hz is statistically significant for all pairs (p<0.05). This result is also as expected. With the result, we acoustically confirm the claim that vocal cord tensing is an articulatory correlate property of RTR vowels.
The findings in Nilo-Saharan languages indicate that RTR vowels may be shorter in duration than their ATR counterparts (Local & Lodge Reference Local and Lodge2004; Olejarczuk et al. Reference Olejarczuk, Otero and Baese-Berk2019). However, in Niger-Congo languages, RTR vowels may be longer than their ATR counterparts (Przezdziecki Reference Przezdziecki2005). Consequently, the vowel duration as an acoustic property of ATR is less consistent across languages. Our results show that most RTR vowels in Imilike have longer durations relative to their ATR counterparts, with the exception of the pair [i/ɪ], as shown in Figure 11. This effect of ATR/RTR on duration is statistically significant (p<0.05) for all vowels, including the high front vowels.
Voice quality is said to be a reliable acoustic correlate of ATR/RTR contrasts in many languages, particularly in the Nilo-Saharan family. As such, although voice quality has not previously been shown to be used to differentiate ATR/RTR pairs in Niger-Congo languages, it is worth investigating whether it plays a role in the ATR/RTR contrast in Imilike. More so, breathy voice seems to not be auditorily apparent, but we wanted to confirm acoustically. For this reason, we extracted measurements of voice quality, specifically, cepstral peak prominence (CPP) (Hillenbrand et al. Reference Hillenbrand, Cleveland and Erickson1994). As an acoustic correlate of voice quality, CPP values are lower in breathy and aspirated sounds when compared to modal voice (Hillenbrand et al. Reference Hillenbrand, Cleveland and Erickson1994; Blankenship Reference Blankenship2002; Esposito & Khan Reference Khan2012; Khan Reference Khan2012; Seyfarth & Garellek Reference Seyfarth and Garellek2018; Berkson Reference Berkson2019). As shown in Figure 12, ATR vowels have lower CPP than their RTR counterparts, but CPP only significantly distinguishes the pairs [e, ɛ] and [u, ʊ]. The trend is in the expected direction for CPP with ATR vowels having lower CPP, just not significantly.
As an additional measurement of voice quality, we extracted the difference between the first and second harmonics (H1*−H2*), which has been corrected for the effects of the estimated formant filter on the harmonics’ amplitudes (Hanson Reference Hanson1997). Higher H1*−H2* values are associated with breathier sounds (Gordon & Ladefoged Reference Gordon and Ladefoged2001; Garellek & Keating Reference Garellek and Keating2011; Abramson et al. Reference Abramson, Tiede and Luangthongkum2015; Seyfarth & Garellek Reference Seyfarth and Garellek2018). If the ATR vowels in Imilike are breathy voiced, we would expect them to have higher H1*−H2*. As shown in Figure 13, the direction of the difference is not consistent for H1*−H2*, and the difference is only significant for the pair [ə̘, ə̙]. As such, our results of CPP and H1*−H2* indicate that breathy voice does not appear to be used to differentiate ATR versus RTR in Imilike. In this regard, Imilike is comparable to Kabiye (Padayodi Reference Padayodi2008).
So far, we have postponed the discussion of the vowel [a]. Based on its F1, the vowel [a] is clearly RTR, but we want to ensure it is significantly distinct from the schwas. Given that the vowel [a] is a central vowel like the vowels [ə̘] and [ə̙], we conducted pairwise comparisons for the three vowels by focusing on the acoustic parameters that are significant for most ATR/RTR pairs. As shown in Figure 14, the F1 value is higher in [a] than in [ə̘] and [ə̙], and this difference is significant (p<0.05). F2 is also able to significantly distinguish [a] from the other central vowels (p<0.05).
Figure 16 shows that the B1 value is significantly higher in [a] than in the schwas (p<0.05). This finding is expected based on the fact that [a] is RTR and based on the fact that it is lower than both schwa vowels. Note that the F1 and B1 values of the RTR vowels [ə̙] and [a] are higher than those of the ATR vowel [ə̘], which is consistent with the acoustic distinction in ATR/RTR pairs throughout the language. The F1, B1 and F2 measurements of [a] fit with expectations for auditorily distinguishing it from schwa.
As shown in Figure 17, formant dispersion is significantly lower for [a] than for RTR schwa, which is in turn significantly lower than ATR schwa (p<0.05).
The values of energy below 500Hz for the central vowels are presented in Figure 18. As shown in the figure, the energy below 500Hz is significantly higher in the ATR schwa [ə̘] than in the RTR counterpart [ə̙], which is in turn significantly higher than in the vowel [a] (p<0.05).
In Figure 19, the centre of gravity is significantly higher in the RTR vowel [a] than in the RTR schwa [ə̙], which in turn has a significantly higher centre of gravity than the ATR schwa [ə̘] (p<0.05). As such, [a] is significantly distinguished from the other central vowels on most acoustic measurements. This means that neither schwa is [a].
As mentioned in Section 1, the schwas occur in free variation with high and low vowels in some words. To understand whether the distribution of the schwas is consistent with their allophones in free variation, we also analyzed the schwas based on their allophones in free variation. Based on the distribution of the F1 and F2 values in Figures 20 and 21 respectively, the distribution of the schwas is consistent with that of their respective allophones in free variation. F1 and F2 values of the ATR and RTR schwas vary significantly depending on the vowels that they are in free variation with (p<0.05). This means that the schwas are different depending on what vowel they are in free variation with.
To investigate whether the amount of acoustic variation in the schwa is within the range of other pairs of vowels, we looked at the range and standard deviation of F1 for all the vowels. The results are presented in Table 2. The results show that the ranges of F1 for the schwas are comparable to those of other pairs of ATR and RTR vowels.
In Table 3, we present the summary of our findings and a compatible articulatory model. For this work, we adopt the Phonological Potentials Model (PPM), which holds that physiological states in speech production are synergistic and anti-synergistic (Esling et al. Reference Esling, Moisik, Benner and Crevier-Buchman2019; Moisik et al. Reference Moisik, Czaykowska-Higgins and Esling2021). For example, under this model, the physiological state of tongue fronting has an anti-synergistic relation with tongue retraction but a synergistic relation with larynx lowering, which, in turn, synergizes with vocal fold opening and decreased vocal fold tension. While it has an anti-synergistic relation with tongue fronting, tongue retraction synergizes with epilaryngeal constriction, larynx raising, vocal fold closure, and increased vocal fold tension. However, the strength of the physiological states varies based on the segment they associate with. As Moisik et al. (Reference Moisik, Czaykowska-Higgins and Esling2021:174) illustrate, the sound [ɛ] involves greater tongue fronting than retraction. Similarly, the sound [o] involves greater tongue retraction than fronting.
In sum, we have been able to show that six of the acoustic parameters in this work were able to distinguish most of the vowels in Imilike, but the parameters F1, B1, and energy (dB) of voiced segments below 500Hz are the most reliable cues to the ATR/RTR distinction in the language. Other measures may also be effective, but further articulatory data to inspire new acoustic studies is necessary.
5. Discussion and conclusion
The results here show that the most reliable acoustic cues for the ATR–RTR distinction in Imilike are F1, B1, energy (dB) below 500Hz and duration. In addition, F2, formant dispersion and centre of gravity also contribute to the distinction between ATR and RTR vowels but are not as reliable as F1, B1, energy (dB) below 500Hz and duration. Most importantly, the results are consistent with the existence of ATR and RTR schwas (Nweya Reference Nweya2013, Reference Nweya2015). Another significant aspect of this study is that formant dispersion, which is considered an acoustic correlate of vocal-tract length, can also distinguish all ATR/RTR pairs, except the ATR and RTR schwas. The results of this study also suggest that RTR vowels in Imilike might involve the constriction of the laryngeal articulator but the compression has no significant effect on the breathiness of the vowels.
Previous research on Northern dialects of Igbo suggests that the RTR vowels are pharyngealized (Obi Reference Obi1979; Mbah & Mbah Reference Mbah and Mbah2010; Nweya Reference Nweya2015). We are confident that pharyngealization is indeed a characteristic of the RTR set, considering that the results of our investigation coincide with many other acoustic patterns reported for ATR/RTR and other ATR languages with similar series of vowels. That the RTR vowels in Igbo are pharyngealized is further established if we consider that the values of centre of gravity and energy (dB) of voiced sound below 500Hz for the RTR vowels in Imilike are consistent with pharyngealization. Considering that the values of these two parameters vary based on vowel quality, it is plausibly the case that the degree of pharyngealization is higher in some RTR vowels than others.
Breathiness-type voice quality, which is said to accompany ATR vowels in other languages, does not appear to be a strong cue to ATR in Imilike, as the acoustic measurements for breathiness do not significantly differentiate most ATR/RTR vowel pairs.
The phonetic results suggest that there are two nonlow central vowels in Imilike, an ATR one and an RTR one. Assuming that these nonlow central vowels are reduced forms of their peripheral counterparts, this result has implications for our understanding of reduced vowels and how they behave in harmony systems. The phonetic data show clearly that schwa can operate across the ATR/RTR boundary like any other vowel pairs (e.g., Tiede Reference Tiede1996, on ATR/RTR vowels in Akan). Further research should examine the theoretical implications of languages with ATR and RTR central and reduced vowels.
Given that the standard deviation (sd) of F1 in the schwas is within the range of variability that we see in other vowels (see Table 2), it is safe to assume that they are categorized into two vowels as claimed, despite the variation that we see depending on which vowel a given schwa is in free variation with. Within-category differences based on free variation of this nature are worth exploring in future research, but they do not affect the current results.
In Imilike, we found that the correlates of ATR are in line with what has been found in other West African languages, such as F1 distinguishing most pairs (Fulop et al. Reference Fulop, Kari and Ladefoged1998; Starwalt Reference Starwalt2008). However, correlates of ATR that have been found in some Nilotic languages do not appear to play a role in Imilike; there are no measurable phonation distinctions (cf. Reh Reference Reh1996; Guion et al. Reference Guion, Post and Payne2004, on the Nilotic language, Maa), and the duration results are opposite to those that were found in phonetic studies of ATR in Nilo-Saharan languages such as Komo (Olejarczuk et al. Reference Olejarczuk, Otero and Baese-Berk2019) and Tugen (Local & Lodge Reference Local and Lodge2004). As noted, for instance, by Rose (Reference Rose, Gallagher, Gouskova and Sora Heng2018), ATR harmony may pattern differently between Niger-Congo versus Nilo-Saharan. While too few languages have been analyzed phonetically to make firm generalizations, it would be worth investigating whether differences in the phonological patterning of ATR correlate with these kinds of differences in phonetic realization. If we take into account that vowel duration directly correlates with tongue-body height, the results in this work also point to tongue-body raising and lowering distinguishing ATR and RTR vowels respectively.
Our results for Imilike are consistent with RTR vowels being pharyngealized, as has been suggested through previous auditorily-based studies (Nweya Reference Nweya2015). This result is interesting because in other (non-African) languages where RTR is equated with pharyngealization, the RTR vowels are the marked vowels that trigger harmony (Zhang Reference Zhang1995; Casali Reference Casali2003, Reference Casali2008). As Rice (Reference Rice and de Lacy2007: 80) notes, ‘early loss’ in sound change is one of the diagnostics for marked segments. The fact that RTR vowels in Igbo (analogously other African languages with tongue-root harmony) are historically lost earlier than their ATR counterparts is consistent with RTR vowels being marked (Elugbe Reference Elugbe1983; Casali Reference Casali1995; Emenanjo Reference Emenanjo2016). The loss of [ɛ] in Standard Igbo might also be imminent in Imilike, given that the vowel presently occurs in free variation with [a] in certain words.
Casali (Reference Casali2003, Reference Casali2008) considers Igbo to be an ATR-dominant system, and ATR is considered the marked feature value in ATR-dominant languages. In this case, there would be a distinction, at least in Imilike, between which class is phonologically marked (ATR) and the one that is phonetically marked (RTR, given the pharyngealization). However, given that harmony in Igbo is root-controlled (Zsiga Reference Zsiga1992, Reference Zsiga1997), the evidence for ATR-dominance in the system is somewhat unclear. Future research should look at the phonology of harmony in Imilike, to determine whether the phonetic markedness of the pharyngealized RTR vowels is reflected in any way in the phonological patterning of the system. Such studies, if done on multiple languages, could help disentangle the cross-linguistic differences in how ATR is realized and how it patterns phonologically.
As a future direction, we will examine whether there is a twelfth vowel, an allophonic counterpart of /a/, in ATR contexts. The vowel /a/ can behave in different ways in many languages with ATR harmony, such as Kabiye (Padayodi Reference Padayodi2008, Reference Padayodi2011) and Akan (O’Keefe Reference O’Keefe2004). In some cases, /a/ is neutral, while in others, it has an allophonic counterpart or harmonizes by pairing with a mid vowel. This can vary both within and between languages (Casali Reference Casali2008). Our results showed a high level of variance for /a/ compared to other vowels for several of the measures, suggesting that there is greater variability in its pronunciation than in that of other vowels. While it is possible that this result is due to fewer vowels occupying the space around /a/, it is worth examining how /a/ behaves in the harmony system and whether it allophonically participates. As shown in Section 1, the vowel [a] in Imilike sometimes harmonizes to [e], so this indicates that [a] is not always neutral in Imilike and other dialects of Igbo (Emenanjo Reference Emenanjo2016).
Before we conclude, it is important to note the limitations of this study. First, the research is based on data from two speakers of Imilike. A second limitation is that we did not control for consonant types and tones in our elicitation. Third, the number of tokens for each vowel is not evenly distributed. Considering that other Northern varieties of Igbo have the ATR and RTR central vowels like Imilike, we also hope to replicate this study with other varieties. Given that glottalized consonants are restricted to the environments of RTR vowels in Igbo dialects such as Mbaise and Owerri (Anoka Reference Anoka1985; Manfredi Reference Manfredi and Bendor-Samuel1989), future research should investigate whether some Imilike consonants also have ATR features.
In sum, this study finds that the language has eleven vowels, in line with previous auditorily-based research on the vowel system. Various acoustic parameters distinguish the ATR and RTR pairs of vowels in Imilike. The results of our study on ATR and RTR vowels in Imilike are in line with the Phonological Potentials Model (Esling et al. Reference Esling, Moisik, Benner and Crevier-Buchman2019; Moisik et al. Reference Moisik, Czaykowska-Higgins and Esling2021), which suggests that the articulatory distinction between ATR and RTR vowels involves synergistic and anti-synergistic relations between lingual, pharyngeal, and laryngeal articulators.