1 Introduction
Modern Standard Arabic (MSA) has a vowel system with six pure vowels: long /aː iː uː/ and short /a i u/.Footnote 1 Historically, long and short vowels were believed to occupy the same positions at the extremes of the vowel triangle and differ only in duration (Al-Ani Reference Al-Ani1970, Watson Reference Watson2002). Local varieties of Arabic, however, reveal patterns that deviate from MSA, suggesting ongoing or completed sound changes. Egyptian Arabic, for example, demonstrates fronting and raising of the low vowels to the front open or open-mid position /æ ∼ ɛ/ (Watson Reference Watson2002). Similar raising and fronting of short /a/ is observed in Iraqi Arabic (Fathi & Qassim Reference Fathi and Qassim2020). Palestinian Arabic reveals lowering of short high vowels /i u/ toward a more central position in the vowel space (Saadah Reference Saadah2011). In Syrian Arabic, all short vowels demonstrate a more central location as compared to long vowels: high vowels /i u/ are lowered toward the close-mid position, and low /a/ is raised toward the center of the vowel space (Almbark Reference Almbark2012). In the dialects of the Gulf (e.g. Qatari Arabic), short high vowels /i/ and /u/ have lower and more central quality than their long counterparts, and short low /a/ has more front quality than long /aː/ (Johnstone Reference Johnstone1967, Bukshaisha Reference Bukshaisha1985).
The patterns that result from these sound changes are not uncommon across languages (Labov Reference Labov1994). One common scenario includes development of qualitative distinctions between long and short vowels: (at least some) long vowels are tense if they are articulated with ‘considerable muscular effort’ of the gesture (Chomsky & Halle Reference Chomsky and Halle1968: 324) often resulting in higher tongue position (Ladefoged & Maddieson Reference Ladefoged and Maddieson1990) compared to (at least some) short vowels, which are described as lax in such system. The tense/lax distinction often triggers different patterns of change for the two subsystems of vowels. Languages with tense/lax distinction often develop a pattern in which tense vowels are located at the periphery of the vowel space, occupying more extreme positions, and lax vowels are non-peripheral, or more centralized (Lindau Reference Lindau1978).
Until recently, the research on development of vowels systems has been limited to English and a few European languages (e.g. Labov Reference Labov1994, Labov, Ash & Boberg Reference Labov, Ash and Boberg2005). The patterns reported for modern vernacular Arabic dialects, however, indicate that the Arabic vowel system may be developing new phonemic distinctions, e.g. peripheral/non-peripheral, in addition to the traditional long/short distinction. To the best of our knowledge, no previous study of Arabic vowel systems has addressed this issue. Including non-Indo-European languages in the scope of this research paradigm will contribute to our understanding of language.
The structure of the paper is as follows. First, we present the background information about typology of vowel systems in Arabic and other languages and acoustics of long and short vowels. Then we present methodology and results of the experimental study to determine acoustical properties and patters of variation in long and short vowel in Qatari Arabic. We conclude with a summary and discussion of our findings.
2 Typology of vowel systems
Although vowel systems with phonological distinction between long and short vowels are common in languages (19.6 $\%$ , according to Maddieson Reference Maddieson1984), the vowel system with three vowel qualities and long/short distinction (Figure 1A), commonly reported for Quranic Arabic (e.g. Al-Ani Reference Al-Ani1970, Newman & Verhoeven Reference Newman and Verhoeven2002, Watson Reference Watson2002), is typologically rare. Maddieson (Reference Maddieson1984: 127–128) claims that only 5.4 $\%$ of languages have a system with three vowel qualities, and that no language would have a phonological distinction in vowel quantity if qualities of long and short vowels are identical. Instead, duration in such system would be treated as a predictable metrical property superimposed on a segment.
Qualitative differences between long and short vowels in languages are typically described as differences in height and/or backness or as differences in tenseness and peripherality (e.g. in English, Labov Reference Labov1994). A typical language with phonological distinction in vowel quantity would have five to eight vowel qualities, but only some of them would be opposed in duration. The most typical patterns include long high vowels /iː/ and /uː/, closed-mid or open-mid short vowels, and short and long low vowels /a aː/, as shown in Figure 1B. This pattern (with minor additions or modifications) is found in many languages, including Persian (Kambuziya, Ghorbanpour & Mahdipour Reference Kambuziya, Ghorbanpour and Mahdipour2017), Pahari (Khan Reference Khan2014), and many varieties of modern English (Ladefoged & Maddieson Reference Ladefoged and Maddieson1990, Labov et al. Reference Labov, Ash and Boberg2005).
Getting from System A to System B, one should primarily take into account changes in short vowels. Previous research showed that vernacular Arabic dialects have systems that employ the principles consistent with System B rather than System A. Al-Ani (Reference Al-Ani1970), who used his own pronunciation of an educated variety of Iraqi Arabic, demonstrated the pattern that deviates from System A. In his pronunciation, short /a/ showed a tendency to be realized as a more front vowel than long /aː/, as shown in Figure 2C. A very recent study of Iraqi Arabic by Fathi & Qassim (Reference Fathi and Qassim2020) showed a further development of this pattern, in which short /a/ started to raise in addition to fronting.
Levantine varieties of Arabic demonstrate further development of the vowel system in Figure 2C. Saadah (Reference Saadah2011), who investigated the vowel system of Palestinian Arabic, found that short /i/ and /u/ are lowered from the positions for /iː/ and /uː/ to the positions of [ɪ] and [ʊ]. A more advanced stage in the development of this vowel system has been found in Syrian Arabic (Almbark Reference Almbark2012) and Jordanian Arabic (Jongman et al. Reference Jongman, Wendy Herd, Sereno and Combest2011, Kalaldeh Reference Kalaldeh2018), with centralization of all short vowels. High vowels /i u/ are lowered and low /a/ is raised toward more central positions in the vowel space, as shown in Figure 2D.
The system of the Cairene dialect of Egypt (Figure 2E) demonstrates additional changes from the system in Figure 2D, in which both low vowels moved to the front position (Alghamdi Reference Alghamdi1998, Watson Reference Watson2002). The extreme stage of centralization of short vowels is found in some Arabic dialects of Mesopotamia, in which short /i/ merged with short /u/ or with short /a/ (Watson Reference Watson2002). Finally, Arabic dialects in the Gulf combine features of the Levantine and Iraqi dialects. Bukshaisha (Reference Bukshaisha1985) reported that the vowel system in Qatari Arabic has lowered short /i/ and /u/ and fronted short /a/, as shown in Figure 2F.
A case of extreme differentiation in quality between long and short vowels is found in Modern Persian (Rahbar Reference Rahbar2008). Historically, short /i/ and /u/ lowered from high position and are realized as mid vowels /e/ and /o/ in modern Persian, while low short /a/ was fronted to /æ/ (Miller Reference Miller2012) yielding the system in Figure 3G. Although it may seem that synchronic description of long /ɑː iː uː/ and short /æ e o/ does not require a distinctive feature [±long] (Rahbar Reference Rahbar2008), quantitative differences are still present in phonetic realization of Persian vowels, and the feature [long] is active in synchronic phonological alternations (Kambuziya et al. Reference Kambuziya, Ghorbanpour and Mahdipour2017).
Other languages, e.g. various dialects of Standard English (Labov Reference Labov1994) and Pahari (Khan Reference Khan2014), present a situation where qualitative differences between long and short vowels are kept within the same height range. In such systems, non-low long vowels tend to be slightly higher and occupy more extreme positions than short vowels, as shown in Figure 3H. The distinction between English long /iː uː/ and short /ɪ ʊ/ in this system is captured by the feature [±tense], with tense vowels being longer and articulated with more muscular tension (Chomsky & Halle Reference Chomsky and Halle1968). Labov (Reference Labov1994) extends the use of the [±tense] feature to the nuclei of the English diphthongs. In English, both tense monophthongs and tense nuclei of diphthongs are located at the extremes of the vowel space, which Labov (Reference Labov1994: 177) defines as a peripheral track. Short, or lax, segments, in contrast, are centralized and occupy the non-peripheral track (marked as the area inside the dashed square in Figure 3H). It is of note that changes in peripheral and non-peripheral vowels tend to occur along the same track. For example, peripheral, or tense, vowels tend to raise along the peripheral track, but non-peripheral, lax, vowels tend to lower along the non-peripheral track.
The long/short, tense/lax, and peripheral/non-peripheral distinctions are not mutually exclusive. They show different degree of abstractness and overlap in phonological systems of languages. While the feature [±long] is derived from physical differences in duration, the features [±tense] and [±peripheral] are more abstract in nature. The relation between tenseness and peripherality is also not straightforward. Lindau (Reference Lindau1978) argues that the tense/lax distinction in English or German is best described by a feature [±peripheral], capturing the observation that lax vowels are typically located between tense vowels in the middle of the vowel space. However, tense or lax vowels may not necessarily become peripheral /non-peripheral in a language. For example, [ART] tense vowels in West African languages are not located at the periphery of the vowel space, and their higher tongue position is a result of the ATR (Advanced Tongue Root) gesture (Stewart Reference Stewart1967). Oral vowels in French are located at the extremes of the vowel space, but their nasal counterparts, which retain all other temporal and physical properties, are more central (Lennig Reference Lennig1978, cited from Labov Reference Labov1994). According to Labov (Reference Labov1994: 285), ‘peripherality is defined relative to the vowel system as a whole’. The membership in the peripheral or non-peripheral class is based on the behavior of a vowel rather than on vowel’s qualitative or quantitative characteristics. The long mid central vowel in English words bird or term is tense, but unlike other long tense English vowels, it is not peripheral. Finally, peripherality of a vowel in chain shifts can change without changes in duration. Therefore, peripheral/non-peripheral status of long and short vowels in vernacular Arabic dialects can be established based on two types of evidence: (i) location of vowels in relation to the center point of the vowel space, and (ii) variation in vowel articulation within each type of long or short vowels that can indicate direction of change along a peripheral or non-peripheral track in the vowel space.
3 Acoustics of long and short vowels
Instrumental studies of long and short vowels usually refer to three acoustical cues: duration of a vowel and frequencies of the first (F1) and second (F2) formant. In duration, long vowels in Arabic dialects are almost twice as long as short vowels. This ratio is quite stable across Arabic dialects although actual duration of vowels differs from study to study depending on the method of elicitation, speech tempo, and syllable structure of words. Al-Ani (Reference Al-Ani1970) found that the ratio between duration of short and long vowels in monosyllables was 0.50 in Iraqi Arabic. A similar ratio of 0.48 was reported for Iraqi Arabic in Fathi & Qassim (Reference Fathi and Qassim2020). Alghamdi (Reference Alghamdi1998) reported 0.41–0.45 ratios for Egyptian and Saudi Arabic vowels. This ratio was found to be 0.42 in MSA produced by Jordanian speakers (Kalaldeh Reference Kalaldeh2018), 0.42 in Palestinian Arabic (Saadah Reference Saadah2011), and 0.50 in Syrian Arabic (Almbark Reference Almbark2012). Bukshaisha (Reference Bukshaisha1985) reported that the ratio averaged at 0.47 in Qatari Arabic.
Long and short vowels in other languages reveal a similar difference in duration. Khan (Reference Khan2014) reported that the duration ratio in Pahari was 0.45, which was quite similar to the ratios found in the Arabic dialects. Acoustical studies of English and other languages, however, show that this difference can be smaller. The short-to-long ratio in American English varies from 0.79 (Hillenbrand et al. Reference Hillenbrand, Getty, Clark and Wheeler1995) to 0.75 (Peterson & Lehiste Reference Peterson and Lehiste1960), and a most recent study of Persian reports an 82 $\%$ ratio (Mokari, Werner & Talebi Reference Mokari, Werner and Talebi2017).
Qualitative differences between vowels are represented as differences in F1 and F2 frequencies (Lindau Reference Lindau1978). As F1 is inversely related to vowel height and F2 is inversely related to vowel backness, studies that find qualitative differences between long and short vowels report higher F1/lower F2 values for front vowels, and higher F1/higher F2 values for back vowels when they move toward the center of the vowels space (e.g. Bukshaisha Reference Bukshaisha1985, Alghamdi Reference Alghamdi1998, Saadah Reference Saadah2011, Almbark Reference Almbark2012, Kalaldeh Reference Kalaldeh2018 for Arabic dialects; Hillenbrand et al. Reference Hillenbrand, Getty, Clark and Wheeler1995, Clopper, Pisoni & de Jong Reference Clopper, Pisoni and de Jong2005, Labov et al. Reference Labov, Ash and Boberg2005 for English; Aronow, McHugh & Molnar Reference Aronow, McHugh and Molnar2017, Mokari et al. Reference Mokari, Werner and Talebi2017, Hemmatnia, Ghazanfari & Nourbakhsh Reference Hemmatnia, Ghazanfari and Nourbakhsh2019 for Persian, among many others).
Vowel quality is sensitive to phonetic context. Co-articulatory effects of consonants on an adjacent vowel can considerably affect vowel articulation (Flemming Reference Flemming, Embarki and Dodane2011). As a result, particular groups of consonants in English often provide important contexts that feed or facilitate historical or ongoing sound changes (Clopper et al. Reference Clopper, Pisoni and de Jong2005, Labov et al. Reference Labov, Ash and Boberg2005). As alveolars tend to raise the second formant frequency of the following vowel (Hillenbrand, Clark & Nearey Reference Hillenbrand, Clark and Nearey2001), fronting of [uː] after alveolar segments, registered in many modern varieties of English, is in fact an expected result of co-articulation (Labov et al. Reference Labov, Ash and Boberg2005). Some of these co-articulatory effects may cause auditory enhancement in vowels (Diehl et al. Reference Diehl, Kluender, Walsh, Parker, Hoffman and Palermo1991). For example, tensing and raising of [æ] in northern and eastern American dialects is enhanced by F1 lowering before a nasal. A chain shift ‘a → o → u’ in back vowels in the Southern American dialect occurs before [r]. Although some studies of vowel articulation use ‘placeless’ glottal context in order to minimize the effect of flanking consonants and obtain ‘pure’ vowel qualities (e.g. Clopper et al. Reference Clopper, Pisoni and de Jong2005, Kalaldeh Reference Kalaldeh2018), adding contextual variation in the data (see e.g. Hillenbrand et al. Reference Hillenbrand, Clark and Nearey2001) can be, in fact, beneficial as it can provide insights about directions of potential changes in the vowel system.
Several contextual factors have been reported to affect acoustic properties of vowels in Arabic. First, formant frequencies can be affected by stress, as a vowel tends to be more centralized in response to a decrease in the degree of stress (de Jong & Zawaideh Reference de Jong and Zawaydeh2002). Another major factor that influences vowel quality is coarticulation effect from an adjacent consonant. So-called ‘emphatic’ (pharyngealized or uvularized) coronal consonants in Arabic are known to cause a considerable retraction of an adjacent vowel within a syllable or a word. This change, known as emphasis spread, is very pervasive in all Arabic dialects and well-documented (Al-Ani Reference Al-Ani1970, Ghazeli Reference Ghazeli1977, Bukshaisha Reference Bukshaisha1985, Jongman et al. Reference Jongman, Wendy Herd, Sereno and Combest2011, Zawaydeh & de Jong Reference Zawaydeh, de Jong, Hassan and Heselwood2011, among many others).
The effects of other categories of consonants on articulation of vowels in Arabic are studied less. Similar to emphatic consonants, uvulars produce the effect of backing on the following vowel, but the degree of backing is much smaller. Back quality of a following vowel, manifested as F2 lowering in acoustics, prevails no further than vowel midpoint (Yeou Reference Yeou1997, Zawaydeh & de Jong Reference Zawaydeh, de Jong, Hassan and Heselwood2011, Al-Ansari & Kulikov Reference Al-Ansari and Kulikov2023) and is never spread to the neighboring syllable. Pharyngeal consonants raise frequency of F1 in an adjacent vowel (Klatt & Stevens Reference Klatt and Stevens1969, McCarthy Reference McCarthy and Keating1994, Yeou Reference Yeou1997, Shosted, Fu & Hermes Reference Shosted, Fu, Hermes, Benmamoun and Bassiouney2018, among others). As a result, high and mid vowels are often lowered next to pharyngeals in Arabic dialects (McCarthy Reference McCarthy and Keating1994). It is of note that, the effect of pharyngeal fricatives /ħ, ʕ/ on a vowel is not identical to the effect of emphatic, or pharyngealized, coronals. While emphatics cause considerable backing of the vowel by significantly lowering its F2 and optionally raising F1, pharyngeals cause considerable lowering of the vowel by significantly raising F1 and optionally raising or lowering F2Footnote 2 (Klatt & Stevens Reference Klatt and Stevens1969, Laufer & Baer Reference Laufer and Baer1988, Yeou Reference Yeou1997). Finally, coronal consonants are known to cause some fronting in a following vowel (Yeou Reference Yeou1997), but the magnitude of this influence on a vowel in Arabic is not clear.
4 Goal and questions
The two-fold goal of the present paper is to fill the gaps in the acoustical description of Qatari Arabic vowels and to determine whether Arabic long and short vowels are developing the qualitative peripheral/non-peripheral distinction in addition to the quantitative long/short distinction. The following three questions are addressed in the current study:
-
Do long and short vowels in QA differ in quality in addition to being different in quantity?
-
Do long and short vowels occupy the peripheral and non-peripheral positions in the vowel space?
-
What are the patterns of variation in production of vowels among speakers?
The previous research also raises questions about the effect of phonetic context on variation of vowel production. Unlike studies that investigate ‘pure’ vowel quality in CV syllables next to a glottal stop (e.g. Clopper et al. Reference Clopper, Pisoni and de Jong2005), we are using a variety of contexts to capture possible differences in vowel quality. Although both onset and coda consonants can be the source of contextual variation of a vowel, we use only onset flanking consonants as a controlled experimental factor. A preliminary look at the effect of a coda consonant on vowel formants in Arabic showed that it was not as strong as the effect of an onset consonant (Kulikov, Mohsenzadeh & Syam, published online 2 November 2021). Therefore, the use of a coda consonant is not controlled in this study; it is a random factor due to selection of lexical items for the study.
The acoustic analysis looks into duration and formant frequencies (F1 and F2) of monophthongs /aː iː uː a i u/ in QA. We predict that short vowels in this dialect form a subgroup of non-peripheral vowels. Short vowels /a i u/ are expected to be lowered and/or centralized whereas their long counterparts are located at the periphery of the vowel space. Long vowels, in contrast, are expected to be located at the periphery of vowel space.
5 Method
5.1 Materials and procedure
Twenty-one native speakers of Qatari Arabic participated in the study. Their age ranged between 18 and 28 years (mean age = 22.4 years); seventeen were female and four were male. They did not report any speech or hearing loss and were not aware of the purposes of the study. As most Qataris, they learned English in high school and were conversant in this language.
The materials included words from the Qatari dialect (n = 120) with word-initial consonants that belonged to one of the four places of articulation: labial (n = 30), alveolar (n = 30), uvular (n = 30), and pharyngeal (n = 30).Footnote 3 Labial context was used as a ‘control category’ because it does not involve a tongue gesture. The three other categories are characterized by distinct tongue movement in three directions. Alveolar articulation moves the tongue forward and upward, uvular articulation moves it backward and upward, and pharyngeal articulation moves it downward (e.g. Yeou Reference Yeou1997, Jongman et al. Reference Jongman, Wendy Herd, Sereno and Combest2011).
Most words were single CVC syllables, in which the target vowel is invariably stressed. In a few cases (n = 6) when it was not possible to have a monosyllabic word, a disyllabic word was used. The target vowel in disyllabic words was in the initial stressed syllable. In order to minimize the effect of vowel-to-vowel coarticulation, the second vowel was a schwa or the same vowel as the target. Half of the words (n = 60) had short vowels /i/, /a/, /u/, and the other half had long vowels /iː/, /aː/, /uː/. Each vowel type (‘i’, ‘a’, ‘u’) was evenly distributed (n = 20) within the subsets of short and long vowels (Table 1).
The stimuli were presented to the participants in standard Arabic orthography. The participants were asked to pronounce (read) target words in their native dialect as a list. They were instructed to read the words at a comfortable tempo using a falling intonation, and to make a distinct pause after each word.Footnote 4 The participants read the list twice to ensure habituation with the tokens, but only the second trial was used in the analysis. A total of 2,520 items were submitted to an acoustic analysis.
5.2 Acoustic analysis
Readings were digitally recorded into.wav sound files and downsampled to 22,050 Hz for analysis. The recordings were manually marked in PRAAT (Boersma & Weenink Reference Boersma and Weenink2021) using waveforms and wideband spectrograms (Figure 4). Vowel onset was defined as emergence of F1; vowel offset was defined as cessation or disappearance of F2 from the spectrogram. Formant frequencies (F1, F2) of the target vowels were taken from LPC spectra obtained with a 25 ms Hamming window at vowel midpoint.
To minimize speaker-specific variation due to physiological or anatomical differences of the speaker’s vocal tract (Traunmüller Reference Traunmüller1988, Maurer et al. Reference Maurer, Cook, Landis and d’Heureuse1991), raw F1 and F2 values were normalized prior to the statistical analysis. We tried two speaker-intrinsic vowel-extrinsic methods of normalization frequently used in acoustical and sociophonetic studies: log-transform (Morrison & Nearey Reference Morrison and Nearey2006) and z-transform (Lobanov Reference Lobanov1971). The log-transform method is a procedure that subtracts the mean of a log-transformed formant frequency for the speaker from a log-transformed formant value, as shown in the following:
The z-transform method normalizes formant values by calculating z-scores for each formant value for the speaker, as shown in the following:
In the formulas, F i is a raw formant value (F1 or F2) in Hz, μ is a mean value for the formant, and σ is the standard deviation of the mean of the same formant. Each formant was normalized separately.
The model with z-scores achieved a better fit (i.e., had a smaller Log-likelihood value) in the vowel identification task, Log-likelihood = 1442.8, χ2(10) = 7587.6, p < .0001, than the log-transform model, Log-likelihood = 1573.0, χ2(10) = 7451.9, p < .0001, and was used for the analysis of formant frequencies.
5.3 Data analysis
The acoustic data were submitted to several mixed effects linear models using the lmer package (Bates et al. Reference Bates, Maechler, Bolker and Walker2015) in R (R Core Team 2021). Each acoustic cue was used as a dependent variable in a separate mixed-effects model. Fixed effects in the model are independent variables whose effect is investigated (e.g. vowel type or phonetic context). Some of the fixed effects in this study (e.g. vowel type and vowel quantity) were within-subject, as each speaker produced all categories of vowels. Some fixed effects (e.g. gender) were between-subject, when each speaker belonged to one group, that is, was either male or female. When a fixed effect had more than two levels, it was first evaluated using a Log-likelihood (chi-square) test by comparing the model fit with and without the factor. The three vowel categories were coded by simple codes, with low vowel /a/ as a reference category. The four phonetic contexts were also coded by simple codes, with labial place of articulation as a reference category. Additional comparisons between levels were performed using the emmeans package (Lenth Reference Lenth2020).
Random effects in the model are sources of variance due to random selection of a subset of population (e.g. speakers or items). Following Barr et al. (Reference Barr, Levy, Scheepers and Tily2013), we started selecting the optimal model with the most saturated one that included both random intercept and random slopes. Random intercept is a mean difference between each speaker or item; random slope explains additional variation in a fixed effect in relation to a given random effect. For example, the effect of vowel type may vary from one speaker to another due to individual differences. Similarly, the effect of phonetic context may vary from one word to another due to the influence of non-target adjacent consonants in a syllable. When adding some effects did not improve the model’s performance, the simpler model was selected for the benefit of better convergence (Matuschek et al. Reference Matuschek, Reinhold Kliegl, Baayen and Bates2017). The p-values for factor levels were calculated using the lmerTest package (Kuznetsova, Brockhoff & Christensen Reference Kuznetsova, Brockhoff and Christensen2017).
6 Results
The boxplots for durations and raw formant frequencies are presented in Figure 5. Table 2 presents means and standard deviations for the cues. Observations of vowel durations revealed that speakers produced the two types of vowels as intended. Short vowels were almost twice as short as their long counterparts (short-to-long ratio = 0.44). The analysis of vowel quality included several stages. First, we analyzed formant frequencies of the target vowels for the effects of vowel type, vowel quantity, and phonetic context. Then we configured the vowel space and analyzed variation in long and short vowels. We present the results of the analysis in the following sections.
6.1 Formant frequencies
For the analysis of formant frequencies, normalized F1 and F2 values were evaluated for fixed effects of vowel type (a, i, u), vowel quantity (long, short), and phonetic context (bilabial, alveolar, uvular, pharyngeal). Although gender significantly affected raw formant frequencies, F1: χ2(1) = 15.57, p < .0001, F2: χ2(1) = 19.84, p < .0001, it did not affect normalized frequencies or improve the model fit, F1: χ2(1) < 1, F2: χ2(1) = 1.57, p = .210. Therefore, it was excluded from the final model. Random effects included speaker and word as random intercepts, vowel type and vowel quantity as random slopes for speaker, and phonetic context as a random slope for word. The summary of the model is given in Table 3; fixed effects are plotted in Figures 6 and 7.
* = p < .05; ** = p < .01; *** = p < .001
For F1, the effect of vowel type was significant, χ2(23) = 485.5, p < .0001. As expected, F1 values were lower for high vowels /uː/ (β = –2.08, SE = .05, p < .0001) and /iː/ (β = –2.21, SE = .06, p < .0001) compared to low vowel /aː/. The effect of vowel quantity was also significant, χ2(16) = 369.2, p < .0001. Positive estimate coefficient for vowel quantity indicated that F1 was higher in short vowels, and positive coefficients for vowel quantity-by-type interactions indicated that the distances between short /a/ and /u/ and /i/ were smaller, suggesting the short vowels were produced closer to the center of the vowel space as compared to long vowels.
A significant effect of phonetic context, χ2 (18) = 138.7, p < .0001, indicated that F1 changed in response to the place of articulation of the preceding consonant. F1 was higher after uvular and pharyngeal consonants (p < .001) compared to bilabial and alveolar contexts. Positive vowel quantity-by-context interaction coefficients indicated that the effect of uvular and pharyngeal contexts was stronger in short vowels.
For F2, the effect of vowel type was significant, χ2(23) = 851.9, p < .0001. As expected, F2 values were lower for back /uː/ (β = –0.55, SE = .04, p < .0001) and higher for front /iː/ (β = 2.40, SE = .05, p < .0001) compared to /aː/. The effect of vowel quantity was also significant, χ2(16) = 473.8, p < .0001. Positive estimate coefficient for vowel quantity indicated that F2 was higher in short /a/, and negative coefficients for vowel quantity-by-type interactions indicated that the differences between short /a/ and /u/ and /i/ were smaller, suggesting the short vowels were produced closer to each other as compared to long vowels.
A significant effect of phonetic context, χ2 (18) = 104.2, p < .0001, indicated that F2 also changed as a function of the place of articulation of the preceding consonant. F2 was higher after alveolar consonants, but lower after uvular consonants. No significant effect was found for pharyngeal context. The negative vowel quantity-by-context coefficient indicated that the effect of uvular context was stronger in short vowels.
6.2 Vowel space
We plotted normalized formant frequencies to configure the vowel space for the Qatari dialect of Arabic. Figure 8 represents mean normalized F1 and F2 values and standard deviations (ellipses) for each of the six vowels in the vowel space. The results suggest that long and short vowels form two distinct subsystems in the vowel space.
Long vowels are articulated in the extreme positions and can be characterized as follows:
/iː/ – high front long vowel, suggested IPA symbol: [iː]
/uː/ – high back long vowel, suggested IPA symbol: [uː]
/aː/ – low back long vowel, suggested IPA symbol: [ɑː]
Short vowels are articulated closer to the center of the vowel space and can be characterized as follows:
/i/ – close/close-mid front short vowel, suggested IPA symbol: [ɪ]
/u/ – close/close-mid back short vowel, suggested IPA symbol: [ʊ]
/a/ – low front short vowel, suggested IPA symbol: [æ]
6.3 Vowel patterns
Finally, we looked into articulation patterns within a vowel space. Observations of individual scatterplots of the vowel space and subsequent hierarchical cluster analysis of F1 and F2 values in IBM SPSS Statistics for Windows, Version 27, using the Ward method revealed two patterns of articulation in long vowels and three patterns of articulation in short vowels. The summary for each cluster is shown in Table 4.
Articulation of long vowels in the two groups differed by the degree of raising of long /aː/. Speakers in Cluster A (n = 14) articulated the low back vowel in a relatively low position (M F1 = 722 Hz), as shown in Figures 9A and 9B. Speakers in Cluster B (n = 7), in contrast, demonstrated raising of the low back vowel indicated as lower F1 (M F1 = 638 Hz), as shown in Figure 9C.
Articulation of short vowels revealed three patterns. Cluster 1 (n = 7) is characterized by clear separation of all short vowels: Short /i/ and /u/ are relatively high and do not overlap; short /a/ is articulated in a low front position (Figure 9A). Cluster 3 (n = 6) has the most central quality of short vowels in the vowel space, which results in a considerable overlap between them, as shown in Figure 9B. Cluster 2 (n = 8) can be described as a transitional stage between the first two clusters. It differs from Cluster 1 in that it has more central quality of high vowels, and from Cluster 3 in that it has a greater degree of fronting/lowering of /a/ (Figure 9C).
7 Discussion and conclusion
The goal of the present study was to investigate qualitative differences between long and short vowels in Qatari Arabic. As we showed above, vowel systems across languages tend to enhance differences in duration with qualitive differences in height and/or backness (Maddieson Reference Maddieson1984). The acoustic analysis focused on vowel duration and F1 and F2 at vowel midpoint. Our findings were consistent with the findings in Bukshaisha (Reference Bukshaisha1985) for Qatari Arabic and other eastern dialects of Arabic (e.g. Fathi & Qassim Reference Fathi and Qassim2020 for Iraqi Arabic, Saadah Reference Saadah2011 for Palestinian Arabic, Almbark Reference Almbark2012 for Syrian Arabic, and Kalaldeh Reference Kalaldeh2018 for Jordanian Arabic). The results of the current study indicated that short vowels /a i u/ have differences in quality with their long counterparts. The location of the vowels suggests that the vowel system in Qatari Arabic can be described in terms of peripheral–non-peripheral distinctions (Labov Reference Labov1994).
Long vowels are located at the extremes of the vowel triangle and can be described as peripheral vowels. Short vowels, in contrast, are located closer to the center of the vowel space and can be described as non-peripheral vowels (Figure 10). Short low vowel /a/ is articulated as a front vowel [æ]. Short high vowels /i/ and /u/ are articulated as close or close-mid vowels [ɪ] and [ʊ] respectively. Given that the means of the high short vowels /i/ and /u/ are located above the central point of the vowel space, we are using the notation [ɪ] and [ʊ] to indicate that these vowels counterparts of long vowels /iː/ and /uː/.
Differences in location of long and short vowels in Qatari Arabic can be interpreted in terms of peripherality: long vowels occupy a peripheral track, and short vowels occupy a non-peripheral track. The peripheral status of long vowels is supported by the characteristic movement pattern:
-
Peripheral vowels are likely to rise along the peripheral track.
(Labov Reference Labov1994: 116)
Our findings reveal that long low /aː/ demonstrates evidence for backing and raising in speakers within Cluster B.
The non-peripheral status of short vowels in Qatari Arabic is also supported by two characteristic chain movement patterns:
-
High non-peripheral vowels are likely to move downward along the non-peripheral track, but low peripheral vowels are likely to rise along the peripheral track.
-
Back non-peripheral vowels are likely to be fronted.
(Labov Reference Labov1994: 116)
Our results show evidence of both patterns. Synchronic variation of the vowel system in Qatari Arabic indicates (i) lowering of short high vowels from the high position toward the close-mid position, and (ii) fronting of short low /a/ from the central position.
It is of note that each of these movements seems to be an independent process. We found three patterns of variation in short vowels in our data. Speakers in Cluster 1 revealed the patterns with separation of short vowels, in which short /i/ and /u/ do not overlap, but short /a/ is fronted. Speakers in Cluster 2 showed the pattern in which short /a/ is also fronted, but short /i/ and /u/ are close to each other. These patterns are not unique to QA as they are found in many eastern Arabic dialects of Levant, Jordan, Iraq, and Saudi Arabia (Johnstone Reference Johnstone1967, Alghamdi Reference Alghamdi1998, Jongman et al. Reference Jongman, Wendy Herd, Sereno and Combest2011, Saadah Reference Saadah2011, Almbark Reference Almbark2012, Kalaldeh Reference Kalaldeh2018, Fathi & Qassim Reference Fathi and Qassim2020). Finally, speakers in Cluster 3 revealed the pattern with maximal centralization of all short vowels. This pattern is similar to the pattern found in vernacular Bedouin Arabic dialects of Mesopotamia (Watson Reference Watson2002), in which short /a/ or short /i/ merged with short /u/.
It is intriguing that the vowel pattern in Cluster 1 with maximal separation of /i u/ and fronting of /a/ is very similar to the pattern of the Modern Persian vowel system (Figure 3G), in which the non-low short vowels are lowered to mid vowels /e/ and /o/ and the low vowels have a [±back] contrast. Acoustic studies of Persian vowels (Aronow et al. Reference Aronow, McHugh and Molnar2017, Mokari et al. Reference Mokari, Werner and Talebi2017, Hemmatnia et al. Reference Hemmatnia, Ghazanfari and Nourbakhsh2019) report mean F1 values of Persian short vowels (/æ/: M = 800 Hz; /e/: M = 500 Hz; /o/: M = 550 Hz) that are very close to mean F1 values of short vowels found in the current study (/a/: M = 698 Hz; /i/: M = 535 Hz; /u/: M = 550 Hz). Diagrams of vowel spaces for individual speakers of Persian (Mokari et al. Reference Mokari, Werner and Talebi2017: 13) look very similar to diagrams for speakers of Qatari Arabic in the current study. Therefore, mid quality of short /i, u/ and front quality of short /a/ in Qatari Arabic is very plausible from the typological perspective. Persian short vowels /æ e o/ and Qatari Arabic short vowels /a i u/ have very similar quality, and the difference in their representation is merely a matter of tradition.
It is still a remaining question whether the peripheral/non-peripheral distinction, observed in the data, coincides with the tense/lax distinction, replicating the pattern reported for languages like English. Although the quality of short vowels in Arabic is similar to the quality of lax vowels in English or Persian, Arabic vowels differ from lax vowels in other languages in that they exhibit shorter duration. The 0.44 short-to-long ratio found in the current study is consistent with the 0.41–0.50 ratios previously found in the studies of Arabic dialects (Al-Ani Reference Al-Ani1970, Bukshaisha, Reference Bukshaisha1985, Alghamdi Reference Alghamdi1998, Saadah Reference Saadah2011, Almbark Reference Almbark2012, Kalaldeh Reference Kalaldeh2018, Fathi & Qassim Reference Fathi and Qassim2020). Other languages with tense/lax distinction in vowels, e.g. English or Persian, reveal smaller durational differences between the two vowels subsets, with the ratios between 0.72 and 0.82 (Peterson & Lehiste Reference Peterson and Lehiste1960, Hillenbrand et al. Reference Hillenbrand, Getty, Clark and Wheeler1995 for English; Aronow et al. Reference Aronow, McHugh and Molnar2017, Mokari et al. Reference Mokari, Werner and Talebi2017, Hemmatnia et al. Reference Hemmatnia, Ghazanfari and Nourbakhsh2019 for Persian). The differences in vowel duration might explain why vowel duration is a more salient feature in Arabic, but vowel quality is more salient in English or Persian.
Finally, we looked into contextual variation in vowel articulation. Our findings revealed that short vowels are more prone to influence of the preceding consonant than long vowels. The effect of a consonant was very weak at the midpoint of long vowels, but it was very noticeable in short vowels. It is likely that the strong effect of context in short vowels was a direct consequence of the very short duration of these segments – a tendency also observed in schwa (Flemming Reference Flemming and Minkova2009). Due to a very short duration of schwa in English (around 60 ms), the tongue gesture simply doesn’t have enough time to reach the target, which causes great contextual variation observed as patterns of consonant-to-vowel and vowel-to-vowel co-articulation. In the current study, we found that F1 in short vowels was considerably higher next to pharyngeal consonants, F2 was lower next to uvular consonants, and F2 was higher next to alveolar consonants. As a result, vowel retraction next to uvulars and vowel lowering next to pharyngeals were the two common modifications in short vowels. Pharyngeal co-articulation had the strongest effect on short vowels in all speakers, with the difference in F1 reaching 100 Hz. Previous studies described this difference in quality of short high vowels as allophonic variation between phones [i u] in non-pharyngeal context and phones [e o] in pharyngeal context (Bukshaisha Reference Bukshaisha1985, Almbark Reference Almbark2012).
Our findings provide some insights into the process of short vowel lowering in Arabic. The pharyngeal context, which is quite common in Arabic (Newman & Verhoeven Reference Newman and Verhoeven2002), may be considered a driving force of the historical sound change in short vowels. The producer-interpreter mismatch mechanism in Labov (Reference Labov1994: 586–587) explains individual deviations of frequency in some vowel tokens as a starting point in the process of changing category boundaries for vowels. When a single vowel token falls into the range of another vowel category, speakers may want to shift the category boundary in order to avoid misunderstanding. Lowering of the short vowels may be a push-chain shift in response to lowering in long vowels produced in the pharyngeal context. It triggered the process of recalculation of category boundaries in short vowels and their subsequent movement away from the high long vowels.
Due to absence of evidence of historical changes in Qatari Arabic (to the best of our knowledge), we could not compare the patterns that we found to the vowel system that existed a few generations earlier. We see this as a limitation of our study. However, since every generation of speakers has examples of more conservative types of pronunciation and more innovative pronunciation (Labov Reference Labov1994), we believe that even synchronic evidence for between-speaker variation can reveal patterns of possible movements (Labov et al. Reference Labov, Ash and Boberg2005).
To conclude, the results of the current study suggest that the vernacular Arabic dialect of Qatar has a vowel system in which long and short vowels demonstrate considerable differences in vowel quality in addition to differences in duration. The two subsystems of vowels can de described in terms of peripheral/non-peripheral distinction. The quality of Arabic short vowels /a i u/ may be represented by IPA symbols [æ ɪ ʊ]. Contextual variation, e.g. pervasive lowering of short vowels next to pharyngeal consonants, may account for the results of historical development of the vowel system. Further studies should focus on the coarticulatory effects of uvular and pharyngeal consonants as compared to the effect of emphasis spread. It also of interest whether long, peripheral Arabic vowels show evidence of tensing. An ultrasound study of Qatari Arabic vowels might clarify precise location of tongue in vowel articulation and the details of the diphthongization process in long vowels.
Acknowledgements
The authors are grateful to the participants of this study, without whom this paper would not be possible. We would also like to thank the Associate Editor Dr. Oliver Niebuhr and three anonymous reviewers for their valuable comments and insights, which greatly improved the quality of the manuscript. All remaining errors are our own.