This article has two goals: descriptive and theoretical. On the descriptive side, the article presents a grammar of gliding and epenthesis of Upper Sorbian, a language whose literature in the generative framework virtually does not exist.Footnote 2 The descriptive goal is worthy because Upper Sorbian is a Paradebeispiel of a complex but regular and productive system of gliding and epenthesis. Upper Sorbian stands out from a typological point of view because it has ten [sic] different strategies to satisfy Onset, a constraint that enforces the derivation of CV syllables. What we see is a complex example of an Onset-driven conspiracy. Typologically, Upper Sorbian is interesting because its syllable structure is governed by different principles in the domain of the root and in the domain of the word. On the theoretical side, the question is whether Optimality Theory (henceforth OT; Prince & Smolensky [1993] Reference Prince and Smolensky2004; McCarthy & Prince Reference McCarthy, Prince, Beckman, Dickey and Urbanczyk1995) which has been designed to solve conspiracies can deal with the complexities of Upper Sorbian. The answer is that it cannot unless it is modified to admit derivational levels. That is, the analysis argues for a derivational model of phonology and against parallel computation. A point of interest is that level 1 in Upper Sorbian must be defined as the bare root level, not as the expected stem level that includes roots and affixes. Further, it is demonstrated that Itô & Mester’s (Reference Itô, Mester, Kager, van der Hulst and Zonneveld1999) Crisp Edge constraint makes wrong predictions for Upper Sorbian and *Multi, a new constraint, needs to be postulated. Also, the analysis bears on the issue of positional markedness versus positional faithfulness and the question of whether, counter to McCarthy (Reference McCarthy1999), Duke of York derivations should be admitted in phonology.
Upper Sorbian, a minority Slavic language spoken in the eastern part of Germany near the border with Poland, exhibits a complex pattern of disparate processes that are united in their goal to satisfy Onset. The goal is achieved in several different ways. First, a vowel may turn into a glide, thereby providing an onset to the syllable that would otherwise be onsetless, ˈjara Footnote 3 ‘very’, //iara// → [ja.ra].Footnote 4 Second, a vowel may glide into the coda, raj ‘paradise’, //rai// → [raj], where gliding precludes the occurrence of *[ra.i] with an onsetless second syllable. Third, post-vocalically, //u// turns into a labial approximant [ʋ], ˈsawna ‘sauna’, //sauna// → [saʋ.na]. Fourth, an onset can be derived by j-insertion: diaˈlekt ‘dialect’, //dialɛkt// → [di.ja.lɛkt]. The preference for insertion over gliding in //dialɛkt// is driven by the constraint against complex onsets, banning the candidate *[dja.lɛkt]. But, fifth, this generalization is contradicted by the occurrence of complex onsets in morphologically derived words such as ˈRomjan ‘inhabitant of Rome’, //rom+ian// → [ro.mjan]. Sixth, in some contexts, Upper Sorbian inserts [ʋ] rather than [j], ˈkanu+wa ‘canoe’ gen.sg., //kanu+a// → [ka.nu.ʋa]. Seventh, j-insertion may be spawned not only by high vowels but also by mid vowels, ˈstereo ‘stereo’, //stɛrɛɔ// → [stɛrɛjɔ]. Eighth, mid vowels may also induce the insertion of the approximant, ˈSamoa ‘Samoa’, //samo+a// → [samoʋa]. Ninth, an onset can be provided by initial ʔ-insertion, ˈabo ‘but’, //abɔ// → [ʔabɔ] (word-initial ʔ-insertion), and, tenth, glottal stop insertion can repair stressed syllables that would otherwise be onsetless (stressed syllable ʔ-insertion), kokaˈin ‘cocaine’, //kɔkain// → [kɔkaʔin]. It would appear that this plethora of ten different processes eliminating onsetless syllables would leave no trace of hiatus in Upper Sorbian phonology, but this is not true: there are words that admit onsetless syllables, as in ˈdual ‘dual’, //dual // → [du.al], geoˈgraf ‘geographer’, //gɛɔgraf// → [gɛ.ɔ.graf] and arˈchai+sk+i ‘archaic’, whose root is //arxai// → [ar.xa.i].
The processes leading to Onset satisfaction have not been described in the literature on Upper Sorbian before, so this article is the first piece of work stating the relevant generalizations. The data come primarily from Prawopisny słownik hornjoserbskeje rěče (Völkel & Meškank Reference Völkel and Meškank2005), a large dictionary that I have studied in great detail, and from the materials of the Sorbian Language Commission concerning an ongoing debate with regard to the orthographic reforms and the accommodation of borrowings (Maćijowa Reference Maćijowa2007).
These sources have been complemented by interviews with native speakers that were conducted by me personally during the Sorbian Language and Culture Summer Schools in Bautzen in 2008 and 2014. Summer schools are an excellent opportunity to conduct interviews because all the instructors, lecturers, and the administrative staff are native speakers of Sorbian. During the fieldwork, informants were asked to read word lists. I transcribed what I heard and I made informal recordings so that I could return to the data when needed. The speakers were Sorbian students and instructors in the age bracket 22–50. All of them were born and raised in Lužica ‘Lausitz’, which is the region of Germany where Upper Sorbian still survives as a minority language.
While transcribing and judging the data may be difficult, my task was facilitated in three ways.
First, it helps if the transcriber has the relevant types of data in their native language. This is the situation here: my language has both glides derived from vowels and glottal stops, though the latter are obligatory only under emphasis. For example, I can tell without difficulty if the Upper Sorbian word ˈklient ‘customer’ is pronounced with [i.jɛ], [i.ɛ] or [i.ʔɛ] because my native language has exactly the same word and it is known that the word is pronounced with [i.jɛ], so any other pronunciation would strike me as different from that in my native language.
Second, it also helps if the orthography of a language is closely, or relatively closely, phonetic. Compare the spelling and pronunciation of the same words in Polish and Upper Sorbian.
Polish uses the letter i for both [i] and [j] while Upper Sorbian makes the distinction in the spelling.
Third, Upper Sorbian exhibits alternations in which the contrast zero–glide is reflected in the spelling.
All the data used in my research were checked with two professional linguists who are native speakers of Upper Sorbian. Before submission, one of them read an earlier version of this paper to make sure that there are no errors in the data.Footnote 6
This article is organized as follows. Section 1 reviews the basic data. Section 2 provides a preliminary analysis. Section 3 adduces arguments for level distinction. Section 4 studies Upper Sorbian gliding and insertion from the point of view of Stratal/Derivational OT. Section 5 concludes with a summary of the results.
1. Data
The goal of this section is to present the data and state the descriptive generalizations. The inventory of surface contrastive segments (phonemes) has been studied in several traditional grammars, including Michalk (Reference Michalk1955), Wowčerk (Reference Wowčerk1955), Schuster-Šewc (Reference Schuster-Šewc1968), Stone (Reference Stone, Comrie and Corbett1993), and Schaarschmidt (Reference Schaarschmidt2002). The most recent study is due to Jocz (Reference Jocz2011).
Important for this paper is the vowel system, which I cite from Jocz (Reference Jocz2011).
The vowel [i] and the glide [j] are in complementary distribution, so I will assume the tenet of autosegmental phonology that [j] is represented as [i] at the melodic tier and is different from the vowel [i] in that it is mora-less and occurs in syllable onsets or codas (Clements & Keyser Reference Clements and Keyser1983; Levin Reference Levin1985; Hayes Reference Hayes1989; and others). To clarify further: underlying representations contain //i// rather than //j// because [j] is predictable and can be derived from //i// by the gliding rules in (4). These rules are perfectly general statements known from a number of languages.
The vowel [i] is the source of [j] in pre-vocalic and post-vocalic contexts, where [j] is an effect of gliding either into the onset (4a) or into the coda (4b).
Gliding is inhibited in the morphological root of the word in CiV contexts. Onset is then satisfied by glide insertion.Footnote 9
While gliding into the coda shown in (4b) is the default strategy of resolving a /Vi/ hiatus, it should be noted that there is a small class of words (10 or so) that resist this process.
The words in (6) are clearly exceptional because the absence of gliding is unpredictable – compare the contrast raj [raj] ‘paradise’ but arˈchaiski [xa.i-].Footnote 12
As will become clear later, the domain of the word plays an important role in the phonology of Upper Sorbian. The domain is defined as the morphological root plus affixes, an expected definition. The vowel //i// shows two patterns of behavior in this domain. First, it triggers j-insertion (7a), a parallel to what we saw in (5). Second, //i// glides to [j] if it is part of an affix (7b).
The data in (7b) present a new pattern because gliding creates a complex onset: [ro.mjan], [du.bja.nɨ] and [dɔ.mja.tsɨ]. This is a contrast to the data in both (5) and (7a).
While the gliding in (7b) is a special case, gliding into the coda is the same in the domain of the word as in the domain of the root illustrated in (4b).
The insertion of [j] is spawned not only by the high front vowel //i//, as in (5) and (7a), but also by the mid front vowel, as shown in (9).
The pattern shown in (9) is contradicted by vowel clusters in the root-internal position where the same configuration of vowels as in (9) fails to trigger insertion. If the vowel is unstressed, the surface representation exhibits hiatus.Footnote 14
The data using the labial approximant [ʋ] to avoid hiatus belong to two classes of cases. First, [ʋ] comes from //u// and, second, [ʋ] occurs in insertion contexts that parallel those in (7a).
In parallel to the data in (9), [ʋ] is generated not only by the high back vowel, as in (11), but also by the mid back vowel, as in (12). The process occurs in the domain of the word.
In contexts other than those enumerated in (11) and (12), [ʋ] cannot be generated, so the surface form exhibits hiatus.
Finally, onsetless syllables can be repaired by glottal stop insertion.Footnote 16
The data in (14a) show that a glottal stop is inserted to provide an onset to word-initial syllables. The same strategy is used when the syllable is stressed (14b). The words in (14c) combine the contexts in (14a) and (14b) because the syllable is both initial and stressed. The alternations in (14d) demonstrate that word-internal ʔ-insertion is indeed sensitive to stress: stressed syllables, but not unstressed syllables, have [ʔ].
A reviewer points out that the occurrence of glottal stops is typically conditioned by style, speech rate, and token frequency. They, as well as emphasis and prominence, have been argued to play a role in the use of glottal stops (Schwartz Reference Schwartz2012). The data used in this paper have not been investigated for style, speech rate, and register. They all come from the reading of word lists. A point to note is that there was consistency across speakers with regard to glottal stops and glides. Given this consistency, theoretical phonology has the liability to construct a model for interacting generalizations that is able to generate the data produced by the speakers. Facts referring to style, speech rate, frequency, and so forth are of primary concern in usage-based grammars. That is, theoretical modeling and usage-based analyses are independent perspectives in linguistic investigation. This paper looks at Upper Sorbian from a theoretical perspective only, leaving usage-based investigation for future research.
A reviewer points out that glottalization occurs with varying degrees of volume and intensity. Phonetically, glottalization is a continuum ranging from a few irregular glottal pulses to fully fledged glottal stops of varying strengths (Balas Reference Balas2011; Schwartz Reference Schwartz2012; Żygis, Brunner & Moisik Reference Żygis, Brunner and Moisik2012). The question therefore is at what point glottalization can be regarded as sufficient to constitute a glottal stop. A larger point here is how gradient phonetic reality can be interpreted as categorical and binary, which is what phonology requires.Footnote 17 The answer lies with phonology, not with phonetics. The issue and the solution to the issue are illustrated with a familiar example of how the phonological distinctive feature [±back] is defined by its function in the classification of central vowels.
It is widely agreed on, ever since The Sound Pattern of English (Chomsky & Halle Reference Chomsky and Halle1968), that [±back] is the only feature for making categorical distinctions on the front-back axis in the articulation of vowels. Specifically, the claim is that [±central] does not exist as a distinctive feature in phonology. The consequence is that all vowels must yield themselves to the classification of being either [-back] or [+back]. Front vowels, such as [i], are uncontroversially [-back], back vowels, such as [u], are naturally [+back], but central vowels, such as [ɨ], [ə], and [a], appear to create a problem. The question is where we decide to draw the line on the front-back axis: before central vowels, in which case central vowels would be [+back], or after central vowels, in which case [ɨ] (as well as [ə] and [a]) would be [-back]. The answer is phonological, not phonetic. The argument is that [ɨ] aligns itself with back vowels from the point of view of palatalization. This is exemplified by the presence of palatalization in Russian stol [ɫ] ‘table’, stol+e [lʲɛ] loc.sg., stol+ik [lʲik] dimin. and its absence in stol+u [ɫu] dat.sg., stol+om [ɫɔ] dat.sg. and stol+y [ɫɨ] nom.pl. I conclude that //ɨ// is [+back].
Returning to glottalization in Upper Sorbian, there are two phonological tests that come to mind: palatalization and glide insertion. Descriptively, there are two palatalization rules, C → Cʲ / – V[-back], and t d → ʧʲ ʤʲ / – V[-back], as the following data illustrate (Schuster-Šewc Reference Schuster-Šewc1968).
The first rule, t d → tʲ dʲ, applies in non-derived environments while the second, t → ʧʲ, is restricted to derived environments. The point is that neither of these rules applies across word boundaries, so płót ˈinternata ‘the fence of the boarding school’ is pronounced with [t], not with [tʲ] or [ʧʲ]. The absence of palatalization before word-initial [i] follows automatically if we assume that initial glottalization in ˈinternata is a glottal stop because then [t] is not adjacent to [i] in [t ʔi].
Another phonological argument for glottal stops is drawn from the application of glide insertion. Historically, Upper Sorbian, but not Polish, had a very productive rule inserting [j] and [w] before word-initial high and mid vowels. Compare the Polish and the corresponding Upper Sorbian data in (16).
At some point word-initial glide insertion became inert and borrowings did not develop initial glides.
The absence of glide insertion in (17) is understandable if we assume that Upper Sorbian changed its strategy of filling word-initial onsets and ʔ-insertion was added as a rule. ʔ-insertion can do a better job of providing an onset than glide insertion because it is independent of the quality of word-initial vowels, so also words beginning with [a] receive an onset, for example, ale [ʔalɛ] ‘but’. A reviewer points out that development of ʔ-insertion is not surprising for yet another reason: Upper Sorbian is surrounded by languages that have rules of ʔ-insertion: German, Czech, and Polish.
In sum, there are two phonological rules that constitute arguments for regarding glottalization on vowels as a glottal stop: palatalization and glide insertion. I conclude that next to gliding, i → j, and j-insertion/w-insertion, ʔ-insertion is a strategy to provide onsets to onsetless syllables in Upper Sorbian.
2. Preliminary analysis
This section provides a preliminary analysis of the data adduced in Section 1. Subsection 2.1 considers simple patterns while Subsection 2.2 debates the status of [ʋ].
2.1 Simple patterns
The tools of the analysis are the familiar OT constraints (Prince & Smolensky [1993] 2004; McCarthy & Prince Reference McCarthy, Prince, Beckman, Dickey and Urbanczyk1995) that are stated in a simplified form in (18).
These constraints are for the most part self-explanatory, but a comment is in order regarding Max-μ and Onset. The job of Max-μ is to penalize gliding because gliding is a process that deletes a mora. This is the tenet of autosegmental phonology which holds that, first, the difference between a glide and a vowel is made in terms of syllable structure and, second, syllable nuclei have a mora (Clements & Keyser Reference Clements and Keyser1983; Levin Reference Levin1985; Hayes Reference Hayes1989; and others). The consequence is that a vowel that turns into a glide loses its mora and syllabifies into the onset or into the coda.Footnote 19 The evaluation in (19), which looks at ˈjara ‘very’, //iara// → [jara], makes this point clear. To save space, this evaluation and the others below do not display full syllable structure, focusing on whether the vowel is linked to a mora and hence is a syllable nucleus or whether it is not linked to a mora and hence belongs to the onset or to the coda. Solid lines denote ranking, the right-pointing hand marks the correct winner.
In what follows, I will simplify representations by omitting reference to moras and writing the vowel [i] that is the nucleus of the syllable as [i] and the glide from //i// as [j], that is, [j] will stand for the melodic segment [i] that has lost its mora and hence is a glide.Footnote 20
Returning to the evaluation in (19), generating the correct surface form [ja.ra] requires the ranking: Onset >> Max-μ. In (20), I extend the list of output candidates to include [ji.ja.ra], a candidate that violates Dep-Seg, a constraint banning insertion. In order to uphold the result obtained in (19), it is necessary to rank Dep-Seg above Max-μ, as shown in (20).
The ranking of Dep-Seg >> Max-μ expresses the generalization that gliding is preferred to insertion as a strategy to resolve hiatus.
Gliding into the coda exemplified earlier in (4b) shows that No-Coda must be ranked below Onset, //rai// → [raj] raj ‘paradise’. Furthermore, Upper Sorbian never uses deletion as a strategy to satisfy Onset, so *//rai// → [ra] is not an option. This generalization is expressed by assuming that Max-Seg is an undominated constraint, as shown in (21). The absence of ranking is indicated by a broken line.
The generalization in (21) that gliding is preferred to insertion is challenged by the data in (5), such as diaˈlekt ‘dialect’, which give preference to insertion over gliding: //dialɛkt// → [di.ja.lɛkt]. The difference between ˈjara ‘very’ in (4) and diaˈlekt ‘dialect’ is that in the latter but not in the former, gliding would lead to the creation of a complex onset: [ja.ra] versus *[dja.lɛkt]. The undesired candidate *[dja.lɛkt] loses to the desired winner [di.ja.lɛkt] if Complex-Onset is ranked above Dep-Seg, the constraint prohibiting insertion.
The interaction between Dep-Seg and No-Coda is illustrated in (23), which looks at //radiatɔr// → [ra.di.ja.tɔr] radiˈator ‘radiator’.
The evaluation works correctly if Dep-Seg is ranked below No-Coda as then [ra.di.ja.tɔr] (23d) wins over [rad.ja.tɔr] (23c), the desired result.
The constraint system developed thus far needs further improvement. First, as I show below, the ranking in (23) generates the wrong syllabification for words with an intervocalic consonant cluster. Second, Upper Sorbian admits both j-insertion and ʔ-insertion (Section 1), so it must be determined why j-insertion rather than ʔ-insertion applies in (23). The question is relevant since the [a] syllable in radiˈator is stressed and Upper Sorbian has ʔ-insertion in stressed syllables, hence [ra.di.ʔa.tɔr] is certainly a viable contender.
The ranking *Complex-Onset >> No-Coda in (23) predicts that VCCV should syllabify as VC.CV rather than as V.CCV because it is more important to obey *Complex-Onset than to violate No-Coda. However, the facts are different: Upper Sorbian maximizes onsets, hence Madrid ‘Madrid’, dobry ‘good’, wotrubny ‘cordial’, and krasny ‘red’ are syllabified Ma.drid, do.bry, wo.tru.bny, and kra.sny, respectively. This pattern can be obtained if No-Coda outranks *Complex-Onset.
The second question raised by the evaluation in (23) is why radiˈator undergoes j-insertion rather than ʔ-insertion. As noted earlier, the latter is an option because the [a] syllable in radiˈator [ra.di.ja.tɔr] is stressed and Upper Sorbian has a process of ʔ-insertion in stressed syllables.
The generalization is that whenever j-insertion and ʔ-insertion compete over the same string, it is the former that wins over the latter. This generalization is captured by the ranking of the segment inventory constraint banning glottal stops *ʔ higher than the constraint banning the glide *j. From the point of view of radiˈator in (23), these constraints can be ranked anywhere, for example, as the lowest in the hierarchy, but the evaluation of //rai// → [raj] raj ‘paradise’, with the added candidate containing a glottal stop requires that *ʔ must outrank No-Coda, as the following tableau documents.
With *ʔ ranked as required and No-Coda >> *Complex-Onset argued for in (24), we need to make sure that these rankings do not have adverse effects on the evaluation of radiator ‘radiator’. They do not, as (26) shows.
To conclude, in situations of conflict, when both j-insertion and ʔ-insertion are applicable, it is j-insertion that must win, so [ra.di.ja.tɔr] must win over [ra.di.ʔa.tɔr].
This generalization is flatly contradicted by the data in (27).
Given the ranking in (26) and, specifically, the fact that *ʔ outranks *j, we would expect [j], not [ʔ], to fill the onsets in (27). The data are perfectly clear: [j] is never inserted word-initially, see (27a). Neither is it inserted word-medially if the configuration is //Vi//, that is, //Vi// is never turned into [V.ji]. At first glance, it appears that these restrictions can be handled by Distinct Glide, a constraint that bans ji syllables (Kawasaki Reference Kawasaki1982; Rubach Reference Rubach2002). What makes this analysis suspect is the fact that Upper Sorbian freely admits [ji] syllables, for example, jich [jix] ‘them’, idej+i ‘idea’ gen.sg. The explanation that it is *ji that blocks glide insertion in (27) collapses when we look at the data in (28).
We know from the data in (9) and (12) that mid vowels trigger glide insertion, for example, ˈstere+o ‘stereo’, //stɛrɛ+ɔ// → [stɛrɛjɔ] ‘stereo’ nom.sg., and that glide insertion is preferred to glottal stop insertion, so we expect to see initial glides in (28). However, what we find is glottal stops and not glides. Similarly, the input //stɛrɛ+ɔ// should be able to resolve hiatus in two ways: first, by generating [j] from //ɛ//, //stɛrɛ+ɔ// → [stɛrɛjɔ], and, second, by generating [w] from //ɔ/, *//stɛrɛ+ɔ// → /stɛrɛwɔ/,Footnote 26 like /w/ (ultimately [ʋ]) is generated in ˈSamo+a, //samɔ+a// → /samɔwa/ ‘Samoa’. But this is not what we find: stereo has [j] and the option of generating [w] is not attested.
While the data in (27a), (27b), and (28a) as well as (28b) look quite different, they can in fact be reduced to single denominator: the glide cannot be in the same syllable as the vowel that generates it. This is a directionality effect, specifically, the restriction that the glide cannot be to the left of the vowel that spawns it. So, in a V1V2 configuration, V2 cannot generate a glide. This is exactly what we see in kokaˈin ‘cocaine’: [i] cannot generate a glide because it is a V2 and [a] cannot generate a glide because only high and mid vowels are able to produce glides. Since the hiatus in [ai] cannot be resolved by glide insertion, the grammar moves to the next best option, which is ʔ-insertion. This option is available because the i of kokaˈin is stressed. The ia configuration in radiˈator is the reverse of the ai configuration of kokaˈin, but in ia the i is a V1 vowel, so it is free to spawn a glide: //radiatɔr// → [ra.di.ja.tɔr]. In //stɛrɛ+ɔ// → [stɛrɛjɔ] stereo, we see [j] because //ɛ//, the spawning vowel is a V1. In contrast, the //ɔ// of //stɛrɛ+ɔ// is a V2, and, consequently, cannot generate a glide, so *sterewo is not attested.
The directionality of glide insertion has been noted as a problem for OT by Itô & Mester (Reference Itô, Mester, Kager, van der Hulst and Zonneveld1999). Their solution is to postulate a new constraint called Crisp Edge: ‘multiple linking between prosodic categories is prohibited’. Crisp Edge bans rightward insertion from V1 in a //V1V2// cluster, but the facts of Upper Sorbian require exactly the opposite: the insertion from V1 is attested while the insertion from V2 is not. Since the inserted glide must share the features with the spawning vowel, the feature tree of the glide and the vowel is either the same or partly the same. The former occurs if the glide is a full copy of the vowel, as in //Vi//→ [V.ji]. The latter happens when the glide is a partial copy of the vowel, as in //Vɛ// → [V.jɛ]. I illustrate the point in (29), where (29a) shows the undesired candidate *[kɔ.ka.jin] for kokaˈin ‘cocaine’ and (29b) displays the desired winner [di.ja.lɛkt] for dialekt ‘dialect’. I focus on the relationships between the Root nodes (marked RT) and their melodic content. Syllables are enclosed in parentheses.
The distinctive property of the banned configuration in (29a) is the occurrence of multiple linking (an effect of spreading) inside one syllable, hence I propose the following constraint.Footnote 27
Given the ranking *Multi >> *ʔ, it is predicted, correctly, that kokaˈin must be [kɔ.ka.ʔin] because *[kɔ.ka.jin] violates *Multi. A further, beneficial consequence is that word-initial syllables beginning with high and mid vowels cannot spawn glides, so ˈIrka ‘Irene’ and ˈecho ‘echo’ do not develop [j]. This is predicted because the putative [j] would have to be in the same syllable as the spawning vowel: [ji] and [jɛ], which is a violation of *Multi. Deprived of their ability to spawn a glide, initial syllables fall prey to the next best option and undergo ʔ-insertion, which generates the attested surface forms [ʔir.ka] and [ʔɛxɔ].
As noted in (16), historically *Multi must have been ranked lower than it currently is, which permitted j-insertion to create [ji] syllables.Footnote 29 This is attested in a class of six words and a few proper names, such as jich ‘their’ and Jitk (name), which had [i] rather than [ji] in Old Upper Sorbian (Schuster-Šewc Reference Schuster-Šewc1983). The current pattern is not to create [ji] syllables, a generalization that is evidenced by Irka ‘Irene’ in (27) and many similar examples. The [j] in jich can still be derived but it must come from gliding rather than from insertion, that is, the underlying representation of jich is //iix// rather than //ix//.
The discussion of *Multi is summarized by looking at the evaluation of kokaˈin ‘cocaine’. Since *Multi is surface-true, I will assume that it is undominated.
The result is correct, but (31) has not considered one important candidate: *[kɔ.kajn] with gliding, i → j. This candidate would have won in (31) because Max-μ, the constraint penalizing gliding, is below Dep-Seg, which militates against insertion. The creation of a complex coda in *[kɔkajn] cannot be the explanation here because the gen.sg. form kokaˈin+a would not have a complex coda and yet the candidate *[kɔ.kaj.na] must lose to the attested surface form [kɔ.ka.ʔi.na]. Similarly, it is irrelevant that the [i] in kokaˈin is stressed because gliding into the coda can be inhibited also when the [i] is unstressed, as in arˈchaiski [xa.i] ‘archaic’, so stress plays no role. Further, notice that the [xa.i] of arˈchaiski ‘archaic’ and the [aj] of raj ‘paradise’ constitute a near minimal pair in that they contrast in the treatment of i: //xai// → [xa.i] versus //rai// → [raj].
The default pattern is the one represented by raj, that is, gliding is the norm. As observed in Section 1, the absence of gliding is found in a small class of morphemes (10 or so) which are simply exceptions. A further question is how this fact should be encoded in the underlying representation. I follow Rubach (Reference Rubach2000a) and assume that the vowel which escapes gliding is prespecified with a sigma, that is, it is prespecified as a syllable nucleus.Footnote 30 Ident-Nuc is then responsible for the blocking of gliding.
Since Ident-Nuc is never violated in the surface forms of Upper Sorbian, I will assume that it is undominated and ranked above Onset.
The evaluation of kokaˈin from (31) is now continued in (33). I mark the prespecified nucleus with N and ignore other aspects of the representation: the reference to moras and the complete syllable trees. Instead, I mark syllable boundaries with a dot. To keep the tableau within reasonable bounds, I omit Complex-Onset because it is not violated by any candidate and hence plays no role.
The evaluation in (33) yields the attested surface form.Footnote 32
Finally, the preliminary analysis of gliding and epenthesis that this section has undertaken needs to account for the preference for ʔ-insertion over j-insertion in vowel clusters involving mid vowels, specifically //ɛV//. An illustrative example is the word oceˈan ‘ocean’, whose final syllable is stressed, //ɔtsɛan// → [ʔɔʦɛʔan]. The problem is how to exclude the candidate *[ʔɔʦɛjan], which is a viable contender because, as remarked in Section 1 and discussed further in Section 4, mid vowels may spawn glides.
The crucial observation leading to the exclusion of *[ʔɔʦɛjan] is that a glide spawned by a mid vowel is deficient because it cannot draw the feature [+high] from the spawning vowel. That is, if the glide from //ɛ// were a copy of the vowel, we would generate a mid glide rather than a high glide. However, mid glides are prohibited in Upper Sorbian, a generalization that is captured by an undominated constraint on glide well-formedness: glides must be [+high]. In order to obey this constraint, the glide from //ɛ// would have to acquire the feature [+high], which violates the feature markedness constraint stated in (18i): *[+high].Footnote 33
The analysis of oceˈan is now straightforward: *[+high] must outrank *ʔ. In (34), I ignore the first syllable of oceˈan, which has a glottal stop, and postpone the discussion of initial ʔ-insertion until Section 4.
To conclude, the constraints discussed in this section are listed in (35) that provides a summary of the rankings.
2.2 The status of [ʋ]
The goal of this section is to present evidence that the bilabial approximant [ʋ] occurring in the processes of gliding and epenthesis is best analyzed as derived from an intermediate /w/ by a process of consonantization: w → ʋ. Footnote 35 I present four arguments in favor of this analysis.
2.2.1 Argument 1
As shown in Section 1, [ʋ] is derived from //u// when //u// is preceded by a vowel, as in ˈsawn+a ‘sauna’: //saun+a// → [saʋna]. From the structural point of view, this derivation is parallel to the derivation of [j] from //i// in words such as ˈfajf+a ‘pipe’: //faif+a//→ [fajfa].
The structural parallel between (36a) and (36b) holds on the condition that //u// turns into a glide, exactly as //i// turns into [j]. The glide has the melodic representation of the vowel, a standard assumption in autosegmental phonology. A further process turns the /w/ from (36b-ii) into a labial approximant consonant: /w/ → [ʋ]. This simple analysis is possible only if /w/ is admitted as an intermediate representation in the derivation of [ʋ] from //u//.
2.2.2 Argument 2
A similar argument derives from the observation that //uV// clusters do not change in the sense that //u// is never turned into [ʋ]. For example, the name Ued+a //uɛd+a// retains //u// in the surface representation.Footnote 36 The absence of [ʋ] is easily accounted for if we assume that //u// would first need to glide to /w/ before ultimately yielding [ʋ] by consonantization. The argument here is that the gliding to /w/ is banned by Onset-w, a constraint that is well known from the study of languages other than Upper Sorbian. For example, Onset-w plays an important role in the analysis of Polish and Slovak (Rubach Reference Rubach2000a).
All that is required is that Onset-w outrank Onset, as shown in (38).
Candidate (38c) shows that *Multi and Onset-w are independent, as (38c) violates the latter, but not the former, constraint (see also Section 4.2).
2.2.3 Argument 3
In surface terms, what appears to be ʋ-insertion and what we know to be j-insertion differ systematically as processes in their operation inside the root morpheme (but not elsewhere, see below). Specifically, the configuration //CiV// spawns a glide but the configuration //CuV// does not.
The absence of *[duʋal] is accounted for if we make the assumption that [ʋ] is derived from /w/ by consonantization. On this assumption, the unattested representation *[duʋal] would have to derive from the intermediate representation /duwal/ by w-insertion, a parallel to j-insertion in dialekt. However, the intermediate representation /duwal/ can never be the winner in the evaluation of //dual// because Onset-w prohibits /w/ in the onset. In the same vein, the candidates /sawuna/ and /uwɛda/ could never be the successful contenders because they violate Onset-w.
2.2.4 Argument 4
The final argument for intermediate /w/ comes from the directionality of glide insertion. This came up as a problem in the discussion of kokain ‘cocaine’ in (31) and (33). Recall that the constraint system was unable to exclude the candidates containing [ji], so *[kɔ.ka.jin]. The problem is more general than the absence of [ji] where the glide comes from insertion. This is exemplified in (40).
As argued in the preceding section, see (31), the absence of *[kɔ.ka.jin] is accounted for by *Multi that outlaws glides occurring in the same syllable as the spawning vowel, as in *[kɔ.ka.jin]. The absence of *[ra.bi.ʋɔm] in (40b) can be accounted for in the same way if we admit an intermediate representation with a glide: */ra.bi.wɔm/. This candidate is excluded by *Multi because /w/ and the spawning vowel /ɔ/ are in the same syllable. If the intermediate /w/ does not exist and hence the candidate has the approximant /ʋ/ in /ra.bi.ʋɔm/, the analysis fails: *Multi is inapplicable because it has jurisdiction over the structure that comes from spreading (multiple linking), not from independent insertion. That is, the candidate /ra.bi.wɔm/ with /w/ from spreading, but not the candidate /ra.bi.ʋɔm/ with /ʋ/ from insertion, is within the purview of *Multi. I conclude that it is beneficial to admit intermediate /w/ and derive [ʋ] at a later point by consonantization, w → ʋ.
To conclude, I have adduced four different arguments showing that surface [ʋ] from //u// coming from gliding or from insertion must go through an intermediate stage at which it is the glide /w/. Building on this conclusion, among other things, the following section argues that the correct analysis of Upper Sorbian must admit level distinction.
3. Level distinction
The goal of this section is to adduce evidence for a derivational analysis. Given OT, derivationality is implemented as the framework of Stratal/Derivational Phonology. I argue for an analysis based on derivational levels and present seven different arguments for it, supported by the data from Upper Sorbian.Footnote 39
3.1 Argument 1
A powerful argument for level distinction was made in the preceding section. The argument is that we need an intermediate stage with the glide /w/ in the derivation of [ʋ] from back vowels by gliding or insertion. For this analysis to work, the consonantization process w → ʋ must take place at a later level.
3.2 Argument 2
The domains of morphological roots and words (roots plus affixes) are systematically different for w-insertion spawned by //u//: w-insertion occurs in the domain of words, but not in the domain of roots.Footnote 40
Notice that the words in (41) constitute near minimal pairs in the sense that the same vowel configuration //ua// exhibits w-insertion if the vowels straddle a morpheme boundary, but not if the vowels are root-internal.
3.3 Argument 3
Mid vowels spawn glides in the domain of the word (42a), but not in the domain of the root (42b).
These examples show that the cluster of //ɛ// and //ɔ// spawns a glide in the domain of the word, but not in the domain of the root: //stɛrɛ+oʋ+ɨ// → [stɛrɛjoʋɨ] (42a) versus //gɛɔgraf// = [gɛɔgraf] (42b).
3.4 Argument 4
The string //CiV// induces glide insertion in roots but not in affixes, which exhibit gliding. The examples in (43) are near minimal pairs.
3.5 Argument 5
Whether the root vowel glides under affixation depends on whether the affix is a suffix or a prefix. There is no gliding when a suffix is added, as in rabi ‘rabbi’ (nom.sg.) – rabij+a (gen.sg.): [rabija], not *[rabja]. In contrast, the addition of a prefix does not inhibit gliding, as in ˈz+jednać ‘bring peace’: //z+iɛdn+a+ʧʲ// → [zjɛdnaʧʲ], not *[zijɛdnaʧʲ]. That is, [rabija] and [zjɛdnaʧʲ] display different patterns of behavior.
3.6 Argument 6
As mentioned before, ʔ-insertion repairs root-initial onsetless syllables. Root-internal syllables (if unstressed) remain unaffected and exhibit hiatus. This is exemplified by the following near minimal pairs.
The problem is how to make sure that ʔ-insertion applies root-initially but not root-internally. A similar problem appears when we consider //V1V2// strings of which V2 is stressed, as I explain below.
3.7 Argument 7
As noted in (14b–c) in Section 1, a glottal stop is inserted to provide an onset for a stressed syllable. This generalization accounts for the alternations in (45).
The problem is how to guarantee that ʔ-insertion applies in stressed syllables and leaves unstressed syllables unscathed.
A heavy-handed analysis of word-initial ʔ-insertion in (44) and stressed syllable ʔ-insertion in (45) would be to posit new constraints such as those in (46).
The problem with these constraints is that they simply state the descriptive facts. An insightful analysis would be one that generates surface representations from an interaction of independent generalizations. Further, Onset [initial σ] and Onset [stressed σ] are flawed also from a theoretical point of view: they designate two new loci for markedness constraints because markedness is now relativized to word-initial syllables and stressed syllables. The effect is that all markedness constraints are tripled in number [sic]. Looking at Onset, for example, we have the generic Onset stated in (18a) in Section 2.1 and the two specific onset constraints in (46). The theory predicts such tripling for every markedness constraint, a formidable increase in the power of the grammar.Footnote 41 This situation is made worse by the fact that we already have such triple editions of faithfulness constraints because, in accordance with the tenets of positional faithfulness, faithfulness is relativized to word-initial syllables and to stressed syllables. That is, an identity constraint represented symbolically as Ident-X appears in three shapes: Ident-X (generic), Ident-X[initial σ] and Ident-X[stressed σ].
With regard to positional faithfulness, the motivation and the rationale for such distinctions have been argued for in a convincing way, notably by Beckman (Reference Beckman1997, Reference Beckman1999) and Casali (Reference Casali1997). I will therefore assume positional faithfulness in the analysis that follows and argue that the addition of the positional markedness constraints in (46) is unnecessary. The argument is built on the assumption that the grammar admits level distinction, an assumption that follows naturally from the seven independent arguments presented earlier in this section.
The theoretical framework of the analysis is that of Stratal/Derivational Optimality Theory (Kiparsky Reference Kiparsky1997, Reference Kiparsky2000; Rubach Reference Rubach and Roca1997, Reference Rubach2000a,Reference Rubachb; Bermúdez-Otero Reference Bermúdez-Otero1999, Reference Bermúdez-Otero2013, Reference Bermúdez-Otero, Hannah and Bosch2018; and others). The idea is that evaluation proceeds at three levels or strata: the stem level, the word level, and the post-lexical (post-syntactic) level.Footnote 42 The input to the first level is the underlying representation of the stems. The optimal output from the first level is the input to the second level (a new ‘underlying representation’) and, likewise, the winner candidate from the second level is the input to the third level. Inside a level/stratum, evaluation is fully parallel, as in classic OT. Constraints may be reranked between levels but the reranking must be minimal.
While the three levels are part of the general model of Stratal/Derivational OT, the determination of what constitutes a stem, a word, or a clitic phrase and a sentence is a language-specific matter.
The stem is a general concept: stems are bare roots and roots expanded by affixation; schematically: [Stem[Stem[Stem[Stem Root]Stem + Suffix 1]Stem + Suffix 2]Stem + Suffix 3]Stem, and so forth. The interest of this paper is that Upper Sorbian requires defining level 1 inputs as bare roots. In a typical situation, level 1 inputs are larger stems. For example, it has been shown by Bermúdez-Otero (Reference Bermúdez-Otero2013) that Spanish does not admit bare roots as level 1 inputs. Level 1 inputs in Polish are roots complete with suffixes. Polish prefixes are analyzed at the word level or at the clitic level (Rubach Reference Rubach2016).
A further question asked by a reviewer is whether WFRs (word formation rules) apply at particular levels, as in classic Lexical Phonology. In this view, English //ɪn//, a classic class 1 prefix, would be added by a WFR at level 1 while English //ʌn//, a classic class 2 prefix, would be added at level 2. This is an attractive assumption but it is potentially problematic because it may lead to affix ordering paradoxes, whereby a level 2 affix might need to be added before a level 1 affix (Kiparsky Reference Kiparsky1985). This problem goes away if we assume that the default is to do all word formation before phonology. Phonological processing is then guided by the designation of affixes as being level 1 or level 2. The default is that all affixes are processed at level 1. Only designated affixes are processed at level 2. Sometimes the level 2 designation can be predicted by a generalization and hence need not be stipulated. This is the situation in Polish: all prefixes are level 2 (Rubach Reference Rubach2016).
The novelty of Upper Sorbian is that bare roots rather than roots plus affixes are level 1 stems and suffixation comes at level 2. I demonstrate in the subsequent sections that the model of Stratal/Derivational OT is correct and sufficient for an analysis of the complex patterns of generalizations in Upper Sorbian.
4. Analysis: Stratal/Derivational OT
This section presents a grammar of Upper Sorbian gliding and epenthesis processes that are active in the Onset conspiracy. Section 4.1 is an overview of where the analysis is heading. Sections 4.2, 4.3, and 4.4 demonstrate how the analysis runs at levels 1, 2, and 3, respectively.
4.1 Overview
There are three kinds of processes that are active at level 1: gliding, j-insertion, and ʔ-insertion. The input //i// is glided into the onset or into the coda, depending on whether it is pre-vocalic (gliding into the onset) or post-vocalic (gliding into the coda), as in jara ‘very’, //iara// → [jara], cited in (4a) and raj ‘paradise’, //rai// → [raj], cited in (4b).
The force of gliding is diminished in two ways. First, in a small class of morphemes, underlying //i// is prespecified as a nucleus, the consequence being that it cannot glide because it would offend the undominated Ident-Nuc constraint. For example, kokaˈin ‘cocaine’, with a prespecified nucleus on //i//, cannot claim *[kɔkajn] as the winner (see Section 2). Second, *Complex-Onset thwarts gliding in instances in which it would create a complex onset, so the candidate *[djalɛkt] is not the optimal output from the input //dialɛkt//, diaˈlekt ‘dialect’, as shown in (22) in Section 2. Further, No-Coda makes sure that the input //radiatɔr// radiˈator ‘radiator’, cannot evade *Complex-Onset by syllabifying [d] into the coda, so the candidate [rad.ja.tɔr] is doomed, as shown in (26). In sum, words such as diaˈlekt and radiˈator share the generalization that gliding cannot occur with //CiV// inputs. This being the case, //CiV// inputs obey Onset by activating j-insertion, so diaˈlekt and radiˈator have [di.ja.lɛkt] and [ra.di.ja.tɔr] as their optimal outputs.
From the point of view of Onset, the hiatus in diaˈlekt and radiˈator could be avoided by either j-insertion or ʔ-insertion, but the generalization is that Upper Sorbian gives preference to j-insertion over ʔ-insertion. A glottal stop is inserted only if the glide cannot be inserted. This happens in three situations. First, given that [j] must be spawned by a front vowel, it could not be inserted in roots such as ˈLaos ‘Laos’ which do not have a front vowel. Second, the vowel e in roots such as geoˈgraf ‘geographer’ cannot generate [j] either because, through the action of *[+high], mid vowels are not permitted to spawn glides at level 1. Third, *Multi prohibits the insertion of [j] before [i], hence kokaˈin ‘cocaine’ cannot have *[kɔ.ka.jin] as its optimal output. Since j-insertion is blocked, the job of filling the onset is passed on to ʔ-insertion, hence ˈLaos, geoˈgraf and kokaˈin have [laʔɔs], [gɛʔɔgraf],Footnote 43 and [kɔkaʔin] as their optimal outputs at level 1.
The derivation of /w/ at level 1 is severely constrained: it is limited to gliding into the coda, as in ˈsauna ‘sauna’ in (36b): //saun// → [sawn].Footnote 44 The derivation of /w/ in the onset position is blocked by the undominated Onset-w, so ˈUed+a (name) and ˈdual ‘dual’ cannot have *[wɛd], *[wuɛd] or *[uwɛd] and *[dwal] or *[duwal] as their optimal outputs. The consequence is that they fall prey to ʔ-insertion and leave level 1 with the representations /ʔuʔɛd/ and /duʔal/. In the case of mid vowels, for example, the //ɔ// in poˈet ‘poet’, the derivation of [w] is blocked by Onset-w: *[pɔ.wet]. The development of [j], however, is thwarted by *Multi. In configurations including //ɛ// as the first vowel of the cluster, as in oceˈan ‘ocean’, j-insertion is barred from applying by *[+high]. The generalization is that mid vowels cannot spawn glides at level 1. Consequently, the satisfaction of Onset is achieved via ʔ-insertion, so poet and ocean leave level 1 with [pɔʔet] and [ʔɔ.ʦɛ.ʔan] as the winning candidates, which happen to be the attested surface forms.
The ranking of *ʔ below Onset makes sure that no root can leave level 1 without an onset. The desirable consequence is that all vowel-initial roots pick up a glottal stop, which is exactly what the surface facts of Upper Sorbian require: recall (see Section 1) that the language has an exceptionless process of word-initial ʔ-insertion, hence Aˈmerika ‘America’ and ˈIndian ‘Indian’ have the syllables [ʔa] and [ʔi] in the phonetic representation: [ʔamɛrika] and [ʔindijan].
The resolution of hiatus via ʔ-insertion in root-internal vowel clusters delivers the correct result in the cases where the syllable is stressed, such as poˈet [pɔʔet] and kokaˈin [kɔkaʔin]. In the case of unstressed root-internal syllables such as ˈdual ‘dual’, geoˈgraf ‘geographer’ and arˈchaiski ‘archaic’, ʔ-insertion overgenerates at level 1, yielding /duʔal/, /gɛʔɔgraf/ and /arxaʔiski/Footnote 45 as the winning candidates. The attested phonetic representations are [dual], [gɛɔgraf], and [arxaiski], so the superfluous /ʔ/ must be deleted at a later level. The question is whether this deletion should occur at level 2 or at level 3. There is no doubt that the deletion of /ʔ/ in unstressed syllables is a level 3 operation. The /ʔ/ plays an important role at level 2 because it blocks glide insertion root-internally.
Level 2 has the word as its domain, which means that the evaluation is extended to strings that include roots plus affixes. The level 1 restrictions on gliding, first, no [CjV] outputs and, second, no [w] in the onset, are lifted at level 2. The input /rom+ian/, ˈRomjan ‘inhabitant of Rome’, has [ro.mjan] as its winning candidate, which is the correct surface representation.Footnote 46 Recall that suffixes (here the //ian// of ˈRomjan) are not available at level 1 because Upper Sorbian limits level 1 to roots. Consequently, the input to level 2 is /rom+ian/, where //o//, but not //i//, was syllabified at level 1 and is designated as the nucleus. Therefore, the /i/ of /the suffix /ian/ does not fall within the purview of Ident-Nuc and is free to glide at level 2: /rom+ian/ → [ro.mjan].
There is no danger that /rabi+a/, the gen.sg. of ˈrabi ‘rabbi’, can follow the same path as ˈRomjan at level 2 and undergo gliding. The reason is that i → j in /rabi+a/ is blocked by the undominated Ident-Nuc, a constraint that prohibits the gliding of a vowel that has been specified as the nucleus. Given that Onset is ranked high and *ʔ remains ranked above Dep-Seg, /rabi+a/ has no option but to undergo j-insertion, yielding [rabija], the correct surface form. For the same reason, /kanu+a/, the gen.sg. of ˈkanu ‘canoe’, cannot undergo gliding or ʔ-insertion. With Onset-w reranked below Onset at level 2, /kanu+a/ undergoes w-insertion and leaves level 2 with the glide /w/: /kanu+a/ → /kanuwa/. The attested surface representation [kanuʋa] is derived at level 3 by consonantization, w → ʋ.
In contrast to level 1, level 2 is open to the derivation of glides spawned by mid vowels because *[+high] >> *ʔ is reranked to *ʔ >> *[+high]. This means that it is better to add the feature [+high], as required when the glide is spawned by a non-high vowel, than to insert [ʔ]. Thus, /stɛrɛ+ɔ/, ˈstereo ‘stereo’, goes to [stɛrɛjɔ] and /bo+a/ ˈboa ‘boa’ turns into /bowa/ at level 2 and further to [boʋa] at level 3. The absence of w-insertion as well as j-insertion in unstressed root-internal vowel clusters containing mid vowels is explained by the fact that these clusters do not exhibit hiatus at level 2. This is so because ʔ-insertion overgenerated at level 1, yielding the intermediate representations with a glottal stop: /du.ʔal/, /gɛ.ʔɔ.graf/, and /ar.xa.ʔi.ski/. The clean-up operation deleting /ʔ/ takes place at level 3, where *ʔ is reranked above Onset and Max-Seg.Footnote 47
The deletion of /ʔ/ must not occur in initial syllables and in stressed syllables. This is no problem, however. The positional faithfulness constraints Max-Seg[initial σ] (no deletion in initial syllables) and Max-Seg[stressed σ] (no deletion in stressed syllables) are ranked above *ʔ, so /ʔ/ survives the clean-up operation and occurs in the surface representations of words such as Aˈmerika [ʔamɛrika] ‘America’ and poˈet [pɔʔet] ‘poet’.Footnote 48
The details of the analysis are presented in the ensuing sections. Section 4.2 examines level 1 evaluations. Section 4.3 lays out the analysis at level 2. Section 4.4 completes the presentation by explaining the changes that occur at level 3.
4.2 Level 1
The evaluations in Section 2 should be understood now as level 1 evaluations, so most of the phonological processes that are active on level 1 have already been discussed. However, there are two types of cases that require further scrutiny because the formal apparatus was not complete with all the relevant constraints at the time when they were discussed in Section 2. The two cases in point are j-insertion originating from mid vowels and the role of ʔ-insertion at level 1.
The interaction between glide insertion spawned by mid vowels and ʔ-insertion unveils a clear generalization, see (42): mid vowels cannot spawn glides at level 1, so Onset is satisfied by ʔ-insertion. As explained earlier, this generalization is captured by the ranking *[+high] >> *ʔ. The constraint *[+high] prohibits the addition of [+high], an operation that is necessary if a glide comes from a non-high vowel. In (48), I repeat the evaluation from (34) for oceˈan //ɔʦɛan// ‘ocean’, now extended to include the initial vowel, and add an evaluation for poˈet //pɔet// ‘poet’. Irrelevant constraints have been omitted.
The words in (48) are the attested surface forms because ʔ-insertion has supplied [ʔ] to the root-initial syllable and to the stressed syllables, which is where [ʔ] is found in the surface representations.
The situation is different when root-internal syllables are unstressed. This happens in words such as arˈchaiski ‘archaic’, ˈdual ‘dual’, and geoˈgraf ‘geographer’. The grammar of level 1 predicts that these syllables will obtain a glottal stop, as shown in (49). The outputs are the intermediate representations, which I enclose in single slashes. Irrelevant constraints that are not violated by any of the candidates or are violated in exactly the same way by all candidates have been omitted. The evaluation in (49a) considers the relevant part of the word arˈchaiski ‘archaic’. Recall from the discussion in Section 2 that the //i// in archaiski is prespecified as a nucleus, so Ident-Nuc excludes the candidate with gliding.
To clarify, the constraint *[+high] is violated by any occurrence of [+high] (see Note 33). In candidates (49a-iii) and (49b-ii), it is violated once, not twice, because the glides come from spreading, so they share the feature tree with the spawning vowel.
Glottal stops in unstressed syllables in (49) are deleted at level 3 (see Section 4.4) because the attested surface forms exhibit hiatus: [xa.i], [du.al], and [gɛ.ɔ.graf]. The presence of a glottal stop at the intermediate stage, specifically at the output of level 1, is a Duke of York situation (Pullum Reference Pullum1976). This is not a problem for two reasons. First, as documented in Rubach (Reference Rubach2003, Reference Rubach2014, Reference Rubach2019b), OT must admit Duke of York derivations and, second, the intermediate representations with a glottal stop play an important role at level 2 because they account for the absence of glide insertion root-internally. Paradoxically then, the Duke of York derivation here is an asset rather than a drawback. I clarify this reasoning further in the following section. At this point, I conclude that all winning outputs from level 1 have an onset. In most cases, the onset generated at level 1 occurs in the attested surface representation. In some cases, the onset (invariably a glottal stop) exists only in the intermediate representations that are processed further at level 3.
A reviewer asks for an independent example that would motivate Duke of York derivations in OT. A clear example is found in Polish (Rubach Reference Rubach2003). Polish soft labials (underlying or derived by palatalization) are decomposed into a labial and a glide, as in //karpʲ+a// → /karpja/ → [karpʲj+a] ‘carp’ gen.sg. The decomposition process is driven by the segment inventory constraint *Soft-Labial. Faithfulness to the soft //pʲ// is satisfied by breaking up the input into two output segments: a hard /p/ and a palatal glide /j/, where both are correspondents of //pʲ//. The correspondents collectively preserve the properties of the input, with [-back, +high] now located on /j/ rather than on the labial. An independent post-lexical process of surface palatalization applying to all consonants before /i/ and /j/ repalatalizes the labial: //pʲ// → /pj/ → [pʲj/. Thus, soft pʲ becomes hard /p/ only in order to revert to soft [pʲ] in the surface representation: a classic Duke of York derivation.Footnote 50
4.3 Level 2
Level 2 is the domain of the word, which means that structures involving roots and affixes are within the purview of level 2 phonology. In this section, I review the processes which operate in a different way on level 2 than on level 1. The constraints are exactly the same but their ranking may be different.
In some cases, the processes on level 2 are exactly the same as on level 1. Gliding into the coda is a case in point.
The constraint hierarchy from level 1 delivers the correct result, as shown by the evaluation of ˈnan+aj /nan+ai/ ‘father’ nom.dual in (51). Irrelevant constraints have been omitted.
Gliding into complex onsets highlights a point of distinction between levels 1 and 2 (see the data in (7b) in Section 1). Specifically, hiatus in CiV strings is resolved by insertion at level 1 but by gliding at level 2: //dialɛkt// → [di.ja.lɛkt] at level 1 (see (22) in Section 2) versus /rom.+ian/ → [ro.mjan] ˈRomjan ‘inhabitant of Rome’ at level 2. This change of strategy is expressed as the reranking of Dep-Seg above No-Coda.
It is this reranking that accounts for the syllabification radiˈator ‘radiator’, //radiatɔr// → [ra.di.jatɔr] at level 1 versus ˈRom+jan ‘inhabitant of Rome’, /rom.ian/ → [ro.mjan] at level 2. To put it simply: due to the low ranking of Dep-Seg at level 1, it is ‘cheaper’ to insert [j] than to have a complex onset, so //dia// → [di.ja] is better than //dia// → *[dja] in radiˈator. At level 2, Dep-Seg is reranked high, which thwarts glide insertion. The next best option is to have a complex onset because No-Coda is ranked higher than *Complex-Onset, so /mia/ → [mja] is better than /mia/ → *[mi.ja] in ˈRom+jan. Importantly, ˈRom+jan as a word is not available on level 1 because level 1 is a root level, so all that is available is the root Rom.
In sum, the evaluation of ˈRom+jan ‘inhabitant of Rome’ at level 2 is as follows.
The level 2 preference for gliding, /CiV/ → [CjV] as in (53), appears to create a problem for the data cited in (7) in Section 1, such as ˈrabi ‘rabbi’ nom.sg. – ˈrabij+a gen.sg. and ˈprofi ‘professional’ nom.sg. – ˈprofij+a gen.sg. – ˈprofij+ow+y adj nom.sg. The problem is how to exclude the undesired candidate *[rabja] from the input /rabi+a/ without losing the insight that ˈRomjan uses gliding as a hiatus resolution strategy: /rom+ian/ → [ro.mjan] in (53).
The solution to this dilemma is already in place and does not require changes in the constraint ranking beyond the Dep-Seg reranking in (52) that is necessary independently. The examples under discussion exhibit a structural difference and that is the key to the problem. The ˈrabi part of the level 2 input /rabi+a/ is a root, so it was processed at level 1 where it was syllabified as /ra.bi/. Crucially, the /i/ is a nucleus when it enters level 2. The constraint Ident-Nuc, which mandates the retention of the nucleus in the output, thwarts gliding at level 2 in exactly the same way as it thwarts gliding of prespecified inputs at level 1, such as kokaˈin ‘cocaine’, as in (33) in Section 2. This constraint is undominated in Upper Sorbian and, consequently, ranked above Onset at all levels. The evaluation of /rabi+a/, ˈrabij+a, the gen.sg. of ˈrabi ‘rabbi’ proceeds as follows.
In contrast to the //i// in ˈrabij+a, the /i/ of //ian// in ˈRom+jan was not available at level 1 because it is part of the suffix, not of the root, and suffixes are processed at level 2 because level 1 is the bare root level. Consequently, it was not syllabified as a nucleus and hence escapes the jurisdiction of Ident-Nuc at level 2. Without the protection of Ident-Nuc, the /i/ of /ian/ falls prey to gliding, exactly as presented in (53). The distinction between ˈrabij+a and ˈRom+jan is a classic cyclic effect: we need to process the internal constituent (here the root //rabi//) before we process the external constituent (here the suffixed structure //rabi+a//).
The analysis of //kanu+a// → [kanuʋa] ‘canoe’ (gen.sg.) mirrors that of /ra.bi+a/ → [ra.bi.ja] ‘rabbi’ gen.sg., but there is one difference: w-insertion is no longer blocked by Onset-w due to the following reranking.
Level 2 phonology tells a new story about mid vowels. From the point of view of glide insertion, mid vowels were inactive at level 1 because the high-ranking of *[+high] prohibits the addition of [+high], leaving spreading from an existing [+high] on high vowels as the only option. Consequently, clusters such as //ɛɔ// in geoˈgraf ‘geographer’ resolve their hiatus at level 1 by inserting [ʔ] rather than [j]: //gɛɔgraf// → /gɛʔɔgraf/. The same clusters derived at level 2 by affixation spawn [j], which is what we find in the paradigm for ˈstere+o [stɛrɛjɔ] ‘stereo’ nom.sg., ˈsterej+a [stɛrɛja] gen.sg., ˈsterej+om [stɛrɛjɔm] instr.sg., and so forth. In terms of the constraint system, this means that the ranking *[+high] >> *ʔ from level 1 has changed to *ʔ >> *[+high] at level 2, so adding [+high] is now better than inserting a glottal stop. Glide insertion does not affect /gɛʔɔgraf/ at level 2 because there is no hiatus. The glottal stop is ultimately deleted at level 3, yielding the attested surface form [gɛɔgraf]. The hiatus produced by ʔ-deletion is not repaired because at level 3 Onset is reranked below Dep-Seg, thwarting insertion.
As noted, relevant for the derivation of stere+o ‘stereo’ (nom.sg.) is the reranking in (56).
The evaluation of ˈstere+o nom.sg. provides evidence for this change. Recall from Section 2 that glides from mid vowels must add the feature [+high] in order to comply with the requirement that glides must be high. The addition of [+high] is penalized by *[+high]. Irrelevant constraints have been omitted.
The absence of j-insertion in words such as reˈal ‘real’ and oceˈan ‘ocean’ is accounted for as follows. Words containing vowel clusters undergo ʔ-insertion at level 1, generating //rɛal// → [rɛ.ʔal] and //ɔʦɛan// → [ʔɔ.ʦɛ.ʔan]. These are the attested surface forms of reˈal ‘real’ and oceˈan ‘ocean’. In contrast, ˈstere+o did not have a vowel cluster at level 1 because the root is //stɛrɛ//, so the race was won by the faithful candidate /stɛ.rɛ/ which does not violate Onset. The hiatus problem first occurs at level 2, at which suffixes enter into the game: the o //ɔ// of the nom.sg. suffix creates a hiatus: /stɛrɛ+ɔ/. But now the constraint ranking is *ʔ >> *[+high], so j-insertion is preferred to ʔ-insertion, as shown in (57).
The question is how words such as geoˈgraf [gɛ.ɔ.graf] whose surface forms exhibit hiatus escape j-insertion at level 2. The solution to this problem lies with the overgenerating power of ʔ-insertion at level 1. Recall that [ʔ] is inserted by default if a syllable has no onset (see (49c) earlier in this section). Consequently, geoˈgraf emerges from level 1 as /gɛ.ʔɔ.graf/. Ultimately, the /ʔ/ must be deleted, but if we delay the deletion till level 3, we have an answer to the question of why the eo in geoˈgraf does not generate a glide: Onset is not violated because geoˈgraf retains /ʔ/ at level 2. It appears that this analysis does not really explain the absence of j-insertion because the problem reemerges at level 3, at which /ʔ/ is deleted and [ɛɔ] constitutes a hiatus. The problem is apparent. I argue in Subsection 4.4 that level 3 phonology does not admit any insertion at all, so the hiatus in geoˈgraf is tolerated.
Level 2 bears witness to w-insertion, a process that was prohibited at level 1 by the high-ranking Onset-w. The facts and the analysis are parallel to those presented for j-insertion spawned by mid vowels and by /i/ in stere+o nom.sg. and rabij+a gen.sg.
We see w-insertion in words such as kanu ‘canoe’ nom.sg. – kanuw+a gen.sg., exemplified in (11b) in Section 1. Relevant here is the gen.sg. form. Its representation at the input to level 2 is /ka.nu.+a/, where /ka.nu/ is the winner from level 1 and a is the gen.sg. ending first processed on level 2. The vowel cluster in /ka.nu.+a/ triggers w-insertion, yielding /ka.nu.wa/. This is a different strategy from that exhibited at level 1, where the same vowel cluster //ua// triggered ʔ-insertion, as in manuˈal ‘manual’: //manual// → [ma.nu.ʔal]. The change of the strategy—ʔ-insertion at level 1 but w-insertion at level 2—is expressed as the reranking of Onset-w >> *ʔ at level 1 to *ʔ >> Onset-w at level 2.
The reason why /kanu+a/ did not receive a glottal stop at level 1 is the same as in the case of /rabi+a/ and /stɛrɛ+ɔ/ analyzed earlier in this section: at level 1 the evaluation is limited to roots and the root //kanu// does not violate Onset. The violation of Onset becomes an issue at level 2, at which suffixes, here the gen.sg. a, are within the purview of the constraint system. But at level 2, the ranking is *ʔ >> Onset-w, so w-insertion rather than ʔ-insertion is the strategy for hiatus resolution. The details of this analysis are shown in (59). Recall from the evaluation of /rabi+a/ in (54) that Ident-Nuc bans the candidate exhibiting gliding. I omit *Multi because it is irrelevant.
The final question is how to avoid w-insertion in words that exhibit an [ua] hiatus in the surface representation, such as ˈdual ‘dual’ nom.sg. The answer here is the same as with the absence of j-insertion in geoˈgraf: ˈdual enters level 2 with a glottal stop due to the overgenerating power of ʔ-insertion at level 1 (see the evaluation in (49b) in Section 4.2). The input to level 2, /du.ʔal/, has its faithful output /du.ʔal/Footnote 52 as the winner because /du.ʔal/ does not violate Onset and hence there is no incentive to make changes to this representation. At level 3, /du.ʔal/ loses its glottal stop but then all insertion is prohibited, so [du.al] is the predicted surface representation, the correct result.
The analysis of w-insertion triggered by mid vowels, as in ˈbo+a ‘boa’, /bo+a/ → /bo.wa/, uses the mechanisms and the arguments familiar from the analyses of stere+o and kanuw+a. Examples of roots with ʔ-insertion contrasting with w-insertion in ˈboa are also parallel: in kreˈola ‘Creole’, the glottal stop is retained in the surface representation [krɛ.ʔɔ.la] because the syllable is stressed whereas in meteoˈrit ‘meteorite’, where eo is unstressed, the representation with a glottal stop /mɛ.tɛ.ʔɔ.rit/ is repaired at level 3 to yield the attested surface form [mɛ.tɛ.ɔ.rit].
The evaluation of ˈbo+a on (60) shows the details of the analysis. Irrelevant constraints have been omitted.
The winners with /w/ in (59) and (60) are processed further at level 3, where /w/ undergoes consonantization: w → ʋ.
A reviewer points out that the feature characterization of approximants is controversial. In my analysis, [w] is represented as [u] on the melodic tier, but the [u] is not linked to a mora and hence is not a nucleus. The derivation w → ʋ occurs on the melodic tier only. Since the melodic segment [u] is vocalic and [ʋ] is not, I assume that the change is from [-cons] to [+cons] and hence I call this process consonantization.
4.4 Level 3
Level 3 phonology undertakes two actions. First, it repairs the inputs that have a glottal stop in unstressed syllables by deleting the glottal stop and, second, it turns the glide /w/ into an approximant by consonantization: w → ʋ.
The glottal stops inherited from level 2 are deleted in unstressed syllables in (61a) but retained in stressed syllables in (61b) and word-initially in (61c).
The deletion in (61a) requires that *ʔ must be reranked above Max-Seg.
Given the reranking, /xa.ʔi/ → [xa.i] in arˈchaiski ‘archaic’ is optimal, which is the correct result. The /i/ does not glide to [j] because Ident-Nuc is undominated at all levels, so it outranks Onset at level 3.
Finally, the words that lose their glottal stop cannot be permitted to satisfy Onset by inserting some other segment, for instance, [w] in ˈdual and [j] in geoˈgraf. That is, the candidates *[du.wal] and *[gɛ.jɔ.graf] must lose to the attested surface forms [du.al] and [gɛ.ɔ.graf]. This result is obtained if Dep-Seg is reranked above Onset, thwarting all insertion.
The evaluations at level 3 for arˈchaiski, ˈdual and geoˈgraf are displayed in (64). Since, as just said, level 3 does not admit glottal stops, gliding or insertion, I will assume at this point that the constraints *ʔ and Dep-Seg are undominated (but see (66), where I modify this claim).
The retention of glottal stops in the surface representations (61b–c), such as [kɔ.ka.ʔin] and [ʔa.prɨl] is an effect of the following constraints.
The constraints in (65a–b) are not stipulated for the purposes of this analysis. They exist anyway because they are positional faithfulness constraints relativized to the two well-known loci of privilege (Trubetzkoy Reference Trubetzkoy1939): stressed syllables and initial syllables (see Beckman Reference Beckman1997, Reference Beckman1999; Casali Reference Casali1997). Ranked above *ʔ, Max-Seg[stressed σ] and Max-Seg[initial σ] thwart ʔ-deletion, as shown in (66). The example is oceˈan [ʔɔ.ʦɛ.ʔan] ‘ocean’, which exhibits glottal stops in both positions of privilege. The stressed syllable is indicated with an accent ˈ.
In addition to regulating the distribution of glottal stops, level 3 phonology spells out the glide /w/ as a labial approximant. The driver for the change is the segment inventory constraint *w.
This constraint plays no role at levels 1 and 2 because it is bottom-ranked. Importantly, *w is ranked below Ident[-cons], so consonantization cannot occur in the winning candidate.
At level 3, the constraints are reranked to *w >> Ident[-cons], opening the way to /w/ changing into a different segment.
The mechanics of the w → ʋ spell-out need to be specified because /w/ should not surface as, for example, [z] or [r]. The spell-out is controlled by Ident constraints, specifically, by Ident-Lab, Ident[+sonor] and Ident[+contin].
The ranking of these constraints is not relevant as long as they outrank Ident[-cons]. For that matter, they could be the undominated constraints. The details of the evaluation for ˈkanuw+a ‘canoe’ gen.sg., /ka.nu.wa/ → [ka.nu.ʋa], are laid out in (71).
Ident-Lab narrows down the pool of acceptable outputs to labials. Ident[+contin] (or Ident[-nas]) excludes candidate (71d) since nasals are [-contin]. Ident[+sonor] makes sure that /w/ does not change into a labio-dental obstruent [v], which is what we see in (71c).
To conclude, level 3 phonology regulates the distribution of glottal stops and spells out /w/ as [ʋ].Footnote 54
5. Conclusion
Upper Sorbian exhibits a complex Onset-driven conspiracy that involves a number of disparate processes: gliding into the onset, gliding into the coda, j-insertion, ʔ-insertion, and w-insertion. In some contexts, none of these processes can apply, so the optimal output is the one exhibiting hiatus. This is an impressively large number of surface configurations that the grammar is required to generate. The added difficulty is that, first, the processes in question operate differently in different constituents and, second, the derivation can be sensitive to properties that extend beyond segmental phonology by including crucial reference to the initial position and the occurrence in a stressed syllable. This conundrum is easily solved in Stratal/Derivational OT, which admits three levels of evaluation: the stem level, the word level and the post-lexical level. The arguments for levels are drawn not only from opacity, a classic source for such arguments, but also from cyclic effects and from the derivation of types of segments that are different at different levels.
The processing of words such as ˈrabij+a [ija] gen.sg. and ˈrabij+at+n+y [ija] adj must be cyclic because it requires that syllable structure must be first assigned in the domain of the root before it is assigned in the domain of the word (see (54) in Section 4). The cyclic effect is derived because Upper Sorbian defines the stem at level 1 as the morphological root. This is new, not known about any other language, so Upper Sorbian is of interest from a typological perspective.Footnote 55
Segment inventories are different at different levels of derivation. Specifically at levels 1 and 2, the epenthetic sonorant segments are /j/ and /w/. The /w/ turns into a consonant at level 3, a situation that is easily represented in Stratal/Derivational OT, but is impossible to represent in classic OT.
Directionality of glide insertion requires postulating a new constraint, *Multi, that assigns a violation to candidates containing the glide and its spawning vowel inside one syllable. This exactly is the reverse of what Itô & Mester (Reference Itô, Mester, Kager, van der Hulst and Zonneveld1999) meant to achieve with their Crisp Edge constraint.
Positional faithfulness is superior to positional markedness. The analysis has argued for Max-Seg relativized to stressed syllables and to word-initial syllables, exactly as predicted by positional faithfulness. Finally, counter to McCarthy (Reference McCarthy1999), Duke of York derivations are attested in OT and need to be recognized as legitimate, as is done by Stratal/Derivational OT.