1. Introduction
Modern Icelandic is legendary in the syntactic literature for having non-nominative subject verbs of different types. This includes verbs which select for dative subjects and nominative objects, so-called Dat-Nom verbs. What is less well known is that Dat-Nom verbs in Icelandic divide into two classes with respect to argument structure and the syntactic behaviour of the arguments. One class of Dat-Nom verbs consistently occurs in the Dat-Nom argument structure construction, while another class of verbs alternates between the Dat-Nom and the Nom-Dat argument structure construction (cf. Bernódusson Reference Bernódusson1982, Jónsson Reference Jónsson1997–1998, Barðdal Reference Barðdal1999, Reference Barðdal2001, Reference Barðdal2023:Ch. 3, Platzack Reference Platzack1999, Sigurðsson Reference Sigurðsson2006a, Rott Reference Rott2013, Reference Rott2016, Wood & Sigurðsson Reference Wood and Ármann Sigurðsson2014, Barðdal, Eythórsson & Dewey Reference Barðdal, Eythórsson and Kim Dewey2014, Reference Barðdal, Eythórsson and Kim Dewey2019). The difference in behaviour between alternating and non-alternating verbs is illustrated by means of the verbs nægja ‘find/be sufficient’ in (1) and líka ‘like’ in (2) below. The verb nægja, being an alternating verb, allows both verbal arguments to take clause-initial position, thus confirming their status as syntactic subjects (cf. 1a–b). At the same time, the other argument is realised in the postverbal slot, which is reserved for objects (for a list of the accepted subject tests in Icelandic, see Section 2 below).Footnote 1
In contrast, in the examples in (2), only the dative of líka ‘like’ may occupy the preverbal position and the nominative the postverbal position (2a), and not vice versa (2b), as is the case with nægja ‘find/be sufficient’ in (1).
By applying a host of accepted subject tests in Icelandic, Barðdal (Reference Barðdal1999, Reference Barðdal2001) was the first to show that either argument of alternating verbs may indeed function as the syntactic subject or the syntactic object. Since then, further work has been carried out on the nature of alternating Dat-Nom/Nom-Dat verbs in Icelandic, including a systematic comparison between the syntactic behaviour of the arguments of classical Dat-Nom verbs and the alternating Dat-Nom/Nom-Dat verbs in Icelandic, also compared to German (cf. Barðdal, Eythórsson & Dewey Reference Barðdal, Eythórsson and Kim Dewey2014, Reference Barðdal, Eythórsson and Kim Dewey2019, Barðdal Reference Barðdal2023, Somers & Barðdal Reference Somers and Barðdal2023). This work further corroborates the dichotomy between classical Dat-Nom verbs and alternating Dat-Nom/Nom-Dat verbs in Icelandic.
However, what is missing from the literature is a systematic study of how frequently alternating verbs instantiate the Nom-Dat construction and the Dat-Nom construction, respectively, in Icelandic texts. In other words, do all alternating Dat-Nom/Nom-Dat verbs instantiate the two argument structure constructions to the same degree or are the frequencies skewed in favour of one of the argument structure constructions over the other? Further, which factors determine the speakers’ choice of one of the two argument structure constructions, Dat-Nom or Nom-Dat, over the other? One hypothesis is that, other things being equal, the Dat-Nom construction is selected when the dative is topical and that the reverse Nom-Dat construction is selected when the nominative is topical (Barðdal Reference Barðdal2001:65; Barðdal, Eythórsson & Dewey Reference Barðdal, Eythórsson and Kim Dewey2014, Reference Barðdal, Eythórsson and Kim Dewey2019). We return to this point in Sections 4.3 and 5.3 below.
A first attempt at an investigation of this type was carried out by Rott (Reference Rott2013), who has extracted data for eight verbs, i.e. four classical Dat-Nom verbs and four alternating Dat-Nom/Nom-Dat verbs. Rott’s study is certainly meritable in that it is the first to lend corpus-based support to the ‘alternating predicate puzzle’, but it nevertheless suffers from several drawbacks. First, Rott only harvests 50 tokens per verb, and his full dataset only comprises 372 observations. In the present study, however, the number of tokens per verb is increased to 200. Second, Rott also includes clausal nominatives, which are de facto considerably longer than nominal arguments, thus being more prone to occurring later in the clause than nominal arguments. In fact, this is exactly what Rott’s results show, as 82 out of 87 clausal nominatives occur in postverbal position. This skewness, in turn, greatly inflates the number of Dat-Nom attestations in his sample.
Third, Rott’s (Reference Rott2013) study does not specify word order distributions per verb lemma. He thus posits a verb class effect without actually demonstrating that such an effect should exist in the first place. Finally, Rott also does not elaborate on any basic interactions between the argument slots. At least for alternating predicates, he specifies per word order pattern (i.e. Dat-Nom or Nom-Dat) how often each argument is realised as either a full NP, a pronoun, or a clause. However, he fails to disclose how often each of these co-occur with one another, which also makes it difficult to properly assess the scope of his results.
One study that has found homogeneous results for Icelandic alternating verbs, thus corroborating their status as an actual verb class with uniform properties, is that of Bornkessel-Schlesewsky et al. (Reference Bornkessel-Schlesewsky, Franziska Kretzschmar, Luming Wang, Philipp, Roehm and Schlesewsky2011). They have been able to show that alternating verbs consistently trigger a different brain response compared to non-alternating Dat-Nom verbs. However, as was the case for Rott (Reference Rott2013), it is unclear which exact verb types this study is based on, so it is difficult to gauge the scope of these findings. Nevertheless, the uniform electrophysiological response Bornkessel-Schlesewsky et al. have been able to elicit clearly confirms the status of alternating verbs as a syntactically uniform verb class.
The goal of this article is to provide a systematic study of the degree to which the two argument structure constructions are instantiated by alternating verbs in Icelandic. This entails a study which compares nouns with nouns, pronouns with pronouns, and nouns with pronouns. It is also important that both arguments be (pro)nominally realised as opposed to one of the arguments being realised as a clause. Such a study is better designed to control for different factors that may determine the speakers’ choice of one argument structure construction over the other.
In the remainder of this article we present a corpus-based study of alternating Dat-Nom/Nom-Dat verbs in Icelandic texts, extracted from the Icelandic Web 2020 corpus (isTenTen20, Jakubíček et al. Reference Jakubíček, Adam Kilgarriff, Rychlý and Suchomel2013), which consists of 520 million words. In order to establish a baseline with which our findings for alternating verbs may be compared, we first present results for both ordinary Nom-Dat verbs and non-alternating Dat-Nom verbs in Icelandic. Our research is based on 15 different verb types, five for each verb class under study. For these, 200 eligible instances are extracted for each lemma, resulting in a total of 3,000 observations. We then proceed to model the data statistically for four out of five alternating verbs, leaving out one outlier. Such an in-depth analysis of the data is crucial in understanding the factors steering the alternation.
This article is organised as follows. In Section 2 we present our object of study, including an overview of the three syntactic verb classes, each selecting for a different argument structure, i.e. the Nom-Dat construction, the Dat-Nom construction, and the alternating Dat-Nom/Nom-Dat constructions. Section 3 gives an overview of the methodology applied, whereas Section 4 presents the results from our study: a baseline for ordinary Nom-Dat verbs and classical Dat-Nom verbs, and the statistics for alternating Dat-Nom/Nom-Dat verbs in relation to these baselines. In Section 5 we single out a set of four alternating verbs, whose behaviour we describe using a logistic regression model. Section 6 summarises the main content and conclusions of the article.
2. Object of study
It is a well-established fact of Icelandic that the subject status of a verbal argument is not necessarily associated with nominative case marking (Andrews Reference Andrews1976, Thráinsson Reference Thráinsson1979, Zaenen, Maling & Thráinsson Reference Zaenen, Maling and Thráinsson1985, Sigurðsson Reference Sigurðsson1989, Jónsson Reference Jónsson1996, Barðdal Reference Barðdal2001, inter alia). For these so-called oblique subjects, at least the following nine subjecthood diagnostics have been identified (Andrews Reference Andrews1976, Thráinsson Reference Thráinsson1979, Zaenen, Maling & Thráinsson Reference Zaenen, Maling and Thráinsson1985, Sigurðsson Reference Sigurðsson1989, Jónsson Reference Jónsson1996, Barðdal Reference Barðdal2001, Reference Barðdal2006, Reference Barðdal2023, Barðdal, Eythórsson & Dewey Reference Barðdal, Eythórsson and Kim Dewey2019, inter alia):
-
first position in declarative clauses
-
subject–verb inversion
-
first position in subordinate clauses
-
subject-to-object raising
-
subject-to-subject raising
-
long-distance reflexivisation
-
clause-bound reflexivisation
-
conjunction reduction
-
control infinitives
It has been demonstrated that Icelandic oblique subjects pass all of the aforementioned tests, usually referred to in the literature as behavioural tests, as opposed to coding tests (cf. Keenan Reference Keenan and Li1976). Note that the coding test involving subject–verb agreement is not applicable to oblique subjects, as is well known in the literature (Sigurðsson Reference Sigurðsson1990–1991, Reference Sigurðsson, Bhaskararao and Subbarao2004, inter alia), since agreement is only found with nominative arguments in Icelandic, Germanic, and the Indo-European languages in general (cf. Barðdal Reference Barðdal2023:97–98). Moreover, the tests in the bulleted list above confirm the status of oblique subjects as behavioural subjects in Icelandic. In this article we intend to lend corpus-based support to the first and the third behavioural test, i.e. word order distribution in main and subordinate clauses, applying them to Dat-Nom and Dat-Nom/Nom-Dat verbs in Icelandic.
It has already been mentioned above that Dat-Nom verbs come in two different guises: non-alternating Dat-Nom verbs, and alternating Dat-Nom/Nom-Dat verbs. The latter class, which allows for two diametrically opposed case frames, was first discovered by Bernódusson (Reference Bernódusson1982), and it has since been the subject of several studies (Jónsson Reference Jónsson1997–1998, Barðdal Reference Barðdal1999, Reference Barðdal2001, Platzack Reference Platzack1999, Sigurðsson Reference Sigurðsson2006a, Rott Reference Rott2013, Reference Rott2016, Barðdal, Eythórsson & Dewey Reference Barðdal, Eythórsson and Kim Dewey2014, Reference Barðdal, Eythórsson and Kim Dewey2019, Wood & Sigurðsson Reference Wood and Ármann Sigurðsson2014, Somers & Barðdal Reference Somers and Barðdal2022, Reference Somers and Barðdal2023, inter alia). In this article, we refer to them either as ‘alternating Dat-Nom/Nom-Dat verbs’, or as nægja-verbs.
Verbs of the nægja type allow both the dative as well as the nominative to take on the role of subject, yet not at the same time. This is manifested in the fact that each of the aforementioned arguments independently passes the subject tests mentioned above, so that, when the dative behaves as the subject, the nominative takes on the role of object, and vice versa (cf. Barðdal Reference Barðdal1999, Reference Barðdal2001, Reference Barðdal2023:Ch. 3, Barðdal, Dewey & Eythórsson Reference Barðdal, Eythórsson and Kim Dewey2014, Reference Barðdal, Eythórsson and Kim Dewey2019, where it is shown that either argument passes all the subject tests in Icelandic). Examples (1a–b), here repeated as (3a–b), illustrate this phenomenon, in that they show that both arguments may take initial position in declarative clauses without there being a change in meaning or focus.
It is clear, however, that the two word orders reflect two different construals of the same event, namely that an experiencer directs his or her attention to a stimulus in (3a) and that a stimulus affects an experiencer in (3b) (cf. Barðdal Reference Barðdal2001, Reference Barðdal2023:Ch. 3, Barðdal, Eythórsson & Dewey Reference Barðdal, Eythórsson and Kim Dewey2019). We take this to be a consequence of the fact that the relevant verbs are force-dynamically neutral in the sense of Talmy (Reference Talmy and Shibatani1976), meaning that there is no causal chain found in the event structure of these verbs, as neither of the participants acts upon the other. In other words, there is no causation involved (see also Croft Reference Croft2012:233, Barðdal Reference Barðdal2023:Ch. 3).
Because of their dyadic nature, Barðdal (Reference Barðdal2001, Reference Barðdal2023) and Barðdal, Eythórsson & Dewey (Reference Barðdal, Eythórsson and Kim Dewey2019) have suggested that alternating verbs of this type in fact instantiate two different argument structure constructions: a Nom-Dat construction that licenses a nominative subject and a dative object, i.e. the stimulus affects the experiencer construal, and a Dat-Nom construction that licenses a dative subject and a nominative object, i.e. the experiencer directs his/her attention towards a stimulus construal. Our approach is fully in line with this analysis, as we subscribe to the view that the subject is the first argument of the argument structure. This is a theory-neutral definition of subject, as all theoretical frameworks employ argument structure or subcategorisation frames in their machinery.
Returning to the examples in (3a–b) above, what speaks against a simple topicalisation analysis is the positioning of the verbal arguments relative to the conjugated verb hafði ‘had’. In Icelandic the subject must be adjacent to the conjugated verb (unless it is either indefinite or heavy): that is, it must either precede or follow the verb. This is because of the so-called verb-second constraint, which also operates on other Germanic languages (cf. Eythórsson Reference Eythórsson1995, Axel Reference Axel2007:27–67, Harbert Reference Harbert2007:398–415, Thráinsson Reference Thráinsson2007:40–45, inter alia). Had either (3a) or (3b) been a topicalisation of the other, then the nominative in (3a) and the dative in (3b) had been realised in between the conjugated verb hafði ‘had’ and the past participle nægt ‘sufficed’. This is not the case, though, since both the nominative in (3a) and the dative in (3b) are realised after the non-finite verb, which is an object position.
Consider now the examples in (4a–b) below, which show that attempts at topicalising the object with alternating verbs result in ungrammatical structures in Icelandic (for one exception to this, see (17a–c) below). As already stated above, this involves an inversion of the subject and the verb. Hence, the intended subject argument immediately follows the verb in these examples and the intended object argument occurs in first position.
Interestingly, not all Dat-Nom verbs allow for the type of alternation shown in (3a–b), as is already mentioned above. Some, such as líka ‘like’, only license dative subjects; their nominative argument invariably behaves as an object with regard to word order distribution. The fact that, for these verbs, subject status is unequivocally associated with the dative case is illustrated by examples (5a–b).
Recall that (5b) is ungrammatical because the subject barninu ‘the child’ and the conjugated verb hafði ‘had’ have been separated from one another by the past participle líkað ‘liked’. If the nominative is realised preverbally for information-structural reasons, the dative, being the syntactic subject, breaks open the verbal group and is once again reunited with the conjugated verb, as shown in (6).
Hence, the example in (6) represents topicalisation and not neutral word order; that is, it is an example of topicalisation that fronts a non-subject constituent to initial position for emphasis (Thráinsson Reference Thráinsson2007:342). Since the dative subject and the conjugated verb have now been reunited, the example is grammatical. Verbs, like líka, that only allow their dative argument to pass the aforementioned subject tests are henceforth called ‘non-alternating Dat-Nom verbs’, but we will also refer to them as líka-verbs in the remainder of this article. Thus, the default argument structure construction that líka-verbs instantiate is the Dat-Nom construction. The linear nominative-first order with líka-verbs is only used for information-structural purposes (Barðdal, Eythórsson & Dewey 2019). In essence, this means that líka-verbs have lexicalised only one of the two construals mentioned above, namely the construal where the experiencer directs his/her attention to a stimulus.Footnote 2
Both alternating Dat-Nom/Nom-Dat verbs, as well as non-alternating Dat-Nom verbs, should be distinguished from ordinary Nom-Dat verbs, or – as we will also be calling them – hjálpa-verbs. These are also two-place predicates requiring a nominative and a dative argument, but, crucially, it is the nominative argument that behaves as the syntactic subject and the dative as the object (Barðdal, Eythórsson & Dewey Reference Barðdal, Eythórsson and Kim Dewey2019:158), as is evident by the grammaticality of (7a) and the ungrammaticality of (7b).
Thus, hjálpa-verbs constitute the mirror counterpart of the aforementioned líka-verbs, in that they exclusively occur in the Nom-Dat argument structure construction, which is the opposite of the Dat-Nom argument structure construction. Also, hjálpa-verbs only allow for preposed datives in cases where the dative is topicalised, as is shown in (7c).
In this study we lend corpus-based statistical support to the analysis that the dative and the nominative arguments of nægja-verbs are indeed syntactic subjects. This we do by comparing the frequency of topicalised arguments in first position to the frequency of subjects in first position. In other words, if an oblique argument behaves as a subject, it can be expected to be strongly associated with first position in declarative clauses (diagnostic test 1) and first position in subordinate clauses (diagnostic test 3), while topicalised objects would not show the same association. Moreover, since word order in Icelandic is understood to be quite rigid (Thráinsson Reference Thráinsson2007:342), topicalisation can be expected to be relatively rare, and even less common in subordinate clauses than in main clauses. This is confirmed by Angantýsson’s (Reference Angantýsson, Woods and Wolfe2020:261) study, although his study is based on acceptability judgements and not corpus frequencies. Nevertheless, empirical studies on how frequent topicalisation actually is are quite scarce for Icelandic.
One study that does include frequency counts is that of Callegari & Ingason (Reference Callegari and Karl Ingason2021). In their diachronic investigation of matrix-clause ditransitive constructions, they explore object topicalisation in Icelandic texts from the twelfth to twenty-first centuries, drawing their data from the IcePaHC corpus (Wallenberg et al. Reference Wallenberg, Karl Ingason, Freyr Sigurðsson and Rögnvaldsson2011). Callegari & Ingason include both pronominal and nominal objects in their study, i.e. objects realised as both pronouns and full NPs. Out of a total of 1,100 hits, they find 128 instances of object topicalisation, of which 89 have the direct object topicalised (8%), and 39 the indirect object (3.5%). Thus, topicalisation affects approximately 11.5% of the tokens under study, and direct object topicalisation turns out to be more than twice as common as indirect object topicalisation. Callegari & Ingason do not include an unambiguous overview of object topicalisation per century, but a summary graph seems to reveal that, for the twenty-first-century data, both direct objects as well as indirect objects are each topicalised approximately 6% of the time.
Another study worth mentioning in this respect is that of Barðdal & Eythórsson (Reference Barðdal and Eythórsson2012), who map out the word order patterns for monotransitive verbs licensing nominative subjects. Barðdal & Eythórsson’s data have also been drawn from the IcePaHC corpus, although an earlier version than that of Callegari & Ingason. The main difference between the two versions is the size of the corpus, with the more recent version containing nearly 60% more texts (1,002,390 vs. 632,000, Einar Freyr Sigurðsson p.c.). With all else being equal, and on the assumption that the smaller version of the corpus is large enough, there is no reason to assume that a comparison involving the frequency of topicalisations between these two studies is not justified. Thus, zooming in on Barðdal & Eythórsson’s (Reference Barðdal and Eythórsson2012) results for verb-second clauses (i.e. SVO vs. OVS structures), it turns out that nominative subjects occur 2,327 times (or 80%) in initial position, and 578 times (or 20%) in postverbal position.Footnote 3 Therefore, object topicalisation is clearly much more frequent with monotransitives than with ditransitives, at least diachronically. See also Booth & Beck’s (Reference Booth and Beck2021) study where it is statistically documented that the clausal initial position in Modern Icelandic is a topic position.
It is unclear if the predicates in our study are equally permissive of topicalisation as Barðdal & Eythórsson’s (Reference Barðdal and Eythórsson2012) monotransitives and Callegari & Ingason’s (Reference Callegari and Karl Ingason2021) ditransitives. For that reason, we map out word order preferences for both the hjálpa and líka classes and use these counts as a baseline against which word order preferences for the nægja class will be measured. We now turn to a description of our methodology, before we present our findings in Sections 4–5 below.
3. Methodology
This study is based on 15 simple verbs that fall into one of three categories: (i) ordinary Nom-Dat verbs (the hjálpa type), (ii) non-alternating Dat-Nom verbs (the líka type), and (iii) alternating Dat-Nom/Nom-Dat verbs (the nægja type). Our aim was to follow Rott (Reference Rott2013) in our selection of verbs, but some of the verbs he used were too infrequent in the corpus to yield enough eligible tokens. Thus, we complemented the dataset with additional known non-alternating Dat-Nom and alternating Dat-Nom/Nom-Dat verbs (cf. Jónsson Reference Jónsson1997–1998, Barðdal Reference Barðdal1999:89, Reference Barðdal2001:53–58, Reference Barðdal2023:81–83). Each category contains five verbs:
-
(i) Ordinary Nom-Dat verbs: hjálpa ‘help’, líkjast ‘resemble’, mótmæla ‘contradict’, treysta ‘trust’, and þakka ‘thank’.
-
(ii) Non-alternating Dat-Nom verbs: áskotnast ‘receive’, blöskra ‘be shocked, be horrified’, leiðast ‘be bored’, líka ‘like’, and þykja gott/slæmt/… ‘think, find, seem good/bad/…’.
-
(iii) Alternating Dat-Nom/Nom-Dat verbs: duga ‘suffice, be enough’, dyljast ‘be hidden to somebody, be aware’, endast ‘last’, henta ‘suit, befit’, and nægja ‘be enough, be sufficient’.
We follow Rott (Reference Rott2013:103) in using blöskra ‘be shocked, be horrified’, leiðast ‘be bored’ and líka ‘like’ in the class of non-alternating Dat-Nom verbs, and henta ‘suit, befit’ and dyljast ‘be hidden to somebody, be aware’ in the alternating Dat-Nom/Nom-Dat class.
The analysis is based on a data collection from the Icelandic Web 2020 corpus (isTenTen20, Jakubíček et al. Reference Jakubíček, Adam Kilgarriff, Rychlý and Suchomel2013), which consists of approximately 520 million words. The corpus itself has been accessed through the Sketch Engine interface. For each of the aforementioned verbs, a lemmatised search query has been carried out targeting the verb’s bare infinitival form. That is also true for the etymologically reflexive -st-verbs, as the search engine considers -st-forms to be instantiations of the non-suffigated base form. Thus, líkjast, áskotnast, leiðast, dyljast, and endast were run as líkja, áskotna, leiða, dylja, and enda, respectively.
The material has subsequently been extracted in one or more files containing 10,000 randomised tokens per verb type, depending on how abundant the data were. In contrast to Rott, who also includes middle field tokens, we only focus on tokens in which the main verb is flanked by either a nominal or a pronominal element. Thus, only instances of the type [Nom-V-Dat] or [Dat-V-Nom] have been taken into account, regardless of clause type. As a consequence, there are no tokens in our dataset of any other kinds of topicalised elements, which in turn excludes, for instance, adverbials.
Contrary to the Mainland Scandinavian languages, Icelandic is a so-called symmetric V2-language, which means that the conjugated verb takes second position both in main clauses as well as in subordinate clauses (Thráinsson Reference Thráinsson2007:41, Angantýsson Reference Angantýsson, Woods and Wolfe2020:243). Eligible tokens are therefore not restricted to main clauses only but also include subordinate structures. Per verb type, the first 200 tokens have been withheld for study. Hence, the total number of collected tokens equals 3,000, and the number of collected tokens per verb class equals 1,000.
Per token, all arguments, dative and nominative, have been manually annotated for the following variables: case, (pro)nominality, pronoun type (if applicable), referentiality, person, number, definiteness, animacy, and length. The choice of variables is motivated by two considerations, namely that each of these (a) are well known in the field for affecting word order, (b) may serve as a proxy for discourse prominence or topicality (cf. the discussion in Sections 4.3 and 5.3 below). For each variable in boldface, the relevant values are rendered in small caps, followed by examples in brackets. The nine variables are discussed below.
-
Case: nominative (þessi sími ‘this phone’ nom.sg, mín eigin föt ‘my own clothes’ nom.pl) or dative (hundinum ‘the dog’ dat.sg, unglingunum ‘the youngsters’ dat.pl).
-
(Pro)nominality: pronoun (þú ‘you’ sg, ykkur ‘you’ pl, einhverjum ‘some’) or full NP (Ísland ‘Iceland’, ýmsir þingmenn ‘some congressmen’ nom.pl, bókin ‘the book’ nom.sg).
-
Pronoun type: personal (ég ‘I’, hann ‘he’, þeir ‘they’ 3.m), demonstrative (þessi ‘this’, hinum ‘the other’, slíkur ‘such’), indefinite (öllum ‘all’, engum ‘no-one’, báðum ‘both’), or reciprocal (hvert öðru ‘each other’ sg.n, hver annarri ‘each other’ sg.f). Reflexives are excluded from study, as they are hypothesised to prefer the postverbal slot. In line with Heylen (Reference Heylen2005:103), conjoined pronouns are also excluded, as they arguably lose their pronominal status.
-
Referentiality: referential or correlative. Icelandic allows for the third person personal pronoun það ‘it’ to be used as an expletive or a correlate. Expletives are wholly absent from our dataset, but correlates, which have a clause-anticipating function, show up approximately 300 times. It is hypothesised that such placeholders, given their impoverished semantic status, are inclined to follow the verb, rather than precede it. All remaining arguments, either nominal or pronominal, have been annotated as referential.
-
Person: first person (mér ‘me’ dat.sg, við ‘we’, okkur feðgum ‘us, father and son’), second person (þú ‘you’ sg, ykkur ‘you’ pl, yður ‘you’ pl.hon), or third person (þeim ‘them’ dat.pl, henni ‘her’ dat.sg.f, augum manna ‘people’s eyes’ dat.pl).
-
Number: singular (rigningarvatn ‘rainwater’ nom.sg, stúlkunni ‘the girl’ dat.sg) or plural (stúlkum ‘girls’ dat.pl, Hauknum og frú ‘Haukur and his wife’ dat.pl).
-
Definiteness: definite or indefinite. Icelandic pronouns are always definite, except for indefinite pronouns (báðar ‘both’ pl.f, manni ‘one’ dat.sg) and indefinite demonstratives (slíkt ‘such a thing’ sg.n). NPs are considered to be definite if they are preceded by a definite demonstrative pronoun (þessi hey ‘this hay’ nom.pl) or a possessive pronoun (þitt fyrirtæki ‘your company’ nom.sg). Constituents that are followed by a cliticised definite article (blaðinu ‘the paper’ dat.sg), a possessive pronoun (hlátur hans ‘his laughter’ nom.sg), or a postposed genitiveFootnote 4 (stjórn félagsins ‘the company board’ nom.sg) also receive a definite reading. Names of people (Þorsteini dat.sg), institutions (Fjölmiðlanefnd ‘the Media Committee’ nom.sg), places (Keflavík nom.sg), and population groups (Reykvíkingum ‘the people of Reykjavík’ dat.pl) are inherently definite. If a conjoined constituent exhibits conflicting definiteness, in that one conjunct is definite but the other is indefinite, the string is coded for the first conjunct. Thus, the string 4x4 klúbburinn og allir sem elska hálendið í sinni hráustu mynd ‘the 4x4 club and all who love the highlands in their rawest form’ is coded as definite, because of the definite status of the first conjunct (4x4 klúbburinn ‘the 4x4 club’).
-
Animacy: individual, collective, inanimate, non-inferable or NA. The label ‘individual’ is used to index constituents referring to humans (bræðurnir ‘the brothers’ nom.pl), animals (fuglum ‘birds’ dat.pl), and what Bresnan & Ford (Reference Bresnan and Ford2010:175) call ‘humanoid beings’ (Guð ‘God’ nom.sg, heilagur andi ‘the Holy Spirit’ nom.sg). This includes cells (krabbameinsfrumur ‘cancer cells’ nom.pl). Groups of individuals are annotated as collective (fólkið ‘the people’ nom.sg, Landsbankanum ‘National Bank’ dat.sg). All other constituents, including plants (þessi jurt ‘this plant’ nom.sg), fungiFootnote 5 (myglusveppum ‘mould’ dat.pl), and dead animals, are labelled ‘inanimate’. When animacy cannot unequivocally be determined, we have resorted to the label ‘non-inferable’. This, for instance, applies to lántakanda ‘borrower’ dat.sg, which, in the given context, can both refer to an individual as well as to a corporate borrower. Pronouns that serve as placeholders for a subclause are annotated as ‘NA’, since their referent is linguistic, and not extra-linguistic. If a conjoined constituent exhibits conflicting animacy, in that one conjunct complies with one label but the other complies with another label, the string is coded for the first conjunct. Thus, the string bæði einstaklingum og fyrirtækjum ‘both individuals as well as companies’ is coded as ‘individual’, because that is the label that captures the animacy status of the first conjunct (bæði einstaklingum ‘both individuals’).
-
Length: constituent weight measured in words.
We now turn to our findings and a discussion thereof.
4. Results and discussion
The current section details the results of our study. Section 4.1 establishes a baseline by mapping out word order preferences for both ordinary Nom-Dat verbs, i.e. verbs of the hjálpa type, and non-alternating Dat-Nom verbs, or líka-verbs. In Section 4.2 we compare the statistics for alternating nægja-verbs with the baseline established for hjálpa- and líka-verbs in Icelandic. Section 4.3 discusses the main implications and conclusions.
4.1 Establishing a baseline: hjálpa- and líka-verbs
In this section we discuss our findings for both hjálpa- and líka-verbs. We single out two configurations, i.e. contexts in which both arguments are full NPs (Section 4.1.1) and contexts in which both arguments are pronouns (Section 4.1.2). We leave out a comparison of contexts involving only one pronoun, since Somers & Barðdal (Reference Somers and Barðdal2022:91–92, 95–97) have shown that those frequencies exhibit the same tendencies as documented below. We summarise our conclusions in Section 4.1.3.
4.1.1 Word order variation in the [NP-V-NP] configuration
Table 1 presents an overview of the word order distributions for both hjálpa- and líka-verbs in clauses where both arguments are full NPs. Starting with hjálpa-verbs, the general rule is that the dative is realised postverbally: as many as 334 tokens (or 99%) across verbs instantiate the Nom-Dat word order, as opposed to a mere two (or 1%) instantiating the reverse Dat-Nom order. Two examples of the unmarked nominative-before-dative pattern, one with hjálpa ‘help’ and one with líkjast ‘resemble’, are given in (8a–b).
The sole hjálpa-verb which (marginally) allows datives in initial position is mótmæla ‘contradict’ with the two tokens shown in (9a–b).
Both of these are topicalisations, with the dative occurring in initial position for information-structural purposes. Both tokens also display a discrepancy in definiteness, in that the fronted dative is definite, whereas the postposed nominative is indefinite. Since definites tend to precede indefinites, this asymmetry is undoubtedly conducive to an inversion of the canonical order of constituents (cf. Siewierska Reference Siewierska1993, Lambrecht Reference Lambrecht1994, Reference Lambrecht2000, Gregory & Michaelis Reference Gregory and Michaelis2001, inter alia).
Moving on to our findings for líka-verbs, it is striking that the acquired figures constitute the mirror image of those obtained for hjálpa-verbs, as 193 clauses (or 99%) assign the preverbal slot to the dative. This corroborates the existing analysis of these as being non-alternating Dat-Nom verbs. Two examples of non-alternating Dat-Nom verbs occurring in their neutral Dat-Nom order are presented in (10a–b).
Only þykja returns one token in which the canonical order of constituents is inverted. This example is shown in (11), where the nominative is topicalised to first position, while the dative subject inverts with the finite verb.
Observe that, across all ten verbs presented in Table 1, nominal frequencies are generally very high: there are never fewer than 24 attestations per verb, and their total number across all ten verbs amounts to 530. Thus, our findings for both hjálpa- and líka-verbs in the double-NP configuration can be considered to be very robust.
Finally, both hjálpa- and líka-verbs show not only a strong verb effect in the [NP-V-NP] configuration but also a robust verb class effect, since all verbs prefer either the Nom-Dat or the Dat-Nom order in equal manner.
4.1.2 Word order variation in the [Pro-V-Pro] configuration
Table 2 summarises the results for hjálpa- and líka-verbs in the [Pro-V-Pro] configuration. For hjálpa-verbs, word order preferences in the [Pro-V-Pro] configuration constitute a near-perfect copy of the results presented in Table 1 above. With the exception of mótmæla, all hjálpa-verbs tend entirely towards the Nom-Dat linear order. Interestingly, the only two attestations of the topicalised Dat-Nom linear order contain a dative demonstrative pronoun in combination with the nominative personal pronoun ég ‘I’. Both of these examples are given in (12a–b).
In (12a) it is the dative demonstrative þessu ‘this’ which occurs in clause-initial position while the nominative ég ‘I’ inverts with the verb. In (12b) a similar pattern surfaces, this time with the topicalised dative demonstrative því ‘that’ in first position.
Verbs of the líka type show considerably more word order variation in the double-pronoun configuration than hjálpa-verbs: the Dat-Nom linear order is attested 183 times (or 81%) and the Nom-Dat order 44 times (or 19%). An example of each pattern is provided in (13a) and (13b) respectively.
Remarkably, the Nom-Dat pattern for líka-verbs is almost uniquely associated with nominative demonstratives: 40 out of 44 tokens occurring with the Nom-Dat linear order are headed by the pronouns það ‘that’ or þetta ‘this’. This finding is reminiscent of the tendency discussed above for the verb mótmæla ‘contradict’, which is marginally found in the Dat-Nom linear order, yet only when the dative object is a demonstrative pronoun.
Given the fact that demonstratives convey highly topical information, it is clear that topicality, especially in combination with effects of definiteness and pronominality, may cause changes in the linear order from the neutral Dat-Nom to the topicalised Nom-Dat order. However, the extent to which the word order of different argument structures can be inverted also seems to be dependent on the verb itself. For a more detailed discussion of the effect of nominative demonstratives, see Somers & Barðdal (Reference Somers and Barðdal2022:98–99).
4.1.3 Interim conclusions
The evidence presented in this section is fully in line with the prediction that Icelandic possesses both a class of Nom-Dat verbs as well as a class of Dat-Nom verbs. The former, also referred to as hjálpa-verbs, are associated with a nominative-before-dative order, whereas the latter, or líka-verbs, instantiate the reverse dative-before-nominative order. Our findings essentially confirm that subjects in Icelandic, regardless of case marking, are very strongly inclined to occupy the preverbal slot (cf. Andrews Reference Andrews and Bresnan1982:428, Sigurðsson Reference Sigurðsson1989:205–206, Jónsson Reference Jónsson1996:115, Thráinsson Reference Thráinsson2007:21, Schätzle Reference Schätzle2018, inter alia).
What is especially informative about our results for the [NP-V-NP] configuration, is that Dat-Nom verbs occur with the Dat-Nom linear order to the same degree as ordinary Nom-Dat verbs of the hjálpa ‘help’ type occur with the Nom-Dat linear order. That is, both verb classes realise their syntactic subjects in clause-initial position 99.5% of the time, the nominative for Nom-Dat verbs and the dative for Dat-Nom verbs.
The overwhelming preference of líka-verbs for dative-first structures refutes the claim made by Roehm et al. (Reference Roehm, Schlesewsky and Bornkessel-Schlesewsky2007) that non-alternating Dat-Nom verbs in Icelandic are a category in flux, in that they have started adopting the behaviour of alternating Dat-Nom/Nom-Dat verbs. Roehm et al.’s conclusion is based both on an acceptability judgement task as well as on ERP data, but it is unclear exactly which verbs they included in their study. In all likelihood, the situation was exactly the opposite, with Dat-Nom verbs being derived from alternating Dat-Nom/Nom-Dat verbs, through the loss of the Nom-Dat alternant (cf. Barðdal Reference Barðdal2023:133–137).
Finally, our data show that topicalisation of this type is very rare in Icelandic. The only verbs found with object topicalisation in the double-NP configuration are mótmæla (two tokens) and þykja (one token). As for clauses with double pronouns, topicalisation is markedly more frequent with Dat-Nom verbs (44 out of 227 tokens, or 19%) than with Nom-Dat verbs (two out of 240 tokens, or 1%). However, it turns out that almost all fronted nominatives with líka-verbs are nominative demonstratives.
4.2 Alternating Dat-Nom/Nom-Dat verbs
In this section we present our findings for the class of alternating Dat-Nom/Nom-Dat verbs, also referred to here as nægja-verbs. The organisation of this subsection is as follows: we first discuss the general findings, i.e. the results across all four configurations (Section 4.2.1), after which we turn to word order variation in the [NP-V-NP] configuration (Section 4.2.2), the [Pro-V-Pro] configuration (Section 4.2.3), and finally we discuss configurations where one of the arguments is a pronoun (Section 4.2.4). The results are compared to the baseline set by Nom-Dat hjálpa-verbs and Dat-Nom líka-verbs.
4.2.1 General findings
The results for the class of nægja-verbs, which are presented in Table 3, generally confirm the alternating nature of these predicates: in total, the Nom-Dat linear order is attested 747 times, i.e. ca. 75%, and the Dat-Nom linear order 253 times, i.e. approximately 25% of the time on average across all five predicates. The alternating nature of nægja-verbs is also supported by either argument passing the subject tests, see Section 2.
Upon closer inspection, the data in Table 3 reveal three remarkable tendencies. First, the Nom-Dat linear order is generally more common than the Dat-Nom linear order. Secondly, there are notable differences between verbs, in that some seem to allow for word order alternation more readily than others. And, thirdly, it is also remarkable that henta, a verb discussed by Barðdal (Reference Barðdal1999, Reference Barðdal2001) as a prime member of the class of alternating verbs, does not yield a single Dat-Nom token.
Our results are generally also less evenly distributed than the ones Rott (Reference Rott2013) documents. He gathered corpus frequencies for the alternating predicates dyljast ‘be hidden’, henta ‘suit, befit’, veitast ‘find (hard/easy)’, and þóknast ‘satisfy, please’, and found that these verbs instantiate the Nom-Dat linear order 76 times, i.e. 51%, and the Dat-Nom linear order 72 times, i.e. 49%. Interestingly, the verb henta is included in Rott’s dataset, but it is unclear what its frequency distribution is, as he does not display any frequency counts for individual verbs. And, as is already stated in Section 1 above, Rott also includes clausal arguments in his investigation, which makes it even more difficult to compare his findings with ours.
The results most similar to the ones we have obtained here are probably the ones attained by Roehm et al. (Reference Roehm, Schlesewsky and Bornkessel-Schlesewsky2007). Their acceptability judgement task reveals that alternating verbs can be used equally felicitously in both case frames, but participants seemed to prefer the nominative-first structure. In their subsequent ERP-study, alternating verbs even elicited a violation response in the dative-before-nominative configuration, but since it is not made explicit which verbs Roehm et al. actually studied, that claim cannot be verified. In any case, it seems rather unexpected that all alternating verbs should elicit the same response, as the within-class variation is quite substantial, as we document here.
4.2.2 Word order variation in the [NP-V-NP] configuration
In total, alternating verbs are attested 217 times in the [NP-V-NP] configuration; 157 tokens (72%) instantiate the Nom-Dat linear order, and 60 tokens (28%) the Dat-Nom linear order. A more detailed overview of the frequencies per verb can be found in Table 4.
The frequencies in Table 4 are indicative of several different tendencies. First, frequencies in the [NP-V-NP] configuration are much less skewed than for ordinary Nom-Dat verbs or non-alternating Dat-Nom verbs, thereby confirming the generally alternating nature of Dat-Nom/Nom-Dat verbs. A chi-square goodness-of-fit test comparing the two word orders attested with nægja-verbs across all five verbs yields a highly significant result with a large effect size (𝝌2 = 80.14; df = 4; p two-tailed < .001; Cramér’s V = .61), which should be interpreted as a statistical indication that the distribution of the two word orders cannot be attributed to chance. A more in-depth analysis of the factors driving the alternation is presented in Section 5.
One of these factors, it seems, is verb type: with the exception of henta, all verbs are attested at least 21% of the time in either the Dat-Nom or the Nom-Dat linear order, but the degree to which they do so is verb-dependent. The verb duga, for instance, is clearly more permissive of clause-initial nominatives, whereas the opposite is true of dyljast and endast. The verb nægja is the most evenly balanced type, favouring a dative-first structure about as often as a nominative-first structure. One example of each word order is given in (14a–b).
Turning to henta, the generally skewed frequencies for that verb presented in Table 3 are evidently replicated in the [NP-V-NP] configuration in Table 4, and since nominal frequencies for this verb are very high (86 tokens), its tendency towards the Nom-Dat linear order can be taken to be very robust, which makes this result all the more enticing. Recall that previous research has confirmed henta’s status as an alternating verb, as both the nominative as well as the dative independently pass the subjecthood tests presented in Section 2, as is documented by Barðdal (Reference Barðdal1999, Reference Barðdal2001). Clearly, further research is needed to better understand henta’s behaviour as an outlier with respect to the word order test.
Also, it is striking how frequencies in the [NP-V-NP] configuration differ from the general frequencies presented in Table 3. For some verbs, like duga and nægja, the alternation is less skewed in the [NP-V-NP] configuration than it is in general, since the proportional frequencies move closer towards a 50–50 distribution. Other verbs, like dyljast and endast, tend more towards the Dat-Nom linear order in the [NP-V-NP] configuration.
Finally, our findings for alternating verbs in the [NP-V-NP] configuration tie in nicely with Allen’s (Reference Allen1995:108) study of Old English Dat-Nom verbs. Allen (Reference Allen1995) shows that the [NP-V-NP] configuration displays a symmetric distribution between the Nom-Dat linear order and the Dat-Nom linear order (21 vs. 19 attestations). This certainly confirms Allen’s (Reference Allen1995:116) claim that her Dat-Nom verbs are indeed alternating verbs in Old English, precisely like nægja-verbs in the present study. Unfortunately, exactly like Rott (Reference Rott2013), Allen does not specify how each individual verb weighs in on the alleged verb class effect, so (i) it is unclear whether all verbs in her sample can actually be regarded as alternating, and (ii) if they do, whether they are all equally attracted to both argument structure constructions.
4.2.3 Word order variation in the [Pro-V-Pro] configuration
Table 5 shows that in the [Pro-V-Pro] configuration alternating predicates almost invariably occur in the Nom-Dat linear order: out of 337 attestations, only 19, i.e. 6%, contain a dative in clause-initial position.
Some examples of Dat-Nom word orders involving pronouns are given in (15a–c), while examples of the more abundant Nom-Dat word order are given in (16a–c).
Table 5 also shows that the Nom-Dat linear order is not disproportionately associated with any one verb in particular, as frequencies are consistently higher than, or equal to, 92% per verb. In other words, these numbers clearly point towards an overarching verb class effect and not towards individual verb effects.
The findings for the [Pro-V-Pro] configuration also explain at least part of the skewness for alternating predicates in general, as the [Pro-V-Pro] configuration is not only heavily biased towards the Nom-Dat construction but is also very frequent in general, since it accounts for about one-third of all the data collected for nægja-verbs (318 tokens out of 1,000).
Given the skewed frequencies in the [Pro-V-Pro] configuration, it should not come as a surprise that tokens containing two personal pronouns show an equal bias: 81 out of 88, or 92%, instantiate the Nom-Dat order (not singled out in Table 5). These findings again mirror Allen’s (Reference Allen1995:109) results for 12 Old English alternating verbs, which, in the double personal pronoun configuration, also show a clear tendency towards the Nom-Dat order. This sets them apart from non-alternating líka-verbs in configurations with two personal pronouns, as these overwhelmingly tend towards the Dat-Nom order (63 tokens, or 95%) and not to the reverse Nom-Dat order (three tokens, or 5%).
This pronominal skewness with alternating verbs raises the question of whether occurrences with pronouns are perhaps unevenly distributed across the three verb classes in terms of frequency and whether that may possibly explain the high proportion of the Nom-Dat construction here. However, out of 3,000 observations in total for all 15 verbs (1,000 for each verb class) there are 664 Nom-Dat observations, 806 Dat-Nom observations, and 783 alternating Dat-Nom/Nom-Dat observations including at least one pronoun. This shows that alternating verbs are not particularly more frequent with pronouns in general, even though they yield most tokens in the [Pro-V-Pro] configuration (337 for alternating verbs, 227 for classical Dat-Nom verbs, and 240 for ordinary Nom-Dat verbs).
4.2.4 Configurations with one pronoun and one full NP
The current section zooms in on the two remaining configurations, i.e. contexts containing a nominative pronoun and a dative full NP and contexts with a dative pronoun and a nominative full NP. The results for the former are laid out in Table 6, which shows that nægja-verbs are strongly skewed towards the Nom-Dat order when the nominative is pronominal and the dative is a full NP: as many as 73 tokens across verbs (or 92%) allocate the preverbal slot to the nominative. The remaining six observations (or 8%) prefer the Dat-Nom order. This nominative-before-dative skewness is not an idiosyncrasy of individual verbs: it is a commonality of all nægja-verbs, thus pointing towards a verb class effect. The results presented in Table 6 are highly reminiscent of our findings for alternating verbs in the double-pronoun configuration (cf. Section 4.2.3 above). Recall that we found alternating verbs to instantiate the Nom-Dat order 94% of the time when both arguments were pronouns.
Our findings for the current configuration also raise the question of how non-alternating líka-verbs fare in contexts with a nominative pronoun and a dative full NP, as one might perhaps assume that the skewness found in Table 6 is a general effect of pronouns, not specific to nægja-verbs. It turns out that, across verbs, líka-verbs would much rather have the dative NP precede the nominative pronoun (47 tokens, or 82%) than the other way around (ten tokens, or 18%) (not shown in any table here). Exactly like in the [Pro-V-Pro] configuration, the bulk of Nom-Dat attestations is due to the effect of nominative demonstratives (eight out of ten, or 80%). Once again, this comparison shows that líka-verbs are quite distinct in behaviour from nægja-verbs: the former strongly adhere to the dative-before-nominative order irrespective of lexical specifications, while the latter are much more susceptible to pronominal influence. Thus, the behaviour of nominative pronouns, to be in first position with nægja-verbs, is not due to a general property of pronouns but represents a fact, specific to nægja-verbs.
Let us now explore the results of the most widely attested configuration for alternating verbs, i.e. one in which a dative pronoun enters into competition with a nominative full NP. These numbers are presented in Table 7. In configurations with dative pronouns and nominative NPs, the results show a relatively even distribution across the Dat-Nom and Nom-Dat word order patterns: the former occurs 46% of the time and the latter 54%. As soon as henta is removed from the dataset, the Dat-Nom order becomes even slightly more common than the Nom-Dat order, reaching a prevalence of 54% (vs. 46% Nom-Dat).
It is striking how well the inter-verb differences uncovered for the current configuration map onto the differences found in the double-NP configuration. That is, the extent to which individual verbs tend to alternate in both of these configurations is nearly identical, at least in relative terms.
Finally, the statistics obtained for alternating Dat-Nom/Nom-Dat verbs, when the dative is a pronoun and the nominative a full NP, deviate considerably from those of their non-alternating Dat-Nom counterparts: in configurations with a dative pronoun and a nominative full NP, líka-verbs opt for the Dat-Nom order 508 times (or 97%), yet only 14 times (or 3%) for the Nom-Dat order. Once more, this underscores the split of the overarching class of Dat-Nom verbs into alternating nægja-verbs and non-alternating líka-verbs. It also confirms that the behaviour of nominative pronouns in the [Pro-V-Pro] configuration is not a general effect of pronouns.
4.3 Interim conclusions
The findings presented in this section confirm that Icelandic indeed possesses a class of alternating Dat-Nom/Nom-Dat verbs, as we have here, for the first time in the literature, established with statistics that both arguments of nægja-verbs pass the word order test (excluding henta). This is evident from the fact that, in the [NP-V-NP] configuration, the Nom-Dat linear order is attested 72% of the time, and the Dat-Nom linear order 28% of the time. This is very different from both hjálpa- and líka-verbs, as 99.5% of all instances involving full NPs show up with the Dat-Nom vs. the Nom-Dat linear order, respectively, for the two verb classes, as is reiterated in Table 8.
We base our conclusions of neutral word order on attestations where both arguments are lexically realised as full NPs, as pronouns clearly impose an information-structural bias on word order, for instance, inducing topicalisation. Furthermore, Table 8 also shows that the results are all the more powerful once henta, the outlier, is excluded from the statistics, yielding 54% Nom-Dat and 46% Dat-Nom linear order.
The fact that henta consistently occurs in the nominative-before-dative linear order is a compelling result in itself. Its word order bias can be explained in two ways: (i) our sample is off, or (ii) henta is not an alternating verb. The former would be indicative of a discrepancy between what is theoretically possible and what is actually attested, the latter of a potential linguistic change, but both hypotheses warrant further investigation.
We now summarise the results obtained for the remaining three configurations, which all involve at least one pronoun. These essentially show that alternating nægja-verbs swing towards the Nom-Dat order whenever nominative pronouns are involved. Both the double-pronoun configuration as well as the configuration involving a nominative pronoun and a dative full NP instantiate the Nom-Dat order in more than 92% of all cases. One might perhaps believe that this pronominal effect with nægja-verbs is a derivative of animacy, as pronouns tend to refer to animate entities, which in turn tend to be subjects, thus occurring clause-initially (cf. Du Bois Reference Du Bois1987). This, however, is not the case for our dataset, as 99% of pronominal nominatives, even excluding correlates, are inanimate. Thus, the pronominal skewness towards the Nom-Dat order cannot be attributed to animacy.
Non-alternating líka-verbs are not altogether immune to the influence of nominative pronouns (and especially nominative demonstratives), but they only allow for 19% topicalisation in the [Pro-V-Pro] configuration and 18% in configurations with a nominative pronoun and a dative full NP. Alternating nægja-verbs in configurations with a dative pronoun and a nominative NP virtually mimic the frequencies obtained for the [NP-V-NP] configuration. Thus, our results confirm the status of nægja-verbs as a class in their own right, different from non-alternating líka-verbs.
In the next section, we investigate the factors underlying each word order pattern of alternating verbs by appealing to a set of variables that are known to influence linearisation, including animacy, definiteness, referentiality, pronoun type, person, number, and length. Conveniently, several of these may also serve as a proxy for topicality, as topics tend to be animate, definite, referential, pronominal, and short (Givón Reference Givón and Li1976:152, Arnold et al. Reference Arnold, Losongco, Wasow and Ginstrom2000:34, Croft Reference Croft2003:178–179, Rosenbach Reference Rosenbach2008:156, Arnold et al. Reference Arnold, Kaiser, Kahn and Kim.2013:406, Cristofaro 2013:74, Reference Cristofaro, Karsten Schmidtke-Bode, Maria Michaelis and Seržant2019:28, Booth & Beck Reference Booth and Beck2021:11, inter alia). It has indeed been argued in the literature that, other things being equal, alternating predicates allocate the preverbal slot to the argument that is most topical in the discourse (Barðdal Reference Barðdal2001, Barðdal, Eythórsson & Dewey Reference Barðdal, Eythórsson and Kim Dewey2014, Reference Barðdal, Eythórsson and Kim Dewey2019). By modelling the word order variation statistically, we aim to uncover which factors have a direct bearing on linearisation. Additionally, we hypothesise the results to converge towards those values that have been shown to correlate with topicality (cf. supra). That way, we bring into the equation a variable that has not been explicitly factored in, but that may still have an influence on the alternation under study.
5. Statistical modelling
In what follows, we investigate the factors guiding the word order variation in the set of alternating Dat-Nom/Nom-Dat verbs by means of two logistic regression models. This involves a comparison across all configurations, on the one hand, and across double NPs, on the other. The reason why we single out double NPs is because pronouns are well known to skew word order preferences (Du Bois Reference Du Bois1987, Croft Reference Croft2012, Cristofaro Reference Cristofaro, Bakker and Haspelmath2013, Reference Cristofaro, Karsten Schmidtke-Bode, Maria Michaelis and Seržant2019, Booth & Beck Reference Booth and Beck2021, inter alia). Importantly, since henta behaves as a clear outlier, it is excluded from the remainder of the analysis. As such, the results presented in the current section are only based on our findings for the verbs duga, dyljast, endast, and nægja.
The purpose of the logistic regression analyses is to identify, and quantify, nuanced empirical interactions between the factors that we hypothesise are involved in the alternation. These include the variables discussed in Section 3, for which the dataset is annotated, namely case marking, pronominality, pronoun type (if applicable), referentiality, person, number, definiteness, animacy, and length. The dependent variable is argument position, either first or second. Binary logistic regression is a probabilistic algorithm that models the outcome as a probability, conditional on the value of the predictor variables (Harrell Reference Harrell2015:219). Although binary logistic regression makes relatively few assumptions, as with any regression model, collinearity (i.e. correlation) among the predictors can be a concern (Harrell Reference Harrell2015:255).
We have chosen to use ordinary logistic regression rather than mixed-effect/multilevel models. The reason for this is simple: the natural group-level (or random effect) in such a model would be verbs, but a much larger number of verbs would need to be included to defend the added complexity of mixed-effect models (Gelman & Hill Reference Gelman and Hill2007:247). When the random effect variable has few levels, mixed-effects logistic models reduce to ordinary logistic models with only fixed effects, in the absence of meaningful group-level variation.
For evaluating the logistic regression models, we rely on a combination of inspecting model residuals (Gelman & Hill Reference Gelman and Hill2007:97–101) and measures of predictive capability, in particular the c-index, since formal tests of fit are often inappropriate for logistic regression (Harrell Reference Harrell2015:236). The c-index measures the proportion of correctly classified responses when comparing the predictions of the model with the observed values in the dataset (Harrell Reference Harrell2015:257). While several other measures exist (Harrell Reference Harrell2015:256–257), we have chosen to report the c-index since it has a reasonably intuitive interpretation. A c-index value of .5 indicates random choice, 1.0 indicates perfect prediction, and .8 and above is often taken as an indication of good predictive capability (Baayen Reference Baayen2008:284, Harrell Reference Harrell2015:257). However, it is worth noting that there is some arbitrariness in these thresholds and in medicine, for example, the threshold for an acceptable model is usually taken to be .7 (Hartman et al. Reference Hartman, Kim, He and Kalbfleisch2023, White et al. Reference White, Parsons, Collins and Barnett2023). Importantly, over-reliance on a single measure can be detrimental, as noted by White et al. (Reference White, Parsons, Collins and Barnett2023), which is why we use the c-index alongside inspection of model residuals, bearing in mind that although a higher c-index is better than a lower one, it does not tell the whole story.
The logistic regression models have all been fitted in R using the rms package. A close inspection of the binned model residuals shows no signs of structural problems. Due to the skewed (i.e. non-symmetrical) distribution in the underlying data set, the length variable has been transformed by taking the natural logarithm of the observed data, which reshapes the data by adjusting the scale, resulting in a more symmetrical distribution. Although other logarithm bases would work equally well for the data transformation, the natural logarithm benefits from being directly interpretable in the model as proportional differences (Gelman & Hill Reference Gelman and Hill2007:60–61). A positive regression coefficient indicates that a variable is associated with the first argument position, while a negative value signals association with the second argument position.
Section 5.1 presents the results of the logistic regression analysis modelling all the data obtained across configurations, whereas Section 5.2 singles out the tokens instantiating the double-NP configuration. As such, the first model is based on all 800 observations and the second on 131 observations. In Section 5.3 we discuss how our findings tie in with the concept of topicality.
5.1 Across configurations
The output of the first logistic regression model, which builds on all 200 observations per verb type, is presented in Table 9. Recall that henta, the outlier, has been excluded. The c-index of the model is .794, a value that is only decimals away from what is commonly taken as a good predictive capability (Baayen Reference Baayen2008:204, Harrell Reference Harrell2015:257).
Table 9 shows the logistic regression coefficient (𝛽), standard error (SE), z-score (Z) and p-value (p) for eight variables, seven of which exert a significant influence on the alternation. Only animacy (value: inanimate) does not have any predictive power, presumably because it strongly correlates with nominative case. Positive regression coefficients indicate an association with the first argument slot, while negative regression coefficients indicate an association with the second slot. As such, animacy (value: individual), animacy (value: NA), case (value: nominative), and person (modelled numerically, see below) are associated with the first argument position, whereas definiteness (value: indefinite), length, and number (value: singular) are tied to the second argument position. Importantly, factors generating a significant effect do not necessarily correlate with one other. As an example, the second argument is very often either indefinite or long, but it is not necessarily associated with both properties at the same time. All the logistic regression coefficients represent the log-odds ratio of switching from second to first position.
Starting with the variables tied to the first argument position, the model attributes a major effect to nominative case marking and animacy (value: NA), which generate a coefficient of 2.26 and 1.33, respectively (corresponding to a 56.5% and 33.2% increase in the likelihood of switching from second to first argument slot). The fact that the former, nominative case, accounts for such a large portion of the variation is hardly surprising, as approximately two thirds of all tokens (i.e. 547 out of 800) with duga, dyljast, endast, and nægja instantiate the Nom-Dat order (cf. Section 4.2.1 above). Nevertheless, the question remains what exactly this means. That is, are alternating verbs, in fact, more strongly drawn to the Nom-Dat order than they are to the reverse Dat-Nom order? Or is the Nom-Dat bias in our sample merely the result of chance? We return to this point in Section 5.2 below.
The second variable closely tied to the first argument position, i.e. animacy (value: NA), captures all instantiations of correlative það ‘it’, which is a third person personal pronoun anticipating a subclause. Correlative það, exactly like expletive and existential það, is indeed well known to occur clause-initially in Icelandic (Rögnvaldsson Reference Rögnvaldsson2002, Thráinsson Reference Thráinsson2007:366–367, inter alia). Still, the fact that this correlate so willingly takes initial position is remarkable, as it goes against the expectation that semantically impoverished units should take a less prominent position than arguments referring to extralinguistic entities (cf. Siewierska Reference Siewierska1993:831). Somers & Barðdal (Reference Somers and Barðdal2023:19–21), for instance, have shown that the strong inclination of Icelandic correlates towards the nominative-before-dative order is something that sets these apart from their German counterparts, since the latter are much more permissive of alternation.
The third variable whose connection with the first argument position is significant is person. We decided to encode this variable as numeric, not categorical, and every unit increase in person (i.e. from first to second, and from second to third) yields an increase in association with the first slot. The effect is quantified by the coefficient as .74. This means that the likelihood of an argument being realised preverbally increases by approximately 18.5% for every one-unit increase along the scale. Thus, constituents are overall more likely to be allotted the first slot if their referent is non-local, third person, as opposed to local, first and second person. Interestingly, this constitutes a violation of the person hierarchy, which stipulates that local pronouns should take precedence over both non-local pronouns and full NPs (Silverstein Reference Silverstein and Dixon1976, Siewierska Reference Siewierska1993:831, Croft Reference Croft2003:130, Haude & Witzlack-Makarevich Reference Haude and Witzlack-Makarevich2016, inter alia).
The reason, we believe, that non-local referents, i.e. third person, are more likely to occur in the first slot, despite the person hierarchy, is due to the effect of correlates, which nearly always occur preverbally (140 out of 148 cases, or 95%), as opposed to postverbally (eight out of 148 cases, or 5%). The number of local pronouns in our dataset is quite limited (a mere 250 across argument positions). One reason why local pronouns are so rare is because these are never nominative in our dataset (however, see below). Local pronouns are also more strongly tied to the second slot (197 out of 250 cases, or 79%). Interestingly, 118/197 postverbal (dative) local pronouns compete with a nominative pronoun, and nominative pronouns show a strong tendency to take first position in any case.
Moreover, in accordance with the person hierarchy, if a nominative is a first or second person pronoun, i.e. a local pronoun, only the Nom-Dat word order is acceptable with alternating predicates in Icelandic. One such attested example, cited from Barðdal & Eythórsson (Reference Barðdal and Eythórsson2003), is given in (17a) with the nominative við ‘we’ in first position. As is shown in (17b), if the nominative is a local pronoun, the Dat-Nom construction is ungrammatical, since the nominative við ‘we’ cannot occur in the object position immediately following the non-finite verb. This analysis is further confirmed by the example in (17c), which shows that if the dative occurs in first position, the local pronoun must invert with the finite verb, which is a clear-cut subject property. For obvious pragmatic reasons, the constructed examples in (17b–c) render the dative fólki ‘people’ as a definite NP instead of an indefinite one.
Recall that at the beginning of this article (see examples 3–4), we provided evidence for the analysis that alternating verbs may instantiate two diametrically opposite argument structure constructions, Dat-Nom and Nom-Dat, and that neither structure is a topicalisation of the other. This is invariably true in all cases except when the nominative is a local pronoun, as in (17) above.
Returning now to the logistic regression analysis, another variable that shows a mild preference for the preverbal slot is animacy (value: individual), with a coefficient of .77. Evidently, this is fully in line with the expectation that animate beings should take precedence over both collectives and inanimates (Allan Reference Allan1987, Siewierska Reference Siewierska1993, Dahl & Fraurud Reference Dahl, Fraurud, Fretheim and Gundel1996, inter alia). Rott (Reference Rott2013) also found an effect for animacy in his study of four Icelandic alternating verbs. More specifically, he has shown that the nominative is hardly ever animate, but that it invariably precedes the dative when it is. Our data are indicative of a similar trend: out of 800 nominative constituents, a mere 14 are animate. Of these, 11 (or 79%) are attested in the Nom-Dat order, and three (or 21%) in the reverse Dat-Nom order. The same holds, mutatis mutandis, for the dative: it is hardly ever inanimate (19 out of 800 tokens), but when it is, it shows a very strong preference for the Nom-Dat order (15 tokens, or 79%) and not the Dat-Nom order (four tokens, or 21%).Footnote 6 Observe that we hereby dispel the myth that the dative of Dat-Nom verbs is animate by definition (see Kutscher Reference Kutscher2009:24, Verhoeven Reference Verhoeven2009, Reference Verhoeven2015, Rott Reference Rott2013:93). The tendency for the dative of such verbs to be animate is indeed very strong, but it is by no means an absolute.
Two variables that associate with the postverbal slot are length and definiteness (value: indefinite) (coefficients −.83 and −1.08). Again, these facts rhyme well with what is known in the literature, namely that indefinite and heavy constituents tend to occur later in the clause (see Behaghel Reference Behaghel1909/10, Allan Reference Allan1987, Siewierska Reference Siewierska1993, Arnold et al. Reference Arnold, Losongco, Wasow and Ginstrom2000, inter alia). Specifically with regard to the Dat-Nom/Nom-Dat alternation, Rott (Reference Rott2013) has also found length to exert an effect, but his evidence for this factor only relates to the distribution of clausal nominatives, which greatly prefer the Dat-Nom order (82 out of 87 tokens, or 94%) to the Nom-Dat order (five out of 87 tokens, or 6%). Our own study conclusively confirms that length is a factor even when both arguments are (pro)nominal.
Rott also finds an effect for definiteness, but the results are again clouded by the high number of clausal nominatives in his dataset. Starting with the Nom-Dat order, he reports that 70 out of 73 nominatives (or 96%) are definite, compared to 55 out of 78 datives (or 71%). Thus, the Nom-Dat order clearly correlates with definite nominatives. As for the Dat-Nom order, 48 out of 94 datives (or 51%) are definite, compared to ten out of 12 nominatives (or 83%). However, since the Dat-Nom order so strongly correlates with clausal nominatives already (cf. supra), the speaker’s choice for this alternant has already been accounted for.
Additionally, our model also connects singular constituents with the second argument position, but their effect is considerably weaker (coefficient −.30) than for the remaining variables.
5.2 Double-NP configuration
The results of the second logistic regression model, which is solely based on the 131 tokens containing double NPs, are shown in Table 10. The c-index is .7, indicating a lower predictive power than the previous model (whose c-index is .794) although still in a range that would be considered acceptable in some fields. A lower c-index value is not unexpected given that the dataset on which the model is trained is smaller, and as such we consider it worth reporting on, especially since the model residuals do not indicate any particular issues.
A comparison with the first model yields fascinating results. First, the current model singles out three variables that were also identified as strong predictors by the first model, i.e. nominative case marking, indefiniteness, and length (coefficients 1.16, −1.15, and −.54, respectively, corresponding to a 29% increase, a 28.8% decrease, and a 13.5% decrease in the likelihood of switching from second to first argument slot). The effects of case marking and length in particular seem to be somewhat mitigated compared to the first model, but both still generate highly significant p-values. Second, the weaker predictors pertaining to both animacy and number no longer appear to have any predictive power, and third, the person variable has been levelled out, as full NPs are self-evidently always third person.
The second logistic regression analysis essentially reveals two tendencies. As with the first analysis, it demonstrates the importance of nominative case marking for the first argument slot. Recall that the four alternating verbs under study yield a total of 71 Nom-Dat tokens and 60 Dat-Nom tokens. The difference between both these subsets is taken to be statistically significant, but, again, it remains to be investigated (i) whether a different sample would equally single out nominative case marking as a significant predictor, and (ii) whether other alternating verbs are equally sensitive to the effect of nominative case marking.
Second, the analysis shows that both indefinite and lengthy constituents have a proclivity for the second argument slot. The model neither reveals whether these variables are interrelated, nor whether they correlate with case marking. We discuss these issues further in the following subsection.
5.3 Interim conclusions
The current section has provided an in-depth statistical analysis of the alternating verbs duga, dyljast, endast, and nægja by means of two logistic regression models, one scrutinising the results across configurations (n = 800), another exploring the results for the double-NP configuration (n = 131). We have identified several well-known predictors from studies of word order as playing a role here, most notably animacy, indefiniteness, and length. Both models also single out nominative case marking as a predictor for the first argument slot.
Starting with the first three predictors, we have found animate (dative) constituents to prefer the first argument slot and indefinite and long constituents to prefer the second argument slot. These findings are evidently interesting in themselves, but the key question is whether there is a greater generalisation to be made. In fact, as is already mentioned in Sections 3 and 4.3 above, these three factors, animacy, indefiniteness, and length, are all proxies for topicality. That is, topical constituents tend to be animate rather than inanimate, and non-topical constituents tend to be indefinite and long rather than definite and short (Givón Reference Givón and Li1976:152, Croft Reference Croft2003:178–179, Arnold et al. Reference Arnold, Kaiser, Kahn and Kim.2013:406, Cristofaro Reference Cristofaro, Bakker and Haspelmath2013:74, Booth & Beck Reference Booth and Beck2021:11, inter alia). Thus, apart from their relevance as individual predictors, the factors in question also suggest that word order with alternating verbs is a derivative of discourse prominence (cf. Barðdal Reference Barðdal2001, Barðdal, Eythórsson & Dewey Reference Barðdal, Eythórsson and Kim Dewey2014, Reference Barðdal, Eythórsson and Kim Dewey2019).
Turning now to the last predictor correlating with the first argument slot, nominative case marking, the question arises whether the nominative is a factor in itself or whether the case marking is an epiphenomenon of other factors, such as, for instance, pronominality. Out of 331 nominative pronouns in first position, 42%, or 140 instances, are correlates, which show a clear preference for first position in Icelandic in any case (Rögnvaldsson Reference Rögnvaldsson2002, Thráinsson Reference Thráinsson2007:366–367, inter alia). A closer inspection of the [Pro-V-Pro] configuration, including correlates, reveals that nominative pronouns in first position show an average length of 1.1 words, while dative pronouns in second position have an average word length of 1.9. This suggests that the nominative-first effect with two pronouns is a consequence of length.
However, such a length effect with nominatives in first position is not found for the double-NP configuration. Instead, nominatives in first position turn out to be definite in 43 out of 71 instances (61%), which clearly makes them topical. What is more, a gauge at the 28 indefinite examples of nominatives in first position reveals that they are either specific or simply more topical than the dative, as is evident from example (18) below, despite being indefinite.
Observe that the indefinite nominative, rigningarvatn ‘rainwater’, in the Nom-Dat example above reiterates information mentioned earlier in the discourse, with both the former (accusative) rigningarvatn ‘rainwater’ and rignir ‘rains’ rendering the latter (nominative) rigningarvatn ‘rainwater’ highly topical, despite it being indefinite. This not only shows that topicality may not simply be reduced to one (or more) of its proxies but also that it has considerable explanatory power of its own. At least for the [NP-V-NP] configuration, it seems promising to explicitly factor in topicality as a predictor variable, because the effect of nominative case uncovered in the present study appears to be an epiphenomenon of a topic-first effect rather than a veritable nominative-first effect in itself.
6. Summary and conclusions
In this article we have succeeded in lending empirical support to the claim that behavioural subjects in Modern Icelandic are strongly tied to clause-initial position, regardless of whether these are marked in the nominative or the dative case. For this purpose, we have extracted 200 examples of 15 verbs each from the Icelandic Web 2020 corpus, thus amounting to a total of 3,000 tokens, all occurring with a dative and a nominative. The first class consists of five ordinary Nom-Dat verbs like hjálpa ‘help’, the second consists of five classical Dat-Nom verbs like líka ‘like’, and the third one of five alternating Dat-Nom/Nom-Dat verbs like nægja ‘find/be sufficient’.
The dataset has been annotated for nine variables: case marking, (pro)nominality, type of pronoun, referentiality, person, number, definiteness, animacy, and length. Our goal has been to provide statistical evidence of our alternating analysis for nægja-verbs, namely that these verbs alternate between two word orders, dative-before-nominative and nominative-before-dative, due to the fact that they instantiate two diametrically opposite argument structures, i.e. Dat-Nom and Nom-Dat.
We first establish a baseline with the help of ordinary Nom-Dat verbs, or hjálpa-verbs, and non-alternating Dat-Nom verbs, or líka-verbs, in configurations with two full NPs. It turns out that both these verb classes, hjálpa- and líka-verbs, realise the syntactic subject clause-initially 99.5% of the time. In contrast, for alternating Dat-Nom/Nom-Dat verbs, i.e. nægja-verbs, our findings generally confirm that the subject is the first argument of the argument structure, be that the dative or the nominative.
When nægja-verbs occur with two full NPs, their distribution is considerably less skewed towards one of the two argument structure constructions than with either hjálpa- or líka-verbs. There are, however, substantial differences found across verbs, with the Nom-Dat case frame attested more frequently than the Dat-Nom case frame, or in 72% vs. 28% of the cases. This number of 28% Dat-Nom is considerably higher than the 0.5% baseline for topicalisation documented with hjálpa- and líka-verbs above, and it is also noticeably higher than the 8% topicalisation documented by Callegari & Ingason (2021). This, in turn, rules out a topicalisation analysis of dative-before-nominative word orders with nægja-verbs. As a matter of fact, there is one particular verb, henta, that behaves unexpectedly in that it occurs consistently with the Nom-Dat linear order, irrespective of whether the two arguments are realised as full NPs or as pronouns. Thus, when recalculating the numbers for full NPs without the outlier, henta, the distribution amounts to 54% Nom-Dat vs. 46% Dat-Nom. Again, this rules out a topicalisation analysis of the dative-before-nominative order altogether.
Our analysis of nægja-verbs has also shown that their word order distributions are considerably more prone to pronominal influence than the ones attested for either hjálpa- or líka-verbs. More specifically, in contexts where the nominative is pronominal, nægja-verbs strongly prefer the nominative to precede the dative. However, contexts in which a dative pronoun enters into competition with a nominative full NP show the same word order distributions as the [NP-V-NP] configuration. These findings confirm the status of alternating Dat-Nom/Nom-Dat verbs as a syntactic class in their own right, distinct from non-alternating Dat-Nom verbs.
Finally, we have modelled the word order variation of nægja-verbs statistically. Recall that we removed henta from our dataset, as its frequencies were unexpectedly skewed. Across configurations, word order patterns are prone to a host of factors, including nominative case marking, indefiniteness, length, animacy, and person. For the double-NP configuration, the logistic regression analysis has identified nominative case marking, indefiniteness, and length as the most important predictors.
As it turns out, the factors underlying the variation in word order, both across configurations as well as in the double-NP configuration, converge nicely in that all these appear to reflect topicality in one way or another. After all, topicality is highly interwoven with animate, pronominal, definite, and short constituents. The only two exceptions to the topical-first trend that we have uncovered involve nominative case marking and person. Starting with person, third person arguments are generally relatively equally divided across the two positions, except for correlates, which occur in first position 95% of the time. Thus, we believe that the third person effect, detected in the logistic regression analysis for first position, is an epiphenomenon of this.
Turning to nominative case marking, we have also shown that in the double-pronoun configuration, which favours the Nom-Dat word order, the preverbal nominative pronoun is considerably shorter than the postverbal dative pronoun, indeed suggesting that the real issue here is length rather than case marking. Regarding the configuration with double NPs, 61% of the nominatives in first position are definite, again confirming the role of topicality. The remaining 39% of the preverbal nominatives are indefinite, yet an initial inspection of these instances shows that the majority are topical, although some are specific. This again validates the role of topicality, also for double NPs, confirming that the strongly observed nominative-first effect is an artefact of topicality.
To conclude, comparing nægja- and líka-verbs, we have shown that the former, but not the latter, have a choice between two alternating constructions, Dat-Nom and Nom-Dat. It turns out that well-worn pragmatic factors such as topicality govern the choice between the two diametrically opposite constructions with nægja-verbs. In contrast, with líka-verbs, the grammar does not provide this option to begin with, meaning that this verb class is confined to the Dat-Nom argument structure construction.
Regarding future research, the most pressing issue at this point is a comparison of the behaviour of alternating Dat-Nom/Nom-Dat verbs across the languages where such a class has been shown to exist, for instance Russian, Lithuanian, Romanian, Latin, and Ancient Greek (cf. Barðdal Reference Barðdal2023:Ch. 3 and the references therein). A particularly promising comparison is one between Modern Icelandic and Modern German, due to their close kinship. For a first attempt at such a venture, see Somers & Barðdal (Reference Somers and Barðdal2023), although a more fine-grained analysis of the relevant data is needed to improve our understanding of the factors at play.
Acknowledgements
This is a heavily revised version of Somers & Barðdal (Reference Somers and Barðdal2022) in Working Papers in Scandinavian Syntax (WPSS 107). For comments and/or discussions, we thank Johan Brandtler, Ludovic De Cuypere, Torsten Leuschner, the editor, Marit Julien, three anonymous reviewers of NJL, and the audiences at Constructions in the Nordics 3 in Kiel in September 2022, at the Belgian Taaldag in Liège in October 2022, at the North by Northwest seminar at Lyon University in November 2022, at the VII CONECT Internacional in Brazil in November 2022, at the Amazonicas IX in Bogotá, Colombia, in June 2023, and the audience at the Forschungskaleidoskop seminar at Hamburg University in June 2023. This research is a part of a larger project on Language Productivity at Work (Co-PI Jóhanna Barðdal), generously funded by Ghent University’s Special Research Fund’s Concerted Research Action Scheme (BOF-GOA grant no. 01G01319).
Authors’ contributions
All three authors designed the research and planned and wrote the manuscript; JB and JS gathered the data; JS coded the data; GBJ performed the computational analysis; all authors contributed equally to the discussion and the interpretation of the results.