Introduction
A classic topic in the field of Second Language Acquisition (SLA), and the cognitive sciences at large, concerns the role of age of acquisition for nativelike attainment in a second language (L2). Since Lenneberg's (Reference Lenneberg1967) formulation of the Critical Period Hypothesis (CPH), well over a hundred studies have sought to ferret out the effects that timing of exposure exerts on L2 acquisition, showing that those who start learning the L2 in childhood in the long run outperform those who start in adulthood. As classic a topic as age of acquisition effects is, it is also highly controversial, having instigated vigorous discussions throughout the decades. The debate has largely focused on the ultimate cause of age effects – that is, whether they are biological, experiential, socio-psychological, cognitive, etc. in nature – rather than on their actual existence.
Recently, however, the finding that individuals who acquired the L2 during childhood do not always converge fully with native speakers has called into question age of acquisition as the cause of such near-native (rather than fully nativelike) attainment. As an alternative explanation, it has been suggested that, rather than age of acquisition, bilingualism – in the sense of either bilingual acquisition, bilingual use, or both – accounts for the subtle non-native features in early-learner ultimate attainment, and, by inference, also the near-nativeness of exceptionally advanced adult L2 learners (e.g., Birdsong, Reference Birdsong2018; Birdsong & Quinto-Pozos, Reference Birdsong and Quinto-Pozos2018; de Leeuw, Reference de Leeuw, Thomas and Mennen2014; Ortega, Reference Ortega2010, Reference Ortega and May2013; Pfenninger & Singleton, Reference Pfenninger and Singleton2017). This suggestion relates to the fact that most studies on nativelike attainment compare L2 speakers who have retained their first language (L1), and therefore are functionally bilingual, with native speakers who are functionally monolingual, thus effectively confounding age of acquisition effects with bilingualism effects.
The methodological practice of comparing bilingual L2 speakers with monolingual L1 speakers becomes particularly problematic in the light of frameworks suggesting that the linguistic behavior of bilinguals inherently differs from that of monolinguals (e.g., Cook, Reference Cook1999, Reference Cook, Wei and Cook2016; Flege, Reference Flege1999; Grosjean, Reference Grosjean1998), as this may ultimately render any observations on age effects inconclusive. However, despite various iterations of the notion of bilingualism effects on L2 ultimate attainment, few studies have actually attempted to address this question empirically. Thus, while it is indeed an intriguing possibility that bilingualism, rather than age of acquisition, underlies the subtle non-nativelikeness of many childhood (as well as exceptionally advanced adult) learners, this suggestion largely remains at the level of speculation due to the absence of solid empirical data.
The current study aims to address this gap, by assessing the relative impact of age of acquisition and bilingualism on L2 ultimate attainment. To achieve this, the study introduces a unique experimental design, which, in addition to an L2 bilingual group and an L1 functionally monolingual group, includes simultaneous bilinguals and international adoptees. In this design, the variables age of acquisition at birth/after birth vs. mono-/bilingualism are fully crossed.
Background
The nativelikeness paradigm in CPH research
The notion that biologically scheduled changes in brain plasticity underlie child-adult differences in L2 ultimate attainment would seem to find support in research showing that non-maturational variables such as length of L2 exposure, educational level, and motivation, while important in (especially adult) L2 acquisition, only exert marginal impact compared to age of acquisition (AoA). Indeed, studies using partial correlations or regression analyses have repeatedly shown that the contributions of experiential and socio-psychological variables drop considerably (often to non-significant levels) when the AoA variable is partialled out, whereas the impact of AoA remains strong and relatively unaffected when the contributions from these variables are removed (e.g., Abrahamsson, Reference Abrahamsson2012; DeKeyser, Reference DeKeyser2000; DeKeyser, Alfi-Shabtay & Ravid, Reference DeKeyser, Alfi-Shabtay and Ravid2010; DeKeyser & Larson-Hall, Reference DeKeyser and Larson-Hall2005; Granena & Long, Reference Granena and Long2013; Johnson & Newport, Reference Johnson and Newport1989). To this end, then, the maturation of the brain would still seem a strong explanatory candidate for AoA effects. However, despite some promising explanatory frameworks, such as the scheduled process of myelination of language-related cortical areas (e.g., Pulvermüller & Schumann, Reference Pulvermüller and Schumann1994) or the age-related switch from (predominantly) implicit/procedural memory to (predominantly) explicit/declarative memory in language development (e.g., Paradis, Reference Paradis2004, Reference Paradis2009; Ullman, Reference Ullman2004, Reference Ullman, VanPatten and Williams2015), any operationalizable neurophysiological correlates to maturation that can be closely associated with AoA are still lacking. Moreover, the AoA variable may well disguise the effects of any as yet unmeasurable (or hitherto ill-measured) non-maturational factor(s) – adult speakers’ retrospectively self-assessed motivation for acquiring the L2 in childhood being just one of many examples. With the substance of the AoA variable still shrouded in darkness, the correlational approach in CPH research thus finds itself in an unfortunate deadlock.
Therefore, an alternative way of addressing the impact of maturational constraints has been to look exclusively for individual counterexamples to the hypothesis that only child learners are capable of attaining nativelike L2 proficiency and behavior. We refer to this approach as the ‘nativelikeness paradigm’. The Popperian rationale behind the approach, as originally presented in detail by Long (Long, Reference Long1990, Reference Long, Hyltenstam and Viberg1993, see also Long, Reference Long2007, Reference Long, Granena and Long2013), is that, if at least one such individual post-critical period learner could be identified who, even after broad and detailed scrutiny, can be shown to exhibit the same linguistic knowledge and behavior as native speakers, then the CPH can be safely rejected, and the well-documented average adult disadvantage should instead be ascribed to factors other than neurobiology.
Long (Reference Long1990, Reference Long, Hyltenstam and Viberg1993) moreover recommended that researchers should use only linguistic tasks and structures that highly advanced learners potentially do not command; that the level of cognitive demand, item difficulty, and linguistic scrutiny in nativelikeness studies should be significantly higher than in studies of beginner or intermediate L2 proficiencies; and that a broad range of language abilities (rather than narrowly selected linguistic features of a limited language domain) should be scrutinized in these learners’ ultimate attainment (for similar arguments and elaborations of this last point, see, e.g., Abrahamsson & Hyltenstam, Reference Abrahamsson and Hyltenstam2009; DeKeyser, Reference DeKeyser, Gass and Mackey2012; Granena & Long, Reference Granena and Long2013; Sorace & Robertson, Reference Sorace, Robertson and Elder2001; Veríssimo, Reference Veríssimo2018; Veríssimo, Heyer, Jacob & Clahsen, Reference Veríssimo, Heyer, Jacob and Clahsen2018).
A previous project from the Stockholm lab (reported in Abrahamsson & Hyltenstam, Reference Abrahamsson and Hyltenstam2008, Reference Abrahamsson and Hyltenstam2009; see also Bylund, Reference Bylund2011; Bylund, Abrahamsson & Hyltenstam, Reference Bylund, Abrahamsson and Hyltenstam2010, Reference Bylund, Abrahamsson and Hyltenstam2012; Hyltenstam, Bylund, Abrahamsson & Park, Reference Hyltenstam, Bylund, Abrahamsson and Park2009; Stölten, Abrahamsson & Hyltenstam, Reference Stölten, Abrahamsson and Hyltenstam2014, Reference Stölten, Abrahamsson and Hyltenstam2015) aimed to follow Long's (Reference Long1990, Reference Long, Hyltenstam and Viberg1993) recommendations as closely as possible. The focus was set exclusively on L2 speakers who passed for native speakers in everyday oral interaction, the rationale being that there is no point in subjecting obviously non-nativelike speakers to extensive linguistic scrutiny just to declare them non-nativelike. A total of 195 candidates, who self-reported as potentially nativelike L2 speakers of Swedish (AoA 1–47 y/o), were first screened through naïve native listener judgments of their spontaneous speech. Out of these, 41 speakers were eventually selected, all of whom were perceived as native speakers by a majority of the judges (minimally 6 out of 10), and were subjected to detailed linguistic scrutiny through a challenging test battery. Thirty-one of these were early learners (AoA 1–11 y/o), and ten were late learners (AoA 13–19 y/o).
The results revealed that every late (seemingly nativelike) learner, and many of the early learners, were in fact near-native (as opposed to nativelike) when scrutinized in detail. For example, when the production and the categorical perception of voice onset time (VOT) were combined, for all three (i.e., bilabial, dental, and velar) places of articulation, as predicted, none of the 10 late learners fell within the native-speaker range, while, at the same time, only 16 of the 31 early learners did so (see Stölten et al., Reference Stölten, Abrahamsson and Hyltenstam2014). When these same learners’ performance on 10 different accuracy and processing measures within various domains and modes of their L2 Swedish (phonology, morphosyntax, lexis, perception through different types of noise, etc.) was analyzed, the pattern was even clearer: again, none of the adult learners, and only a handful of the early learners, performed within the range of native-speaker controls on a majority of the measures (for details, see Abrahamsson & Hyltenstam, Reference Abrahamsson and Hyltenstam2009; for similar patterns in other advanced-learner samples, see Abrahamsson, Reference Abrahamsson2012; Hyltenstam, Reference Hyltenstam and Harris1992; Hyltenstam & Abrahamsson, Reference Hyltenstam, Abrahamsson, Fraurud and Hyltenstam2003a; Hyltenstam et al., Reference Hyltenstam, Bylund, Abrahamsson and Park2009).Footnote 1
The findings reported in Abrahamsson and Hyltenstam (Reference Abrahamsson and Hyltenstam2009) were then taken as evidence that even short delays of language exposure may have minor but scientifically detectable consequences for L2 ultimate attainment, potentially indicative of the brain's decreasing capacity for nativelike language acquisition already at early AoAs. Such a conclusion is on par with accounts of atypical L1 development where small delays in L1 exposure compromise ultimate attainment, as seen both in congenitally deaf children with delayed sign-language exposure (see, e.g., Mayberry & Kluender, Reference Mayberry and Kluender2018; Morford & Mayberry, Reference Morford and Mayberry2000) and in children with severe otitis media during their first year of life (e.g., Mody, Schwartz, Gravel & Ruben, Reference Mody, Schwartz, Gravel and Ruben1999; Ruben, Reference Ruben1999) (for an overview, see Werker & Hensch, Reference Werker and Hensch2015).
That brain maturation is a potential cause of childhood learners’ less than nativelike L2 ultimate attainment, is not, however, an interpretation that has been embraced by everyone. Instead, results such as those above have been re-interpreted by several scholars as evidence that bilingualism, not maturation, is what lies behind the less than nativelike ultimate attainment of both early and exceptionally advanced late learners. This argument will be reviewed next.
Monolingual bias, bilingualism effects, and the ‘bi/multilingual turn’ in CPH research
The status of the L2 learner's L1, and the role of cross-linguistic influence generally, has fluctuated considerably over time in SLA theory building. From having been given an absolute role under the behaviorist (pre-modern SLA) era, via a next to negligent role during the first decades of interlanguage theory development and (mainly) nativist SLA, learners’ L1 and their bilingualism at large have been gradually resurrected as central components in recent (notably, connectionist/emergentist) SLA theorizing. Several modern-day cognitivist theorists would argue that the successive, age-related entrenchment of the L1 and/or the active use of two languages are the major reasons why nativelike L2 competence and behavior are not attained (e.g., Flege, Reference Flege1999; Herschensohn, Reference Herschensohn2007; Pallier, Reference Pallier, Köpke, Schmid, Keijzer and Dostert2007; Vanhove, Reference Vanhove2013). Accordingly, the theoretical account currently gaining interpretative prerogative in the CPH debate holds that less than nativelike ultimate attainment is to be expected even in very advanced (be it early or late) L2 learners, simply because “nonmonolingual-likeness in terms of proficiency /…/ is a defining characteristic of bilingualism” (Birdsong, Reference Birdsong2014, p. 377). In line with Grosjean's (Reference Grosjean1989) statement that the bilingual is not two monolinguals in one person, various theoretical approaches to SLA, such as the Multicompetence framework (e.g., Cook, Reference Cook1991, Reference Cook2003, Reference Cook, Wei and Cook2016), the Competition Model (e.g., MacWhinney, Reference MacWhinney1999, Reference MacWhinney2016), the Speech Learning Model (e.g., Flege, Reference Flege1999), and the Interference Hypothesis (Pallier, Dehaene, Poline, LeBihan, Argenti, Dupoux & Mehler, Reference Pallier, Dehaene, Poline, LeBihan, Argenti, Dupoux and Mehler2003; Ventureyra, Pallier & Yoo, Reference Ventureyra, Pallier and Yoo2004), all point to the inherent difference between monolingual competence and the unique linguistic competence that emerges from the existence of two language systems in one mind (for a contrasting view, see e.g., Meisel, Reference Meisel2008, Reference Meisel2017; also Montrul & Ionin, Reference Montrul and Ionin2010).
Consequently, in view of this reasoning, various reinterpretations have been suggested for the results of the Abrahamsson and Hyltenstam (Reference Abrahamsson and Hyltenstam2009) study, along with general criticisms of Long's nativelikeness paradigm. On this latter point, Birdsong (Reference Birdsong2018) argues that, because of “coactivation and bidirectional effects, neither the first nor the second language of bilinguals can be expected to resemble under scrutiny that of monolinguals in either language” (p. 6), thus making it “unreasonable to hold up a standard of ‘across-the-board monolingual nativelikeness’ in the L2 as a criterion for falsifying the CPH” (ibid.) (see also Birdsong, Reference Birdsong2005, Reference Birdsong2006; Birdsong & Gertken, Reference Birdsong and Gertken2013; Birdsong & Quinto-Pozos, Reference Birdsong and Quinto-Pozos2018). In a similar fashion, Vanhove (Reference Vanhove2013) holds that “the linguistic repertoires of mono- and bilinguals differ by definition and differences in the behavioural outcome will necessarily be found, if only one digs deep enough” (p. 2), and he warns us against raising the bar for highly accomplished L2 learners “to Swiftian extremes” (ibid.).Footnote 2 Consequently, and in line with what has been launched as “the bi/multilingual turn in SLA” (Ortega, Reference Ortega2010, Reference Ortega and May2013), the very comparison with monolingual speakers has been deemed theoretically misguided and it has been recommended that it should be abandoned in CPH (or, even, all SLA) research; since ‘nativelike’ is considered synonymous with ‘monolingual-like’, the expected maximal ‘bilingual-like’ ultimate attainment should be equivalent to what has hitherto been (mis)taken for ‘near-native’ proficiency, regardless of learners’ AoA. Accordingly, it has been suggested by several authors (e.g., Birdsong, Reference Birdsong2005, Reference Birdsong2018; de Leeuw, Reference de Leeuw, Thomas and Mennen2014; Ortega, Reference Ortega2010, Reference Ortega and May2013; Cook, Reference Cook1999, Reference Cook2003, Reference Cook, Wei and Cook2016; Muñoz & Singleton, Reference Muñoz and Singleton2011) that the comparative standard should be shifted from monolingual language proficiency to the simultaneously acquired bilingual ultimate attainment of ‘crib bilinguals’. For example, de Leeuw (Reference de Leeuw, Thomas and Mennen2014) sees the conclusions in Abrahamsson and Hyltenstam (Reference Abrahamsson and Hyltenstam2009) as premature, as the study potentially suffered from monolingual-speaker bias. According to her, the inclusion of an additional participant group, consisting of simultaneous bilinguals who acquired two languages from birth, would have been necessary, because
only if the /…/ simultaneous bilinguals performed according to monolingual proficiency levels, whereas the /…/ non-native speakers /…/ did not, would it have been possible to ascertain that biologically determined maturational constraints impede L2 acquisition. If, on the other hand, both simultaneous and late consecutive bilinguals performed deviantly to monolingual norms, an alternative explanation would be required. (de Leeuw, Reference de Leeuw, Thomas and Mennen2014: 35)
That bilingualism, rather than brain maturation, might be the best candidate for explaining any subtle differences between native and near-native ultimate attainment is indeed a theoretically intriguing hypothesis that, in our view, merits thorough empirical testing. When considering the past decades’ explosion of research suggesting that bilingualism brings about cognitive advantages (in terms of divergent thinking, enhanced executive control, delayed symptoms of dementia, etc.), as well as linguistic costs (particularly in terms of a so-called bilingual lexical deficit; for overviews, see, e.g., Bialystok, Reference Bialystok2009, Reference Bialystok2016, Reference Bialystok2017), the hypothesis seems well-motivated. However, the widespread reliance on this research is actually what constitutes the core problem of the current CPH debate, as the bilingualism-effects argument largely rests on indirect inferencing from non-CPH/non-ultimate attainment research. For example, when Singleton and Pfenninger (Reference Singleton and Pfenninger2018) assume that “[t]he reason for the slight differences between native speakers and native-like non-natives /…/ almost certainly has to do with the effects of multi-competence /…/ rather than age” (p. 260; emphasis added), they are certainly not the only ones to engage in guesswork based on research that set out to investigate something other than the relative roles of AoA and bilingualism for ultimate attainment. This is clearly problematic for a number of reasons.
To begin with, it should be noted that the bilingual cognitive advantage has been seriously challenged, both in a comprehensive meta-analysis (Lehtonen, Soveri, Laine, Järvenpää, de Bruin & Antfolk, Reference Lehtonen, Soveri, Laine, Järvenpää, de Bruin and Antfolk2018), and in a recent large-scale study (Dick, Garcia, Pruden, Thompson, Hawes, Sutherland, Riedel, Laird & Gonzalez, Reference Dick, Garcia, Pruden, Thompson, Hawes, Sutherland, Riedel, Laird and Gonzalez2019), showing that there is no robust evidence of enhanced executive functioning in bilinguals. Secondly, and more importantly for the current argument, the majority of studies claiming to show a lexical deficit in bilinguals have actually ignored the AoA dimension or disregarded the crucial distinction between simultaneous and sequential bilingualism. Because of this, it is notoriously difficult to tell whether the lexical behavior attested in those bilingual samples is an artefact of bilingualism or L2 status. Indeed, a recent study showed that when AoA is taken carefully into account, the alleged bilingual lexical deficit turns out to predominantly be an L2 effect (Bylund, Abrahamsson, Hyltenstam & Norrman, Reference Bylund, Abrahamsson, Hyltenstam and Norrman2019). Taken together, these findings seriously undermine several assumptions on which arguments of bilingualism effects rest.
Moreover, when ultimate attainment studies have indeed included simultaneous bilinguals, results do not necessarily indicate consistent differences in proficiency or neurophysiology between monolinguals and simultaneous bilinguals (e.g., Berken, Gracco & Klein, Reference Berken, Gracco and Klein2017; Klein, Mok, Chen & Watkins, Reference Klein, Mok, Chen and Watkins2014; Reetzke, Lam, Xie, Sheng & Chandrasekaran, Reference Reetzke, Lam, Xie, Sheng and Chandrasekaran2016; Veríssimo et al., Reference Veríssimo, Heyer, Jacob and Clahsen2018). In those instances where different proficiency scores are indeed documented between monolinguals and simultaneous bilinguals (e.g., Hartshorne, Tenenbaum & Pinker, Reference Hartshorne, Tenenbaum and Pinker2018; Sundara, Polka & Baum, Reference Sundara, Polka and Baum2006), it is unclear whether the language under scrutiny was the participants’ dominant or non-dominant language, and whether it was the majority language or a heritage language – both of which are absolutely crucial factors to control for when performing group comparisons with monolingual majority-language speakers.
A logical extension of the bilingualism-effects argument is that the less L1 knowledge there is (and consequently, the lower the L2 speaker's degree of bilingualism), the greater the possibility of attaining nativelike/monolingual-like L2 proficiency – the extreme situation of total L1 loss offering the likeliest prospect for such attainment. This reasoning is captured in the Interference Hypothesis, the empirical basis of which is a series of studies (Pallier et al., Reference Pallier, Dehaene, Poline, LeBihan, Argenti, Dupoux and Mehler2003; Ventureyra et al., Reference Ventureyra, Pallier and Yoo2004; see also Pallier, Reference Pallier, Köpke, Schmid, Keijzer and Dostert2007) showing that international adoptees seem to have completely forgotten their childhood L1, as evidenced both through behavioral tests as well as fMRI responses (however, for counterevidence, see Choi, Cutler & Broersma, Reference Choi, Cutler and Broersma2017b; Park, Reference Park2014; Pierce, Klein, Chen, Delcenserie & Genesee, Reference Pierce, Klein, Chen, Delcenserie and Genesee2014; Singh, Liederman, Mierzejewski & Barnes, Reference Singh, Liederman, Mierzejewski and Barnes2011), while at the same time having attained (allegedly) nativelike proficiency in the L2. The conclusions drawn by these researchers were that if the L1 is lost at some point in childhood, the neural network can “reset” (Ventureyra et al., Reference Ventureyra, Pallier and Yoo2004: 89), which will allow for monolingual acquisition and a nativelike ultimate attainment. Conversely, the reason why some childhood L2 learners who maintain their L1 (such as immigrant children) do not attain nativelike L2 proficiency is because their L1 “acts as a filter that distorts the way in which a second language can be acquired” (Pallier et al., Reference Pallier, Dehaene, Poline, LeBihan, Argenti, Dupoux and Mehler2003: 160).
The problem, however, is that these studies performed no systematic linguistic assessment of the adoptees’ L2 proficiency. Instead, the claim about L2 nativelikeness was based on the test administrators’ impressions of the adoptees’ L2 speech. Subsequent studies examining the L2 of international adoptees with proper experiments have instead found that this group exhibits the same levels of (non-nativelike) proficiency as L2 speakers who have retained their L1 (Gauthier & Genesee, Reference Gauthier and Genesee2011; Hyltenstam et al., Reference Hyltenstam, Bylund, Abrahamsson and Park2009; Norrman & Bylund, Reference Norrman and Bylund2016, see also Gauthier, Genesee & Kasparian, Reference Gauthier, Genesee and Kasparian2012; Pierce, Chen, Delcenserie, Genesee & Klein, Reference Pierce, Chen, Delcenserie, Genesee and Klein2015). Moreover, several studies have shown that international adoptees often display L1 remnants (e.g., Choi, Broersma & Cutler, Reference Choi, Broersma and Cutler2017a; Choi et al., Reference Choi, Cutler and Broersma2017b; Hyltenstam et al., Reference Hyltenstam, Bylund, Abrahamsson and Park2009; Park, Reference Park2014; Pierce et al., Reference Pierce, Klein, Chen, Delcenserie and Genesee2014; Singh et al., Reference Singh, Liederman, Mierzejewski and Barnes2011). Yet, the ideas of complete L1 loss, ‘neural resetting’, and monolingualism as prerequisites for nativelike L2 acquisition seem to be considered to be facts in the CPH debate.
Aims, design, and hypotheses of the present study
Given that the current debate on AoA and bilingualism effects in L2 acquisition is characterized by a scarcity of hard evidence, the current study set out to empirically assess the relative impact of AoA vs. bilingualism on ultimate attainment in seemingly nativelike L2 speakers. To do so, we introduced a novel methodological design, in which the issue of monolingual-likeness is addressed through the addition of, first, simultaneous bilinguals as a comparison group, as advised by proponents of the bi/multilingual turn in SLA, and second, sequential monolingual L2 learners (here in the form of adult L2 speakers who were internationally adopted in early childhood), as per the notion that L1 loss increases the likelihood of nativelike L2 attainment. This yielded a 2(AoA from birth vs. after birth) × 2(monolingualism vs. bilingualism) factorial design (see Table 1), for which the following two alternative hypotheses were postulated:
1. The AoA-effects hypothesis
‘Nativelikeness’ is made possible by language exposure beginning at birth; ‘non-nativelikeness’ is the result of language exposure beginning later than birth. This hypothesis predicts a stand-alone main effect of AoA.
2. The bilingualism-effects hypothesis
‘Nativelikeness’ is made possible by monolingual language acquisition and use (and should instead be labeled ‘monolingual-likeness’); ‘non-nativelikeness’ is the result of bilingual language acquisition and use (and should instead be called ‘bilingual-likeness’). In the current design, this hypothesis would be confirmed in a stand-alone main effect of bilingualism.
In addition to these potential outcomes, alternative results may also be attested, manifested as an interaction between, or a confluence of, bilingualism and AoA.
Method
Participants
The following 80 participants took part in the study.
Monolingual L1 speakers of Swedish (n = 20)
The speakers in this group (M age = 29.8) were ‘crib monolinguals’. They were born in Sweden to L1 Swedish parents, and had acquired Swedish from birth as their only language. They had grown up in Sweden, and used Swedish in their everyday lives for communicative purposes. These participants were recruited through word of mouth and flyers distributed throughout Stockholm.
Simultaneously bilingual L1 speakers of Swedish and Spanish (n = 20)
These speakers (M age = 32.2) were ‘crib bilinguals’. They were born in Sweden to one Swedish-speaking parent and one Spanish-speaking parent. They had acquired both Swedish and Spanish from birth, and used both languages for everyday communication. These participants were recruited through newspaper advertisements.
Sequentially monolingual L2 speakers of Swedish (n = 20)
This group (M age = 33.7) comprised childhood adoptees who were born in Spanish-speaking countries in Latin America and adopted to Sweden between 3 and 7 years of age (M age of arrival = 4.3). According to self-reports, they had lost proficiency altogether in their L1 Spanish shortly after adoption, and had not engaged in relearning activities. In Sweden, they were brought up in L1 Swedish-speaking families, and consequently acquired Swedish as an L2. They used only Swedish for everyday communicative purposes. These participants were recruited through newspaper advertisements, adoption associations, adoption agencies, and social media.
Sequentially bilingual speakers of L1 Spanish and L2 Swedish (n = 20)
The participants in this group (M age = 28.8) were born in Latin American countries to L1 Spanish-speaking parents and thus had acquired Spanish from birth. Together with their families, they immigrated to Sweden between the ages of 3 and 8 years (M age of arrival = 5.2), which was when their acquisition of L2 Swedish commenced. These individuals had continued using their Spanish since arrival, and reported using both Spanish and Swedish in their everyday lives. These participants were recruited through newspaper advertisements.
As seen in Table 2, the L2-speaker groups (i.e., the ‘childhood adoptees’ and the ‘childhood immigrants’) did not differ significantly in terms of age of L2 acquisition. The bilingual groups (i.e., the ‘crib bilinguals’ and the ‘childhood immigrants’) did not differ in terms of Spanish language knowledge, as measured by their performance on a Spanish cloze test (Bylund et al., Reference Bylund, Abrahamsson, Hyltenstam and Norrman2019), or in their everyday use of Spanish and Swedish. Through schooling in Sweden, all participants had acquired foreign language skills in English and at least one other modern language, such as French or German. All participants spoke Swedish without any noticeable phonological, grammatical, or lexical deviations, as impressionistically judged by a linguistically trained, Swedish native-speaker research assistant. Groups were also matched in terms of education and gender.
Note. Standard deviations are reported in brackets.
a The difference in current age between the L1 bilinguals and L2 monolinguals was not significant after Bonferroni alpha correction.
Materials and procedure
Data was elicited on speech production and perception, morphosyntax (accuracy and response latencies), and formulaic language, thus covering a fairly broad range of language competence and processing abilities. The linguistic instruments were identical to 7 of the 10 instruments used by Abrahamsson and Hyltenstam (Reference Abrahamsson and Hyltenstam2009).Footnote 3
Instruments 1 and 2: production and perception of voice onset time (VOT)
The time interval between the release of a stop consonant and the onset of periodicity of the following vowel is generally referred to as voice onset time, or VOT. Spanish and Swedish differ as to where on the voicing continuum the voiced/voiceless categories separate: Spanish category boundaries are located at low, usually negative VOT values, whereas Swedish boundaries are found at higher, usually positive values (see, e.g., Lisker & Abramson, Reference Lisker and Abramson1964).
In the production task (Instrument 1), the participants’ reading aloud of the Swedish words par (‘pair’), tal (‘number’), and kal (‘naked’) was recorded. Each word was read in isolation 10 times, yielding a total of 2,398 data points (3 words × 10 readings × 80 participants - 2 unmeasurable tokens). Spectral analyses of the VOT of /p/, /t/, and /k/ were made in Praat (Boersma, Reference Boersma2002), measuring the time interval between the onset burst of the stop and the onset of vowel periodicity. Because VOT duration varies as a function of speech rate (e.g., Johnson & Wilson, Reference Johnson and Wilson2002; Schmidt & Flege, Reference Schmidt and Flege1996; Volaitis & Miller, Reference Volaitis and Miller1992), VOT values in milliseconds were converted into relative VOT values, calculated as percentages of word duration (for further detail on such a procedure, see, e.g., Stölten et al., Reference Stölten, Abrahamsson and Hyltenstam2015). Word duration was operationalized as the interval spanning from the onset of the release burst to the end of the periodicity of the final /l/ or /r/. The production task took approximately 5 minutes to complete.
The categorical perception test (Instrument 2) included the minimal pairs par-bar (‘pair’-‘bar’), tal-dal (‘number’-‘valley’), and kal-gal (‘naked’-‘crow(s)’ (Vpres)). Each word had been recorded in an anechoic chamber by a native female speaker of Swedish, and a 5-msec-step VOT continuum ranging from −60 to + 90 msec was then created for all word pairs (for details on the preparation of stimuli, see Stölten et al., Reference Stölten, Abrahamsson and Hyltenstam2014). The stimulus items were presented in E-Prime v.2.0 (Psychology Software Tools, Inc.; Schneider, Eschman & Zuccolotto, Reference Schneider, Eschman and Zuccolotto2002a, Reference Schneider, Eschman and Zuccolotto2002b) through PC-350 headphones in different randomized orders for all participants. Each word was preceded by the carrier phrase Nu hör du… (‘Now you will hear…’), and the participants’ task was to indicate by pressing one of two buttons whether they heard a word beginning with a voiceless stop, /p, t, k/, or a voiced stop, /b, d, ɡ/. The perception test took approximately 5 minutes to complete.
Instruments 3 and 4: grammaticality judgment accuracy and latency
Morphosyntactic knowledge and processing ability was measured through a comprehensive and demanding, auditory grammaticality judgment test. The test consisted of 80 sentences representing four morphosyntactic features of Swedish: (1) subject-verb inversion (V2); (2) reflexive possessive pronouns; (3) placement of sentence adverbs in relative clauses; and (4) gender and number agreement. Half of the sentences were grammatically incorrect, containing one grammatical error each. All sentences contained subordinate clauses, so as to increase syntactic processing demands. The sentences had been recorded in an anechoic chamber by a female native speaker and were presented in E-Prime through PC-350 headphones in different random orders for all participants. The participants indicated whether they perceived each sentence as grammatically correct or incorrect by pressing a green or a red (respectively) button at any point during or after the sentence presentation. Along with accuracy (Instrument 3), response latencies were also recorded (Instrument 4). The test took 15–20 minutes to complete.
Instrument 5: grammatical, lexical, and semantic inferencing
A global measure of L2 Swedish proficiency was obtained through a cloze test. The cloze test technique (Taylor, Reference Taylor1953) mobilizes a speaker's grammatical, lexical, contextual, and pragmatic knowledge in the perception and comprehension of spoken and written language. The present test was an untimed pen-and-paper task consisting of a 300-word text where every seventh word had been replaced by a blank. The task was to fill in each of the 42 blanks with a word that would fit into the context, structurally and semantically. Responses other than those in the original text were evaluated for lexical, grammatical, and semantic appropriateness with respect to their linguistic context; encyclopedic errors or spelling errors were not scored as errors. The test took 15–20 minutes to complete.
Instruments 6 and 7: formulaic language
Even though L2 learners (as well as L1 learners) rely on prefabricated linguistic chunks in early language development, the idiomatic use of formulaic language has been shown to be one of the greatest difficulties for (even very advanced) L2 speakers (e.g., Erman, Forsberg Lundell & Lewis, Reference Erman, Forsberg Lundell, Lewis, Hyltenstam, Bartning and Fant2018; Foster, Bolibaugh & Kotula, Reference Foster, Bolibaugh and Kotula2014; Granena & Long, Reference Granena and Long2013; Wray, Reference Wray2005). The present study included one test of idioms (Instrument 6) and one test of proverbs (Instrument 7). Both tests were created and run in E-Prime; they were identical in design and procedure, and included 50 items each presented on a screen (one at a time and in the same order for all participants) with a blank that was to be filled in with a missing word or phrase. Participants were given 10 seconds to complete each item, and their oral responses were recorded and later analyzed. Responses that did not correspond to the standard formulaic expression or any established variant thereof were scored as erroneous.Footnote 4 The tests took each 7–8 minutes to complete.
Testing and data collection was performed by a male native speaker of Swedish in a sound-attenuated room individually with each participant. Normal hearing was confirmed with an OSCILLA SM910 screening audiometer, and the entire language testing session (including instructions and breaks) then lasted for approximately 2.5 hours. Participants received a remuneration of SEK 500 (approximately €50).
Statistical analyses
In the current study, AoA is defined as a categorical variable, that is, acquisition from birth (i.e., L1) versus additional language acquisition (i.e., L2), commencing in this case between 3 and 8 years of age. The study is, in other words, not designed to assess AoA as a continuous variable, because the AoA range is too narrow and only covers early childhood (a period during which pronounced differences in AoA effects are typically not attested).
Performance on the grammaticality judgement test (accuracy), the cloze test, the idioms test, and the proverbs tests was analyzed using logit mixed model regressions with response accuracy as dependent variable. AoA (i.e., at birth vs. at 3–8 years of age) and bilingualism (monolingualism vs. bilingualism) were entered as categorical fixed effects, sum coded as −1 and 1. Subject and item were added as random effects, and bilingualism, AoA, and their interaction were added as random slopes, as justified by the maximal structure that converged.
Linear mixed model regressions were conducted to analyze the performance on the VOT production test and the reaction times on the grammaticality judgment test. Again, AoA and bilingualism were entered as categorical fixed effects. Subject and item were added as random effects, and bilingualism, AoA, and their interaction were added as random slopes, as justified by the maximal model that converged. All mixed model regressions were carried out using the Lme4 package (Bates, Mächler, Bolker & Walker, Reference Bates, Mächler, Bolker and Walker2014) in R (R Core Team, 2014).
Categorical perception of VOT was analyzed using probit (Finney, Reference Finney1947), which generates estimates of the 50% crossover points of binary response curves using maximum likelihood estimation (see also Caramazza, Yeni-Komshian, Zurif & Carbone, Reference Caramazza, Yeni-Komshian, Zurif and Carbone1973; Hazan & Boulakia, Reference Hazan and Boulakia1993). The generated probit values (one per place of articulation for each participant) were entered as dependent variable into linear models, with AoA and bilingualism as fixed effects (because there was only one data point, i.e., the probit value, per participant per articulation, random effects and random slopes were not computable).
Results
Production and perception of VOT
Starting with production, the elicited VOT values of /p/ showed no significant effects of either AoA nor bilingualism, β = 0.035, SE = 0.029, t = 1.192, p = .237, and β = −0.016, SE = 0.029, t = −0.565, p = .574, respectively, and no interaction, p = .913. A similar result was obtained for /t/: AoA, β = 0.036, SE = 0.025, t = 1.431, p = .157; bilingualism, β = −0.006, SE = 0.025, t = −0.255, p = .799; and non-significant interaction, p = .739. Likewise, no significant main effects or interactions were found for /k/: AoA, β = 0.024, SE = 0.031, t = 0.794, p = .430; bilingualism, β = −0.020, SE = 0.031, t = −0.650, p = .518; and non-significant interaction, p = .915. All groups were thus found to produce stop intervals of similar proportions between stop release and vowel periodicity onset. These results are depicted in Figure 1a–c.
However, in terms of categorical perception, a significant main effect of AoA was found for bilabial stops, β = 4.801, SE = 1.200, p < .001, suggesting that speakers with AoA at birth were more likely to place the category boundary of /p/–/b/ towards the positive end of the voicing continuum than speakers with later AoA (see Figure 1d). This was further confirmed in planned pairwise comparisons revealing significant differences in the same direction between monolingual L1 and L2 speakers (p < .01) and between bilingual L1 and L2 speakers (p = .04). No statistically significant main effect was noted for bilingualism, β = 2.112, SE = 1.200, p = .082, nor was there any significant interaction between the predictor variables, β = 1.047, SE = 1.200, p = .385. For dental stops (Figure 1e), AoA was again found to exert a main effect on category boundary, β = 2.082, SE = 0.663, p = 0.002, with L2 speakers being prone towards negative /t/–/d/ boundaries. No main effect of bilingualism was detected, β = 0.742, SE = 0.663, p = .267, but a significant interaction between bilingualism and AoA was found, β = 1.809, SE = 0.633, p = .008. Bonferroni post hoc tests revealed that L1 monolinguals and L1 bilinguals differed from one another (p = .029), but not L2 monolinguals and L2 bilinguals (p = 1). Likewise, monolingual L1 speakers differed significantly from monolingual L2 speakers (p = .001), whereas bilingual L1 speakers did not differ from bilingual L2 speakers (p = 1). Lastly, for velar /k/–/ɡ/ crossover, there was no significant main effect for AoA, β = 0.106, SE = 0.823, p = .898, or for bilingualism, β = 1.436, SE = 0.823, p = .09, nor any interaction effect, β = 0.404, SE = 0.823, p = .625 (Figure 1f).Footnote 5
Grammaticality judgment accuracy and latency
Participants’ performance on correctly judging the grammaticality of Swedish sentences revealed a significant main effect of AoA, β = −0.538, SE = 0.092, p < .001, but no main effect of bilingualism, β = 0.011, SE = 0.092, p = .906. However, there was also a marginally significant interaction between these two variables, β = −0.741, SE = 0.369, p = .055. As a follow-up, a series of posthoc test (Bonferroni) was conducted. These showed no significant difference within the L1 groups (L1 monolinguals vs. L1 bilinguals, p = 1) nor within the L2 groups (L2 monolinguals vs. L2 bilinguals, p = .714). There were differences, though, within the monolingual groups, with L1 monolinguals attaining higher scores than L2 monolinguals (p < .001), as well as within the bilingual groups, with L1 bilinguals attaining higher scores than L2 bilinguals (p < .019). Lastly, the L1 monolinguals were found to outperform the L2 bilinguals (p < .001), and the L1 bilinguals the L2 monolinguals (p < .01). In other words, these results indicate a robust effect of AoA on grammatical intuition. Accuracy scores are presented in Figure 2a.
In terms of latency (log-transformed), a main effect was again documented for AoA, β = 0.048, SE = 0.010, t = 4.66, p < .001, showing that L1 (monolingual and bilingual) speakers exhibited overall shorter reaction times than L2 (monolingual and bilingual) speakers (corroborated in pairwise comparisons, according to which L1 speakers were significantly faster than L2 speakers, monolingual L1 vs. monolingual L2, p < .001; bilingual L1 vs. bilingual L2, p = .014). No significant main effect of bilingualism (β = −0.009, SE = 0.010, t = −0.88, p = .377) or interaction (β = 0.011, SE = 0.010, t = 1.153, p = .253) was found. Latencies are presented in Figure 2b.
Grammatical, lexical, and semantic inferencing
The cloze test scores revealed a significant main effect of AoA, β = −0.678, SE = 0.099, p < .001, but no main effect of bilingualism, β = 0.068, SE = 0.099, p = .493. However, a significant interaction was also found, β = −0.251, SE = 0.099, p = .011. Bonferroni posthoc tests revealed no differences between the L1 groups (L1 monolinguals vs. L1 bilinguals, p = 1) or between the L2 groups (L2 monolinguals vs. L2 bilinguals, p = .148). Significant differences were, however, found within the monolingual groups, with L1 monolinguals attaining higher scores than L2 monolinguals (p < .001) and the bilingual groups, with L1 bilinguals attaining higher scores than L2 bilinguals (p = .025). Finally, the L2 bilinguals were found to obtain lower scores than the L1 monolinguals (p < 0.001), and the L1 bilinguals higher scores than the L2 monolinguals (p < 0.001). Cloze test scores are depicted in Figure 2c. These comparisons thus show a significant advantage of L1 speakers over L2 speakers (irrespective of the mono-/bilingualism in either group) on this test.
Formulaic language
On the test assessing proficiency with idioms, a significant main effect of AoA was documented, β = −0.467, SE = 0.149, p = .002, showing that L1 speakers in general attained higher scores than L2 speakers (the effect was consistent for both monolingual L1 vs. monolingual L2, p = .028, and for bilingual L1 vs. bilingual L2, p = .026). However, a significant main effect was also found for bilingualism, β = 0.473, SE = 0.157, p = .003 (confirmed in comparisons between monolingual and bilingual L1 speakers, p = .013, and between monolingual and bilingual L2 speakers, p = .012), suggesting that bilinguals overall scored significantly lower than monolinguals on this test. There was no significant interaction between AoA and bilingualism, β = 0.065, SE = 0.149, p = .664. Scores on the idioms test are presented in Figure 3a.
The analysis of the performance on the proverbs test revealed a significant main effect of AoA, β = −0.467, SE = 0.149, p = .002, suggesting again that, on average, L1 speakers were more proficient with the proverbs under scrutiny than L2 speakers (consistent across both monolingual L1 and L2 speakers, p = .016, and bilingual L1 and L2 speakers, p = .025). A significant main effect was also found for bilingualism, β = 0.473, SE = 0.157, p = .003, with monolingual L1 speakers in general attaining higher scores than bilingual L1 speakers (p = .048); for monolingual and bilingual L2 speakers this difference was only obtained at trend level (p = .069). No significant interaction was attested, β = 0.065, SE = 0.149, p = .664. Proverb scores are presented in Figure 3b.
Discussion
The current findings raise a number of important points for discussion, concerning not only the existence of AoA effects in ultimate attainment, but also the interpretation of AoA effects and bilingualism effects in general.
The impact of age of acquisition on ultimate attainment
The results revealed that out of the seven instruments used to assess ultimate attainment, none showed a standalone effect of bilingualism. Rather, when main effects of bilingualism were attested (for formulaic language), they always occurred in conjunction with main effects of AoA. Conversely, effects of AoA were found for six out of the seven instruments (the only exception being VOT production, where no effects of either predictor variable were documented). In those instances where an interaction between AoA and bilingualism was found (VOT perception, grammaticality judgement accuracy and cloze test performance), follow-up tests indeed confirmed consistent AoA effects and minimal bilingualism effects (if any at all). These results offer strong support for the AoA-effects hypothesis postulated above, and only limited support for the bilingualism-effects hypothesis. Moreover, the findings are consistent with previous research that has set out to examine the relative impact of AoA and bilingualism on L2 ultimate attainment using less comprehensive research designs than the current one (e.g., Bylund et al., Reference Bylund, Abrahamsson and Hyltenstam2012; Bylund et al., Reference Bylund, Abrahamsson, Hyltenstam and Norrman2019; Norrman & Bylund, Reference Norrman and Bylund2016; Veríssimo et al., Reference Veríssimo, Heyer, Jacob and Clahsen2018).
Is it possible, though, that the AoA effects attested here are in some way covert effects of bilingualism? The study has sought to disentangle these two variables in L2 speakers by including a group of international adoptees reporting to have undergone complete L1 loss. However, there is by now ample evidence showing that international adoptees may unconsciously retain some sort of L1 knowledge, even after decades of non-exposure (e.g., Choi et al., Reference Choi, Broersma and Cutler2017a; Hyltenstam et al., Reference Hyltenstam, Bylund, Abrahamsson and Park2009; Park, Reference Park2014; Pierce et al., Reference Pierce, Klein, Chen, Delcenserie and Genesee2014). Such knowledge is often manifested as a heightened sensitivity and/or distinct neurophysiological activation patterns to L1-phonetic contrasts in particular. It could, in other words, be argued that such L1 residual knowledge may give rise to L2 non-nativelikeness. It does, however, seem far-fetched that such traces would produce similar levels of bilingualism effects as would a fully functional L1. In fact, such a claim would entail that there is no proportion of bilingualism effects relative to L1 activation and proficiency, but that the mere existence of some kind of L1 knowledge, be it as a latent phonetic sensitivity or a full-fledged language, exerts an absolute effect on L2 attainment. While we have no desire to rule out the possibility that the adoptees in our study might have retained some L1 knowledge (despite self-reports to the contrary), we consider the idea of absolute bilingualism effects to be neither probable nor on par with previously reported findings on L1-L2 proficiency interactions (e.g., Bylund et al., Reference Bylund, Abrahamsson and Hyltenstam2012; Yeni-Komshian, Flege & Liu, Reference Yeni-Komshian, Flege and Liu2000).
Thus, the current findings have far-reaching consequences for the ongoing debate on bilingualism effects in L2 ultimate attainment, which to date has been characterized by a shortage of empirical evidence. The robust effects of AoA attested in the current design are orthogonal to the interpretation that bilingualism, rather than age of acquisition, gives rise to near-native (as opposed to nativelike) ultimate attainment in early learners. As such, the findings speak against, first, the idea that bilingualism per se automatically results in non-nativelike/non-monolingual-like linguistic behavior (for a similar point, see Meisel, Reference Meisel2017), and second, the notion that L1 loss brings about nativelike/monolingual-like L2 attainment (e.g., Ventureyra et al., Reference Ventureyra, Pallier and Yoo2004). Seeing that the current study has assessed language proficiency with the same type of instruments as several previous CPH studies (e.g., Abrahamsson & Hyltenstam, Reference Abrahamsson and Hyltenstam2009; Bialystok & Miller, Reference Bialystok and Miller1999; Birdsong & Molis, Reference Birdsong and Molis2001; DeKeyser, Reference DeKeyser2000; DeKeyser et al., Reference DeKeyser, Alfi-Shabtay and Ravid2010; Granena & Long, Reference Granena and Long2013; Johnson & Newport, Reference Johnson and Newport1989) while controlling for bilingualism effects, one could argue that the findings on age of acquisition generated here are indeed not unique to a particular experimental paradigm, but may account for – and crucially, confirm – previously reported AoA effects.
Because our instruments are identical to those in one of these previous studies, Abrahamsson and Hyltenstam (Reference Abrahamsson and Hyltenstam2009), we are in a unique position to re-assess the incidence of L2 nativelikeness in that study. As mentioned earlier, out of 41 potentially nativelike learners, 3 (with AoA ≤ 8) were in the range of native speakers on all instruments used by Abrahamsson and Hyltenstam (Reference Abrahamsson and Hyltenstam2009). Since the current results showed main effects of bilingualism on the tests of formulaic language (idioms and proverbs), these instruments should be considered as tapping into bilingualism effects – in addition to effects of AoA – and could therefore be removed from the test battery in Abrahamsson and Hyltenstam's study. This removal results in the inclusion of two additional participants (AoA 1 and 4 years) who previously did not exhibit nativelike performance because of the formulaic language tests. This changes the number of learners who performed like native speakers on the relevant instruments from three to five, corresponding to a 5% increase of the nativelikeness rate in the sample (from 7% to 12% of the learners). In other words, the removal of instruments sensitive (also) to bilingualism effects had but marginal effects on the original findings reported by Abrahamsson and Hyltenstam (Reference Abrahamsson and Hyltenstam2009). As a consequence, any suggestion that the low incidence of nativelike L2 speakers in that study was an artefact of bilingualism would find only scant support.
Implications for interpreting the ultimate cause of AoA effects
What, then, are the consequences of the current findings for interpreting the mechanisms that underlie AoA effects in L2 ultimate attainment? De Leeuw (Reference de Leeuw, Thomas and Mennen2014, p. 35) suggests that should AoA effects be detected in a design that controls for bilingualism, such as the current one, then this can be taken as evidence for maturational constraints on L2 learning. We believe, however, that it is better not to overestimate this design when interpreting the ultimate cause of the attested effects: while the results allow us to reject the generic claim that L2 nativelike attainment is impossible due to bilingualism effects, they do not necessarily reveal the specific locus of AoA. That said, it should be emphasized that the stand-alone effects of AoA documented in the present study in no way rule out a maturational constraints-based explanation. In fact, they are consistent with such an explanation. The exact nature of maturationally induced AoA effects is, however, yet to be uncovered, concerning both the actual changes in the mechanisms of language acquisition and in the resulting learned linguistic representations, as well as the type of sensitive period (nested sensitive period with cascading effects, or independent multiple sensitive periods). Relatedly, it is necessary to ask whether the same mechanisms may really account for the whole range of behaviors studied here, or whether different explanations are needed for different linguistic behaviors (which is certainly not inconceivable, cf. Johnson, Reference Johnson2005).
Implications for research on bilingualism effects
While the current study is primarily concerned with the potential role of bilingualism for nativelike attainment in an L2, the findings have important implications for the understanding of bilingualism effects on verbal behavior in general. As mentioned in the background section, there has been a tendency in some research areas (e.g., the bilingual lexical deficit literature) not to systematically factor into the study design a distinction between simultaneous bilingualism and sequential bilingualism, but to instead lump together participants with different bilingual acquisition trajectories and test them in the societally dominant language. In such a design, a large part of the bilinguals may in fact be L2 speakers, who are then compared with monolingual L1 speakers. In view of the present findings, it is clear that while bilingualism may have a certain effect on some linguistic domains (e.g., lexis), AoA exerts more consistent effects across the board (including lexis; see Bylund et al., Reference Bylund, Abrahamsson, Hyltenstam and Norrman2019). Thus, a design that sets out to assess bilingualism effects on linguistic behavior but confounds bilingualism with L2 status runs the risk of inflating any differences between the bilinguals and the monolingual comparison group, ultimately compromising their observations on bilingualism effects. In conclusion, just as inattention to the bilingualism of L2 speakers may be problematic for assessing AoA effects, as suggested by proponents of the bi-/multilingual turn in SLA, we argue that inattention to AoA and to the fact that bilinguals may be L2 speakers is equally problematic for assessing bilingualism effects.
Conclusions
The aim of the present study was to address empirically the notion that bilingualism, rather than age of acquisition, underlies less than nativelike attainment in early (and, by inference, also exceptionally successful adult) L2 learners. In a factorial design where the variables of AoA at birth/after birth vs. monolingualism/bilingualism were fully crossed, the results from a comprehensive battery of previously used tests (see, e.g., Abrahamsson & Hyltenstam, Reference Abrahamsson and Hyltenstam2009) showed minimal effects of bilingualism, but major effects of AoA. These findings were discussed in relation to the ongoing debate on AoA vs. bilingualism effects on L2 ultimate attainment, and also in terms of their implications for interpreting AoA effects and bilingualism effects in general.
There is a risk that sweeping arguments about bilingualism underlying non-nativeness in early learners may in the end backfire, as they inflate the expectations of the explanatory potential of this variable. As shown by the current study, the effects of bilingualism on L2 attainment are more limited and selective in scope than previously thought. Ultimately, it is not rhetoric, but empirical assessments, along with conceptual analyses of the notions of mono- and bilingualism (see Bylund, Hyltenstam & Abrahamsson, Reference Bylund, Hyltenstam, Abrahamsson, Granena and Long2013), that will further our knowledge in this area. We choose to believe Birdsong (Reference Birdsong2018) when he ascertains that “no researchers claim that bilingualism effects alone are responsible for all divergences from monolingual-likeness in bilingualism” (p. 6). At the same time, however, we sense that the alleged negative effects of using monolingual native speakers as baseline may have been exaggerated; according to the present data and previously reported findings (e.g., Bylund et al., Reference Bylund, Abrahamsson and Hyltenstam2012; Reference Bylund, Abrahamsson, Hyltenstam and Norrman2019; Hyltenstam et al., Reference Hyltenstam, Bylund, Abrahamsson and Park2009; Norrman & Bylund, Reference Norrman and Bylund2016), there may be no urgent need for making a general shift in SLA research to the use of simultaneous bilinguals as golden standard for every linguistic domain. In order to bring further clarity into this issue, we encourage future studies to not only assess different types of monolingual and bilingual L1 and L2 speakers (including simultaneous bilinguals, sequential monolinguals, and others) – but also to factor into their designs test instruments that allow for a systematic targeting of linguistic domains and structures that exhibit different degrees of likelihood to elicit bilingualism effects (e.g., based on typological analysis). This will be crucial for understanding the differential effects of age of acquisition and bilingualism, and any potential interaction between the two, on ultimate attainment.
Acknowledgements
This research was funded by Riksbankens Jubileumsfond (The Swedish Foundation for Humanities and Social Sciences), grant no. M2005-0459 (to K.H.) and grant no. SAB16-0051:1 (to N.A.). We are grateful to João Veríssimo and two anonymous reviewers for providing insightful comments on a previous version of the manuscript, to Gunnar Norrman for collecting and organizing the data, to Robyn Berghoff for statistical advice, and to Lamont Antieau for editing our English writing. Needless to say, any remaining errors are entirely our own.