Highlights
-
• We chart bilingual adaptations in Similar versus Distant Language Bilinguals finding strong evidence for a distance effect in bilingualism.
-
• We highlight the lack of ecologically accepted metrics of language distance cross-linguistically.
-
• We present a set of factors that may mitigate the effect of language distance.
1. Introduction
The consequences of bilingualism on (language) processing and neurocognition have been a topic of interest for decades. The most contemporary accounts maintain that bilinguals/multilinguals, under certain conditions of dual/multiple language engagement, can experience cognitive and neuroanatomical adaptations. Such adaptations are argued to stem from and calibrate to the degree of cognitive resource allocation needed to effectively manage an individual’s two or more linguistic systems (e.g., Bialystok, Reference Bialystok2024; DeLuca et al., Reference DeLuca, Rothman, Bialystok and Pliatsikas2019; Titone & Tiv, Reference Titone and Tiv2023; see Lehtonen et al., Reference Lehtonen, Fyndanis and Jylkkä2023 for a recent overview). Although the neuroimaging literature provides somewhat clearer supportive evidence across a wide range of bilingual populations with distinct trajectories, ages and settings (Blanco-Elorrieta & Pylkkänen, Reference Blanco-Elorrieta and Pylkkänen2017; Costa & Sebastián-Gallés, Reference Costa and Sebastián-Gallés2014; DeLuca et al., Reference DeLuca, Voits, Ni, Carter, Rahman, Mazaheri, Krott and Segaert2024; DeLuca & Voits, Reference DeLuca and Voits2022; Pliatsikas & Luk, Reference Pliatsikas and Luk2016) – with recent fMRI studies showing brain activation in response to unfamiliar languages that are similar to languages in which people have high-to-moderate proficiency, thus suggesting that the language network’s response magnitude scales correlate with the degree of engagement of linguistic operations (Malik-Moraleda et al., Reference Malik-Moraleda, Jouravlev, Taliaferro, Mineroff, Cucu, Mahowald, Blank and Fedorenko2024) – the behavioral literature presents some contradictory findings.
Such findings fall into three categories. First, some studies find evidence in favor of bilingual/multilingual adaptations (understood as effects that encompass both enhancements and compensations; Dentella et al., Reference Dentella, Masullo and Leivada2024), evidenced either by comparing monolingual and bilingual/multilingual aggregates or, increasingly so, by regressing composite scores related to degree of bilingual language experience and engagement as continuous variables within diverse groupings of bilinguals/multilinguals. This literature links (significant degrees of) bilingual experience to adaptations that could be framed as enhancements, mainly in executive function (EF) measures and the brain areas that subserve them (Bialystok, Reference Bialystok2001, Reference Bialystok2007; Costa et al., Reference Costa, Hernández and Sebastián-Gallés2008; DeLuca et al., Reference DeLuca, Voits, Ni, Carter, Rahman, Mazaheri, Krott and Segaert2024; DeLuca & Voits, Reference DeLuca and Voits2022; Perovic et al., Reference Perovic, Filipović Đurđević and Halupka-Rešetar2023; Prior & MacWhinney, Reference Prior and MacWhinney2010), but also in other domains of language learning and processing (Kaushanskaya & Marian, Reference Kaushanskaya and Marian2009; Leivada et al., Reference Leivada, Mitrofanova and Westergaard2021; Marsh et al., Reference Marsh, Hansson, Sörman and Ljungberg2019). In parallel to this first body of literature, the second category consists of work showing trade-off effects where bilingualism/multilingualism has consequences veering in the opposite direction: for example, negative effects on lexical access and semantic fluency have been noted (Baus et al., Reference Baus, Santesteban, Runnqvist, Strijkers and Costa2020; Gollan et al., Reference Gollan, Montoya and Werner2002; Ivanova & Costa, Reference Ivanova and Costa2008). The third category finds evidence for null effects in EF adaptations: In some experiments, observed differences between monolinguals and bilinguals are statistically indistinguishable from zero. In other words, being bilingual is not a sufficient condition to systematically entail differences from monolinguals (Duñabeitia et al., Reference Duñabeitia, Hernández, Antón, Macizo, Estévez, Fuentes and Carreiras2014; Lehtonen et al., Reference Lehtonen, Soveri, Laine, Järvenpää, de Bruin and Antfolk2018; Paap et al., Reference Paap, Johnson and Sawi2015; Paap & Greenberg, Reference Paap and Greenberg2013). Such data have led to reasonable questioning: If bilingual effects are inconsistently shown, happening perhaps only under specific conditions, then one might ponder the generalizability of the claim that bi-/multilingualism has an impact on the mind/brain (Paap, Reference Paap2023). Reconciling the evidence is non-trivial. Given the present state of the discipline, it should be uncontroversial to state that no one ascribes to the simplistic position where bilingualism per se is taken to be a sufficient condition for neurocognitive adaptations, or that any cognitive effects bilingualism entails are solely advantageous ones. Rather, the challenge is to determine the precise conditions under which bilingual/multilingual adaptations are to be predicted.
What mechanisms could be responsible for such adaptations? Three accounts have been proposed. The first argues for an inhibition mechanism that suppresses the activation of the language that the speaker does not use at a given moment (Green & Abutalebi, Reference Green and Abutalebi2013). The second relies on the highest activation level of linguistic elements without assuming any need for suppression, employing a general monitoring system that allows production in the intended language (Blanco-Elorrieta & Caramazza, Reference Blanco-Elorrieta and Caramazza2021). The third account relates bilingual/multilingual adaptations to the attentional control system; essentially to differences in the efficiency and deployment of attentional control (Bialystok, Reference Bialystok2024; Bialystok & Craik, Reference Bialystok and Craik2022). While these accounts offer distinct coverage of available data and make different or only partially overlapping predictions as to where bilingual effects on neurocognition stem from, they share an important assumption: a level of representation exists where the different language systems must be kept cognitively distinct in the mind of the speaker/signer. Thus, one open question that is relevant for all accounts concerns language distance and how keeping increasingly overlapping representations distinct may contribute to the magnitude of bilingual adaptations. Language distance (also referred to as language similarity or proximity in the literature) can be defined as the overlap between sets of linguistic features (be it lexical items, morphosyntactic features, phonemic inventories, etc.), where each set corresponds to a different language. Thus, a comprehensive measure of language distance amounts to a distance metric that charts the set overlap between lexical, morphological, syntactic, phonetic, phonological and orthographic features.
While it has been noted that similarity across representations at the lexical, grammatical and phonological levels plays a significant role in modulating the degree of recruitment of cognitive control mechanisms during bilingual language processing (Costa et al., Reference Costa, Santesteban and Ivanova2006; Rodriguez-Fornells et al., Reference Rodriguez-Fornells, de Diego Balaguer and Münte2006), the details of this modulation are un(der)specified and largely unknown. Language distance has so far occupied a distinctively unclear role in the relevant literature (DeLuca, Reference DeLuca2024; Lee, Reference Lee2022), to the extent that targeted predictions are hard to form with respect to how it plays out in relation to bilingual adaptations.
On the one hand, it is possible that speakers of closely related languages require more resources for handling them than speakers of typologically distant languages, because it may be harder (i.e., more cognitively demanding) to suppress a subset of representations that are very similar, compared to typologically distant ones (Costa et al., Reference Costa, Santesteban and Ivanova2006; Rothman, Reference Rothman2015). If monitoring lexical variants becomes more effortful depending on how similar competing alternatives are (Blanco-Elorrieta & Caramazza, Reference Blanco-Elorrieta and Caramazza2021; Roelofs, Reference Roelofs1992), speakers of closely related varieties would exert more effort; hence, they would show more pronounced bilingual adaptations, if the similarity of language representations indeed plays a role.
On the other hand, it is also possible that speakers of distant languages exert more cognitive effort when handling their languages, because they are constrained as to how much they can transfer from one language system to the other. Language transfer is accelerated when a strong similarity at the lexical and structural levels occurs, as argued in several models of multilingual language acquisition (e.g., González Alonso et al., Reference González Alonso, Alemán Bañón, DeLuca, Miller, Pereira Soares, Puig-Mayenco, Slaats and Rothman2020; Mitrofanova et al., Reference Mitrofanova, Leivada and Westergaard2023; Rothman, Reference Rothman, Baauw, Dirjkoningen, Meroni and Pinto2013, Reference Rothman2015; Rothman et al., Reference Rothman, González Alonso and Puig-Mayenco2019; Westergaard, Reference Westergaard2021a, Reference Westergaard2021b). Speakers of distant languages have a bigger pool of non-shared representations that they must keep monitoring. Given that the pool of competing alternatives grows substantially when there are fewer (or no) overlaps between the two languages, this more widespread competition may lead to more pronounced adaptations in speakers of distant languages.
One factor that makes the role of language similarity hard to spell out in a concise way is the absence of a universal metric for measuring degree of language (dis)similarity. While the labels “language similarity,” “language proximity” and “language distance” are often used interchangeably, they lack an ecologically valid, unambiguous definition (Mitrofanova et al., Reference Mitrofanova, Leivada and Westergaard2023). As Eden (Reference Eden2018, 23) puts it, “a ‘language’ is more or less similar to other languages – but what does that mean? Is it the percentage of shared cognates which is important (e.g., Lees, Reference Lees1953, see also Otwinowska, Reference Otwinowska2015), or the phonemic inventory (e.g., Bardel & Lindqvist Reference Bardel, Lindqvist, Chini, Desideri, Favilla and Pallotti2007; Bartelt, Reference Bartelt1989)? Is it a matter of overlapping grammatical representations? All of these factors together?.” Answering how language distance is to be conceptualized “is a fundamental question that has yet to be discussed seriously in the realm of bilingualism” (Lee, Reference Lee2022, 3336).
Languages are often categorized as “similar/close” or “dissimilar/distant” with little to no reference to how close/distant they are and in what sense. Employing phylogenetic relationships to evaluate language distance may hinder detailed analysis of contemporary similarities and differences between languages. Hence, quantitative metrics have turned to analyzing the degree of overlap in specific linguistic features in order to measure language distance. Languages have been compared independently at the lexical (Brown et al., Reference Brown, Holman, Wichmann and Velupillai2008; Downey et al., Reference Downey, Hallmark, Cox, Norquest and Lansing2008; Gallo et al., Reference Gallo, Myachykov, Nelyubina, Shtyrov, Kubiak, Terekhina and Abutalebi2023; Gooskens, Reference Gooskens2007; Kepinska et al., Reference Kepinska, Caballero, Oliver, Marks, Haft, Zekelman, Kovelman, Uchikoshi and Hoeft2023; Petroni & Serva, Reference Petroni and Serva2010), morphosyntactic (Jeong et al., Reference Jeong, Sugiura, Sassa, Haji, Usui, Taira, Horie, Sato and Kawashima2007a, Reference Jeong, Sugiura, Sassa, Yokoyama, Horie, Sato, Taira and Kawashima2007b) and orthographic (Dong et al., Reference Dong, Li, Chen, Qu, Jiang, Sun, Hu and Mei2021; Kim et al., Reference Kim, Qi, Feng, Ding, Liu and Cao2016) levels. Other studies have adopted combined distance measures that aggregate features from different levels of linguistic analysis, tapping into lexical and morphosyntactic similarities (Floccia et al., Reference Floccia, Sambrook, Delle Luche, Kwok, Goslin, White, Cattani, Sullivan, Abbot-Smith, Krott, Mills, Rowland, Gervain and Plunkett2018), phonological, morphological and lexical similarities (Schepens et al., Reference Schepens, Van Hout and Jaeger2020) or genealogical classification with measures of morphosyntactic similarities (Laketa et al., Reference Laketa, Studenica, Chrysochoou, Blakey and Vivas2021; Studenica et al., Reference Studenica, Laketa, Chrysochoou, Blakey and Vivas2022). These metrics provide more fine-grained details of the similarities and differences between languages than phylogenetic relations alone, which produce dichotomous comparisons and are often unable to quantify the distance between languages. However, as they define distance in non-overlapping and highly constrained ways (e.g., often measuring a small number of features from one level of linguistic analysis), their conclusions need further validation. Although the lack of an overall metric of language distance that enjoys consensus limits our understanding of language distance effects in bilingual cognition, in recent years, there has been an upsurge in studies that adopt a comparison of different bilingual populations with an explicit aim to understand distance.
In this context, this systematic review and quantitative analysis aims to shed light on the role of language distance in neurocognitive (including language processing) adaptations to bilingualism. We combine Bayesian analyses with the PRISMA protocol (Page et al., Reference Page, McKenzie, Bossuyt, Boutron, Hoffmann, Mulrow, Shamseer, Tetzlaff, Akl, Brennan, Chou, Glanville, Grimshaw, Hróbjartsson, Lalu, Li, Loder, Mayo-Wilson, McDonald and Moher2021) to identify and analyze studies that investigate the role of language distance by comparing different groups of bilingual/multilingual populations. The analysis is guided by three Research Questions: (RQ1) In behavioral studies that compare bilingual groups of typologically different languages, is there evidence for a modulatory role of language distance on bilingual neurocognition? (RQ2) Are adaptations to bilingualism more pronounced in Similar Language Bilinguals (SLB), Distant Language Bilinguals (DLB), or is the evidence mixed? (RQ3) Do neural correlates align with the behavioral evidence?
The predictions for the behavioral and the neural outcomes are the following:
Hypothesis 1. If similar representations are putatively more effortful to manage, bilingual adaptations will be more pronounced in SLB. Consequently, such adaptations boil down to the effort devoted to managing the competition between a small pool of very similar alternatives.
Hypothesis 2. If distant representations are putatively more effortful to manage, the prediction is that bilingual adaptations will be more pronounced in DLB. In this case, bilingual adaptations stem from managing a big pool of non-similar alternatives.
Hypothesis 3. If language switching due to speaker-external, communicative constraints requires a cognitive effort (Blanco-Elorrieta & Pylkkänen, Reference Blanco-Elorrieta and Pylkkänen2018), bilingual adaptations will be more pronounced in DLB than SLB. The reason is that SLB speakers tend to mix more and switch less, because closely related languages often give rise to hybrid varieties (Auer, Reference Auer1999; Grohmann et al., Reference Grohmann, Kambanaros, Leivada and Pavlou2021; Leivada et al., Reference Leivada, Papadopoulou and Pavlou2017): The more similar the two languages are, the more likely people will understand both – even if they do not actively use both to the same degrees – which will thus reduce the need for language-switching; a need that arises from the desire to not jeopardize effective communication (Costa et al., Reference Costa, Santesteban and Ivanova2006).
Hypothesis 4. Alternatively, under the assumption that the brain does not “see” switching (i.e., if it is insensitive to switching and switch effects disappear in executive control regions during comprehension of natural conversation, Blanco-Elorrieta & Pylkkänen, Reference Blanco-Elorrieta and Pylkkänen2017; Phillips & Pylkkänen, Reference Phillips and Pylkkänen2021), the different patterns of language use behind switching versus mixing across typological distant or similar languages will not lead to pronounced differences between DLB and SLB. Table 1 summarizes the predictions.
Table 1. Summary of hypotheses and predictions

Note: The different domains refer to the origin of the relevant literature from which each hypothesis stems.
Of course, these predictions are predicated on the assumption that all other contributory variables behind bilingual adaptations (e.g., degree of bilingual engagement, sociolinguistic variables related to language use), save for the one in focus (in this case, language distance), are either controlled or equally distributed in the samples that comprise each set: SLB and DLB. As this systematic review addresses studies that have already been carried out, possible bias and variation due to inadequately controlled confounding variables can be neither confirmed nor disconfirmed for each of the studies in our dataset. Systematic reviews and meta-analyses by default entail some degree of heterogeneity of the compared studies; therefore, differences between individual studies are likely lost when aggregating the data for analysis, effectively diluting the role of possible confounds in the overall sample.
2. Methods
We conducted a systematic review and a Bayesian quantitative analysis of the literature on language similarity and bilingual cognition. The review was conducted according to the PRISMA Statement (Page et al., Reference Page, McKenzie, Bossuyt, Boutron, Hoffmann, Mulrow, Shamseer, Tetzlaff, Akl, Brennan, Chou, Glanville, Grimshaw, Hróbjartsson, Lalu, Li, Loder, Mayo-Wilson, McDonald and Moher2021), which is a reporting guideline designed to assist authors of systematic reviews and meta-analyses in describing the purpose and the methodology of their work in a transparent way. The PRISMA checklist is provided in the Appendix. Data were plotted and analyzed using R (R Core Team, 2021) and jamovi, version 2.2.5 (The jamovi project, 2024).
A systematic search of the literature was conducted in the following databases: Scopus and PubMed. The searches were conducted in May 2024. The following search terms were used: “proximity” OR “distance” OR “similarity” AND “bilingual*.” A filter was used to obtain only studies in English published after 2000. The number of records retrieved was 574, out of which 128 duplicates were removed. The remaining reports were screened and 384 were excluded for various reasons (i.e., relevance, not featuring bilingual/multilingual populations that use different languages, not presenting novel results; see Figure 1). The final database consists of 47 studies, 30 of which were obtained through the PRISMA protocol, and the remaining through following up individual references in all articles that were assessed for eligibility. First, one researcher (EL) independently searched the databases, selected the relevant studies and extracted the data, following the aforementioned predefined criteria. In cases of doubt, two other researchers (LKI and CM) independently evaluated the study in question for inclusion. In all cases, consensus was eventually reached among all coding authors. The dataset that was created and analyzed for this review is available at https://osf.io/fqx9m/.

Figure 1. PRISMA flow chart.
For the Bayesian analyses, we coded the results in the following way, based on previous studies on bilingual adaptations that performed similar quantitative analyses (Dentella et al., Reference Dentella, Masullo and Leivada2024; Grundy, Reference Grundy2020; Yurtsever et al., Reference Yurtsever, Anderson and Grundy2023): If a study found evidence for a distance effect, it was coded as 1. If a study did not find evidence for a distance effect, it was coded as −1. If a study produced mixed or spurious results that provided some evidence but did not clearly indicate a reliable distance effect, it was coded as 0.
In Bayesian analyses, the Bayes factor (BF) computes the probability of observing the analyzed data under the null hypothesis versus the alternative hypothesis (Wetzels & Wagenmakers, Reference Wetzels and Wagenmakers2012). A BF of 10–30 is typically considered strong support for the alternative hypothesis, while a BF > 100 can be interpreted as extremely strong evidence in favor of the alternative. In contrast to the frequentist p value, the BF allows researchers to quantify the evidence in favor of the null hypothesis by determining that the results are X times more likely under the alternative than under the null hypothesis.
Two analyses were run. First, a Bayesian one sample t-test targeted the presence or absence of a distance effect (Analysis 1). Second, a Bayesian binomial proportion test focused on those studies that claim to find robust evidence for a distance effect (i.e., the studies coded as 1) and analyzed the direction of the effect (Analysis 2). In this case, the results were recoded as SLB > DLB or DLB > SLB, depending on whether the observed bilingual effects were more pronounced in the SLB or DLB group (see Table 1 for concrete predictions about the SLB versus DLB differences). We used the original labeling of the tested populations as SLB or DLB, as found in the analyzed studies. Table 2 represents the input used for the two analyses.
Table 2. Input for Bayesian analyses

Note: More details about the reviewed studies (e.g., number of participants, tested language pairs, domain of testing, main finding, type of metric used to determine distance) are given in the expanded version of this table available at https://osf.io/fqx9m/.
Notably, in the entire dataset, only 10 studies employ a language distance metric that empirically justifies their attribution of labels such as “similar” or “distant” to different populations. It is worth highlighting that only two of these studies found mixed or spurious results; others either found a language distance effect (n = 6) or no effect at all (n = 2). The remaining 37 studies, which did not measure language distance, obtained a variety of effects: evidence for a language distance effect (n = 25), no evidence for a language distance effect (n = 6) and mixed or spurious results (n = 6). Table 3 classifies the analyzed studies by the effect they found and the presence (or lack thereof) of a language distance metric.
Table 3. Language distance effects and the presence or absence of a language distance metric

3. Results
Our findings show that 65.96% of the studies in our dataset found evidence for a modulatory effect of language distance on bilingual adaptations, 17.02% reported no effect and 17.02% of them obtained mixed or unclear evidence, which did not allow them to reliably support a language distance effect. Figure 2 summarizes these findings, while Figure 3 presents the studies that report both behavioral and neuroimaging data. As Figure 3 shows, almost all brain studies find an effect of distance (6/7 studies) and, with one exception, it is always in the SLB > DLB direction (5/6 studies). Given the small size of this subset of studies, which does not permit for a separate analysis, as well as the fact that they all have a behavioral component, we include these studies in the analyses presented below.

Figure 2. Distribution of language distance effects.

Figure 3. Results of studies including both the neural and the cognitive levels of analysis.
3.1. Language distance effect (Analysis 1)
In the first analysis, we explored whether there was a modulatory effect of language distance in our pool of data (RQ1). The Bayesian one-sample t-test revealed extreme evidence that the data are more likely under the alternative hypothesis (BF10 = 278.008),Footnote 1 which supports the presence of a language distance effect. The robustness of our results is shown in Figure 4, in the panels “Bayes Factor Robustness Check,” which displays a stable BF across various priors and “Sequential Analysis,” which illustrates stronger evidence favoring the alternative hypothesis as each new study is incorporated into the analysis.

Figure 4. Strong evidence for the modulatory effect of language distance. In the “Effect” panel, the circle indicates the mean across studies reporting a language distance effect (1) or not (−1), while the error bar represents standard error. In the “Prior and Posterior” panel, the prior shows the initial probability before data introduction, while the posterior shows the updated probability after incorporating the data. The “Bayes Factor Robustness Check” panel displays how the BF varies with different priors. The “Sequential Analysis” panel illustrates the BF progression as each study is added.
Given the importance of weighting results according to sample size in order to factor in all contributions fairly (Grundy, Reference Grundy2020), we reran Analysis 1 using a sample size correction. Specifically, each study was assigned a proportional weight by dividing its sample size by the total number of participants included in our database. The final weighted score was obtained by multiplying the weight of each study by the effect score assigned to it (i.e., −1: no LD effect; 1: LD effect; 0: spurious results). It is worth noting that one study in our database (Schepens et al., Reference Schepens, Van Hout and Jaeger2020) accounted for almost 89% of the total sample size (weighted value = 0.898418). Importantly, its sample consisted of second-language learners of Dutch with 62 different L1s, resulting in an extremely heterogeneous population not only in terms of language pairs but also in terms of factors related to speakers’ sociolinguistic backgrounds. These features make this study an outlier both in sample size (n = 48219) and participant profile. Thus, we decided to run two separate weighted analyses: one including Schepens et al. (Reference Schepens, Van Hout and Jaeger2020) and one excluding it. A Bayesian one-sample t-test run on weighted studies showed moderate evidence in favor of the null hypothesis (i.e., no modulation of LD), with BF10 = 0.264. However, the same analysis without the outlier showed moderate evidence for the alternative hypothesis (BF10 = 3.777), in line with the results of our original analysis on non-weighted data, although with a clear decrease in effect size (i.e., going from extreme support in favor of the alternative hypothesis to moderate support). For the sake of completeness, we did an additional Bayesian one-sample t-test on non-weighted data, mirroring the original analysis but excluding the outlier. Once again, the results confirmed what we found in the first analysis of this section and revealed strong evidence for the alternative hypothesis (BF10 = 166.825).
3.2. Direction of language distance effect (Analysis 2)
This second analysis focused on the studies which reported an effect of language distance on bilingual neurocognition (n = 31). We aimed to determine whether this effect was more pronounced in bilinguals with similar versus distant language pairs (RQ2). While the SLB > DLB category is numerically more plentiful, the results of a Bayesian binomial proportion test show that there is no evidence to strongly support either hypothesis over the other (BF10 = 0.793). This means we cannot conclude that the language distance effect is SLB versus DLB (Figure 5) and the results are mixed.

Figure 5. Anecdotal evidence for the direction of the modulatory effect of language distance, after recoding the studies finding an effect of LD as SLB > DLB or DLB > SLB. In the “Prior and Posterior” panel, it is shown that there is a 95% probability that the population proportion lies between 0.2 and 0.5. The “Sequential Analysis” panel illustrates the BF progression as each study is added, showing in this case weak evidence for the null hypothesis.
To complement our Bayesian analysis with a frequentist one, for analysis 1, a one-sample t-test confirms our finding that there is strong evidence that an effect of language distance exists (p = 0.00008254, Cohen’s d = 0.63). For analysis 2, a Chi-square test suggests that there is no evidence that one direction of the effect (SLB > DLB versus DLB > SLB) is stronger than the other (p = 0.106).
4. Discussion
This systematic review and quantitative analysis addresses our 3 RQs:
RQ1 asked whether there is evidence for a modulatory role of language distance on bilingual neurocognition, defined in the broad sense, including outcomes from different cognitive domains, and including language processing. The answer is positive, confirming early claims about the role of distance (Costa et al., Reference Costa, Santesteban and Ivanova2006; Rodriguez-Fornells et al., Reference Rodriguez-Fornells, de Diego Balaguer and Münte2006). To some extent, this finding contradicts that of the big meta-analysis of Lehtonen et al. (Reference Lehtonen, Soveri, Laine, Järvenpää, de Bruin and Antfolk2018), which found no evidence that language distance predicts bilingual adaptations in executive control. However, one critical difference exists between the two studies: Lehtonen et al. (Reference Lehtonen, Soveri, Laine, Järvenpää, de Bruin and Antfolk2018) focused on the executive component of cognition, whereas – although the executive component was strongly featured in our dataset – it was not our sole focus, as we did not exclude studies on word learning, speech recognition, attention, memory, metalinguistic awareness or other cognitive domains. Overall, our results agree with Werker and Byers-Heinlein (Reference Werker and Byers-Heinlein2008) and Borragan et al. (Reference Borragan, de Bruin, Havas, de Diego-Balaguer, Vulchanova, Vulchanov and Duñabeitia2021), who underscore the influence of the specific L1-L2 pairs in bilingual language processing from the early stages of linguistic development.
RQ2 relates to the directionality of the effect: If language distance indeed plays a role, are the observed effects that distinguish monolinguals from bilinguals more pronounced in SLB, DLB, or is the evidence mixed? Our analysis suggests that the evidence is largely mixed (in agreement with several studies in our dataset; see Oschwald et al., Reference Oschwald, Schättin, von Bastian and Souza2018 and Laketa et al., Reference Laketa, Studenica, Chrysochoou, Blakey and Vivas2021). Overall, there are numerically more studies in our dataset that report stronger bilingual adaptations in SLB, but our analyses suggest that this amounts to anecdotal evidence; hence, from a statistical point of view, we cannot confidently discard the null hypothesis, which is that there are no differences in the occurrence of bilingual adaptations in DLB versus SLB.
Addressing the lack of discernible directionality, we do not preclude or delve into the reasonable possibility that both positions (i.e., DLB > SLB and SLB > DLB) could have ecological validity simultaneously. In other words, it is possible that both extreme similarity and sufficient language distance can tax relevant underlying mechanisms – the same, partially overlapping or distinct ones – resulting in similar or indistinguishable (behavioral) performance effects. If so, neuroimaging might prove especially useful to tease out underlying differences depending on the degree of overlap of the implicated mechanisms in each case. If this turns out to be a tenable possibility, we would still expect thresholding to apply along a spectrum of language distance whereby the extreme ends of distance despite showing similar performance outcomes would differ from the middle (i.e., in-between language pairings). Moreover, we do not preclude that any discernible role of relative language distance might have distinct effects depending on its interaction with other variables and/at distinct stages of the processes of (becoming) bi-/multilingual; for example, languages of (less) closer proximity might confer opposite effects when applied to language learning stages as compared to maintenance stages (i.e., language use after sufficient proficiency has been attained and learning has stopped), which would be in line with some neurocognitive theories of brain adaptions that distinguish between language learning and maintenance periods after learning is complete (e.g., Pliatsikas, Reference Pliatsikas2020). For ease of exposition in the present nascent discussion, we leave these considerations for future hypothesizing and purposefully designed empiricism.
RQ3 concerns neural correlates and whether they align with the behavioral evidence. Again, since the evidence is mixed, and given that our sample of neuroimaging studies is very small to allow for separate analyses, we cannot unambiguously interpret the direction of the effect in the context of how switching versus mixing differently engages the brain. Our explanatory power is thus extremely limited, and we refrain from inferring neural effects from behavioral data based on our data; hence, we can only speculate in our discussion of RQ3. If the origin of bilingual adaptations were neural training due to involuntary switching, we would expect to find more robust evidence for H3, which predicts that adaptations would be more pronounced in DLB versus SLB (i.e., the opposite pattern of what we see in Figure 3). The reason has to do with language intelligibility. If two varieties are closely related, the need to keep them cognitively distinct may be relaxed, because mixing or voluntary switching would not interfere (much) with effective communication. This would explain why speakers of closely related varieties often mix instead of switch. Mixing entails incorporating elements from different varieties into one code, possibly giving rise to fused lects (Auer, Reference Auer1999; Grohmann et al., Reference Grohmann, Kambanaros, Leivada and Pavlou2021; Leivada et al., Reference Leivada, Papadopoulou and Pavlou2017). Overall, our results in relation to RQ3 are best interpreted in the context of Blanco-Elorrieta and Pylkkänen’s (Reference Blanco-Elorrieta and Pylkkänen2017) findings: voluntary switching does not engage the prefrontal cortex or elicit behavioral switch costs (see also Gollan & Ferreira, Reference Gollan and Ferreira2009). If switch effects do not emerge during comprehension of natural conversation – the real-life situation bilinguals face – this may explain why we fail to find pronounced differences between DLB and SLB. While this is not the only explanation (a topic to which we return below), it would readily capture the fact that almost all the studies in our dataset featuring very closely related varieties which may allow some degree of mutual intelligibility (e.g., Swiss German and Standard German versus Swiss German and Turkish in Oschwald et al., Reference Oschwald, Schättin, von Bastian and Souza2018; Spanish and Catalan versus Spanish and Basque in Borragan et al., Reference Borragan, de Bruin, Havas, de Diego-Balaguer, Vulchanova, Vulchanov and Duñabeitia2021) either offer support for SLB > DLB or provide unclear/mixed evidence. A notable exception is Von Grebmer Zu Wolfsthurn et al. (Reference Von Grebmer Zu Wolfsthurn, Gupta, Pablos and Schiller2023), whose results support DLB > SLB: their typologically similar group (Italian–Spanish) showed lower inhibitory control performance compared to the typologically dissimilar group (Dutch–Spanish).
Based on our findings, the hypotheses and predictions offered in Table 1 can be updated as in Table 4.
Table 4. Evaluation of hypotheses and predictions

Returning to the complex issue of why the results are mixed to the degree of making the evidence for H1 versus H2 and H3 versus H4 weak, we consider five different, non-mutually exclusive explanations that could contribute to our findings. First, language distance does not work alone; its interaction with other factors such as proficiency and degree/intensity of dual/multiple language engagement may mitigate its influence. For instance, if it turns out to be that in the SLB studies, people engage less with bilingual experience (e.g., because they live in an environment where everybody understands both varieties), then it could be that the effect of language distance is blurred, even if relevant. This could be why several studies in our dataset (e.g., Gallo et al., Reference Gallo, Myachykov, Nelyubina, Shtyrov, Kubiak, Terekhina and Abutalebi2023; Persici et al., Reference Persici, Vihman, Burro and Majorano2019) find that language distance effects may tail off as L2 proficiency and length of experience increase. In other words, the variation we observe could be because of the confluence of the different developmental stages. If we mix DLBs who may have greater cognitive involvement at the learning stage, followed by decreasing effort at the maintenance stage, with SLBs who presumably may show the opposite pattern, there could be a washing out effect. This means that more precise predictions about the role of language distance at various stages along the emerging and sustained nature of (becoming) bi-/multilingual (DeLuca, Reference DeLuca2024) and commensurable with how neuroplasticity works in general (e.g., in the Dynamic Restructuring Model; Pliatsikas, Reference Pliatsikas2020) are needed.
Furthermore, some studies in our database (Laketa et al., Reference Laketa, Studenica, Chrysochoou, Blakey and Vivas2021; Lu et al., Reference Lu, Liu, Zhang, Zhang, Song, Wang and Wang2023) underscore the potential role of cultural differences on both the subjective perception of language distance and bilingual language practices. These explanations highlight the need to transition from approaching lab-bilingualism as a categorical yes/no variable to measuring the interaction of different variables, such as language distance, proficiency, degree of switching and sociolinguistic norms, simulating real-life-bilingualism and accurately factoring in the complexity of different bilingual experiences (Blanco-Elorrieta & Pylkkänen, Reference Blanco-Elorrieta and Pylkkänen2017; Kubota & Rothman, Reference Kubota and Rothman2024; Leivada et al., Reference Leivada, Rodríguez-Ordóñez, Parafita Couto and Perpiñán2023; Luk, Reference Luk2023; Masullo et al., Reference Masullo, Dentella and Leivada2023; Navarro et al., Reference Navarro, DeLuca and Rossi2022; Rothman et al., Reference Rothman, González Alonso and Puig-Mayenco2019, Reference Rothman, Bayram, DeLuca, Di Pisa, Dunabeitia, Gharibi and Wulff2023; Titone & Tiv, Reference Titone and Tiv2023).
The second consideration has to do with the terminology employed and the absence of an overall metric. The terms “language similarity,” “language proximity” and “language distance” lack an ecologically valid, unambiguous definition (Mitrofanova et al., Reference Mitrofanova, Leivada and Westergaard2023). Different studies attribute to them different meanings, often taking one level of linguistic analysis in isolation – instead of overall L1-L2 similarity – as the basis for claiming that one language pair is more similar than another. Yet, this practice is not unproblematic. We cannot reliably claim that bilingual adaptations are more or less pronounced in similar versus distant languages if we cannot reliably measure what counts as (dis)similar. Two languages may show a great overlap of phonological features, while having little similarity in morphosyntax (e.g., Basque and Spanish) or without clustering together in the phylogenetic tree (e.g., Basque and Greek). Due to this inability to define what counts as similar in a global way, existing studies tapping into the relation between language similarity and cognitive adaptations resort to extreme cases for comparison that involve language isolates or languages from different language families.
This approach suffers from several challenges that may contribute to the mixed results we observe. First, it ignores that quite often the usual candidates for extreme distance (i.e., Basque, Chinese) entail testing speakers who use more than one regional variety; hence, they are at least trilingual. The possible interaction of this added variable with language distance is currently understudied and largely unknown (DeLuca, Reference DeLuca2024). Second, it often takes language family as a proxy for language similarity: languages from the same family are taken for granted to be more similar than languages from different families. Indeed, most studies in our dataset use language family as a proxy without measuring distance in any further way (but see, among others, Floccia et al., Reference Floccia, Delle Luche, Lepadatu, Chow, Ratnage and Plunkett2020; Kepinska et al., Reference Kepinska, Caballero, Oliver, Marks, Haft, Zekelman, Kovelman, Uchikoshi and Hoeft2023; Malik-Moraleda et al., Reference Malik-Moraleda, Jouravlev, Taliaferro, Mineroff, Cucu, Mahowald, Blank and Fedorenko2024; Schepens et al., Reference Schepens, Van Hout and Jaeger2020 for the use of different metrics). However, the practice of using language family as a proxy may lead to unwarranted assumptions of L1-L2 similarity (Eden, Reference Eden2018). For example, the phonemic inventory analysis in Eden (Reference Eden2018) suggests that Spanish and Greek are very similar, considerably more than Spanish and Portuguese, which are both Ibero-Romance varieties. In sum, the direction of the effect of language distance in bilingualism is hard to predict because distance itself is described in broad strokes more often than not.
Third, comparing “similar” languages from the same language family with “similar” languages assessed by a language distance measure might entail inaccurate comparisons, re-introducing the issue of what “similar” refers to. Spurious results might arise due to the lack of a comprehensive definition of language distance. Establishing different dimensions of language distance relevant to bilingual cognition and how to measure them can lead to the operationalization of a proximity metric to thoroughly explore the relationship between specific bilingual adaptations and language distance.
A fourth explanation for the mixed results could depend on psychotypology, that is, subjective perceptions of language distance. Psychotypology may affect transfer during L2/L3 acquisition (Kellerman, Reference Kellerman1979; Singleton, Reference Singleton1987; Xia, Reference Xia2017, but see Rothman et al., Reference Rothman, González Alonso and Puig-Mayenco2019) and might even override an objective measure of language distance when considering the likelihood of transfer (Odlin, Reference Odlin1989). It is possible for bilinguals to perceive similarities between unrelated languages (Ringbom, Reference Ringbom2006), and such similarities can accelerate transfer. Phylogenetically “distant” languages can present lexical similarities (e.g., cognates) due to language contact and borrowing. The source of such similarities might be unknown to bilinguals with lower metalinguistic awareness, who can perceive “distant” languages as “close,” and therefore transfer knowledge from L1 to L2 (or vice versa). We still do not know what L1-L2 similarities and differences matter when it comes to bilingualism, whether objective (language distance), subjective (psychotypology) or both. Exploring both metrics in parallel can provide insights into the relationship between L1-L2 similarities and bilingualism across the lifespan, focusing on stable (language distance) and unstable (psychotypology) metrics, as psychotypology can change over time, depending on a bilingual’s proficiency and metalinguistic awareness. Considering subjective metrics alongside objective measures of language distance could be particularly useful in contexts involving minority languages or sociolinguistic dynamics influenced by language prestige (Calamai et al., Reference Calamai, Piccardi and Nodari2022).
A last point to consider in relation to the present mixed results relates to the null-result bias. It is well established that almost all fields of research suffer from a bias that makes null results harder to publish (Hubbard & Armstrong, Reference Hubbard and Armstrong1997). It has been argued that bilingualism research is not any different (de Bruin et al., Reference de Bruin, Treccani and Della Sala2015), although recent estimates have shown prior claims regarding the magnitude of a publication bias in bilingualism were exaggerated due to selective representation of the relevant studies (Leivada, Reference Leivada2023). In any case, while we find that this explanation is less likely to hold good explanatory power compared to the others, we cannot exclude the possibility that researchers submit for publication only or predominantly those significant results in either direction: SLB > DLB or DLB > SLB. Therefore, many null results that could help disambiguate the role of language distance across different language pairs remain in the drawer. If so, the studies we analyze correspond to the visible tip of the iceberg, and further research is needed to fully determine the role of language distance in bilingualism.
5. Outlook
The findings of the present study reveal a modulatory effect of language distance on bilingual neurocognition. This means that the typological distance between language pairs could be one of the factors that underlie (the degree of) bilingual adaptations, explaining the different sets of results found in research on bilingual effects on cognition. Regarding the direction of the effect and the presence of more robust adaptations in either similar or distant language pairs, our results show mixed findings. Possible explanations for these diversified results concern the lack of a standardized and global language distance index, and the interaction of distance with other variables such as proficiency in the second language, language practices and the potential role of psychotypology. Future studies should start to define language distance, both in terms of terminology and through using global distance measures, as a step towards a better understanding of the effects of bilingualism on the mind/brain. If employing comprehensive measures of language distance becomes common practice, further comparisons across languages and studies might lead to more reliable generalizations about language distance and its relationship with different aspects of bilingual cognition.
Future research on language distance should recognize and investigate the dynamic nature of this moderator, seeking to uncover its interaction with other factors. In this respect, one factor of interest is timing. It is possible that more distant language pairs exert maximum effects at initial stages of language learning because parsing with less cross-linguistic bootstrapping conveys a differential cognitive challenge. Conversely, closer language pairs might take on this same role after language learning is over and bilingualism needs to be maintained because inhibiting intrusions of more closely related languages taxes underlying cognitive control more. Another factor to consider is the degree of switching, as involuntary switching increases the exerted cognitive effort (Blanco-Elorrieta & Pylkkänen, Reference Blanco-Elorrieta and Pylkkänen2018). Yet another factor worthy of serious consideration is the age of onset. The exerted control may become less of an intensive mental gym as linguistic monitoring progressively transitions from a heavily controlled process to a far more automated one at later stages of L2 learning (Paap, Reference Paap, De Houwer and Ortega2018). Measuring the interaction of these and other potentially co-morbid factors behind the bilingual experience in combination with developing precise metrics of language distance will likely help us to understand how similarity in mental, linguistic representations affects the mind and brain.
Data availability statement
The datasets generated and analyzed in the current study are available at the “The unpredictable role of language distance in bilingual cognition: A systematic review from brain to behavior” OSF project, https://osf.io/fqx9m/. The R code used for the analysis is available at the “The unpredictable role of language distance in bilingual cognition: A systematic review from brain to behavior” OSF project, https://osf.io/fqx9m/.
Acknowledgements
*We thank Esti Blanco-Elorrieta for feedback on an early version of this work.
Funding statement
EL acknowledges funding by the Spanish Ministry of Science and Innovation (MCIN/AEI/10.13039/501100011033) under the research project no. PID2021-124399NA-I00. MW and JR acknowledge funding by the Trond Mohn Foundation, grant no. TMS2023UiT01.
Competing interests
The authors declare none.

