Introduction
The number of individuals who speak more than one language has been increasing in the world (Ansaldo et al., Reference Ansaldo, Ghazi-Saidi and Adrover-Roig2015). Being bilingual has inherent effects on various aspects of an individual's life. Some studies have shown bilingualism affects cognitive abilities in positive ways (Bialystok et al., Reference Bialystok, Craik, Klein and Viswanathan2004). Over time, however, the literature has shown a considerable disagreement about how bilingualism affects cognitive functioning. In this study, we used a novel approach to the field of bilingualism, a latent variable approach, to test for the cognitive consequences of bilingual language experience. Confirmatory Factor Analysis (CFA) stands out as a valuable method for separating genuine Executive Functions (EF) effects from non-executive demands. We explain in more detail below why this approach is appropriate to address the question of a possible relationship between bilingualism and executive functioning. This study uses the CFA approach to explore the structure of EF in bilinguals and monolinguals, as well as to examine the potential consequences of bilingualism on EF among young adults. By using this lens, we can gain a better understanding of the complex relationship between bilingualism and EFs. A cognitive consequence of being bilingual would result from experiences using two languages. In processing their two languages, bilinguals have to selectively attend to the target language and context and reduce interference from the non-target language. Bilinguals gain a lot of practice using executive functioning to reduce the interference of the non-target language, thereby potentially leading to an advantage over monolinguals (Antón et al., Reference Antón, García, Carreiras and Duñabeitia2016; Bialystok, Reference Bialystok2011; Bialystok et al., Reference Bialystok, Craik, Grady, Chau, Ishii, Gunji and Pantev2005; Pelham & Abrams, Reference Pelham and Abrams2014; Poulin-Dubois et al., Reference Poulin-Dubois, Blaye, Coutya and Bialystok2011). Executive functions (EFs), also known as executive control or cognitive control, refer to a set of top-down cognitive abilities that are required to control individuals’ thoughts, actions and underlie goal-directed behavior (Diamond, Reference Diamond2013; Miyake et al., Reference Miyake, Friedman, Emerson, Witzki, Howerter and Wager2000). As a result, they are essential to the coordination and regulation of cognition, emotion, and behavior; and, as a result, are critical to each individual's functioning (Strauss et al., Reference Strauss, Sherman and Spreen2006). EFs play an important role in mental and physical health, social and psychological development (Diamond, Reference Diamond2013).
Cognitive consequences of bilingualism
The literature offers a variety of perspectives on bilingualism's relationship with EF. Many studies have found that bilinguals performed better on tasks tapping EF (Bialystok et al., Reference Bialystok, Craik and Luk2008; Brito et al., Reference Brito, Murphy, Vaidya and Barr2016; Grundy & Timmer, Reference Grundy and Timmer2017; Hernández et al., Reference Hernández, Costa, Fuentes, Vivas and Sebastián-Gallés2010; Morales et al., Reference Morales, Calvo and Bialystok2013). However, other studies have not found these cognitive consequences of bilingual language experience (Antón et al., Reference Antón, García, Carreiras and Duñabeitia2016; Blumenfeld & Marian, Reference Blumenfeld and Marian2014; Desjardins et al., Reference Desjardins, Bangert and Gomez2020; Filippi et al., Reference Filippi, Periche Tomas, Papageorgiou and Bright2020; Hilchey & Klein, Reference Hilchey and Klein2011; Kousaie et al., Reference Kousaie, Sheppard, Lemieux, Monetta and Taler2014; Lee Salvatierra & Rosselli, Reference Lee Salvatierra and Rosselli2011; Massa et al., Reference Massa, Köpke and El Yagoubi2020; Mor et al., Reference Mor, Yitzhaki-Amsalem and Prior2014; Morrison & Taler, Reference Morrison and Taler2020; Morton & Harper, Reference Morton and Harper2007; Papageorgiou et al., Reference Papageorgiou, Bright, Periche Tomas and Filippi2019). Further, a series of large-scale studies (e.g., Dick et al., Reference Dick, Garcia, Pruden, Thompson, Hawes, Sutherland, Riedel, Laird and Gonzalez2019; Nichols et al., Reference Nichols, Wild, Stojanoski, Battista and Owen2020) and exhaustive meta-analyses (like those by Anderson et al., Reference Anderson, Hawrylewicz and Grundy2020; Donnelly et al., Reference Donnelly, Brooks and Homer2019; Gunnerud et al., Reference Gunnerud, Ten Braak, Reikerås, Donolato and Melby-Lervåg2020; Lehtonen et al., Reference Lehtonen, Soveri, Laine, Järvenpää, de Bruin and Antfolk2018; Lowe et al., Reference Lowe, Cho, Goldsmith and Morton2021) have found minimal or null effects of bilingualism on cognitive functioning. Moreover, a few studies have shown a bilingual disadvantage (e.g., Folke et al., Reference Folke, Ouzia, Bright, De Martino and Filippi2016; Gonzalez, Reference Gonzalez2017; Kousaie & Phillips, Reference Kousaie and Phillips2012; Samuel et al., Reference Samuel, Roehr-Brackin, Pak and Kim2018). Paap et al. (Reference Paap, Anders-Jefferson, Mason, Alvarado and Zimiga2018), for instance, did a meta- analysis study of 99 studies reporting Reaction Times (RT) for 177 language-group comparisons. The results showed that out of 174 comparisons, reporting a statistical test to compare the interference effects between monolinguals and bilinguals, only 26 (14.7%) comparisons showed that bilinguals outperform monolingual counterparts; interestingly, all of those comparisons were from the studies with small sample sizes. 144 (81.4%) of the comparisons yielded null results and 4 (2.3%) showed a bilingual disadvantage. Additionally, van den Noort et al. (Reference van den Noort, Vermeire, Bosch, Staudte, Krajenbrink, Jaswetz, Struys, Yeo, Barisch, Perriard, Lee and Lim2019) reviewed the results of 46 studies comparing bilingual and monolingual children and adults, on tasks tapping executive functioning. They found that 54.3% of the selected studies supported a positive effect of bilingualism on EFs, 28.3% showed mixed results, and 17.4% showed a bilingual disadvantage. They reported that the positive consequences of bilingualism on cognitive abilities were more evident in the earlier studies in the period between 2004 and 2012, while studies showing null results and bilingual disadvantage were conducted more recently from 2013 until late 2018. One key explanation is the improved methodology, including the use of larger samples and different experimental tasks, which has been used more in recent studies than earlier ones.
Challenges in EF measurement
A significant part of the variance in results between studies is due to the inherent challenges of researching EF. At least part of the reason for such variable results across studies is that EF conceptualization has been hindered by two major issues: the structure of EF and the composition of the tasks that are used to measure EF (Baggetta & Alexander, Reference Baggetta and Alexander2016; Barkley, Reference Barkley2012; Morse, Reference Morse2022; Stålnacke et al., Reference Stålnacke, Lundequist, Böhm, Forssberg and Smedler2019). First and foremost, the question remains as to whether EF is a unitary construct or a heterogeneous set of dissociable processes (Garon et al., Reference Garon, Bryson and Smith2008; Jurado & Rosselli, Reference Jurado and Rosselli2007). The most common method of addressing this issue has been to create comprehensive neuropsychological test batteries and use principal components analysis (PCA) or exploratory factor analysis (EFA) to determine whether manifest variables can be reduced to a smaller number of underlying factors. In the studies using these approaches, the factorial solutions differed in terms of the number, composition, and interpretation of the extracted factors, thus limiting the conclusions that could be drawn regarding the nature of EF. It is possible that these inconsistencies are caused by the use of different test batteries and by the age range of participants (van der Sluis et al., Reference van der Sluis, de Jong and van der Leij2007).
EF is a term that was introduced formally in the 1970s, but discussions have been ongoing since the 1840s (Goldstein et al., Reference Goldstein, Naglieri, Princiotta and Otero2014). In 1973, Pribram introduced the term “executive” in the context of prefrontal cortex functions (Pribram, Reference Pribram1973). Early definitions (e.g., Lezak, Reference Lezak1983; Welsh & Pennington, Reference Welsh and Pennington1988), and nearly all definitions that followed (Baggetta & Alexander, Reference Baggetta and Alexander2016; Barkley, Reference Barkley2012; Jurado & Rosselli, Reference Jurado and Rosselli2007), defined EF as a multidimensional construct. With the growth of the conceptual understanding of EF, researchers began to delineate its functional attributes and underlying mechanisms in depth. Broadbent's (Reference Broadbent1958) filter model theorized a buffer for conscious awareness, selecting relevant information while filtering out the irrelevant (Broadbent, Reference Broadbent1958). Posner and Snyder (Reference Posner, Snyder, Balota and Marsh2004) expanded on this with a cognitive control model, while Baddeley et al. (Reference Baddeley, Sala and Robbins1996) described the executive as a unified system guiding multiple functions. Baddeley and Hitch (Reference Baddeley and Hitch1994) proposed that a ‘central executive’ was responsible for managing lower-level cognitive processes in the context of working memory, whereas others applied the concept to a system in which attention is controlled consciously (i.e., the Supervisory Attentional System [SAS]; Norman & Shallice, Reference Norman, Shallice, Davidson, Schwartz and Shapiro1986). According to the SAS model, attention is processed in two main ways: automatic processes, which are unconscious and respond to familiar stimuli, and controlled processes, which require conscious effort for unique situations. Barkley (Reference Barkley2012) framed EF as self-regulation, underlying components like working memory and emotion management (Barkley, Reference Barkley2012). Over the years, research has robustly examined correlations between tests of executive functions using a factor analytic approach (Royall et al., Reference Royall, Lauterbach, Cummings, Reeve, Rummans, Kaufer, LaFrance and Coffey2002). Miyake et al. (Reference Miyake, Friedman, Emerson, Witzki, Howerter and Wager2000) put forth one of the most prominent models of EF, suggesting interconnected yet separate components: inhibition (the process of managing attention purposely), switching (switching between different concepts concurrently), and working memory (keeping and processing information in mind for a short time). In their study, they administered nine EF tasks, three for each component, using a combination of confirmatory and exploratory factor analysis to test the individual components and to assess how those components are connected and loaded onto a common factor. They used a Stroop task, an Antisaccade task, and a Stop-signal task to measure inhibition, a Keep-track task, a Letter memory task, and a Tone-monitoring task to assess working memory and a Plus–minus task, the Number–letter task, and the Local–global task to measure shifting ability. The results confirmed the three-factor model. This model allows an understanding of EFs at a behavioral level (rather than a neural level).
A second major difficulty in studying EF is the task impurity problem – that is, tasks designed to measure it often involve more than one type of executive processing (Hughes & Graham, Reference Hughes and Graham2002), and they also might contain a variety of non-executive processes (e.g., perceptual processing) that may contribute to an individual's performance (Miyake et al., Reference Miyake, Friedman, Emerson, Witzki, Howerter and Wager2000). Because latent variable approaches parse task variance into latent (shared) and residual (task-specific) variance, these types of studies are well suited to dealing with the task impurity problem, which may have contributed to inconsistencies being observed.
Confirmatory factor analysis
In order to address the measurement issues associated with EF, confirmatory factor analysis (CFA) is a useful methodological approach. A CFA analyzes correlations between unmeasured latent variables that are composed of two or more observed manifest variables. A CFA is a method that evaluates the fit of a theory-driven factor model to the data, as opposed to PCA or EFA, which are data-driven methods (Wiebe et al., Reference Wiebe, Espy and Charak2007). In this manner, one can compare competing theory-driven factor models on the basis of how well they fit the data. As opposed to PCA, where the new component variables are functions of the manifest variables, EFA and CFA use manifest variables that are functions of latent factors. Therefore, both EFA and CFA are capable of establishing reliable associations between latent factors and manifest variables by identifying and isolating unique sources of variance in the manifest variables (Bryant & Yarnold, Reference Bryant and Yarnold1995). Contrary to PCA and EFA, where each manifest variable loads on every latent factor in the analysis, CFA allows for the specification of the loadings for each latent factor in order to better satisfy a priori hypotheses. As a result, CFA provides a powerful approach for evaluating different hypotheses regarding the structure of EF. Furthermore, CFA solves the task impurity issue by extracting only the common variance shared by different EF tasks that are required to measure the same latent factor; so this results in a purer measure of the EF construct (Miyake et al., Reference Miyake, Friedman, Emerson, Witzki, Howerter and Wager2000).
Recently, CFA has been increasingly employed in executive function measurement. While Miyake et al. (Reference Miyake, Friedman, Emerson, Witzki, Howerter and Wager2000) initially suggested a three-factor model, subsequent CFAs have generated varied models across lifespan, ranging from one-dimensional structures to nested-factor models (i.e., bifactor without inhibition). According to a recent systematic review (Karr et al., Reference Karr, Areshenkoff, Rast, Hofer, Iverson and Garcia-Barrera2018), for adults, there were roughly three types of models: a two-factor model (33.33%; Klauer et al., Reference Klauer, Schmitz, Teige-Mocigemba and Voss2010; McVay & Kane, Reference McVay and Kane2012; Was, Reference Was2008); a three-factor model (22.22%; Klauer et al., Reference Klauer, Schmitz, Teige-Mocigemba and Voss2010; Miyake et al., Reference Miyake, Friedman, Emerson, Witzki, Howerter and Wager2000); and a nested factor model (22.22%; Fleming et al., Reference Fleming, Heintzelman and Bartholow2016; Ito et al., Reference Ito, Friedman, Bartholow, Correll, Loersch, Altamirano and Miyake2015). One study favored a four-factor model (11.11%; Chuderski et al., Reference Chuderski, Taraday, Nêcka and Smoleñ2012) and another showed a five-factor model (11.11%; Fournier-Vicente et al., Reference Fournier-Vicente, Larigauderie and Gaonac'h2008). Among older adults, the majority supported a two-factor model (62.5%; Bettcher et al., Reference Bettcher, Mungas, Patel, Elofson, Dutt, Wynn, Watson, Stephens, Walsh and Kramer2016; de Frias et al., Reference de Frias, Dixon and Strauss2009; Frazier et al., Reference Frazier, Bettcher, Dutt, Patel, Mungas, Miller, Green and Kramer2015; Hedden & Yoon, Reference Hedden and Yoon2006; Hull et al., Reference Hull, Martin, Beier, Lane and Hamilton2008), whereas a smaller but substantial percentage supported a three-factor model (37.5%; Adrover-Roig et al., Reference Adrover-Roig, Sesé, Barceló and Palmer2012; de Frias et al., Reference de Frias, Dixon and Strauss2009; Vaughan & Giovanello, Reference Vaughan and Giovanello2010). For example, Klauer et al. (Reference Klauer, Schmitz, Teige-Mocigemba and Voss2010) found that inhibition and updating could not be separated, whereas a few studies (Hull et al., Reference Hull, Martin, Beier, Lane and Hamilton2008; van der Sluis et al., Reference van der Sluis, de Jong and van der Leij2007) have failed to find an inhibition factor, primarily due to the inability to determine a latent variable of the inhibition tasks in these studies.
Cognitive consequences of bilingualism across the lifespan
Several studies have examined the cognitive consequences of bilingual language experiences across different age groups. Particularly noteworthy is that bilingualism has affected EFs in children and older adults, wherein bilingual individuals appear to exhibit enhanced cognitive skills compared to their monolingual counterparts. Age is a significant factor in cognitive development and might provide a plausible explanation for varying outcomes across studies.
Although recent studies have found mixed results, many have found a connection between bilingualism and EFs among children (e.g., Barac & Bialystok, Reference Barac and Bialystok2012; Barac et al., Reference Barac, Moreno and Bialystok2016; Park et al., Reference Park, Ellis Weismer and Kaushanskaya2018; Yang & Yang, Reference Yang and Yang2016; Yurtsever et al., Reference Yurtsever, Anderson and Grundy2023). In young adults, particularly when EF is at its peak, the cognitive consequences of bilingual language experience are often not apparent, as demonstrated in Bialystok's foundational study (Bialystok et al., Reference Bialystok, Craik and Luk2008). Skill-learning theory posits a framework where repetitive practice turns processes into routine, driven more by automatic processes and less by the general executive system (Taatgen, Reference Taatgen2013). This distinction between automaticity and executive control is fluid, with even well-practiced tasks potentially requiring conscious oversight or the engagement of EFs like inhibition, especially when confronted with unexpected challenges. As per this theory, when a task is new, it heavily engages broad, effortful top-down control processes. However, as familiarity with the task grows, there's a reduced dependence on these general executive resources, as task-specific skills take over (Lehtonen et al., Reference Lehtonen, Fyndanis and Jylkkä2023). In contrast, older adults present another aspect of bilingualism. As cognitive decline becomes more evident with age (e.g., Deng et al., Reference Deng, Wang, Gu and Song2023), older bilingual adults have found results in favor of delaying the onset of dementia and other age-related cognitive impairments (e.g., Kousaie et al., Reference Kousaie, Sheppard, Lemieux, Monetta and Taler2014). Thus, future research should examine how cognitive consequences of bilingual language experiences change over time, considering a wide age range, where bilingualism has more prominent effects on EF in future studies.
Although all the constructs in the Miyake's model of EF have been extensively studied in the literature on bilingualism; to our knowledge, the current study is the first one that used confirmatory factor analysis and tested for the EF model among bilinguals. Due to this context, our research aims to explore uncharted territories of bilingual EF structures. Therefore, the present study has two goals: first, to assess the EF structure among adults using different EF tasks, and second, to compare the best-fitting EF model between monolinguals and bilinguals.
Methods
Participants
Participants of this study were 320 students at the University of Alberta. All participants (214 females, 106 males, Mage = 19.52, SD = 2.57, range = 18–38 years) were recruited from the Psychology Research Participation Pool. They were all undergraduate students and received one course credit for their participation and completion of this study. This study was conducted in an English-majority language part of Canada. The participants were required to be either English-speaking monolinguals or bilinguals who speak a first language other than English and English as their second language in order to participate. They classified themselves into monolingual or bilingual groups by answering the following questions, “Do you consider yourself monolingual or bilingual?” and “What is your second language?”. The monolingual group consisted of 162 participants (105 female, Mage = 19.81, SD = 3.21) and the bilingual group consisted of 158 (106 female, Mage = 20.58, SD = 3.77). Bilinguals rated their second language (English) proficiency on a Likert scale from 0 (beginner) to 7 (typical native speaker), a measure adopted from Paap and Greenberg (Reference Paap and Greenberg2013). Self-ratings of language proficiency have been used widely in bilingualism studies and different studies have shown that self-reported studies are highly associated with standardized measures of language proficiency (e.g., Marian et al., Reference Marian, Blumenfeld and Kaushanskaya2007; Sörman et al., Reference Sörman, Hansson and Ljungberg2019).
Executive Control tasks
Six computerized tasks were chosen for the present study. All of the tasks have good validity and reliability; and previous studies using these tasks showed a bilingual advantage. For each task the stimuli were presented on the center of the screen in 36-point font. More detail about the stimuli is provided in each task description. Psytoolkit and Qualtrics websites were used to manage experiments. Both Psytoolkit and Qualtrics websites are freely online programs in which researchers can build surveys and create modified versions of cognitive tests. There are some studies about Psytoolkit's timing reliability. A recent study showed that PsyToolkit is valid and reliable for administering both general and psycholinguistic experiments using response choice and response time (Kim et al., Reference Kim, Gabriel and Gygax2019). They found Psytoolkit's timing is on par with one of the most lab-based software packages, E-Prime. Psytoolkit uses standard JavaScript technology; which is the same in all browser-based RT measurements. Qualtrics website was used to conduct digit span tasks to measure verbal working memory.
Inhibitory control
We used a modified version of the classic Stroop color-naming task (Reference Stroop1935) to measure inhibition. This computerized task has two levels of trials, congruent and incongruent, with four colors red, yellow, green, and blue. In the congruent trials, a color word is written in the same color, e.g., the word “RED” printed in red but in the incongruent condition, a color word is written in another color, e.g., the word “RED” printed in yellow. The words were presented in a 36-point Chicago font and the letters were lower case. There were 8 practice trials and the participants received feedback on their performance. Following that there were 48 trials (with no feedback) that started with a centered white cross symbol on a black background. Four keys on the keyboard were labeled and assigned for each color and participants were asked to perform the task based on the font's color (i.e., “r” for red, “y” for yellow, “g” for green and “b” for blue). Each trial started with a centered white fixation cross displayed on a black background for 1000 ms, and the stimulus stayed on the screen until they responded or for 5000 ms. The dependent variable that we considered for this study was the RT for incongruent condition.
We also used a computerized version of the Flanker task (Eriksen and Eriksen, Reference Eriksen and Eriksen1974). In the Flanker test a row of five horizontally arranged stimuli (Letters) displayed on the center of the screen. The test used in the present study has only two conditions: congruent and incongruent. This task involves 32 trials of each condition and the number of correct responses and reaction time were recorded. Participants were asked to press the assigned key on the left (A) or right (L) of the keyboard based on the target stimuli which is flanked by other stimuli. For instance, the target letters X and V may require left and right responses, respectively. In congruent trials, the flanked stimuli match the target stimuli (e.g., XX X XX), versus incongruent trials (e.g., VV X VV). Each trial was displayed by a fixation cross of 300 ms in the middle of the screen. The considered dependent variable was the RT for incongruent condition.
Shifting
For measuring shifting ability, the Letter-Number task was used (Rogers & Monsell, Reference Rogers and Monsell1995). In this task participants saw a mixture of one digit (even or odd) and one letter (vowel vs. consonant). The targets of this task included five consonants (f, k, s, n, p), five vowels (a, e, i, o,u), five odd (1, 3, 5, 7, 9) and five even digits (2, 4, 6, 8, 0), which were printed in 36- point Chicago font. Each letter was randomly combined by a digit. Trials started first by a centered fixation cross shown on screen for 1000 ms, then the task cue (the word LETTER or NUMBER) was shown for 200 ms. After that a combination of letter-number was shown on the screen for a maximum of 5000 ms. The test started with two single blocks consisting of 80 trials. The word NUMBER or LETTER displayed on the screen as a task cue. In the number task, participants were asked to respond by pressing O on the keyboard when the digit is odd and P is for the even digits. As for the letter task, O was for consonant and P was for vowel letter. In the mixed block of 40 trials that half of them were switch trials, where the current trial was different from the previous one (e.g., NUMBER-LETTER) and half of them were no switching trials, in that the current trial was the same as the previous trial (e.g., NUMBER-NUMBER). There were 4 practice trials before each single block and 8 practice trials before the mixed block. The variable of interest in this task was RT for the switching trials.
In addition, the modified version of the Color-Shape Switching task was used to measure switching (Prior & MacWhinney, Reference Prior and MacWhinney2010). This task consists of three dimensions: pure blocks of color and shape, as well as mixed blocks. In each trial a target appeared on the screen that was either circles or triangles, or blue or yellow. For the color trials, participants were asked to respond with two fingers (index and middle) of one hand by pressing the assigned bottoms on the keyboard. They were asked to press N if the target was blue and B if the target was yellow. Likewise, in the shape blocks, they were asked to press the designed keys with the index and middle finger of the other hand and decide if the target was triangle or circle. For the single blocks, all trials presented either shape or color but for the mixed block, trials could change and presented both types of the color or shape. Each trial began with a fixation cross for 350 ms. Following that, the screen turned blank for 150 ms and then the block cue showed on the screen for 250 ms; next the target appeared and remained on the screen until the participants responded or for the maximum time of 5000 ms. Participants started with a single block of 8 practice and 16 experimental trials followed by 38 trials in the mixed block. The screen turned blank between each block of the test for 250 ms. Each trial in a mixed block was either a repeat (same dimension as the previous trial) or a switch (different from the previous trial) – that is, 50% of trials were switch (shape-color or color-shape trials) and 50% non-switch trials (shape-shape or color-color). The variable of interest was the RT for switching trials.
Working memory
The Digit Span task was used to measure verbal working memory (i.e., linguistic working memory). We used the Qualtrics website and a native speaker researcher read a list of digits and then participants needed to recall the digits and type the sequence out. One practice trial was included in the test to make sure that participants were ready to start the test and the sound and their speaker worked well. The first phase started with two digits and the list of digits increased in length with each phase and reached nine digits by the end of the test. For the backward version the first phase included two digits and reached six digits. The same sequence of digits was represented to all participants. There was no feedback for this task and they had to complete both forward and backward versions of the test from two digits to nine or six digits. Participants gained one point for each trial if they recalled the sequence of digits completely correctly; they scored 0 if they recalled at least one digit of the list incorrectly. The highest possible score for forward digit span was 8 and for backward version was 5. The variable of interest was accuracy in this task.
The Corsi block task was also used to assess visual working memory (i.e., non-linguistic working memory). A string of the blocks was highlighted on the screen and participants were asked to recall the sequence of the blocks and press the sequence in the same or reverse order. Each phase consisted of 2 different trials for the same sequence length. In order to move onto the next phase participants needed to do both trials totally correct. The highest possible score was 12. For the backward Corsi block task in which participants needed to recall the sequence in the reverse order, the highest possible score was 9.
Procedure
All steps of the data collection were done remotely. All of the participants used their own computers to perform the tasks. They completed all tasks in a single experimental session taking approximately 45 minutes. In the consent form, participants were asked to do the experiment through a Google Chrome browser and make sure their computer speakers work well. They also were informed that they would be needing the keyboard. Task instructions were printed in white and Arial 18 on a black screen. Participants completed a demographic questionnaire and then they performed Flanker, Color-Shape, Stroop, Letter-Number, forward and backward Corsi-block, and forward and backward Digit Span tasks, respectively. The tasks were all presented in the same order to equate it across participants.
Statistical methods
Confirmatory factor analyses were conducted using lavaan package (0.6–12) by R version 4.2.1 (R Development Core Team, 2005). Initially, based on the previous research some models were determined, followed by selecting the model that provided the best fit based on the fit statistics. The chi-square (χ 2) test indicates the fit of a model; if the p value is nonsignificant, it shows that the model fits the data well. A set of cutoff criteria is used to determine how RMSEA and CFI should be applied. An RMSEA value of .05 and .08 signifies a reasonable fit between the model and data, according to previous research (Browne & Cudeck, Reference Browne, Cudeck, Bollen and Long1993). As a measure of comparative improvement in fit, the CFI (Bentler, Reference Bentler1990) compares the baseline model to the postulated model. Earlier studies proposed that, for the CFI, the thresholds for acceptable fit is >.90 (Bentler & Bonett, Reference Bentler and Bonett1980) and >.95 (Hu & Bentler, Reference Hu and Bentler1999; Kline, Reference Kline2011).
An examination of the measurement properties of a factor or factors across groups is known as factor invariance testing. Widaman and Reiss (Reference Widaman and Reise1997) proposed four main steps for testing measurement invariance: configural which serves as a prerequisite for the other tests and refers to the presence of the same number of factors in each group, as well as the same pattern of fixed and free parameters, weak factorial (also known as metric) which tests for the equal loadings of the tasks across groups, strong factorial (also known as scalar) testing for the equal intercept for groups,). If there is evidence for strong factorial invariance, it is possible to compare latent factor means across groups.
Results
Mean scores and standard deviations for performance on each EF task for the total sample in addition to monolinguals and bilinguals are summarized in Table 1.
Note. For Flanker and Stroop tasks, incongruent RT, for Color-Shape and Letter-Number, RT for switching conditions and for WM tasks, the accuracy (correct answers) were measured.
According to the t-tests, monolinguals were faster than bilinguals in both inhibition, t(306) = −3.20, p = 001, t(316) = −4.42, p < .001 for the RT in flanker and Stroop's incongruent conditions, respectively. Monolinguals also performed the switch trials in shifting tasks faster than bilinguals, t(315) = −1.99, p = .04 and t(291) = −1.34, p = .05 for Color-Shape and Letter-Number tasks, respectively. There was no significant difference between groups regarding the Verbal Working Memory measuring with FDS t(312) = −1.18, p = .23 and BDS t (317) = −.88, p = .37. Likewise, monolinguals and bilinguals performed FCB and BCB tasks statistically equivalently, t (302) = .69, p = .49 and t (310) = 1.97, p = .05).
Correlations between tasks for each group are presented in Table 2. As can be seen, both inhibition tasks are highly correlated to each other and both shifting tasks for both groups. However, there is no significant correlation between WM and other EF tasks.
Note: Above the diagonal line presents the correlation coefficients for monolinguals; below the diagonal presents the correlation coefficients for bilinguals.
Finc: Flanker RT for incongruent conditions, Sinc: Stroop RT for incongruent conditions, LN: Letter-Number task, CS: Color-Shape task, FDS: Forward Digit Span, BDS: Backward Digit Span, FCB: Forward Corsi Block, BDS: Backward Corsi Block
** Correlation is significant at the .01 level
* Correlation is significant at the .05 level
Next, confirmatory factor analysis was used to select the best-fitting model. The model parameters and fit statistics are presented in Table 3 and Table 4 presents the model fit comparisons. In general, two models (Model 4 which is the two-factor model with shifting and inhibition plus WM and Model 5 showing a three-factor model with shifting, inhibition and WM) properly fit the data. The two-factor model was preferred for reasons of parsimony; notably, the correlation between shifting and inhibition in the three-factor model was .83.
Following selecting the best-fitting model, we tested factorial invariance across monolingual and bilingual groups. Tests of invariance are provided in Table 5. Unfortunately, the 2-factor model did not converge when testing configural invariance, so the 3-factor model was tested instead (illustrated in Figure 1, Model 5). There was evidence for metric invariance. This indicates that factor loadings of each task on the three-factor EF could be constrained to be equal in monolingual and bilingual groups. Results also supported scalar invariance – that is, the intercepts for indicators could be constrained to be equal across monolinguals and bilinguals.
Note: Baseline = no invariance constraints, M1 = configural invariance, M2 = metric invariance, M3 = scalar Invariance, M4 = equal factor means, M5 = Equal means for WM only, M6 = equal means for inhibition only, M7 = equal means for inhibition only, M8 = equal correlation between WM and inhibition, M9 = equal correlation between WM and shifting, M10 = equal correlation between shifting and inhibition
The presence of scalar invariance allowed us to test for mean differences in the three latent variables. There was statistically significant poorer fit when all means were considered to be equal across monolinguals and bilinguals (M4 comparison in Table 5). Then, we did further analysis by constraining each of the latent variable means in turn to be equal across groups. For the chi-square difference test the scalar model was considered as a baseline. Constraining the means to be equal for WM showed no change in fit, whereas holding means equal for inhibition and shifting resulted in significant chi square difference tests indicating poorer fit. Therefore, the final model we interpreted was the model with WM constrained to equality across groups. In examining the differences between mean factor scores in this final model, the referent group was bilinguals, with factor scores of 0 for each factor, while monolinguals’ mean factor scores were 0, −50.04 and −62.39 for WM, inhibition and shifting respectively.
We also tested for group differences in the correlations between latent variables. Follow up analysis showed constraining the correlation between WM and shifting components made the model fit to the data poorer, but the WM-inhibition and inhibition-shifting correlations could be constrained without poorer model fit. This indicated that only the correlation between WM and shifting is different across groups (see Table 5), with a correlation of −.49 for bilinguals and a correlation of −.03 for monolinguals.
The best-fitting model for monolinguals and bilinguals are illustrated in Figure 2 and 3, respectively.
Discussion
It has been shown that bilingualism can have cognitive consequences, especially on EF tasks measuring inhibition and interference control (Bialystok, Reference Bialystok2001; Bialystok et al., Reference Bialystok, Craik, Klein and Viswanathan2004), shifting (Garbin et al., Reference Garbin, Sanjuan, Forn, Bustamante, Rodriguez-Pujadas, Belloch, Hernandez, Costa and Avila2010; Prior & MacWhinney, Reference Prior and MacWhinney2010) and WM (Bialystok et al., Reference Bialystok, Craik and Luk2008; Luo et al., Reference Luo, Craik, Moreno and Bialystok2013). This bilingual advantage has been reported in studies with participants of different age-ranges including children and adults. The notion of cognitive consequences of bilingual language experience has been criticized recently and some studies and meta-analyses have suggested that there is no difference between bilinguals and monolinguals (Duñabeitia et al., Reference Duñabeitia, Hernández, Antón, Macizo, Estévez, Fuentes and Carreiras2014; Donnelly, Reference Donnelly2016; Paap et al., Reference Paap, Johnson and Sawi2015; Paap & Sawi, Reference Paap and Sawi2014; Shokrkon & Nicoladis, Reference Shokrkon and Nicoladis2021). While rare, a handful of studies have also reported a bilingual disadvantage in EF tasks (e.g., Folke et al., Reference Folke, Ouzia, Bright, De Martino and Filippi2016). The first purpose of this study was to test for the cognitive consequences of bilingualism using a large sample size and tasks tapping all three EF components using CFA. However, the results of this study did not support the bilingual advantage hypothesis. In fact, the present study found a bilingual disadvantage for incongruent RTs with respect to latent variables inhibition and shifting.
The second aim of this study was to test and compare the EF's structure across monolinguals and bilinguals. The results support a 2- or 3-factor structure when all participants were considered together. To our knowledge, this is one of the first studies testing if the EF's structure is different among monolinguals and bilinguals. The findings of this study are also in line with previous studies that failed to support the unitary structure among adults and supported multicomponent models (e.g., Fleming et al., Reference Fleming, Heintzelman and Bartholow2016; Ito et al., Reference Ito, Friedman, Bartholow, Correll, Loersch, Altamirano and Miyake2015; Klauer et al., Reference Klauer, Schmitz, Teige-Mocigemba and Voss2010; Miyake et al., Reference Miyake, Friedman, Emerson, Witzki, Howerter and Wager2000). It is worth noting, the results supported the same EF structure for monolingual and bilingual adults; however, the correlation between WM and shifting components differed across groups – that is, WM and shifting are more integrated for bilinguals. In other words, shifting between two languages on a daily basis could possibly lead to greater cross-modal integration between these two components. Further research is needed to determine the correlation between EF components among monolinguals and bilinguals using different EF tasks considering switching cost.
The results of the present study are in line with the previous studies reporting a bilingual disadvantage in some EF tasks (Bialystok et al., Reference Bialystok, Craik and Luk2008; Gollan et al., Reference Gollan, Montoya, Fennema-Notestine and Morris2005; Kaushanskaya & Marian, Reference Kaushanskaya and Marian2007), where monolinguals, compared to bilinguals perform tasks better, even when bilinguals are tested in their dominant or native language. Nonetheless, the results of the present study conflict with the claim that bilingualism leads to enhanced executive functions (e.g., Paap & Greenberg, Reference Paap and Greenberg2013; Segal et al., Reference Segal, Stasenko and Gollan2019).
Different factors can potentially explain the results of this study. One possible explanation for the results comes from the fact that cognitive consequence of bilingual language experience is dependent on the specific tasks measuring EFs. This explanation was supported in a recent meta-analysis, consisting of 170 studies determining if the cognitive consequence of bilingual language experience is dependent on the EF tasks (Ware et al., Reference Ware, Kirkovski and Lum2020). Other researchers, including Lowe et al. (Reference Lowe, Cho, Goldsmith and Morton2021), Lehtonen et al. (Reference Lehtonen, Soveri, Laine, Järvenpää, de Bruin and Antfolk2018), and Gunnerud et al. (Reference Gunnerud, Ten Braak, Reikerås, Donolato and Melby-Lervåg2020), have also highlighted that cognitive consequences of bilingualism can be observed only in a limited number of specific domains. Similarly, Mas-Herrero et al. (Reference Mas-Herrero, Adrover-Roig, Ruz and de Diego-Balaguer2021) found a cognitive consequence of bilingual language experience only for complex EF and non-linguistic tasks. According to Miyake and Friedman (Reference Miyake and Friedman2012), because the CFA method uses latent variables instead of tasks, it is a useful approach to solve EF task impurity and task specific problems. Using CFA, the results of this study were not still in favor of the bilingual advantage hypothesis. Future studies are needed to test the cognitive consequence of bilingual language experience – particularly, in EF by using more tasks tapping on each component.
Moreover, it might be possible that the bilingual language control process recruits different brain regions than EF components. In other words, the brain activation overlaps between bilingual language control including inhibition, switching, WM and EFs component could be partial for frontal and posterior parietal areas. Some studies found supporting evidence for this explanation. For instance, Jiao et al. (Reference Jiao, Meng, Wang, Schwieter and Liu2022) conducted a meta-analysis of neuroimaging studies to investigate the neural mechanisms underlying bilingual language control and EF. Using contrast analyses, they found that stronger convergence of activation in the left fusiform gyrus and occipital gyrus in language switching compared to task switching, and conversely, stronger convergence of activation in the left dorsolateral prefrontal cortex in task switching. They concluded, despite some similarities between brain regions for language conflict resolution and EF, there are some unique mechanisms for task and language switching. Still another possible explanation of inconsistency between the results of this study and previous studies showing a cognitive consequence of bilingual language experience is the publication bias (most of the journals are in favor of confirmatory results of bilingual advantage) which can lead researchers to not publish their studies with null or negative effect sizes on the relationship between bilingualism and EFs. Moreover, researchers’ confirmation bias may explain the results supporting a positive cognitive consequence of bilingual language experience. Paap et al. (Reference Paap, Mason, Zimiga, Ayala-Silva and Frost2020) argued that small but significant effect sizes showing bilingual advantage in the recent meta-analysis might vanish, by controlling for the researcher confirmation bias. Consistent with this explanation, de Bruin et al. (Reference de Bruin, Bak and Della Sala2015) claimed that publication bias is a possible factor supporting the cognitive consequences of bilingual language experience especially in EFs. Some recent meta-analyses have directly addressed this explanation of publication bias (Lehtonen et al., Reference Lehtonen, Soveri, Laine, Järvenpää, de Bruin and Antfolk2018; Lowe et al., Reference Lowe, Cho, Goldsmith and Morton2021). To estimate bias-adjusted language-status effects, Lowe et al. (Reference Lowe, Cho, Goldsmith and Morton2021), for instance, employed the precision-effect test (PET) and the PET with standard errors (PEESE). In their study, they found that when publication bias is corrected, small effects of language status on EF are eliminated.
Findings of this study should be interpreted with caution because of the study's limitations. First, all of the tasks in this study were computer versions and all participants used their own computers to perform the tasks online. Their RTs while performing the tasks could have been variable, because possibly some of them had access to high-speed computers and internet and others did not, although Kim et al. (Reference Kim, Gabriel and Gygax2019) determined that the timing of Psytoolkit is comparable to that of one of the most widely used lab-based software packages, E-Prime. Second, this study did not control for all of the variables that could lead to a group difference on the EF among individuals such as IQ and neurodevelopmental disorders like ADHD. Third, in this study, we used the RT for incongruent conditions instead of flanker and stroop effects and switching costs. Despite the fact that some studies indicate that if the mean RT on incongruent trials and congruent trials is reliable at .90, and if the RT interference effect (difference score) is reliable at .80, the RT interference effect will only have a reliability of .50 (see Draheim et al., Reference Draheim, Mashburn, Martin and Engle2019) many studies using CFA approach used the inhibiting effect and switching costs. Lastly, when interpreting the results of this study, it is essential to emphasize a significant aspect of bilingual cognitive research. The cognitive consequences of bilingual language experience in young adults, especially when EF is functioning optimally, are not always apparent, as Bialystok et al. (Reference Bialystok, Craik and Luk2008) demonstrate. This research, therefore, has a limitation, being based on this specific demographic – young adults – where determining bilingualism's effects on EFs has historically been difficult. Researchers might benefit from applying CFA to bilingualism and EF, considering a wider age range or populations where bilingualism has more prominent effects on EF in future studies.
In conclusion, in this study, we tested for differences between monolinguals and bilinguals on EF components including inhibition, WM capacity, and shifting. We found no differences between groups in WM capacity. There were, however, differences between groups on the RTs. Monolinguals performed the tasks tapping on inhibition and shifting ability faster than bilinguals. Moreover, the three-factor EF model was evident for both monolingual and bilingual groups. Although the correlation between WM and shifting components were different across groups. We argue that behavioral studies might only paint part of the picture. We believe that a multi-disciplinary approach, which may include techniques such as longitudinal tracking and neuroimaging, could offer a more comprehensive understanding of the cognitive consequence of bilingual language experience. This argument stems from the intricate nature of the bilingual cognitive experience, which may not solely manifest in behavior but might also involve complex neurological processes. Therefore, while behavioral studies are critical, they could be complemented by other methodologies for a more complete picture of the phenomena under investigation. If there are cognitive consequences of bilingual language experience, future studies are needed to test the generalizability of these findings, as well as including measures of possible mitigating variables.
Acknowledgments
We would like to thank all participants in the study for dedicating their valuable time to our research. This study was supported by a Discovery Grant (#2018–04978) of the Natural Sciences and Engineering Research Council of Canada. We are grateful to the editors and anonymous reviewers of BLC for their insightful and constructive comments on our manuscript.