Introduction
Vocabulary in an unfamiliar language can be learned using paired associations of the new unfamiliar words with their translations in a known language. For example, in paired-associate learning of Swahili vocabulary, an English speaker would study an unfamiliar word in Swahili (mbwa) presented together with its translation equivalent in English (dog) and attempt to commit the pairings to memory. Learning can be tested using cued recall or associative-recognition tests. In a cued-recall test, one word in the pair is presented and the other must be generated. Typically, the cued-recall test requires receptive retrieval, in that the unfamiliar word is presented as a cue to recall the familiar word. For example, mbwa could be presented and the expected response would be dog. The cued-recall task can be made more difficult by requiring productive retrieval, in which the familiar word as a cue to recall the unfamiliar word (e.g., de Groot & Keijzer, Reference de Groot and Keijzer2000; Griffin & Harley, Reference Griffin and Harley1996; Schneider et al., Reference Schneider, Healy and Bourne2002). For example, dog would be presented, and the expected response would be mbwa. In an associative-recognition test, pairs of words are presented for verification if it is a correct pairing or rejection if it is an incorrect pairing. For example, if mbwa-dog were presented, the response would be yes; if mbwa-table were presented, the response would be no.
In the present study, we investigated three aspects of paired associate learning of unfamiliar vocabulary: (1) monolingual-bilingual differences in learning; (2) associations of language dominance and proficiency with learning; and (3) the possible role of learning strategies. Spanish–English bilinguals learned Swahili words through both of their known languages, and learning was measured using cued recall tests in both directions and associative recognition tests.
Bilingual-monolingual comparisons
Several studies have reported that cued recall and associative recognition of foreign vocabulary were more accurate for bilinguals using their more proficient language than for monolinguals (Bogulski et al., Reference Bogulski, Bice and Kroll2018; Kaushanskaya, Reference Kaushanskaya2012; Kaushanskaya & Marian, Reference Kaushanskaya and Marian2009; Kaushanskaya & Rechtzigel, Reference Kaushanskaya and Rechtzigel2012; Van Hell & Mahn, Reference Van Hell and Mahn1997). One study similarly showed more accurate performance in multilinguals than in bilinguals (Papagno & Vallar, Reference Papagno and Vallar1995). Several explanations for these effects have been offered (see Tsuboi & Francis, Reference Tsuboi and Francis2020 for a review). Among these are that bilinguals may have greater phonological working memory or phonological long-term knowledge (Kaushanskaya, Reference Kaushanskaya2012; Papagno & Vallar, Reference Papagno and Vallar1995; Van Hell & Mahn, Reference Van Hell and Mahn1997), enhanced selective attention or executive function (Kaushanskaya, Reference Kaushanskaya2012; Kaushanskaya & Marian, Reference Kaushanskaya and Marian2009), more effective encoding strategies (Kaushanskaya & Rechtzigel, Reference Kaushanskaya and Rechtzigel2012), and greater experience in associating novel labels with concepts (Kaushanskaya, Reference Kaushanskaya2012). Other possible reasons to expect bilinguals to learn vocabulary more efficiently include more efficient association of previously meaningless information to familiar meaningful information and a greater vocabulary to which associations can be made (Tsuboi & Francis, Reference Tsuboi and Francis2020).
Although group differences were large (ηp2 ≥ .15) and consistent, five studies had one or more features that would bias performance in favor of the bilingual (or multilingual) participants (see Tsuboi & Francis, Reference Tsuboi and Francis2020 for a more detailed review). First, multilingual self-selected language majors were compared to bilinguals whose less proficient language was a standard academic requirement (Papagno & Vallar, Reference Papagno and Vallar1995). Second, bilingual dominant-language speakers took 60% longer per item to study than monolinguals or nondominant-language speakers (Bogulski et al., Reference Bogulski, Bice and Kroll2018). Third, the bilinguals were first-language English speakers in the U.S. self-selected for K-12 education enriched with second-language classroom immersion and had English vocabulary or working memory spans greater than or equal to the monolingual participants (Kaushanskaya, Reference Kaushanskaya2012; Kaushanskaya & Marian, Reference Kaushanskaya and Marian2009; Kaushanskaya & Rechtzigel, Reference Kaushanskaya and Rechtzigel2012). Matching monolingual and bilingual groups on these factors results in non-representative bilingual samples with particularly high language-learning ability (Prior & MacWhinney, Reference Prior and MacWhinney2010). These five studies also had small bilingual samples (n ≤ 25). In the only study with adequate sample size, monolingual and bilingual groups learned different foreign languages which may not have had equal difficulty (Van Hell & Mahn, Reference Van Hell and Mahn1997). In this last study and some of the others, group matching on age, education, and/or SES was inadequate or not reported. None of these studies had the rigor necessary to draw definitive conclusions about monolingual-bilingual differences.
In contrast, we recently reported a larger and more rigorously controlled study in which monolingual, English-dominant, and Spanish-dominant young adults were matched on age, education, nonverbal cognitive ability, and SES (Tsuboi & Francis, Reference Tsuboi and Francis2020). With 48 participants per group, cued recall and associative recognition accuracy for Japanese-English word pairs showed no group differences. The use of associative strategies (e.g., mediator, sentence) to learn the words was reported more often by bilingual than by monolingual participants (74% vs. 46%), but participants who reported using associative strategies did not have more accurate performance.
As in previous comparisons of monolingual and bilingual cued recall, the Tsuboi and Francis (Reference Tsuboi and Francis2020) study used a receptive retrieval task in which the unfamiliar words were used to cue recall of the familiar words. However, it is also informative to consider productive retrieval tasks in which familiar words are presented as cues to recall the corresponding unfamiliar words. Cued recall with productive retrieval is more difficult and depends to a greater degree on phonological memory (e.g., de Groot & Keijzer, Reference de Groot and Keijzer2000; Griffin & Harley, Reference Griffin and Harley1996; Schneider et al., Reference Schneider, Healy and Bourne2002). Learning strategies may also be more important for productive retrieval. The greater dependency on phonological memory or the greater potential impact of associative strategy use might be expected to favor bilingual relative to monolingual learners. Published studies using this reversed direction for cued recall include no comparisons of monolingual and bilingual performance. There is one multilingual vs. bilingual study that used productive retrieval (Papagno & Vallar, Reference Papagno and Vallar1995), but the findings were inconclusive because of self-selection bias and small sample sizes (ns = 10). The present study includes both receptive and productive cued-recall tasks to determine whether group differences emerge in the more difficult productive retrieval task.
Language dominance and proficiency
Language dominance and language proficiency are related but distinct constructs. In general, language proficiency is skill in a particular language, whereas language dominance refers to the relative skill in two languages within an individual speaker. These factors were examined in only two of the reported bilingual studies of foreign vocabulary learning. First, dominant-language speakers had more accurate performance than nondominant-language speakers (Bogulski et al., Reference Bogulski, Bice and Kroll2018), but the comparison was compromised by the longer study times of the dominant-language speakers. In the same study, limited unbiased evidence was obtained, in that subjective proficiency ratings in a small sample of Spanish–English bilinguals correlated with accuracy. Specifically, bilinguals who were more English dominant had more accurate learning of Dutch–English paired associates than bilinguals who were more Spanish-dominant (Bogulski et al., Reference Bogulski, Bice and Kroll2018).
In a recent study with a substantially larger sample (Tsuboi & Francis, Reference Tsuboi and Francis2020), objectively assessed proficiency in the known language that was paired with the unfamiliar language correlated reliably with accuracy in cued recall and associative recognition. Specifically, in Spanish–English bilinguals learning Japanese vocabulary through Japanese-English translation pairs, objective English proficiency scores were positively correlated with accuracy on cued recall and associative recognition tests. Similarly, in Japanese-English bilinguals learning Spanish vocabulary through Japanese-Spanish translation pairs, objective Japanese proficiency scores were positively correlated with accuracy on cued recall and associative recognition tests. In both samples, associations of proficiency in the known language not used for the learning task (Spanish and English, respectively) were less conclusive, with some significant effects in the Spanish–English bilinguals but not in the Japanese-English bilinguals. These findings suggest that proficiency in the language paired with the unfamiliar language was an important factor in learning. However, learning of Japanese-English translation pairs did not differ between English-dominant and Spanish-dominant participants (Tsuboi & Francis, Reference Tsuboi and Francis2020), possibly because of the substantial overlap in English proficiency distributions between the two groups. In the present study, Spanish–English bilingual participants learned Swahili vocabulary through both English and Spanish.
In cued recall with receptive retrieval, generating responses in a less proficient language requires retrieval of words that are more difficult to access. The relative difficulty of access to the known language words would therefore be a plausible explanation of the proficiency effects observed in cued recall with receptive retrieval. Even the proficiency effects observed in associative recognition could be explained in this way, if we adopt a recall-to-reject account of associative recognition (Anderson, Reference Anderson1974; Clark, Reference Clark1992; Mandler, Reference Mandler1980). In contrast, with productive retrieval, the words to be retrieved would be equally unfamiliar across levels of proficiency in the cue language. Therefore, if proficiency effects were to extend to this direction of cued recall, they could not be attributed to differences in difficulty accessing the unfamiliar response words. Because cued recall requires retrieval of both the association formed between the two words of the pair and retrieval of the appropriate response word (e.g., Madan et al., Reference Madan, Glaholt and Caplan2010), any such effects would have to be based on the strength of the association formed between the two words in each pair. To distinguish between these explanations, the present study included cued recall tests with productive retrieval.
Present study
The overarching goal of the present study was to better understand the effects of bilingualism and bilingual proficiency in an experiment that followed the rigor of the Tsuboi and Francis (Reference Tsuboi and Francis2020) study but enhanced the likelihood that effects would be detected in two ways. First, the study included cued-recall tests in the more difficult direction, where monolingual-bilingual differences might be magnified and have not previously been compared. Second, each bilingual participant learned vocabulary through both of their languages. Specifically, English-dominant and Spanish-dominant Spanish–English bilinguals learned Swahili–English and Swahili-Spanish pairings in separate phases of the experiment (English-speaking monolinguals learned Swahili vocabulary through Swahili–English pairings only.)
The first goal was to rigorously compare monolingual and bilingual learning of vocabulary in an unfamiliar language and explore the plausibility of associative strategy use as a mechanism to explain any differences observed. Based on our previous result (Tsuboi & Francis, Reference Tsuboi and Francis2020), we did not expect to find differences between monolingual and bilingual accuracy on cued recall tests with receptive retrieval or on associative recognition tests. In the more difficult cued recall test with productive retrieval, we hypothesized that bilinguals would retrieve the unfamiliar words more efficiently due to a higher rate of associative strategy use or a larger vocabulary (e.g., Gollan et al., Reference Gollan, Montoya, Cera and Sandoval2008) with which to make such associations. Therefore, we predicted that bilingual performance would be enhanced relative to monolingual performance when recalling Swahili words in response to English or Spanish cues.
To address the possible role of associative strategies in any group differences observed, we analyzed self-reported learning strategies. We assessed group differences in the reported use of associative strategies and whether the use of these strategies was associated with learning. We predicted that bilinguals would report using more associative strategies than monolinguals as in our previous research (Tsuboi & Francis, Reference Tsuboi and Francis2020). Although the reported use of these strategies in that study was not associated with cued recall performance with receptive retrieval, we hypothesized that associative strategies would be helpful for productive retrieval. We therefore predicted that participants who reported using associative strategies would have greater accuracy in the difficult direction of cued recall than participants who did not.
The second goal was to clarify the relationships of language dominance and proficiency to learning, as measured by accuracy in cued recall and associative recognition tests. For each participant, the dominant language was considered to be the language with the higher proficiency score. We hypothesized that learning would be more efficient when unfamiliar words were paired with the dominant language, because of having a larger vocabulary in that language with which to associate the novel Swahili words. Specifically, we predicted that cued recall with receptive retrieval and associative recognition would be more accurate when Swahili words were paired with their translations in the dominant language. Similarly, we hypothesized that higher proficiency in the language through which unfamiliar vocabulary was learned would be associated with greater learning, also because of having a larger vocabulary with which to associate the novel Swahili words. We also hypothesized that proficiency in other languages would not be associated with learning. Specifically, for cued recall with receptive retrieval and associative recognition, we predicted that English performance would be positively correlated with English proficiency but not Spanish proficiency. Similarly, we predicted that Spanish performance would correlate with Spanish but not English proficiency. We also hypothesized that the effects of dominance and proficiency do not depend on the relative retrievability of familiar response words. Instead, bilinguals with higher proficiency in the task language would form stronger associations at encoding that would be easier to retrieve at test. Thus, in the more difficult cued recall test with productive retrieval, if the effects of dominance or proficiency are in the encoding of new associations, we would expect the same pattern of effects as for receptive retrieval. In contrast, if proficiency effects are based on the retrievability of known words, then no effects of dominance or proficiency would be expected in productive retrieval.
Method
Power and sample size
Reported monolingual-bilingual differences in previous studies were very large effects (cued recall: Mean d = 1.11; associative recognition: Mean d = 1.05; averaged across effect sizes cited in Table 1 of Tsuboi and Francis (Reference Tsuboi and Francis2020), excluding that study and comparisons in which bilinguals learned through their nondominant language). Even the smallest reported difference was a medium-to-large effect (d = .63; Kaushanskaya & Rechtzigel, Reference Kaushanskaya and Rechtzigel2012). Nevertheless, we wanted to have 80% power to detect medium sized pairwise group differences (d = .50) tested as single-df comparisons in a three-group design. A power analysis in G*Power showed that this would require 128 participants in total, at least 43 participants for each of the three language groups (note that because power analyses for logistic mixed-effects regression are not well developed, we used a power analysis based on ANOVA.) Due to counterbalancing considerations requiring a multiple of 8 in each group, we tested 48 participants in each language group, which yields power of 85% to detect a medium group difference.
Note: Values indicate means or percentages unless otherwise indicated; numbers in parentheses are standard deviations.
aSome percentages do not add to 100%, because several questionnaire responses were missing for three participants.
bBoth indicates that reported ages of acquisition for the two languages were one year apart or less, indicating simultaneous acquisition
c2% biweekly, 6% monthly, 10% every few months, 15% once or twice per year, 19% less than once or twice per year
dMean age-equivalency scores for performance on the WMLS-R (Woodcock et al., Reference Woodcock, Muñoz-Sandoval, Ruef and Alvarado2005). Age equivalency is the normative age in years at which performance would be average for a monolingual speaker.
eScores indicate mean age-equivalency levels for performance on the WJ-III (Woodcock et al., Reference Woodcock, McGrew and Mather2001).
We also anticipated examining the effects of language dominance and proficiency in a continuous manner. In previous studies, the correlations of dominance scores and accuracy were medium to large. In the most similar study (Tsuboi & Francis, Reference Tsuboi and Francis2020, Experiment 1), correlations of oral language scores with cued recall and associative recognition scores in Spanish–English bilinguals ranged from .311 to .447. With 96 bilingual participants, the present experiment has 85% power to detect a medium correlation (r = .30).
Participants
The participants were English-speaking monolinguals, English-dominant Spanish–English bilinguals, and Spanish-dominant Spanish–English bilinguals (N = 144), with 48 participants in each language group. All participants were recruited at the University of Texas at El Paso (UTEP), and they received course credits or $10 per hour for research participation. Seven participants with complete experimental and assessment data were excluded: 1 because of low English proficiency not detected until after the experiment was complete, 1 because of extremely low cognitive scores, and 5 because they were inadvertently run on the wrong form. All excluded participants were replaced to preserve counterbalancing. Participant self-reported characteristics and objective assessment scores are summarized in Table 1.
The objective assessments of proficiency in English and Spanish were the Picture Vocabulary and Verbal Analogies subtests of the Woodcock-Munoz Language Survey-Revised (WMLS-R; Woodcock et al., Reference Woodcock, Muñoz-Sandoval, Ruef and Alvarado2005). This instrument is an objective and standardized assessment normed with monolingual English and monolingual Spanish speakers in the Americas. Scores for each subtest were entered into the scoring program, which uses the scores to compute oral language composite scores for each language, which were then converted to age-equivalency scores. The age-equivalency score represents the average age at which a monolingual speaker achieves the same performance level and is reported in Table 1. However, for inferential analyses, calibrated scores (W) provided by the WMLS-R scoring program were used, because they are closer to normally distributed (see Schrank et al., Reference Schrank, McGrew and Dailey2010 for technical information). It is important to note that these two subtests rely primarily on knowledge of individual words in the language and not grammar or connected language, which limits its scope as an index of proficiency.
The bilingual participants were students who had extensive experience using English and Spanish for substantial communication and had little to no difficulty conversing fluently with the experimenter in both languages. Nearly all of the students (96%) reported either learning Spanish first or learning English and Spanish simultaneously from early childhood. Current regular usage is evidenced by 98% reporting that they used both English and Spanish either every day or several days per week. Bilingual participants had to score at least 8 years on the Oral Language composite in both English and Spanish or demonstrate clear spoken proficiency in conversing with the experimenter with a score of at least 7 years. Bilingual participants whose English score was higher than their Spanish score were classified as English-dominant bilinguals; those whose Spanish score was higher than their English score were classified as Spanish-dominant bilinguals. Their age-equivalency scores on the Oral Language composite averaged 16.3 years in their more proficient language and 11.2 in their less proficient language. Based on calibrated W scores, English-dominant bilinguals scored higher in English than the Spanish-dominant bilinguals, t(94) = 6.500, p < .001, and Spanish-dominant bilinguals scored higher in Spanish than the English-dominant bilinguals, t(94) = 6.859, p < .001. Also, the degree of dominance did not differ reliably across groups, t(94) = 1.080, p = .283.
The monolingual group consisted of students who were functionally monolingual, in that they did not have fluent conversational skills in Spanish and had relatively brief and infrequent use of the little Spanish they knew; they were not able to converse fluently with the experimenters. The vast majority (83%) learned English as their only first language, and 12% were exposed to both languages early in life but did not become proficient in Spanish. English monolingual participants had to obtain an age-equivalency score of at least 8 years on the Oral Language composite in English, have no reported proficiency in any other languages, not be able to engage in Spanish conversation with the experimenter, and obtain scores less than 8 years in Spanish to qualify to participate in the present experiment. Their age-equivalency scores on the Oral Language composite averaged 19.0 in English (significantly higher than the English-dominant group, t(94)= 2.29, p = .024) and 3.2 in Spanish (significantly lower than the English-dominant group, t(94) = 17.01, p < .001).
As shown in Table 1, the three language groups were well matched on age, education, and socio-economic status, as indicated by the highest parent education level (Galobardes et al., Reference Galobardes, Shaw, Lawlar, Lynch and Davey Smith2006). Age differences among groups were not significant, F(2, 140) = .488, MSE = 30.74, p = .615, η2p = .007. Education for participants and parents was measured on a six-point ordinal scale (less than 8th grade, some high school, graduated high school, some college, graduated college, advanced degree). All participants were in the “some college” category. Reported parent education levels had a median of “graduated college” for all groups. Two subtests from the Woodcock-Johnson III Test of Cognitive Abilities (WJ-III; Woodcock et al., Reference Woodcock, McGrew and Mather2001) were administered: Spatial Relations and Picture Recognition. The scoring program used these subtests to compute a composite visuo-spatial reasoning score and convert all scores to age-equivalency scores. Here, although the English monolinguals scored somewhat lower numerically than the bilinguals, the difference was not statistically significant, F(1, 142) = 2.864, p = .093, η2p = .02.
Materials
A set of 68 Swahili–English word pairs was selected from Nelson and Dunlosky (Reference Nelson and Dunlosky1994). The pairs were selected according to three criteria: 1) There was a clear Spanish translation equivalent; 2) The Swahili word did not resemble the English or Spanish translation equivalent; 3) The English and Spanish translation equivalents were deemed likely to be known in the less proficient language of a bilingual. The stimulus set included English, Spanish, and Swahili translation equivalents for 68 word concepts (see Appendix A). The mean letter length was 5.5 for the Swahili words, 5.6 for the English words, and 6.3 for the Spanish words. These words were randomly divided into two sets matched on difficulty (based on norms from Nelson & Dunlosky, Reference Nelson and Dunlosky1994) and on the proportion of concrete and abstract words (82% concrete, 18% abstract). The assignment of item sets to languages was counterbalanced across participants. Also, the half of each set that was assigned to be correctly or incorrectly paired in the associative recognition text was counterbalanced across participants. Incorrect pairings for this test were randomly generated.
Procedure
At the beginning of the session, following informed consent, the WMLS-R was administered in English and Spanish. In the main experiment, participants learned two different sets of 34 Swahili words. As illustrated in Figure 1, bilingual participants studied and were tested on Swahili–English translation pairs and Swahili-Spanish translation pairs in separate trial blocks, with the order of languages counterbalanced across participants. Monolingual participants studied and were tested on two sets of Swahili–English translation pairs in separate trial blocks. Between blocks, participants completed cognitive assessments and questionnaires.
Each language block began with three study-test cycles with Swahili cues and responses in the known language. A series of 34 Swahili–English (or Swahili-Spanish) word pairs were presented one at a time in random order for 8 sec each, during which participants were to say the pair aloud three times. The screen went blank for 1 sec between word pairs. Immediately after studying the last word pair, the cued recall task began. The 34 Swahili words were presented one at a time in random order, and the participant attempted to recall and say the corresponding English (Spanish) word aloud. The experimenter used a worksheet that contained the expected responses to mark correct and incorrect responses. Upon completion of the first cued recall test, the study phase for the next cycle began. The second and third study test cycles were the same as the first except that the items were presented in different random orders. Upon completion of the third study-test cycle, an additional cued recall test was given in which the 34 English (Spanish) words were presented one at a time in random order, and the participant was to recall the corresponding Swahili words.
Next, participants completed an associative recognition test. On this test, 34 word pairs were presented, 17 of the originally studied pairs and 17 incorrect pairings of studied Swahili and English (Spanish) words. Participants were to indicate whether each pairing was correct or incorrect by pressing a yes (/) or no (z) key on the computer keyboard. Upon completion of this final test, participants were given a questionnaire (adapted from Tsuboi & Francis, Reference Tsuboi and Francis2020) that asked them to explain the strategies that they used to commit the words to memory and later recall them.
The procedure as described was used for in-person administration of the protocol. About halfway through data collection, the protocol was adapted for remote administration during the pandemic, and 41 participants completed the remote version of the experiment in scheduled Zoom appointments with the experimenters. Participants were instructed to put Zoom into full-screen mode to avoid being distracted by other applications. The only change to the language assessments is that the picture vocabulary test items were displayed on the shared computer screen. Questionnaires were programmed and administered through QuestionPro. In the computerized experiment proper, PsyScope was used to present the stimuli on a shared screen. The only change was that on the associative recognition test, participants simply said yes or no aloud, and the experimenter entered their responses.
Approach to analysis
The primary dependent variables were cued recall accuracy, associative recognition accuracy, and associative strategy use. The first set of analyses examined group differences in accuracy, with a focus on comparisons between monolinguals and bilinguals on learning of Swahili–English word pairs only. The second set of analyses, which include data only from the bilingual participants, examined the effects of language dominance on accuracy using both Swahili–English and Swahili-Spanish word pairs. Here, the dominant language was operationalized as the language with the higher language proficiency score. The third set of analyses examined the associations of language proficiency scores with accuracy separately for bilingual and monolingual participants. Within each of these sets of analyses, logistic mixed-effects regression was used to analyze cued recall accuracy, and ANCOVAs on signal-detection measures were used for associative recognition accuracy. The final set of analyses focused on the use of associative strategies. Reported usage of the strategies was compared across groups using chi-square tests, and the associations of reported strategies with accuracy on the learning measures were analyzed using ANOVA.
Results
Accuracy
Mean accuracy scores are given in Table 2 for all groups, conditions, and measures. Cued recall data were analyzed using logistic mixed-effects regression in Jamovi (The jamovi project, 2022). We report the effects of interest in this section, and detailed results are provided in tables in Appendix B. In each analysis, we used the maximal random-effects structure that converged. Associative recognition data were analyzed using ANCOVAs on the signal detection measure d’. In LME and ANCOVA analyses comparing groups, we included cognitive scores as a covariate to remove the association of cognitive scores with accuracy from the error variance (the results of more traditional analyses using ANOVA are provided in the data repository for comparison with previous studies.)
Note: Cued recall cycles 1, 2, and 3 refer to recall with Swahili cues; Swahili cued recall refers to recall with Swahili responses. FA = false alarm
Monolingual-bilingual comparisons
To compare the performance of the monolingual and bilingual groups, only data from English trial blocks were included in the analyses. Cued recall results are illustrated in Figure 2. For cued recall with receptive retrieval (Swahili cued, English responses), the fixed factors included language group, recall cycle (within-subjects), and their interaction. Cognitive assessment scores (W scores) were included as a covariate. The model included random intercepts for participants and items, random slopes across recall cycle for participants and items, and random slopes across language groups for items. Planned comparisons of groups showed that recall was more accurate for English monolinguals relative to Spanish-dominant bilinguals. b = .773, SE = .277, z = 2.796, p = .005, but not relative to English-dominant bilinguals, b = .465, SE = .273, z = 1.705, p = .088. Accuracy of the two bilingual groups did not differ (p > .2). Accuracy improved across the three study-test cycles, as indicated by a significant linear trend, b = 2.408, SE = .078, z = 30.834, p < .001. A significant interaction showed that the difference between monolingual and Spanish-dominant bilingual performance increased/decreased across the three cued recall cycles, b = .364, SE = .152, z = 2.391, p = .017. Higher cognitive assessment scores were associated with higher accuracy, b = .021, SE = .009, z = 2.343, p = .019 (see Table B.1 for additional details.)
For cued recall with productive retrieval (English cues, Swahili responses), the fixed factor was language group. Cognitive assessment scores were included as a covariate. The model included random intercepts for participants and items and random slopes across language groups for items. Planned comparisons showed that English monolinguals had more accurate productive cued recall than English-dominant, b = .658, SE = .310, z = 2.125, p = .034, and Spanish-dominant bilinguals, b = .801, SE = .314, z = 2.550, p = .011. The accuracy of productive cued recall did not differ for the two bilingual groups (p > .5). Cognitive assessment scores were not significantly associated with accuracy (p > .2) (see Table B.2 for additional details.)
For the associative recognition data, hit rates and false alarm rates were obtained and used to compute the signal detection statistic d’ (illustrated in Figure 3). The d’ scores were submitted to a one-way between-subjects ANCOVA across language groups, with cognitive assessment scores as a covariate. No group differences were detected (English monolingual vs. English dominant: F(1, 140) = .01, MSE = 1.097, p = .928, ηp2 < .01; English monolingual vs. Spanish dominant: F(1, 140) = .91, MSE = 1.097, p = .342, ηp2 < .01, English dominant vs. Spanish dominant: F(1, 140) = 1.56, MSE = 1.097, p = .213, ηp2 = .01). Higher cognitive scores were associated with greater accuracy, F(1, 140) = 4.148, MSE = 1.097, p = .044, ηp2 = .029.
Bilingual language dominance
To examine the effects of language dominance, only data from the bilingual participant groups were included. For cued recall with receptive retrieval (Swahili cues), the fixed factors included language group, the within-subjects factors of response language and recall cycle, and all possible interactions among the factors. The model included random intercepts for participants and items, random slopes for response language and recall cycle across participants and items, and random slopes for language group across items. On this task, the overall accuracy of the two language groups did not differ (p > .5). Overall, cued recall was more accurate with English than with Spanish responses, b = .326, SE = .107, z = 3.050, p = .002. As shown in Figure 2, these factors interacted, b = .632, SE = .195, z = 3.248, p = .001, such that English-dominant bilinguals had more accurate recall in English than Spanish, b = .642, SE = .145, z = 2.523, p < .001, but Spanish-dominant participants did not show this effect (p > .5). Accuracy improved across the three study-test cycles, as indicated by a significant linear trend, b = 2.323, SE = .078, z = 29.659, p < .001. Higher cognitive assessment scores were associated with greater accuracy, b = .020, SE = .010, z = 1.979, p = .048 (see Table B.3 for additional details.)
For cued recall with productive retrieval (Swahili responses), the fixed factors were language group, the within-subjects factor of cue language, and their interaction. The model included random intercepts for participants and items, random slopes for cue language across participants and items, and random slopes for group and the interaction of group and cue language across items. Overall accuracy of the two groups did not differ, (p > .5). Cued recall was more accurate overall with English than with Spanish cues, b = .508, SE = .105, z = 4.858, p < .001. These factors interacted, b = .550, SE = .188, z = 2.927, p = .003, such that English-dominant bilinguals had more accurate recall with English cues than with Spanish cues, b = .783, SE = .146, z = 5.350, p < .001, but Spanish-dominant participants did not show this effect, b = .233, SE = .135, z = 1.734, p = .083. Cognitive scores were not significantly associated with accuracy (p > .1) (see Table B.4 for additional details.)
For associative recognition, the d’ scores (illustrated in Figure 3) were submitted to a 2 (bilingual group) x 2 (known word language) mixed ANCOVA with cognitive assessment scores as a covariate. Bilingual associative recognition was not significantly more accurate in English than in Spanish, F(1, 93) = 1.513, MSE = .447, p = .222, ηp2 = .016, and there was no main effect of bilingual group, (F < 1). However, there was a significant interaction of bilingual group and response language, F(1, 93) = 5.097, MSE = .447, p = .026, ηp2 = .052, such that the English-dominant group had an advantage for English over Spanish, F(1, 47) = 8.913, MSE = .368, p = .004, ηp2 = .159, but the Spanish-dominant group did not, F(1, 47) = .160, MSE = .532, p = .691, ηp2 = .003. Higher cognitive assessment scores were associated with greater accuracy, F(1, 93) = 4.333, MSE = 2.089, p = .040, ηp2 = .045.
Language proficiency
Simple correlations of proficiency scores with cued recall and associative recognition accuracy are given in Table 3. The two bilingual groups were pooled to make the proficiency scores closer to normally distributed. For recall with Swahili cues, the average recall score across blocks was used. A Bonferroni correction was applied to each set of six correlations with a familywise error rate of .05 (α = .0083). In both monolingual and bilingual participants, English proficiency scores were significantly correlated with all measures from the English block. For bilinguals, Spanish proficiency scores were significantly associated with all measures from the Spanish block. No correlations of Spanish proficiency scores with English performance or English proficiency scores with Spanish performance survived the Bonferroni correction. However, the nonsignificant correlations are not so low as to provide strong evidence that only proficiency in the involved known language is associated with learning. Note that proficiency scores in English and Spanish were not correlated in bilingual participants, r(94) = -.026, p = .803, or in monolingual participants, r(46) = -.153, p = .300. Therefore, any positive associations between language proficiency in the uninvolved known language and accuracy cannot be an artifact of such a correlation.
Note. Swahili Cue (Overall) refers to the mean of the three recall cycles with Swahili cues. Associative recognition correlations are with d’ scores. Proficiency scores are Oral Language composite (calibrated W) scores from the WMLS-R (Woodcock et al., Reference Woodcock, Muñoz-Sandoval, Ruef and Alvarado2005)
*p < .05, **p < .0083, ***p < .001
For more rigorous inferential assessments of the associations of proficiency with cued recall, we conducted logistic mixed-effects regression analyses with proficiency scores (W scores on the WMLS-R) as a covariate predictor. For cued recall with receptive retrieval (Swahili cues), the fixed factors were response language, recall cycle (within-subjects), and their interaction, and proficiency scores in the response language was the covariate. Note that this means that English proficiency scores were entered for English response trials, and Spanish proficiency scores were entered for Spanish response trials. The model included random intercepts for participants and items and random slopes for response language and recall cycle across participants and items. Higher proficiency scores in the response language were associated with greater accuracy in receptive retrieval, b = .024, SE = .004, z = 5.354, p < .001 (see Table B.5 for detailed results.)
For cued recall with productive retrieval (Swahili responses), the fixed factor was cue language, and the covariate was proficiency scores in the cue language. The model included random intercepts for participants and items and random slopes for cue language across both participants and items. As expected, higher proficiency scores in the cue language were associated with greater accuracy in productive retrieval, b = .024, SE = .004, z = 5.620, p < .001 (see Table B.6 for detailed results.)
We also conducted proficiency analyses within the monolingual participant group. For receptive retrieval, the fixed factor was recall cycle (within-subjects), and the covariate was English proficiency scores. The model included random intercepts for items and participants and random slopes for recall cycle across participants and items. Participants with higher English proficiency scores had more accurate receptive retrieval, b = .024, SE = .011, z = 2.284, p = .022. For productive retrieval, the only predictor was the covariate English proficiency scores, and the model included random intercepts for items and participants. Participants with higher English proficiency scores had more accurate productive retrieval, b = .069, SE = .023, z = 3.044, p = .002. When these analyses were conducted using Spanish proficiency scores, no significant associations were observed (ps > .2) (see Tables B.7, B.8, B.9, and B.10 for detailed results.)
Strategy use
We compared self-reported learning strategies across language groups. Responses to the strategy use questionnaires were coded according to several possible categories, mostly drawn from previous research (Bangert & Heydarian, Reference Bangert and Heydarian2017; Tsuboi & Francis, Reference Tsuboi and Francis2020). The proportion of participants who reported each strategy is given in Table 4 for each language group (note that 2 monolingual participants did not complete the strategy questionnaire.) Reported study strategies were coded into the following categories: (1) word association or mediator, (2) sentence or phrase formation, (3) imagery, (4) rote repetition, (5) structural similarities, (6) focus on spelling, and (7) focus on pronunciation. Individual responses were coded into as many categories as were reported. Inter-rater agreement (proportion consistent) and reliability (Cohen's κ – Xοηεν et al., Reference Cohen, MacWhinney, Flatt and Provost1993) were computed for each category (see Table 4). Two raters coded strategies for 22 participants together and then independently coded strategies for the remaining 120 participants (2 participants did not complete the strategy questionnaire). All disagreements were resolved by consensus. We considered the word association or mediator, sentence or phrase formation, and imagery strategies to be associative in nature, and the proportion of participants who reported at least one of these associative strategies was computed. As shown in Table 4, monolingual and bilingual participants used at least one associative strategy at similar rates, X 2(1, N = 142) = .172, p = .679, ϕ = .035. Bilinguals were more likely than monolinguals to report using imagery, X 2(1, N = 142) = 7.541, p = .006, ϕ = .230, but no language-group differences were observed for the other strategies (ps > .1).
Note. Each proportion listed for individual strategy categories includes participants who reported multiple strategies.
aProportion of participants who reported at least one of the three associative learning strategies listed above.
To examine whether the use of associative strategies was associated with higher accuracy on the memory measures, a 3 (language group) x 2 (use of associative strategies) ANOVA was performed for each of the six memory measures. To correct for multiple comparisons, a Bonferroni correction with a familywise error rate of .05 was applied (α = .0083). Participants who reported using at least one associative strategy had higher accuracy on cued recall with Swahili cues and Spanish responses than those who did not, F(1, 94) = 8.701, MSE = .034, p = .004, ηp2 = .085. However, none of the other measures yielded a significant associative strategy effect (ps > .05). Reported use of imagery in particular also was not associated with higher accuracy on any of the learning measures (ps > .05).
Discussion
The first goal of the study was to rigorously compare monolingual and bilingual learning of vocabulary in an unfamiliar language and explore the plausibility of strategic factors as a mechanism for any group differences. When Swahili cues were used to retrieve English words and in associative recognition, there were no significant differences in accuracy between English monolingual and English-dominant bilingual performance, consistent with our previous research (Tsuboi & Francis, Reference Tsuboi and Francis2020) and in contrast to other previous research (e.g., Bogulski et al., Reference Bogulski, Bice and Kroll2018; Kaushanskaya & Marian, Reference Kaushanskaya and Marian2009). If anything, the monolingual scores were numerically higher. We had predicted that English-dominant bilinguals would outperform English-speaking monolinguals in the more difficult productive retrieval task in which English words were used to cue the unfamiliar Swahili words. However, the opposite pattern was observed, suggesting that bilinguals did not have an advantage based on their larger total vocabulary (e.g., Gollan et al., Reference Gollan, Montoya, Cera and Sandoval2008) or greater phonological long-term knowledge (e.g., Kaushanskaya, Reference Kaushanskaya2012). Across learning measures involving English, English monolingual performance was more accurate than Spanish-dominant bilingual performance. Based on the bilingual results to follow, we interpret these differences as proficiency-related effects.
Note that the three groups were well matched on age, education, and parent education level. If anything, the English monolingual group scored lower on the measure of nonverbal cognitive ability, so there is no evidence that the bilinguals in the present study were cognitively disadvantaged. We intentionally did not match groups on language proficiency scores, because bilinguals on average have lower vocabulary in each of their languages due to splitting their usage between languages (e.g., Gollan et al., Reference Gollan, Montoya, Cera and Sandoval2008). Therefore, matching monolingual and bilingual vocabulary scores would result in non-representative bilingual samples with particularly high language-learning ability (as pointed out by Prior & MacWhinney, Reference Prior and MacWhinney2010; Tsuboi & Francis, Reference Tsuboi and Francis2020) or non-representative monolingual samples with particularly low language-learning ability.
We examined whether the reported use of associative strategies differed across groups and whether it was predictive of learning. Bilingual and monolingual participants reported the use of associative strategies for learning the word pairs at similar rates (about 74%), although bilinguals reported the use of imagery more frequently than monolinguals. The use of associative strategies was not consistently associated with more efficient learning. This result converges with previous findings that self-reported use of particular strategies was not associated with cued recall of words in an unfamiliar language with productive retrieval (Bangert & Heydarian, Reference Bangert and Heydarian2017; Tsuboi & Francis, Reference Tsuboi and Francis2020).
The second goal of the study was to clarify the relationships of language dominance and proficiency in known languages to the efficiency of learning vocabulary in an unfamiliar language. The three learning measures, recall with Swahili cues, recall with Swahili responses, and associative recognition showed consistent patterns of performance with respect to language dominance group and task language. In each case, there was a significant interaction of language dominance group with the nominal language of the task, such that performance was more accurate when the task was performed in the dominant language. Overall, accuracy was higher in English than in Spanish. Given that the degree of dominance of the two bilingual groups was similar (as shown by WMLS-R scores in Table 1), this effect indicates simply that the task was more difficult in Spanish. Thus, it is a property of the materials, not a characteristic of the participants.
In bilinguals, proficiency in English or Spanish was correlated with learning measures that involved the same language (with the exception of associative recognition in English), but proficiency was not reliably correlated with learning measures in the other language (a pattern consistent with Tsuboi & Francis, Reference Tsuboi and Francis2020). Contrary to our previous study (Tsuboi & Francis, Reference Tsuboi and Francis2020), monolingual English proficiency was also significantly correlated with learning measures. Note, however, that Spanish proficiency scores in monolingual participants were not correlated with the learning measures. Therefore, the absence of a bilingual advantage cannot be explained by the monolingual participants’ knowledge of Spanish vocabulary.
The language-dominance effects and proficiency correlations extended to the more difficult cued recall task that required productive retrieval, where known words were given as cues to recall the unfamiliar Swahili words. These correlations were of comparable strength to those observed for cued recall with receptive retrieval. Therefore, the proficiency effects observed cannot be attributed to the relative accessibility of the known words. Instead, the locus of the effects of proficiency must be in the strength of the new associations formed at encoding. In the following section we consider possible explanations for why higher proficiency would be associated with the formation of stronger associations.
Possible mechanisms for proficiency effects
We considered several possible mechanisms for the observed associations between language proficiency and the learning measures. First, we evaluated explanations that involve a causal relationship between language proficiency and the processes of learning. One possibility was that the higher vocabulary of more proficient speakers would give them more possible mediator words. If this were based on total vocabulary (summed across languages), we would have expected more efficient learning in bilinguals. In contrast, if the important factor were vocabulary in the specific language with which the novel Swahili words were presented, we would have expected consistently more efficient learning in monolinguals. A related possibility was that because higher proficiency is associated with stronger links between words and their concepts, mediators could be identified more rapidly, but this explanation would be expected to favor monolingual participants.
We also considered measured third variables that might underlie both proficiency and learning of vocabulary in a new language, including cognitive ability, socio-economic status, experience with the known language, and language learning ability, and relevant correlations are given in Table 5. A Bonferroni correction with a familywise error rate of .05 was applied for each predictor across the six learning measures (α = .0083). With respect to cognitive ability, non-verbal cognitive ability scores (WJ-III visuo-spatial ability) were not correlated with any of the learning measures in monolinguals or bilinguals. Socio-economic status scores (parent education scores) were not correlated with the learning measures in monolinguals or bilinguals. For bilinguals, experience with English was estimated by subtracting the age of acquisition from the age of the participant, and these scores also were not correlated with the learning measures.
Note. Swahili Cue (Overall) refers to the mean of the three recall cycles with Swahili cues.
aW scores from the Woodcock-Johnson III Test of Cognitive Abilities (Woodcock McGrew, & Mather, 2001)
bParent education scored on 1-6 scale (1 = less than 8th grade; 2 = some high school; 3 = graduated high school; 4 = some college; 5 = graduated college; 6 = advanced degree); data were missing for 3 monolingual participants.
cExperience with English obtained by subtracting age of English acquisition from age of participant; English experience was missing for 1 bilingual participant.
*p < .05, **p < .0083, ***p < .001
To examine the possible influence of language learning skill, we computed the partial correlation between English proficiency scores and English learning measures in bilinguals, controlling for years of English experience. These positive correlations were highly significant for all three learning measures. Similarly, the partial correlations between Spanish proficiency scores and the Spanish learning measures controlling for years of English experience were significant. These partial correlations suggest that higher language-learning skill underlies both higher proficiency attainment in known languages and more efficient learning on the new task. While this relationship is not particularly surprising, the mechanisms by which higher language-learning skill might lead to more efficient learning of translation paired associates is not obvious. That is, this relationship alone does not tell us what individuals with higher language learning skill do differently when learning the translation pairs. One possibility was a greater reliance on associative strategy use. Indeed, when controlling for years of English experience, bilinguals with higher English proficiency scores were more likely to report using associative strategies, r(92) = .248, p = .016. However, we did not obtain reliable evidence that the use of associative strategies enhanced learning.
We also considered the possible role of phonological working memory, although we had no direct measures of this construct. Previous research has shown that in both monolingual and bilingual children, better phonological working memory is associated with more extensive vocabulary in their first-learned languages (e.g., Gathercole et al., Reference Gathercole, Service, Hitch, Adams and Martin1999; Lanfranchi & Swanson, Reference Lanfranchi and Swanson2005; Swanson et al., Reference Swanson, Orosco and Lussier2011) and later-learned languages (Lanfranchi & Swanson, Reference Lanfranchi and Swanson2005; Swanson et al., Reference Swanson, Orosco and Lussier2011). In young adulthood these patterns persist for bilinguals (Kaushanskaya et al., Reference Kaushanskaya, Blumenfeld and Marian2011), but are less consistent for monolinguals (e.g., Kaushanskaya et al., Reference Kaushanskaya, Blumenfeld and Marian2011; Kemper & Sumner, Reference Kemper and Sumner2001), perhaps because other factors have a greater influence (Gathercole et al., Reference Gathercole, Service, Hitch, Adams and Martin1999). Learning vocabulary in an unfamiliar language is more efficient for monolingual (Kaushanskaya, Reference Kaushanskaya2012) and bilingual/multilingual adults (Papagno & Vallar, Reference Papagno and Vallar1995) with higher phonological working memory. Thus, given that phonological working memory is associated with more extensive vocabulary in known languages and more efficient learning of vocabulary in unfamiliar languages, more extensive vocabulary should be associated with more efficient learning of vocabulary in unfamiliar languages. Indeed, this association was evident in both the bilingual and monolingual samples. Thus, higher phonological working memory is a plausible causal factor underlying this association.
External validity considerations
It is important to note that the bilingual participants in the present study generally learned English out of necessity while living and going to school in the U.S. Thus, their bilingualism did not arise out of a particular interest in or ability to learn a second language, nor were they participants in an enriched education program targeting students with high academic performance or high socio-economic status. Thus, the present results are not expected to generalize to conditions in which learning a second language is based on self-selection or targeted educational opportunities, where bilinguals might well be expected to outperform monolinguals on the paired-associate learning task.
Because the monolingual participants lived in environment where they could be exposed to Spanish on a regular basis, we considered whether monolingual participants with more knowledge of Spanish might have learned more efficiently, perhaps masking a bilingual advantage. However, in the monolingual group, Spanish proficiency scores were not correlated with performance on the learning tasks (consistent with Tsuboi & Francis, Reference Tsuboi and Francis2020). Also, in a previous study with monolingual participants, ambient language diversity was not associated with enhanced learning of vocabulary in a new language (Bice & Kroll, Reference Bice and Kroll2019).
With respect to generalizing the results of the present study, it should be noted that vocabulary acquisition is only one of many important aspects of learning a new language. Thus, the present results do not have implications for other critical components of language learning. Also, studying translation pairs is not the only way to learn vocabulary in a new language. Note, however, that the advantage for the dominant language and the proficiency effects in the present study are consistent with the results of a previous study in which unfamiliar words in a known language were learned by inferring meanings from sentence contexts (Lauro et al., Reference Lauro, Schwartz and Francis2020).
Because none of the stimuli were cognates or non-cognate homographs, which make learning easier (de Groot & Keijzer, Reference de Groot and Keijzer2000), bilinguals may not have been able to take full advantage of their bilingual experience and knowledge. Knowing two or more languages increases the probability that words in a new language will have cognates in a known language, and a bilingual person would likely take advantage of such similarities. Therefore, in more natural learning contexts, we might indeed expect bilingual-monolingual learning differences.
The small number of available normed stimuli meant that the experiment could not be designed to directly compare independent measures of receptive and productive retrieval within participants along with the language manipulation, and it is possible that the three cycles of receptive retrieval practice influenced the pattern of later productive retrieval and/or associative recognition across groups.
Finally, learning was tested using a relatively short retention interval (within a single experimental session), and it is uncertain whether the results would generalize to the longer retention intervals associated with more natural learning contexts. Indeed, factors that affect verbal learning performance at shorter and longer intervals are not always the same (e.g., Schmidt & Bjork, Reference Schmidt and Bjork1992).
Conclusions
Bilinguals learning vocabulary in an unfamiliar language through their more proficient language did not outperform matched monolinguals, even in a difficult task in which they had to respond in the unfamiliar language. Thus, neither a larger total vocabulary nor greater phonological long-term knowledge conferred a benefit to the bilingual participants. The reported use of associative strategies was similar for monolingual and bilingual participants and did not reliably predict learning. Bilinguals learned vocabulary more efficiently through their more proficient language, as indicated by more accurate performance on all learning measures. Higher proficiency in the known language involved in the learning task was associated with higher accuracy even when recalling the unfamiliar Swahili words. Thus, in this learning context, higher proficiency is associated with more efficient or stronger learning of associations rather than more efficient retrieval of the known words. The association between proficiency in known languages and performance on the learning measures may arise from variation in language learning skill, for which phonological working memory is a plausible mechanism of influence.
Data availability
The data that support these findings are available at https://data.mendeley.com/datasets/wbwsk44pwb/1.
Acknowledgements
This research was supported in part by NSF Grant 1632283 to Francis and by research internships awarded to Nájera, including RISE (funded by NIH Grant 2R25GM069621) and SURPASS (funded jointly by the UTEP Vice President of Student Affairs and Campus Office for Undergraduate Research Initiatives). We gratefully acknowledge the assistance of Naoko Tsuboi in implementation and training of research assistants and the assistance of Rebekah Villaverde, Luis Urrea, Isabela Blanco Almeida, Ileen Gurrola-Meza, and Kimberly Rivera in data collection and data entry.
Competing Interests
There are no competing interests to declare.
Appendix A: Word Stimuli
Appendix B: Tables of Logistic Mixed-Effects Regression Results