1. Introduction
The multilingual community has been enlarged over the years, and it is necessary to explore whether language representation is identical or different when the language is used as the first or a foreign language. In every language, emotion plays an essential role in grounding the semantic connotations (Kousta et al., Reference Kousta, Vigliocco, Vinson, Andrews and Del Campo2011) and functions indispensably in human communication. Therefore, an increasing body of research has investigated how emotional stimuli modulate the processing of the non-dominant language, especially among late bilinguals. A late bilingual is defined as one who was not immersed in a bilingual environment since birth and did not acquire a second language until later stages of language development (Toivo & Scheepers, Reference Toivo and Scheepers2019), and the boundary age is normally around six years old (Brothers et al., Reference Brothers, Hoversten and Traxler2021; X. Liu et al., Reference Liu, Tu, Wang, Jiang, Gao, Pan, Li, Zhong, Zhu, Niu, Li, Zhao, Chen, Liu, Lu and Huang2017). This late bilingual population has been found to feel the emotional meanings of words in a second language (L2) less intensely (Ferré et al., Reference Ferré, Guasch, Stadthagen-Gonzalez and Comesaña2022) and to use fewer high-arousal emotional words facing moral dilemmas in the L2 context (Kyriakou et al., Reference Kyriakou, Foucart and Mavrou2022). These characteristics of L2 emotion processing have certain implications in real-life settings, such as the Foreign Language Effect in decision making (Keysar et al., Reference Keysar, Hayakawa and An2012; L. Liu et al., Reference Liu, Margoni, He and Liu2021), the clinical intervention and rehabilitation of language use (Monaco et al., Reference Monaco, Jost, Gygax and Annoni2019), and the pedagogy for foreign language teaching. As such, there exists practical and theoretical significance to investigate emotion processing among the bilingual population in different scenarios.
Some consensus has been reached in this research field, but disagreements still exist. Many studies have investigated the differences in emotional density between the dominant and non-dominant languages by presenting L1 and L2 readers with the same set of emotional words and directly comparing their responsive activities. A line of studies has collected the affective ratings of a set of emotional words from the L1 and L2 populations, and has mostly reported attenuated or less extreme emotional feelings in the L2 context (Ferré et al., Reference Ferré, Guasch, Stadthagen-Gonzalez and Comesaña2022; Garrido & Prada, Reference Garrido and Prada2021). Similarly, previous studies that have recorded participants’ physiological responses also seemed to have identified weaker responses to emotional contents in the L2 group, including weaker pupillary effects (Toivo & Scheepers, Reference Toivo and Scheepers2019) as well as decreased facial motor resonance and skin conductance responses (Baumeister et al., Reference Baumeister, Foroni, Conrad, Rumiati and Winkielman2017; Jankowiak & Korpal, Reference Jankowiak and Korpal2018). However, there have been conflicting findings in studies that have examined how participants responded to the emotional content in their L1 or L2 using psychological and cognitive tasks. Some studies have reported weaker memory facilitation effects elicited by the emotional connotations in the L2 group (Baumeister et al., Reference Baumeister, Foroni, Conrad, Rumiati and Winkielman2017), but others have controversially observed comparable affective priming effects (Degner et al., Reference Degner, Doycheva and Wentura2012), emotion-memory effects (Ayçiçeği-Dinn & Caldwell-Harris, Reference Ayçiçeği-Dinn and Caldwell-Harris2009), emotional Stroop effects (Ahn & Jiang, Reference Ahn and Jiang2022) and response speed of lexical decision (Ponari et al., Reference Ponari, Rodríguez-Cuadrado, Vinson, Fox, Costa and Vigliocco2015) in the L2 population. According to the Emotional Contexts of Learning Theory (Harris et al., Reference Harris, Gleason, Ayçiçeǧi and Pavlenko2006), late bilinguals usually acquire their L2 without sufficient involvement of emotional experiences, so that their emotional embodiment tends to be less grounded and they may process the L2 words more semantically with reduced automaticity of affective activation (Pavlenko, Reference Pavlenko2012). However, considering the inconsistent findings in previous studies, further investigations are still needed to examine the similarities and differences between the L1 and L2 populations.
Despite the intense interest in the comparison of L1-L2 emotional processing, few studies, however, have touched on the nuance caused by L2 proficiency within the bilingual population. As proposed by the Revised Hierarchical Model (Kroll & Stewart, Reference Kroll and Stewart1994), higher L2 proficiency or exposure is accompanied by stronger conceptual connections of L2 words, so the bilinguals with higher L2 proficiency could directly access the meaning of L2 words without the mediation of L1 translation. This model suggests that less proficient bilinguals cannot directly access the L2 lexical nodes to the conceptual nodes, and they have to indirectly refer to the lexical-level links from L2 to L1 translation equivalents. As for proficient bilinguals, they can establish direct connections between L2 lexical nodes and conceptual nodes, which parallel the semantic representations of L1 speakers. Based on this model, it can thus be inferred that highly proficient bilinguals would show more native-like behaviours and responses while processing L2 words, compared to the less proficient L2 learners who primarily rely on the lexical connections of L1 counterparts. In a similar vein, the Lexical Quality Hypothesis (Perfetti, Reference Perfetti2007) indicates that higher quality representations of a word's identity are more fully-specified and stable, including the orthographic, phonological and semantic constituents. As such, this hypothesis supports that word identification and processing would be more efficient and less cognitively demanding if the language proficiency is high. Hence, as suggested by Imbault et al. (Reference Imbault, Titone, Warriner and Kuperman2021), a prediction pertaining to the proficiency effects on L2 processing could be made–that is, less proficient L2 learners would demonstrate stronger neutralization while processing the emotional connotations of L2 words, compared to the more proficient L2 speakers.
Based on these theoretical frameworks, bilinguals with different proficiency levels could be predicted to show dissimilar behaviours or responses during L2 word processing, due to different semantic representations. Some previous studies have explored the proficiency effects on L2 emotion processing, but they have reported inconsistent findings on this issue. Imbault et al. (Reference Imbault, Titone, Warriner and Kuperman2021) collected the valence and arousal ratings for 2,628 L2 English words by a large-scale affective rating experiment, and they compared the valence scores across five L2 proficiency levels, finding that more proficient bilinguals tended to report native-like affective ratings. Caldwell-Harris et al. (Reference Caldwell-Harris, Tong, Lung and Poo2011) recorded the skin conductance responses of bilinguals during lexical processing, and they observed that participants with higher language proficiency showed elevated physiological responses to English endearments. However, a few prior studies using behavioural tasks did not seem to identify such proficiency effects. For example, Degner et al. (Reference Degner, Doycheva and Wentura2012) reported similar affective priming effects between the L1 and L2 populations, and they did not observe significant differences caused by self-rated language proficiency. Similarly, Ahn and Jiang (Reference Ahn and Jiang2022) adopted a classic Stroop paradigm to investigate the automaticity of emotion activation in L2 readers, but they did not observe the main effect of L2 proficiency on the emotional Stroop effect. As such, L2 proficiency might not be predictive of different processing patterns for emotional connotations among L2 readers. Considering the highly discrepant but sparse empirical evidence on the association between L2 proficiency and L2 emotion processing, it is urgent and necessary to use a few more psychophysiological and neuroimaging paradigms to investigate the differences in affective processing among bilinguals with different proficiency levels.
The effects of a few influential factors on L2 emotion processing have been explored. In terms of the lexical materials, emotion polarity has shown certain modulating effects. Specifically, positive words seemed to elicit stronger processing advantages among L2 readers (Shenaut & Ober, Reference Shenaut and Ober2021), while negative words, including swear and taboo words, in turn, might be processed similarly to neutral words (Arriagada-Mödinger & Ferreira, Reference Arriagada-Mödinger and Ferreira2022) or even demonstrate processing disadvantages (Ferré et al., Reference Ferré, Anglada-Tort and Guasch2018) in the L2 context. In addition to the behavioural observations, some neuroimaging studies have also identified differences in processing positive and negative L2 words. Jończyk et al. (Reference Jończyk, Boutonnet, Musial, Hoemann and Thierry2016) enrolled Polish–English bilinguals to read sentences ending with positive, negative and neutral words, which manifested an incomplete and suppressed semantic access to L2 negative words based on the reduced N400 amplitudes. Such an early suppression and later stronger reevaluation of negative semantics, rather than positive or neutral semantics, corroborated previous results (Y. J. Wu & Thierry, Reference Wu and Thierry2012), indicating an involvement of additional brain structures and more efforts in negative word processing (Sulpizio et al., Reference Sulpizio, Toti, Del Maschio, Costa, Fedeli, Job and Abutalebi2019). Therefore, it seems that positive words might show processing advantages over negative words in the L2 population. However, prior studies have mainly observed the influence of emotional valence in L2 with single-word stimuli, and such an effect may also need to be examined at the above-word levels.
Another concern is the use of emotion-label and emotion-laden words as the experimental materials, which have been reported to possibly involve different processing procedures over the recent years. Pavlenko (Reference Pavlenko2008) has provided specific definitions for these two word types. Emotion-label words refer to the words that straightforwardly describe particular affective states (e.g., “happy”, “sad”) or processes (e.g., “to surprise”, “to worry”), while emotion-laden words denote the words that indirectly express (e.g., “loser”, “champion”) or elicit emotions from the interlocutors (e.g., “birthday”, “funeral”). By definition, compared to emotion-laden words, emotion-label words should normally have more salient emotional content and direct conceptual links to the core emotions they represent (Haro et al., Reference Haro, Calvillo, Poch, Hinojosa and Ferré2023; Knickerbocker & Altarriba, Reference Knickerbocker and Altarriba2013). Previous studies have found facilitated processing of emotion-label words in a few behavioural tasks, such as the lexical decision task (Kazanas & Altarriba, Reference Kazanas and Altarriba2015). Other research using event-related potentials also observed that emotion-label words elicited enhanced N170 compared to emotion-laden words (Zhang et al., Reference Zhang, Wu, Meng and Yuan2017).
In the L2 population, similar processing facilitation for emotion-label words has also been reported. Specifically, bilinguals showed higher accuracy and shorter response time while processing L2 emotion-label words in emotion categorization tasks (D. Tang et al., Reference Tang, Fu, Wang, Liu, Zang and Kärkkäinen2023) and priming tasks (C. Wu et al., Reference Wu, Zhang and Yuan2022), and they needed smaller brain activation during the emotion perception primed by L2 emotion-label words (C. Wu et al., Reference Wu, Zhang and Yuan2022). According to the “mediated account” (Altarriba & Basnight-Brown, Reference Altarriba and Basnight-Brown2011), the processing of emotional connotations in emotion-laden words needs to be mediated by the labelled conceptual meanings and associated affective experiences. However, since emotion-label words directly and explicitly represent emotions, the processing of their emotional meanings would be easier and more automatic. Similarly, the “emotion duality model” also suggests that emotional responses activated by emotion-label words are automatic and biologically rooted, while the processing of emotion-laden words requires more cognitive efforts through a reflective system (Imbir et al., Reference Imbir, Jurkiewicz and Duda-Goławska2019; D. Tang et al., Reference Tang, Fu, Wang, Liu, Zang and Kärkkäinen2023). Besides the difficulty and higher cognitive demands in processing emotion-laden words, the establishment of emotional connections to the emotion-laden words also seems to be more complex with larger individual and cross-cultural variations. In an extreme case illustrated by Pavlenko (Reference Pavlenko2008), some emotion-laden words that are commonly perceived as insults, such as swearwords, may appear as friendly terms of affection in other languages, while words like “liberal” or “elite” may be perceived as insults in certain cultures. In fact, while constructing the cross-culturally universal “colexification” networks of emotional expressions (Jackson et al., Reference Jackson, Watts, Henry, List, Forkel, Mucha, Greenhill, Gray and Lindquist2019), researchers only considered words that could be directly felt as efficient emotion concepts. Therefore, considering the less demanding processing requirements, higher emotional salience and the more cross-culturally common emotional connotations for emotion-label words, we thus used emotion-label words as our experimental materials for this preliminary eye-tracking exploration of L2 emotion processing.
In addition to the factors related to lexical materials, L2 emotion processing might also be influenced by individual emotional states. Previous studies have mostly acknowledged that the processing of emotional information is abnormal in mood disorders (Panchal et al., Reference Panchal, Kaltenboeck and Harmer2019; E. Tang et al., Reference Tang, Zhang, Chen, Lin and Ding2022). Even in the healthy population, individuals’ emotional states have also been reported to cause differences in word learning and emotional word processing. In a novel word learning paradigm (Guo et al., Reference Guo, Zou and Peng2018), converging results were obtained from a semantic category judgment task and a word-picture semantic consistency judgment task, suggesting that negative emotional states could lead to worse performance in word learning. However, during the processing of emotional words, negative mood could narrow participants’ attention and enhance their distinction of words within the positive, negative or neutral clusters (Sereno et al., Reference Sereno, Scott, Yao, Thaden and O’ Donnell2015). In a recent review article (Naranowicz, Reference Naranowicz2022), it has been summarized that positive moods could activate global processing while negative moods strengthen detailed and local processing during language comprehension. In addition to the effects of individual mood on language processing, the anxiety states also seemed to influence people's comprehension of context information. Using the spelling task and lexical decision task, Blanchette and Richards (Reference Blanchette and Richards2003) found that, compared to the control group, participants with state anxiety were more likely to adopt an emotional interpretation of the ambiguous homophones under the emotional context and a neutral interpretation under the neutral context. Hence, these previous studies have demonstrated the influence of individuals’ mood and anxiety state on the processing of emotion-related information. Knickerbocker et al. (Reference Knickerbocker, Johnson and Altarriba2015) have analyzed the associations between participants’ eye movement measures in L1 processing and the emotional measures of their depression and anxiety traits, and reported significant interactions between participants’ anxiety and the emotion processing effects for negative words. However, few studies have explored how individual emotional states influence emotional word processing in the L2 context. Therefore, the current study was also intended to evaluate the depression and anxiety states of the L2 readers, and to associate these measures with their eye movements in L2 word processing.
Despite substantial explorations on L2 emotion processing, the eye-tracking method has not yet been widely used in this field to investigate the facilitation effects of L2 emotional connotations using passive sentence reading tasks. Earlier research (Scott et al., Reference Scott, O'Donnell and Sereno2012) carried out among native speakers showed that emotion qualities of words can accelerate the reading time during natural sentence reading, which justified the effectiveness of eye-tracking experiments for the investigation of reading patterns as a function of different emotional meanings of words. Sheikh and Titone (Reference Sheikh and Titone2013, Reference Sheikh and Titone2016) selected 52 triplets of English words and compiled them into sentence frames, and recorded the eye movements of native and bilingual readers during their reading procedure. Results indicated that emotion facilitation effects occurred for both positive and negative words with low frequency among L1 readers (Sheikh & Titone, Reference Sheikh and Titone2013), whereas, among L2 readers, benefits were only identified for positive words, and their negative word processing was modulated by concreteness, frequency and L2 proficiency similarly to neutral words (Sheikh & Titone, Reference Sheikh and Titone2016). These two experiments that compared the reading behaviours of L1 and L2 readers with the same set of materials suggested a joint influence of linguistic, emotional and sensorimotor information on the early stages of word processing in natural sentence reading. Moreover, they seemed to indicate that negative words might not be grounded in emotional experiences in the L2 context (Monaco et al., Reference Monaco, Jost, Gygax and Annoni2019). Hence, positive words could be predicted to elicit more advantages over negative words in L2 word processing. Knickerbocker et al. (Reference Knickerbocker, Johnson and Altarriba2015) respectively tracked the reading behaviours for positive and negative words in native readers, and they corroborated that both positive and negative emotion-label words can facilitate the L1 reading at both early and late processing stages, compared to neutral words. The materials in this study have not yet been used in bilinguals to examine whether similar effects also stood in the non-dominant language environment. A few extant bilingual studies recorded participants’ pupillary responses to emotional and neutral words, rather than words fitted to sentence frames, and they consistently reported attenuated pupil responses for emotional words in L2 (Toivo & Scheepers, Reference Toivo and Scheepers2019; Yao et al., Reference Yao, Connell and Politzer-Ahles2023). Nevertheless, there is still a dearth of direct eye-tracking evidence to support the claim that bilinguals processed L2 emotional words differently from L1 readers, especially at the sentence level.
The present study primarily intended to investigate whether the emotion facilitation effects in L1 also occur during L2 sentence reading, and thus supplement new evidence to limited eye-tracking experiments on L2 emotion word processing. Taking existing evidence into consideration, we tried to address the following research questions, including (1) whether and how L2 emotion-label words influence the reading behaviours of bilinguals; (2) whether bilinguals with high and low L2 proficiency show similar or different patterns of L2 emotional word processing; and (3) whether depressive and anxious symptoms are associated with bilinguals’ eye movements for emotional word processing. Based on previous findings and the theoretical frameworks, we predicted that bilinguals would not demonstrate the facilitation effects of emotional connotations in L2 processing, but positive words might show processing ease compared to negative words. In the current study, we would use the term “emotion processing effect” hereafter to represent the ease of processing emotional words compared to neutral words. Meanwhile, highly proficient bilinguals should process L2 words faster than less proficient ones in general, and they might also show stronger emotion processing effects considering their advancement in semantic representations and lexical quality for L2 words. Last, since the associations between individual emotional states and the eye movements of L1 processing have been relatively unstable (Knickerbocker et al., Reference Knickerbocker, Johnson and Altarriba2015), such associations could be even weaker in L2 processing considering the increased cognitive burdens. However, as there only exists little evidence concerning how individual depression and anxiety states influence the processing of L2 emotional words, we could thus tentatively predict that few eye movement measures would be correlated with individuals’ emotional states, and any established associations should be interpreted with cautions and require further validations.
A few novelties and contributions of our current study are demonstrated here. Methodologically, tracking eye movements during L2 reading can appropriately observe bilinguals’ automatic physiological responses to emotional words under involuntary control, and the recorded data can objectively reflect the emotional effects elicited by the emotional connotations of words. Moreover, due to factors such as language acquisition contexts and L1-L2 interaction, the L2 population tends to show prominent individual variations (Fricke et al., Reference Fricke, Zirnstein, Navarro-Torres and Kroll2019), even for L2 readers with similar proficiency levels. As such, the comparability of emotional processing effects obtained in separate L2 groups might be decreased in a traditional between-subject design, even when the demographic background and language use experience were matched. Therefore, we adopted a within-subject design to minimize the potentially unexpected individual variances, so that the emotion processing effects of L2 positive and negative words could be more reasonably compared together. As for the selected materials, we used sentences rather than isolated words as experimental stimuli, which were more ecologically valid and could restore the emotion processing effects in the authentic scenes. It has been suggested that contextual information was more accurately predictive of lexical representation than isolated words (Garrido & Prada, Reference Garrido and Prada2021; Snefjella & Kuperman, Reference Snefjella and Kuperman2016), and the use of sentence materials could be more sensitive to demonstrate the differences in context interpretation and lexical processing across the L1 and L2 populations. In terms of statistical analyses, we intended to analyze as many as 13 eye-tracking measures for all emotional words and examine the influences of L2 proficiency and individual emotional states on L2 emotion processing. Hence, our current study would not only supplement limited eye-tracking evidence in the field of L2 emotion processing, but also provide new statistics for some factorial effects that have been insufficiently investigated in previous studies. Last, we have drawn on some theoretical frameworks of bilingual research, including the Emotional Contexts of Learning Theory, Lexical Quality Hypothesis and Revised Hierarchical Model to establish hypotheses for our findings, and would in turn validate relevant theoretical premises with our empirical data.
2. Methods
2.1. Participants
Forty-two Chinese–English bilinguals from Shanghai Jiao Tong University participated in this experiment. All participants were native speakers of Mandarin Chinese and reported English as their second language. They all had normal or corrected-to-normal vision, with no history of language impairments, reading difficulty, or learning disorders. Unqualified participation was excluded from later analyses. Three participants wore thick glasses and their eye movements could not be successfully tracked. Two participants did not finish all trials due to fatigue or emergent personal issues and chose to withdraw from the experiment. One participant had an operation of crystalline lens resection, and failed to pass the calibration. As a result, the eye-tracking data from 36 participants (18 males) aging between 18 and 28 (M = 22.14, SD = 2.45) were analyzed. The final set of bilingual participants were all born and raised in mainland China with Chinese being the dominant language in daily use, and they were all late L2 learners who were not exposed to English until the age of six (mean age of first exposure = 8.89, SD = 2.49, range = 6-13). Based on the self-rated English proficiency on the 1-7 scale, these included bilinguals had an averagely intermediate proficiency in reading (M = 5.51, SD = 1.04), writing (M = 4.92, SD = 0.98), listening (M = 4.78, SD = 1.48) and speaking (M = 4.68, SD = 1.40) capacities. Only one participant once lived in an English-speaking country (the UK) for 10 months during the undergraduate period, while others did not claim overseas living experiences for more than six months. As for the language history, 10 participants reported knowing a third language but with limited proficiency (Japanese: n = 6; French: n = 3; Russian: n = 1). All participants signed a written informed consent before the formal experiment, and received monetary compensation for their participation.
2.2. Materials
The present study constructed the experiment lexicons and sentence frames based on Knickerbocker et al. (Reference Knickerbocker, Johnson and Altarriba2015), where the effects of affective meanings on sentence reading have been examined in native speakers of English. In total, 36 positive emotion-label words, 36 negative emotion-label words, and 36 neutral contrasts were selected. For both the emotional and neutral words, there were 12 nouns, 12 verbs, and 12 adjectives. To ensure that all selected words were emotionally appropriate for the current study, their emotional characteristics were re-examined in the L2 population before our formal experiment.
The influence of emotional connotations on L2 reading was evaluated in two separate blocks, with positive or negative emotional words compared against the neutral words in each block. Within each block, 72 sentence frames were designed to locate the emotional and neutral words in pairs so that each sentence frame could grammatically and semantically fit one emotional word and one paired neutral word. The target words never appeared as the first three or the last three words of the sentence. As illustrated in Table 1, there were two sentence frames designed for each word pair, and each word would randomly appear in either of the sentence frames for each participant. Moreover, each target word and sentence frame would only appear once within one block, and the word-sentence matching was counter-balanced across all participants. As a result, each participant read 72 sentences in each block. All experimental sentences can be accessed in the two Appendixes. The mean sentence length was 11.36 words (SD = 1.50) in the block containing positive words, and it was 13.63 words (SD = 2.04) in the block containing negative words. To ensure that the sentences were appropriate for L2 reading, the difficulty, understandability and predictability were also re-assessed in the L2 population.
2.3. Normative examination
First, the affective properties of the selected lexical items were re-evaluated in L2 to ensure that these experimental words could effectively introduce the positive, negative or neutral meanings as intended. Twenty Chinese–English bilinguals who did not participate in the formal eye-tracking experiment were enrolled to rate the familiarity, valence and arousal parameters of all included words. On the seven-point scale, 1 referred to the least familiar, the most negative and the least arousing, while 7 denoted the most familiar, positive and arousing. As shown in Table 2, all included words were highly familiar to the L2 population, and there was no significant difference between neutral words and either positive (t(70) = 0.24, p = 0.81) or negative (t(70) = 0.24, p = 0.81) words. Compared to the neutral words, positive words received significantly higher valence (t(70) = 9.07, p < 0.001) and arousal (t(70) = 8.85, p < 0.001) scores, while negative words received significantly lower valence (t(70) = –13.52, p < 0.001) and higher arousal (t(70) = 6.51, p < 0.001) scores. In summary, these selected emotional words contained prominent emotional meanings with moderately high arousing effects, and the neutral words were emotionally neutral with low arousing effects in L2. Hence, these words were suitable for the present experiment. Other parameters, including the word length, frequency and orthographic neighbourhood size, have been matched in Knickerbocker et al. (Reference Knickerbocker, Johnson and Altarriba2015), and these objective measures were regarded as similar in the L1 and L2 conditions.
As the experiment materials of our current study followed closely to those of Knickerbocker et al. (Reference Knickerbocker, Johnson and Altarriba2015), we have further examined whether the distribution of valence and arousal ratings of these experimental words was matched between the L1 and L2 populations. Since the seven-point and nine-point scales were differently adopted by our study and that previous study, the average raw scores were standardized into the percentages they took into account of the full scales for each affective metric, which ensured the cross-study comparability. To clearly observe how much difference there was between the actual percentages and the midpoint of the scales, we subtracted 50% from the actual percentage statistics at each metric. As depicted in Fig. 1, the valence ratings of the three word types had the same distribution in both populations, and those of the L1 population seemed to be more extreme than the L2 population. As for the arousal dimension, although the overall distribution was roughly the same across the two populations, these selected negative words seemed to be more arousing in the L1 context but less arousing in the L2 context, compared to the positive words. Despite these minor differences, it has been guaranteed that the emotional words could be clearly distinguished from the neutral words in both populations, so the selected words could satisfy our current research objectives. However, we still acknowledge that the observed differences here might impose certain limitations on the interpretation of our findings.
Second, properties at the sentence level, especially difficulty, understandability, and predictability, were also assessed on the seven-point scale in the L2 population. Specifically, difficulty was defined as how difficult the sentences were for L2 reading, which focused on the use of unfamiliar words and the complexity of grammatical structure. Understandability was intended to measure whether the sentence meanings were clear and acceptable, as a reflection of how well the target words fit the sentence structures. Predictability measured whether the target words could be predicted by the preceding sentence frames. Another group of 20 participants who did not participate in the formal eye-tracking experiment provided the difficulty and understandability norms for the experiment sentences, with score 1 representing extremely difficult/not understandable and 7 representing extremely easy/very understandable. Sentences were counterbalanced so that each rater evaluated each sentence frame and each target word only once. As shown in Table 3, all experimental sentences were easy and natural to read, and there were no significant differences between the sentences containing emotional or neutral words under each condition. Another 15 bilinguals were given the incomplete sentence frames preceding the target words, and were asked to write down one English word for each sentence that was the most likely to occur at the next position. Results showed that the predictability was low for positive (M = 0.00%), negative (M = 0.83%), and neutral (M = 0.00%) target words in these sentence contexts.
Note: SE: sentences containing emotional target words; SN: sentences containing neutral target words; AS: all sentences; DI: difficulty; UN: understandability
2.4. Apparatus and procedure
Eye-movements were recorded by the SR Eyelink 1000 eye-tracker (SR Research, ON, Canada) at a sampling rate of 1000 Hz per second. Experiment sentences were displayed 73 cm away from participants’ eyes on a 24-inch CRT screen (resolution: 1,024*768; refresh rate: 100 Hz) of a DELL computer, and each sentence was presented in a single line in 12-point Courier New font where two characters equalled 1° of visual angle. Although the sentence reading was binocular, all calibrations and recordings of the eye movements were based on the right eye.
On arrival, participants registered their language history with an adapted version of the LEAP-Q questionnaire (Marian et al., Reference Marian, Blumenfeld and Kaushanskaya2007). Participants were then seated comfortably in front of the eye-tracker, with their heads positioned securely on the chin and forehead rest. Before recording the eye movements, the nine-point grid calibration was carried out, which was accepted if an average error was < 0.50° of visual angle (approximately 1 character) with a maximum error < 1°. With an acceptable calibration, a fixation cross would occur before each trial at the position where the first word of the following sentence was located, and drift calibration was conducted. Re-calibration was performed when necessary. To become familiarized with the experiment tasks, participants firstly conducted 10 practice trials, all of which did not appear again in the formal experiment, followed by 72 experiment trials presented in randomized order. During the experiment, participants were instructed to read the single-line sentences presented on the screen silently in a natural manner and to press the space bar on the keyboard if finished. Following a third of the sentences appeared a yes-or-no comprehension question that was related to the meaning of the preceding sentence but was irrelevant to the target words and the post-target regions (two-word region after the target words). Participants were asked to press one of the two buttons representing “yes” or “no” as a response, and the left-right corresponding relationships between the answers and buttons were counterbalanced. Half of the questions appeared after the sentences containing emotional words and the other half after the neutral sentences, and the number of questions requiring a “yes” answer or a “no” answer was also evened. Only the data of participants whose average accuracy for comprehension questions was at least 1.5 times the chance level (75%; Z. Wu & Wang, Reference Wu and Wang2022) were valid for further analyses. The formal experiment session had two blocks, with one comparing positive and neutral words and the other negative and neutral words. Two blocks were counterbalanced in presentation sequence, and the whole experiment lasted around 40 minutes.
After the eye-tracking experiment, participants continued to complete a set of measures for affective state and L2 proficiency. The symptoms of depression were assessed by the Beck Depression Inventory Short Form (BDI-SF; Beck & Steer, Reference Beck and Steer1993), the 13-item cognitive-affective subscale of BDI, which has been found effective and valid in depression detection (Furlanetto et al., Reference Furlanetto, Mendlowicz and Romildo Bueno2005; Stukenberg et al., Reference Stukenberg, Dura and Kiecolt-Glaser1990). The Spielberger State Trait Anxiety Inventory (STAI; Spielberger, Reference Spielberger1983) was used to measure the state anxiety and trait anxiety of participants, which consists of the STAI-S and STAI-T subscales to respectively measure the transient status and general propensity of the anxious disposition.
In the last step, English proficiency was measured by the LexTALE and C-test. The LexTALE is a quick and valid predictor of English vocabulary knowledge (Lemhöfer & Broersma, Reference Lemhöfer and Broersma2012), and has been used for L2 vocabulary assessment (Zhang et al., Reference Zhang, Wu, Yuan and Meng2020). As an alternative to the traditional cloze test, the C-test (Raatz & Klein-Braley, Reference Raatz, Klein-Braley, Culhane, Klein-Braley and Stevenson1981) normally consists of five independent short texts, and the second half of every second word in each text is deleted so that 20 blanks need to be completed. It outperforms some vocabulary tests as a valid and reliable predictor of receptive language skills, L2 academic achievements, or even general language proficiency (Daller et al., Reference Daller, Müller and Wang-Taylor2020; Harsch & Hartig, Reference Harsch and Hartig2015). Following previous L2 research that used the C-test to discriminate L2 proficiency groups (Qiu, Reference Qiu2022), we compiled five excerpts from prior studies (Babaii & Ansary, Reference Babaii and Ansary2001; Dörnyei & Katona, Reference Dörnyei and Katona1992), whose reliability, validity and sensitivity were satisfactory in the L2 population. Based on the median split of average language proficiency test scores (Qian et al., Reference Qian, Lee, Lu and Garnsey2018), L2 participants were divided into the higher proficiency group (higher than 65) and the lower proficiency group (lower than 65).
Table 4 reported the statistics of affective and L2 proficiency assessments in our current experiment. There were significant differences in LexTALE (t(34) = 4.21, p < 0.001), C-test (t(34) = 8.70, p < 0.001), and the general L2 proficiency (t(34) = 7.78, p < 0.001) between two groups, while no between-group significant difference was found for STAI (t(34) = 1.36, p = 0.18) and BDI (t(34) = 0.93, p = 0.36).
Note: EngProf = English proficiency reflected by the average score of LexTALE and C-test.
2.5. Data processing and statistical analysis
The eye-tracking data were automatically cleaned. Short fixations (< 80 ms) were combined with nearby fixations within one character space. Extremely short (< 80 ms) and long (> 1000 ms) isolated fixations were removed from further analyses. Trials with a blink or pupil absence in the pre-target, target, or post-target (two-word region following the target word) regions were deleted. In addition, trials without any fixations were also deleted. In total, 3.04% of data on the target region and 2.90% of data on the post-target region were trimmed in the positive emotion block, while 2.44% of data on the target region and 2.34% of data on the post-target region were trimmed in the negative emotion block.
Following Knickerbocker et al. (Reference Knickerbocker, Johnson and Altarriba2015, Reference Knickerbocker, Johnson, Starr, Hall, Preti, Slate and Altarriba2019), the current study analyzed a few early and late measures in the target and post-target regions. Early measures included: (1) first fixation duration (the duration of the first fixation on the target word); (2) single fixation duration (the duration of the first fixation on the target word when there was only one fixation on it); (3) gaze duration (or first-pass fixation duration, the total duration of all first-pass fixations on the target word before exiting it); (4) landing position (the number of characters from the horizontal position of the first fixation landing on the target word to its left edge); (5) skipping rate (the percentage of trials where the target word had no fixation during first-pass reading). Late measures included: (1) total dwell time (the total duration of all fixations on the target word); (2) regressions in (the percentage of trials where the participant returned to the target word from the regions to its right side); (3) second pass time (the total duration of all second-pass fixations on the target word). Besides the target words, five measures were also analyzed in the post-target regions, including (1) spillover (the duration of the first fixation after leaving the target word); (2) first fixation duration; (3) gaze duration; (4) total dwell time; (5) regressions out (the percentage of trials where regressions were made from the post-target regions to earlier interest areas before leaving the post-target regions in the forward direction).
Eye-tracking statistics were analyzed by R version 4.0.3 (R Core Team, 2020) in the linear mixed effect models (LMMs) for continuous dependent measures and generalized linear mixed effect models (GLMMs) for categorical metrics. These two models were respectively fitted using the lmer() function and the glmer() function of the lme4 1.1-26 package (Bates et al., Reference Bates, Mächler, Bolker and Walker2015). The p-values and pairwise comparisons for statistical examinations were calculated by the lmerTest 3.1-3 package (Kuznetsova et al., Reference Kuznetsova, Brockhoff and Christensen2017) and the lsmeans 2.30-0 package (Lenth, Reference Lenth2016). The remaining data were processed before fitting the statistical models. Metrics measuring fixation duration were lg-transformed to ensure a normal distribution, while other binomial metrics were analyzed by means of proportions. For each dependent variable, two contrast-coded factors and their interactions were coded as fixed effects, including the target word type (neutral: -0.5 vs. positive: +0.5 or neutral: -0.5 vs. negative: +0.5) and English proficiency (lower proficiency group: -0.5 vs. higher proficiency group: +0.5). Here, as English proficiency is fundamentally a continuous variable, it was also coded as a continuous dependent variable in additional models to reexamine the observed effects of L2 proficiency. As we also intended to investigate whether depression and anxiety states affected the emotion-related sentence reading, either BDI or STAI score was also coded as an additional fixed effect in two separate formulas for each dependent variable. Random effects were constructed with by-subject and by-item random intercepts. Starting with the full model, the backward model selection procedure followed the Akaike information criterion (AIC; Akaike, Reference Akaike1974) and significance tests (Matuschek et al., Reference Matuschek, Kliegl, Vasishth, Baayen and Bates2017). Models that reported convergence failure were excluded. To assess the construct of each model, the r.squaredGLMM() function of the MuMIn 1.43.17 package (Bartoń, Reference Bartoń2020; Nakagawa et al., Reference Nakagawa, Johnson and Schielzeth2017) was adopted to summarize the explanatory power of fixed effects (R2marginal) and the combined power of fixed and random effects (R2conditional). The partial eta-squared effect size was also intended for significant findings using the eta_squared() function of the effectsize 0.6.0.1 package (Ben-Shachar et al., Reference Ben-Shachar, Lüdecke and Makowski2020). According to Cohen (Reference Cohen1988), the benchmarks for small (η2 = 0.01), medium (η2 = 0.06) and large (η2 = 0.14) effects were defined.
3. Results
3.1. Effects of presentation sequence on neutral word processing
Before conducting the formal statistical analyses, we were encouraged to examine the effects of presentation sequence on the processing of neutral words. In our current within-subject experiment design, the same set of neutral words was presented twice, but the emotional words only appeared once in the two separate blocks. Although possible familiar effects brought by the repeated presentation have been addressed by our counterbalance manipulation, there still existed an issue of averaging the eye-tracking data of neutral word processing obtained from the first and second presentations. Therefore, for each of the 13 eye-tracking measures in neutral word processing, we set the presentation sequence as the fixed effect with the by-subject and by-item random effects in LMMs or GLMMs models. As shown in Table 5, no significant differences were identified in any of our involved eye-tracking measures (all ps > 0.32). Results here indicated that no prominent differences existed in the neutral word processing between first-time and second-time presentations. In other words, the repeated presentation did not significantly influence the processing of neutral words in our current L2 sample. Hence, our following statistical analyses were modestly justified.
3.2. Positive emotional words vs. neutral words
Overall, the comprehension accuracy for the positive block was acceptable for all participants (M = 87.15%, SD = 8.23%, range = 75% to 100%), indicating that they were attentive to the reading task. Table 6 presents the means and standard deviations of reading times concerning the eye movements at both target and post-target regions. Details of the LMM and GLMM models were provided in the supplementary data.
Note. Reading times are in milliseconds.
In the early stage of L2 sentence reading, an extra processing burden was observed for positive words and such a tendency was more obvious in the lower proficiency group. In general, when participants read positive emotional words, they showed a tendency to spend longer first-pass gaze duration (β = 0.03, SE = 0.02, t = 1.58, χ2(1) = 2.50, p = 0.11) and lower skipping rate (β = −0.16, SE = 0.39, t = −0.42, χ2(1) = 0.00, p = 0.95). The landing position (β = -0.01, SE = 0.03, z = −0.39, χ2(1) = 0.16, p = 0.69) was comparably similar in neutral and positive conditions. Nevertheless, there were no significant main effects of emotion in any of these early-stage measures. As for the comparison between the two proficiency groups, the higher proficiency group was overall faster to read the L2 words than the lower proficiency group, but the interaction between English proficiency and target word type reached significance only in the eye-tracking measure of landing position (β = 0.07, SE = 0.03, t = 1.94, χ2(1) = 3.75, p = 0.05, R2Marginal = 0.006, R2Conditional = 0.102). However, with an additional model where the L2 proficiency was coded as a continuous variable, the significant interaction in landing position disappeared (χ2(1) = 0.27, p = 0.60), suggesting that the observed L2 proficiency effects in the early processing stage were not robust enough.
With respect to the late measure, prolonged reading time was also observed for positive emotional words, and the burden of reading was more severe in the lower proficiency group. Participants spent longer total time (β = 0.02, SE = 0.02, t = 0.90, χ2(1) = 0.82, p = 0.37) and second pass time (β = 0.01, SE = 0.02, t = 0.85, χ2(1) = 0.73, p = 0.39) while reading positive words, and they had fewer regressions back to the target word region (β = -0.05, SE = 0.14, z = −0.33, χ2(1) = 0.14, p = 0.71) under this condition. The interaction between English proficiency and target word type was significant in the measure of total time (β = 0.04, SE = 0.02, t = 2.06, χ2(1) = 4.25, p = 0.04, R2Marginal = 0.033, R2Conditional = 0.377). Post-hoc analyses showed that compared to L2 readers in the lower proficiency group, those in the higher proficiency group spent significantly less total time reading positive words (β = −0.12, SE = 0.05, t = −2.38, p = 0.02), but their reading time for neutral words were comparable (β = −0.08, SE = 0.05, t = -1.64, p = 0.11). When coding the L2 proficiency as a continuous variable, this significant interaction was also identified (β = 0.001, SE = 0.001, t = 2.02, χ2(1) = 4.09, p = 0.04, R2Marginal = 0.045, R2Conditional = 0.377). As such, it could be cautiously inferred that higher L2 proficiency might be associated with stronger emotion processing ease for positive words at the late processing stage than less proficient bilinguals (Fig. 2).
In the post-target region, neither the main effect of emotion nor the interaction effects between English proficiency and emotion reached significance. Interestingly, the measure of spillover did not manifest a significant interaction effect in the categorical model (χ2(1) = 1.30, p = 0.26), but this interaction effect has reached significance in the continuous model (β = 0.001, SE = 0.001, t = 1.98, χ2(1) = 3.90, p = 0.05, R2Marginal = 0.014, R2Conditional = 0.185).
In separate runs of the models examining the effects of individuals’ emotional states on L2 reading, BDI scores did not show significant main effects (all ps > 0.24) or interaction effects (all ps > 0.15) in any of the eye movement measures. As for the STAI scores, although the main effects were not significant for all eye-tracking measures (all ps > 0.28), significant interactions between STAI scores and emotion were recognized in the early measures of positive word processing, including the first fixation (χ2(1) = 5.89, p = 0.02, R2Marginal = 0.018, R2Conditional = 0.136) and single fixation (χ2(1) = 3.92, p = 0.05, R2Marginal = 0.021, R2Conditional = 0.211). Hence, the anxious mood, but not the depressive states, was correlated with the L2 processing of positive emotional words in the early stage.
In summary, L2 readers showed no significant differences in eye movements when reading the neutral and positive words. However, bilinguals with higher L2 proficiency showed significantly stronger emotion processing effects for L2 positive words at the late processing stage, as reflected by the robust interaction effects of total reading time. Other eye-tracking measures, such as landing position and spillover, might also display significant interaction effects, but cautions were needed for relevant interpretations due to the low robustness. Additionally, only the measures of first fixation and single fixation demonstrated significant interaction effects of STAI scores in the positive block.
3.3. Negative emotional words vs. neutral words
Overall, the comprehension accuracy for the negative block was acceptable for all participants (M = 86.92%, SD = 6.84%, range = 75% to 100%), suggesting that they read the sentences carefully and followed our instructions. Table 7 presents the means and standard deviations of reading times. Details of the LMM and GLMM models were provided in the supplementary data.
Note. Reading times are in milliseconds.
Of all early-stage measures, the reading time of negative emotional words was generally longer than that of neutral words. When participants read negative emotional words, they tended to spend longer first fixation duration (β = 0.02, SE = 0.01, t = 1.93, χ2(1) = 3.68, p = 0.06), single fixation duration (β = 0.01, SE = 0.01, t = 1.16, χ2(1) = 1.36, p = 0.24), first-pass gaze duration (β = 0.02, SE = 0.02, t = 0.87, χ2(1) = 0.76, p = 0.38). On the other hand, the landing position (β = 0.02, SE = 0.03, t = 0.48, χ2(1) = 0.23, p = 0.63) and skipping rate (β = 0.03, SE = 0.29, z = 0.11, χ2(1) = 0.06, p = 0.80) were also higher for the negative words than the neutral words. As for the proficiency effects on L2 emotional reading, a significant interaction effect of single fixation was identified (β = 0.04, SE = 0.02, t = 2.04, χ2(1) = 4.17, p = 0.04, R2Marginal = 0.008, R2Conditional = 0.166). However, this interaction effect did not reach a significant magnitude in the models where L2 proficiency was coded as a continuous variable (χ2(1) = 2.52, p = 0.11), suggesting that the proficiency effects identified in the categorical models might not be robust.
As for late measures, L2 participants also spent longer reading time for negative words than neutral words. In specific, the total time (β = 0.01, SE = 0.03, t = 0.53, χ2(1) = 0.28, p = 0.59), second pass time (β = 0.00, SE = 0.02, t = 0.18, χ2(1) = 0.03, p = 0.85), and regressions into the target regions (β = 0.02, SE = 0.16, z = 0.10, χ2(1) = 0.02, p = 0.90) were all slightly higher for negative words. No significant interactions between L2 proficiency and emotion were found in any of the eye-tracking measures at the late stage of negative word processing.
In the post-target region, significant main effects of emotion were only observed in the first fixation measure, while no other significant findings were identified. When reading the post-region words following the negative targets, L2 participants spent significantly longer first fixation duration (β = 0.02, SE = 0.01, t = 2.55, χ2(1) = 6.50, p = 0.01, R2Marginal = 0.009, R2Conditional = 0.114). The emotion effects on spillover also reached marginal significance (β = 0.02, SE = 0.01, t = 1.83, χ2(1) = 3.32, p = 0.07, R2Marginal = 0.010, R2Conditional = 0.149). These two eye-tracking measures seemed to suggest that the early processing of what closely follows negative words might require a longer time in the L2 population. The English proficiency, however, was not predictive of any emotion processing effects here.
Two separate models were operated to examine the effects of participants’ emotional states, as reflected by the BDI and STAI scores. We observed no significant main effects of BDI scores (all ps > 0.15) or STAI scores (all ps > 0.06) in any of the eye movement measures. As for the interaction effects, only one significant interaction effect was identified between BDI scores and emotion in the late measure of regressions out (χ2(1) = 4.08, p = 0.04, R2Marginal = 0.022, R2Conditional = 0.265), whereas all other interaction effects of BDI were not significant (all ps > 0.18). No significant interaction effects were recognized for STAI scores (all ps > 0.23).
In summary, the emotion processing effects of negative words shortly occurred in the early processing stage at the post-target region among L2 readers. However, as all eye-tracking measures at the target region and most measures at the post-target region did not demonstrate significant main effects of emotion, the emotion processing effects might not truly exist for negative words. As for the interaction effects between L2 proficiency and emotion, only the single fixation measure manifested a significant result when dividing all participants into two proficiency groups. Additionally, only one interaction effect between BDI scores and emotion was found for the measure of regressions out. However, these interaction effects need to be interpreted with caution due to the low robustness.
4. Discussion
The present study investigated whether Chinese–English bilinguals benefit from the emotional information while processing L2 emotion-label words, and measured participants’ eye movements at the early and late processing stages in the passive sentence reading tasks. Results suggested that emotion processing effects in L1 did not occur in our L2 population, and these emotional words even possibly exerted burdens on L2 processing, as reflected by the overall longer reading time for both positive and negative emotional words. The interaction effects between L2 proficiency and emotion were stably significant for positive words at the late processing stage. Moreover, since significant interaction effects of depressive and anxious states were only identified in few eye-tracking measures, they might not robustly correlate with eye movement measures in L2 emotional word processing.
Emotional words have been claimed to facilitate lexical processing and sentence reading in L1, but such an emotional advantage was not found in our L2 sample. In fact, L2 readers in our study even showed a reversed tendency of longer reading time for L2 emotional word processing, indicating that there might be disadvantages in this process. Similar to our eye movement measures, a few studies have presented bilinguals with emotional and neutral words in their L1 and L2 conditions, and recorded their pupillary responses. These experiments reported less pronounced contrasts of pupil size in the L2 condition only (Toivo & Scheepers, Reference Toivo and Scheepers2019), and observed similar weaker pupil responses for emotional words in L2 even across two cognate languages (Yao et al., Reference Yao, Connell and Politzer-Ahles2023), thus reflecting reduced emotional responses for L2 words. Some studies using other physiological measures have also reported similar findings. For instance, bilinguals showed significantly decreased activities of the corrugator muscle for L2 emotional words (Baumeister et al., Reference Baumeister, Foroni, Conrad, Rumiati and Winkielman2017), and their skin conductance levels did not differ significantly between emotionally charged and neutral words (Eilola & Havelka, Reference Eilola and Havelka2010). Our eye movement data supplemented existing evidence and demonstrated that bilinguals might not be sensitive to the emotional messages in L2 so that they read the L2 emotional and neutral words in similar ways.
One possible explanation for the missing of emotion processing effects in L2 reading is that bilinguals had less emotional engagement in the L2 context (Harris et al., Reference Harris, Gleason, Ayçiçeǧi and Pavlenko2006), with longer emotional distance and reduced emotional resonance (Costa et al., Reference Costa, Foucart, Arnon, Aparici and Apesteguia2014; Degner et al., Reference Degner, Doycheva and Wentura2012) in L2 processing. Several factors, such as age of acquisition, context of acquisition, language dominance, L2 proficiency, and cross-language emotional memory (Caldwell-Harris, Reference Caldwell-Harris2015; Degner et al., Reference Degner, Doycheva and Wentura2012; Yao et al., Reference Yao, Connell and Politzer-Ahles2023), have been argued to influence the formation of emotionality in L2. In support of the Emotional Contexts of Learning Theory (Harris et al., Reference Harris, Gleason, Ayçiçeǧi and Pavlenko2006), late bilinguals usually acquire their second language through formal courses in the classroom, and they may not be able to integrate the L2 lexical meanings with emotionally relevant autobiographical experiences and everyday interactions (Pavlenko, Reference Pavlenko2017). Due to the lack of sensory involvement, emotional connotations of L2 words and the ensuing sensorimotor activation are not as strong as their first languages (Dudschig et al., Reference Dudschig, de la Vega and Kaup2014; Eilola & Havelka, Reference Eilola and Havelka2010). Affective rating experiments have proved that bilinguals were inclined to evaluate L2 emotional words less intensely (Ferré et al., Reference Ferré, Guasch, Stadthagen-Gonzalez and Comesaña2022; Imbault et al., Reference Imbault, Titone, Warriner and Kuperman2021), and they subjectively confessed to feeling less emotional and natural in L2 despite high proficiency (Dewaele & Nakano, Reference Dewaele and Nakano2013). In other words, late bilinguals may clearly know the emotional meaning of L2 words, but they either find it difficult to feel the embedded emotion in person or only perceive the L2 emotional words as insincere and artificial (Pavlenko, Reference Pavlenko2012). As a result, their reading behaviours for emotional and neutral words did not manifest significant differences overall, and the facilitation effects of emotional connotations were thus not successfully allowed.
Nevertheless, our eye-tracking statistics from the L2 readers manifested longer reading time for emotional words, which seemed to indicate disadvantages for L2 emotion processing rather than merely an attenuation of the advantageous effects in L1. The overall longer reading time in the L2 population might be caused by the extra cognitive load required for L2 processing. For bilinguals, the limited attentional and cognitive resources are preferentially occupied by prioritized orthographic, semantic, grammatical and phonetic processing in L2 (Franconeri et al., Reference Franconeri, Alvarez and Cavanagh2013; Hinojosa et al., Reference Hinojosa, Méndez-Bértolo and Pozo2010; Jiang, Reference Jiang2021; Lieder & Griffiths, Reference Lieder and Griffiths2019), so the additional processing of emotional connotations may lead to cognitive overload and working memory burden. Neuroimaging evidence has also corroborated that L2 brains favoured semantic processing over emotion decoding and more cognitive resources were needed to process the lexical-semantic components of L2 words (L. Liu et al., Reference Liu, Margoni, He and Liu2021), which hindered or at least attenuated the access to the emotional aspects of L2 words and led the bilinguals to overlook emotional information (L. Liu et al., Reference Liu, Schwieter, Wang and Liu2022). As a result, when our bilingual participants read the target words and sentences in the cognitively demanding L2 context, they had to use longer time to comprehend the semantic meanings, and spent even longer duration decoding the emotional connotations associated with the emotional words in a suppressed or delayed manner.
The valence effects were also observed in our current study. Previous studies have mostly revealed processing disadvantages for negative words and processing advantages for positive words, compared to neutral words (Barriga-Paulino et al., Reference Barriga-Paulino, Guerreiro, Faísca and Reis2022; Bromberek-Dyzman et al., Reference Bromberek-Dyzman, Jończyk, Vasileanu, Niculescu-Gorpin and Bąk2021; Unkelbach et al., Reference Unkelbach, von Hippel, Forgas, Robinson, Shakarchi and Hawkins2010). In our study, similar disadvantages in negative word processing have also been reflected by bilinguals’ eye movements in the L2 context. This could be explained by the Automatic Vigilance Theory (Estes & Adelman, Reference Estes and Adelman2008; Estes & Verges, Reference Estes and Verges2008; Pratto & John, Reference Pratto and John1991), which ascribes such effects to the increased engagement and delayed disengagement of attention for negative stimuli. According to this theory, external inputs are automatically evaluated as positive or negative in the first place, and then extended attention and cognitive resources would be allocated to negative stimuli due to their danger and fatality indications, so that relevant responses to the negative stimuli would be prolonged compared to the neutral stimuli. A more pertinent interpretation for the negative disadvantages in our L2 sample might be the suppressed and reevaluated processing of L2 negative stimuli. Jończyk et al. (Reference Jończyk, Boutonnet, Musial, Hoemann and Thierry2016) observed reduced N400 amplitudes when bilinguals read L2 sentences ending with negative words only. It has thus been proposed that the semantic access to L2 negative information in bilinguals was incomplete and partially suppressed at the early processing stage, which might further trigger stronger reevaluation in later processing stages, as reflected by the greater amplitudes in late positive complex (LPC) range (Y. J. Wu & Thierry, Reference Wu and Thierry2012). Because of the early suppression and reevaluated processing, the reading time for L2 negative words was reasonably longer than the neutral words.
However, the disadvantages of positive word processing in our study were not expected, which seemed to be inconsistent with previous evidence (Sheikh & Titone, Reference Sheikh and Titone2013, Reference Sheikh and Titone2016). In this case, the effects of the arousal dimension, which have been found to function on the word recognition duration independently from the valence effects (Kever et al., Reference Kever, Grynberg, Szmalec, Smalle and Vermeulen2019), might have influenced the processing of emotional words in our study. Kuperman et al. (Reference Kuperman, Estes, Brysbaert and Warriner2014) tried to determine the precise nature of the effects of valence and arousal on word recognition by analyzing 12,658 words, and they observed that words with higher arousal were recognized more slowly than calming (low-arousal) words. As illustrated in Fig. 1, our included positive words were perceived as the most arousing category in the L2 population, compared to the neutral and negative words. Therefore, the higher arousal traits of our positive words might elicit extra difficulties in word recognition, which further caused the processing disadvantages of longer reading time. If the valence-arousal interaction is taken into further consideration, our included lexical materials seem to have caused extra cognitive burdens in L2 processing. In an extended version of the Approach-Withdrawal Theory (Robinson et al., Reference Robinson, Storbeck, Meier and Kirkeby2004), it is proposed that negative high-arousal stimuli suggest potential threat (withdrawal) while positive low-arousal stimuli represent safety (approach). In our study, however, the experimental materials instead encompassed positive high-arousal and negative low-arousal words. According to the Valence-Arousal Conflict Theory (Robinson et al., Reference Robinson, Storbeck, Meier and Kirkeby2004), there exists a tendency that positive high-arousal and negative low-arousal stimuli are more difficult to process because of their breach of the traditional approach-withdrawal relationships. Therefore, due to the mismatched valence-arousal relationships of our emotional words, these positive words might have imposed more processing difficulties at both the pre-attentive level and the subsequent response stage, and the reading time for these words was thus prolonged. Another possible reason underlying this processing disadvantage might be the individual psychological states of our participants. Recent studies have claimed that reading behaviours for positive words were highly susceptible to individual differences, and found that individuals with higher levels of need for affect would read positive words more slowly (Lei et al., Reference Lei, Willems and Eekhof2023). However, since the current study only collected the BDI and STAI scores as reflections of participants’ emotional states, it could not be determined whether other factors related with individual differences (e.g., need for affect) have modulated the eye movements during the L2 reading.
The proficiency of L2 was also coded as an essential factor for our subgroup analyses, which distinguished different reading behaviours between the higher proficiency group and the lower proficiency group, especially for positive words. Based on the Revised Hierarchical Model and Lexical Quality Hypothesis, these two proficiency groups might have different semantic representations of L2 words. Hence, it was hypothesized in our current study that bilinguals with different L2 proficiency could show different eye movement patterns while processing the L2 words with various emotional connotations. Indeed, we have found one stable significant proficiency-emotion interaction effect of the total reading time at the late processing stage of positive words. As for other eye-tracking measures, we also identified some unstable interaction effects, which reached significance in only one of the models that coded L2 proficiency as either a categorical or a continuous variable. These measures included the early landing position and the late spillover metrics in the positive block, and the single fixation of target words in the negative block. Our findings suggested that the effect of proficiency, reflected by the faster reading speed of L2 readers with higher proficiency, was larger for positive words than neutral words in the late processing stage. In fact, semantic and emotional processing in L2 may take place via separate channels. Compared to the less proficient bilinguals, highly proficient bilinguals can automatically activate the L2 affective connotations (Degner et al., Reference Degner, Doycheva and Wentura2012) with a more integrated word processing procedure (Sianipar et al., Reference Sianipar, Middelburg and Dijkstra2015), and they tend to show prioritized emotion perception during L2 processing (Ponari et al., Reference Ponari, Rodríguez-Cuadrado, Vinson, Fox, Costa and Vigliocco2015). As for the lower proficiency bilinguals, they seem to have extra demanding cognitive costs (Hasegawa et al., Reference Hasegawa, Carpenter and Just2002; Stiller & Schworm, Reference Stiller and Schworm2019), which would further influence the emotional access (L. Liu et al., Reference Liu, Margoni, He and Liu2021) and lead to the down-regulation of emotional circuits (L. Liu et al., Reference Liu, Schwieter, Wang and Liu2022; Van Dillen et al., Reference Van Dillen, Papies and Hofmann2013). However, a prior eye-tracking experiment (Sheikh & Titone, Reference Sheikh and Titone2016) found that L2 proficiency was only predictive of the concreteness advantage in L2 processing, rather than the emotional advantage. As such, since the proficiency effects on L2 emotion processing have not yet been sufficiently investigated, our findings on the proficiency effects need to be interpreted with caution and still require further examination.
We have also investigated the associations between depressive or anxious symptoms and the reading behaviours for L2 emotional words. However, only a few measures were significantly correlated to our bilinguals’ affective states, indicating that these findings might not be stable and robust enough to be generalized to a larger population. Previous evidence has displayed a facilitated processing of negative target words under a more anxious mood (Knickerbocker et al., Reference Knickerbocker, Johnson and Altarriba2015). Our statistics, on the other hand, manifested processing ease for positive words as a function of higher anxiety level, but also at the early processing course. In addition to the anxious states, we also found that trait depression was associated with the percentage of regressions out in the post-target regions for negative words. Recent research mostly inspected the relationship between anxiety and negative emotion processing, finding that anxiety symptoms were associated with higher avoidant attention allocation and greater pupil dilation (Shechner et al., Reference Shechner, Jarcho, Wong, Leibenluft, Pine and Nelson2017). Nevertheless, negative moods in general, such as anxiety and depression, were found to promote detail-oriented L2 lexical processing and temporarily suppress the complete L2 semantic integration (Naranowicz et al., Reference Naranowicz, Jankowiak, Kakuba, Bromberek-Dyzman and Thierry2022), so the seemingly different correlations here might just be representations of the same effect in two emotional word categories. After all, investigation of the influence of depression and anxiety symptoms on the emotional access of words and sentence reading, especially in the bilingual context, is still scant and remains open for discussion, so that the findings reported here also warrant further examination.
Several limitations are acknowledged in the present study. First, there might exist possible influences of the current within-subject design on the interpretation of our findings. For instance, certain unexpected effects might be introduced by the repeated reading of the neutral sentences. To address this problem, we have tried to minimize such effects by the counterbalance manipulation, and revealed the non-significant effects of presentation sequence on neutral word processing. As a result, the statistical analyses of the current study were credible. Future studies could adopt more sophisticated experiment materials and designs in the exploration of L2 emotion processing. Second, the affective perception of experimental materials showed some differences between the L1 and L2 populations. For instance, compared to the positive words, negative words were perceived as more arousing in the L1 readers but less arousing in the L2 readers. Although these differences did not influence the fact that both positive and negative words could be clearly distinguished from neutral words in both populations, they might still modulate our current eye-tracking observations and elicit more caution for the interpretation of our results. Third, we only enrolled late bilinguals with similar language dominance, and more diversified L2 populations need to be included to examine the emotion processing effects in bilingual and multilingual contexts. Previous research has indicated that the construction and development of semantic representations were largely different in early and late bilinguals (Gathercole & Moawad, Reference Gathercole and Moawad2010), whose language networks also manifested different patterns (X. Liu et al., Reference Liu, Tu, Wang, Jiang, Gao, Pan, Li, Zhong, Zhu, Niu, Li, Zhao, Chen, Liu, Lu and Huang2017). It can be the case that two types of bilinguals may show different activation and responsive patterns when reading the same set of emotional words in the L2 context. Last, the emotional density of emotion-label and emotion-laden words can be different in bilinguals’ lexicon (Pavlenko, Reference Pavlenko2008), and it is still unclear whether these two emotion-related word types cause identical or discrepant physiological responses in L2 reading. According to previous behavioural and neuroimaging studies, the processing of emotion-laden words required longer response time and higher cognitive efforts than emotion-label words (D. Tang et al., Reference Tang, Fu, Wang, Liu, Zang and Kärkkäinen2023; C. Wu et al., Reference Wu, Zhang and Yuan2022). In our current study, we did not identify significant emotion processing effects by using the emotion-label words, so it might be predicted that such facilitation effects caused by emotional connotations might not occur in the reading of L2 emotion-laden words. Probably, due to the indirect emotional connections of emotion-laden words and their associations with various personal experiences, the processing of these words might display more prominent disadvantages and larger individual variation compared to emotion-label words. Hence, future studies that analyze the eye movements in L2 reading of emotion-laden words are needed, which can help illustrate the L2 emotion processing effects more comprehensively.
5. Conclusions
To conclude, the facilitation effects of affective connotations in emotional words did not exist in the L2 context. Instead, processing disadvantages for emotional words were observed in our L2 sample. Our findings implied that bilinguals activate fewer emotional connotations when reading non-dominant L2 words in natural sentences. Moreover, readers with higher L2 proficiency showed greater ease in processing positive words at the late stage compared to those with lower L2 proficiency. As only a few interaction effects of individuals’ emotional states were identified in our study, it appeared that the depressive and anxious states of bilinguals might not robustly influence their processing of L2 emotional words. The current study provides new evidence to a limited number of eye-tracking experiments on L2 emotion processing, and supports several theoretical frameworks in the field of bilingual language processing.
Supplementary Material
For supplementary material accompanying this paper, visit https://doi.org/10.1017/S1366728923000718
Conflict of interest
We have no known conflict of interest to disclose.
Acknowledgements
This work was supported by grants from the Major Program of National Social Science Foundation of China (Grant number: No. 18ZDA293).
Competing interests
The authors declare none.
Data availability
The essential data of this research are accessible by the electronic supplementary file.