1. Introduction
The word frequency effect (FE) refers to the phenomenon that words with a higher frequency of occurrence are processed faster than those that appear less often. It has been well-studied in monolinguals and bilinguals of alphabetic languages (e.g., English; Gollan et al., Reference Gollan, Slattery, Goldenberg, Van Assche, Duyck and Rayner2011) and is one of the strongest factors affecting word processing (Brysbaert et al., Reference Brysbaert, Stevens, Mandera and Keuleers2016). The effect is also evident in reading Chinese (e.g., Li et al., Reference Li, Bicknell, Liu, Wei and Rayner2014), a writing system that systematically differs from that of alphabetic languages in terms of spelling and pronunciation. However, recent evidence shows that although there is an overall FE in Chinese paragraph reading, the effect decreases and eventually disappears as its character frequency, a language-specific factor, increases (Sui et al., Reference Sui, Woumans, Duyck and Dirixsubmitted). Consequently, the magnitude of the Chinese FE in natural reading may differ from that in alphabetic languages. In addition to such between-group comparisons across languages, an interesting line of research has also compared FEs within readers, between first language (L1) and second language (L2), in research on alphabetical languages. Typically, the FE is found to be larger in L2 than in L1 (Cop et al., Reference Cop, Keuleers, Drieghe and Duyck2015; Duyck et al., Reference Duyck, Vanderelst, Desmet and Hartsuiker2008; Mor & Prior, Reference Mor and Prior2022; Whitford & Joanisse, Reference Whitford and Joanisse2018; Whitford & Titone, Reference Whitford and Titone2012, Reference Whitford and Titone2017).
However, none of the studies have compared L1 and L2 FEs in natural reading among bilinguals with different L1 writing systems. Given the lack of empirical evidence, the existing theoretical explanations of the FE based on data from alphabetical languages, such as the learning, lexical entrenchment and rank hypotheses (infra), have yet to be verified for their applicability to Chinese–English bilinguals. This work, therefore, aims to compare FEs in L1 and L2 between readers with distinct L1s (e.g., Chinese and Dutch) and the same L2 (e.g., English) as well as between L1 and L2 within Chinese–English bilinguals. Investigation of these questions firstly allows us to examine related theories based primarily on alphabet reading research (e.g., learning hypothesis) and assess their universality. Secondly, it allows an understanding of the similarities and differences between different L1 writing systems and whether the nature of the L1 writing system affects the processing of the L2.
Besides the above cross-lingual complications, it is also important to consider the (experimental) context in which words appear. Word recognition in natural text is influenced by a wide range of contextual influences (e.g., syntactical or semantic expectations), and differs from reading isolated words (Dirix et al., Reference Dirix, Brysbaert and Duyck2019; Kuperman et al., Reference Kuperman, Drieghe, Keuleers and Brysbaert2013). And also the FE observed in isolated word reading (e.g., the lexical decision task) appears larger than that for words embedded in sentences (e.g., in eye-tracking research; Dirix et al., Reference Dirix, Brysbaert and Duyck2019). Apparently, studying the FEs of words in sentences, which closely resembles reading in everyday life, is essential for understanding language processing, especially in Chinese reading. This is because words are important units in Chinese reading, and their boundaries are often not clearly defined, making word segmentation essential for reading sentences but not for isolated words. In the current study, we therefore investigate FEs by comparing two eye-tracking corpora, GECO (Ghent Eye-tracking COrpus; Cop et al., Reference Cop, Dirix, Drieghe and Duyck2017a) and GECO-CN (Ghent Eye-tracking COrpus for Chinese–English bilinguals; Sui et al., Reference Sui, Dirix, Woumans and Duyck2023), which recorded eye-movement data for L1s and L2s in paragraph reading for Chinese– and Dutch–English bilinguals, respectively.
Eye-tracking is a popular method used to study the underlying processes involved in sentence reading by monitoring the eye movements of the reader while reading. This approach provides a range of eye-movement measures (Rayner, Reference Rayner2009), such as saccades (the action of rapidly moving eyes to a new point) and fixations (the duration of eyes fixating on a specific point). There are multiple fixation duration measures, including (a) first-fixation duration (FFD), the duration of the initial fixation on a word; (b) gaze duration (GD), the summed duration of fixation on a word in the first pass and (c) total-reading times (TRTs), the summed duration of all fixations and refixations on a word. The first two are generally viewed as early measures (reflecting the initial stages of word identification, such as lexical access), whereas the last one, incorporating second-pass time, is considered a late measure (reflecting later stages, such as verification and integration; e.g., Boston et al., Reference Boston, Hale, Kliegl, Patil and Vasishth2008; Clifton et al., Reference Clifton, Staub, Rayner, van Gompel, Fischer, Murray and Hill2007). Skipping probability refers to whether the word is skipped during the reading, not just in the first pass.
Indeed, some studies investigated word frequency as a categorical variable, although it naturally occurs as a continuous variable (e.g., Li et al., Reference Li, Bicknell, Liu, Wei and Rayner2014). However, categorizing continuous variables can result in reduced statistical power and reliability, inappropriate rejection of the null hypothesis and failure to capture the variation of the effect (Balota et al., Reference Balota, Cortese, Sergent-Marshall, Spieler and Yap2004). Here, the large amount of target words in the two eye-tracking corpora allows us to assess word frequency as a continuous variable. In the following section, we will begin with brief summaries of the existing findings on the L1 FE for distinct writing systems, i.e., alphabetic languages and Chinese. Then, we will review the key results on L2 FEs and discuss theoretical issues regarding FEs in bilinguals. Finally, we will report the analysis of this research and discuss the main findings obtained.
1.1 L1 FE in alphabetic languages
FE, the difference in processing times between low-frequency (LF) and high-frequency (HF) words, has been studied extensively in L1 reading of alphabetic languages (e.g., Cop et al., Reference Cop, Keuleers, Drieghe and Duyck2015; Rayner & Raney, Reference Rayner and Raney1996; Whitford & Titone, Reference Whitford and Titone2017). The effect is one of the most potent phenomena (explaining over 30% of the variance in lexical decision mega-studies; Brysbaert et al., Reference Brysbaert, Stevens, Mandera and Keuleers2016; Ferrand et al., Reference Ferrand, New, Brysbaert, Keuleers, Bonin, Méot, Augustinova and Pallier2010; Keuleers et al., Reference Keuleers, Diependaele and Brysbaert2010b, Reference Keuleers, Lacey, Rastle and Brysbaert2012; Yap & Balota, Reference Yap and Balota2009) and is robust in both monolingual and bilingual adults and children (Cop et al., Reference Cop, Keuleers, Drieghe and Duyck2015; Whitford & Joanisse, Reference Whitford and Joanisse2018). Numerous reading experiments have shown that when reading in the first or dominant language, alphabetic language readers spend more time fixating on LF words and are less likely to skip them than HF words (e.g., Cop et al., Reference Cop, Keuleers, Drieghe and Duyck2015; Duyck et al., Reference Duyck, Vanderelst, Desmet and Hartsuiker2008; Whitford & Joanisse, Reference Whitford and Joanisse2018; see Rayner, Reference Rayner2009, for a review). The FE appears to be modulated by the degree of language exposure: readers with more language exposure exhibit a smaller FE (e.g., Ashby et al., Reference Ashby, Rayner and Clifton2005; Cop et al., Reference Cop, Keuleers, Drieghe and Duyck2015; Whitford & Titone, Reference Whitford and Titone2012, Reference Whitford and Titone2017). Some studies have shown that L1 and L2 fixation durations decrease as L1 exposure increases, unaffected by L2 exposure (Cop et al., Reference Cop, Keuleers, Drieghe and Duyck2015). However, others found that L2 exposure affects FEs in young adults: as L2 exposure increases, the FE decreases for L2 and increases for L1 (Whitford & Titone, Reference Whitford and Titone2012, Reference Whitford and Titone2017). Furthermore, English monolinguals and alphabetic language bilinguals (e.g., Dutch–English) exhibit comparable FEs in L1 reading (or dominant language; Cop et al., Reference Cop, Keuleers, Drieghe and Duyck2015 in sentence reading; Diependaele et al., Reference Diependaele, Lemhöfer and Brysbaert2013 in lexical decision; but see Whitford & Joanisse, Reference Whitford and Joanisse2018, for a larger L1 FE in English–French children compared to English monolinguals).
Furthermore, unskilled readers exhibited larger FEs compared to skilled readers, with steeper curves at LF words (Kuperman & Van Dyke, Reference Kuperman and Van Dyke2013). Apparently, the limited exposure to a language appears to negatively affect exposure to LF words since such readers are likely to have a limited vocabulary and may opt for easier materials (i.e., with fewer LF words; Brysbaert et al., Reference Brysbaert, Lagrou and Stevens2017). Consequently, their exposure to HF words should be similar to readers with extensive language exposure but considerably less to LF words. As a result, the difference in reading times between HF and LF words decreases with increased language exposure, leading to a reduced FE, congruent with the existing findings.
1.2 L1 FE in Chinese writing systems
Chinese is a logographic language that is qualitatively distinct from alphabetic languages. Chinese characters are written in strokes, and are the components of words. In Chinese, there are about 5,000 commonly used characters and they can constitute more than 56,000 words. The most encountered word type is two-character words, while the commonly used word tokens are one-character words, i.e., the characters themselves. One- and two-character words account for the majority of commonly used Chinese words (97.2%; Li & Pollatsek, Reference Li and Pollatsek2020). Obviously, Chinese words are, on average, much shorter than those of alphabetic languages. Another major difference between Chinese and alphabetic languages is that the words of the former are not visually separated in sentences, whereas the latter contains spaces between words. That is, a character might be a single-character word or form a word with its preceding or following character in a Chinese sentence. Since the word is an important processing unit in Chinese reading (for the discussion, see Li et al., Reference Li, Bicknell, Liu, Wei and Rayner2014; Sui et al., Reference Sui, Dirix, Woumans and Duyck2023), word segmentation is challenging, but undoubtedly necessary for Chinese sentence reading.
The evidence shows that despite the lack of visual demarcation between words in Chinese sentences, most research has observed conventional word FEs (e.g., Li et al., Reference Li, Bicknell, Liu, Wei and Rayner2014; Yan et al., Reference Yan, Tian, Bai and Rayner2006). The word FEs of Chinese single-character words are inconsistent, with some studies showing a significant main effect (Zang et al., Reference Zang, Zhang, Bai, Yan, Paterson and Liversedge2016) and others failing to find it (Liversedge et al., Reference Liversedge, Zang, Zhang, Bai, Yan and Drieghe2014). Liversedge et al. (Reference Liversedge, Zang, Zhang, Bai, Yan and Drieghe2014) did not observe a main effect of word frequency but did observe a significant interaction between frequency and word (character) complexity (i.e., number of strokes). That is, the fixation duration was longer for LF, complex words. In multi-character words, the main effect of word frequency was consistently observed, with shorter reading times (e.g., Li et al., Reference Li, Bicknell, Liu, Wei and Rayner2014; Ma et al., Reference Ma, Li and Rayner2015) and higher skip rates for HF words (e.g., Cui et al., Reference Cui, Wang, Zhang, Cong, Zhang and Hyönä2021; Liu et al., Reference Liu, Yu, Fu, Li, Duan and Reichle2019; Yan et al., Reference Yan, Tian, Bai and Rayner2006). In general, the FEs found in Chinese sentence reading are concordant with those reported in alphabetic languages.
Note that most studies investigating the FE on eye movements in Chinese reading only studied target words (primarily two-character content words) embedded in a single manipulated low-constrained sentence (e.g., Li et al., Reference Li, Bicknell, Liu, Wei and Rayner2014; but see Sui et al., Reference Sui, Woumans, Duyck and Dirixsubmitted, in paragraphs), with many even using the same sentence frames that differed only in the target words (e.g., Cui et al., Reference Cui, Yan, Bai, Hyönä, Wang and Liversedge2013, Reference Cui, Wang, Zhang, Cong, Zhang and Hyönä2021), in order to minimize sentence context effects. Furthermore, most research investigated word FEs using dichotomous frequency categories (i.e., categorizing continuous frequencies, e.g., Cui et al., Reference Cui, Wang, Zhang, Cong, Zhang and Hyönä2021; Li et al., Reference Li, Bicknell, Liu, Wei and Rayner2014; but see Sui et al., Reference Sui, Woumans, Duyck and Dirixsubmitted, who analysed them as continuous variables). Hence, the word FEs observed in some existing research may be adversely affected or even biased by employing these manipulations. In addition, the effect of language exposure on the FE does not seem to apply to Chinese readers, unlike alphabetic languages. So far, only Sui et al. (Reference Sui, Woumans, Duyck and Dirixsubmitted) have considered language proficiency (a proxy of language exposure) when studying the Chinese word FE. Surprisingly, we did not find an effect of language proficiency on the word FE.
1.3 L2 frequency effect
An increasing number of studies have investigated whether an FE also occurs in L2 reading. Evidence has shown that unbalanced bilinguals usually have a larger FE in L2 than in L1 reading (e.g., Cop et al., Reference Cop, Keuleers, Drieghe and Duyck2015; Duyck et al., Reference Duyck, Vanderelst, Desmet and Hartsuiker2008; Mor & Prior, Reference Mor and Prior2020; Whitford & Titone, Reference Whitford and Titone2012, Reference Whitford and Titone2017). When language exposure (often measured by its proxy vocabulary size; Brysbaert et al., Reference Brysbaert, Lagrou and Stevens2017) is included as a predictor in the analyses, the difference between the FEs in L1 and L2 reading becomes negligibly small in the lexical decision tasks (Brysbaert et al., Reference Brysbaert, Lagrou and Stevens2017) but not in eye-movement studies (Cop et al., Reference Cop, Keuleers, Drieghe and Duyck2015), where FEs remain larger in the L2 than in the L1. Cop et al. (Reference Cop, Keuleers, Drieghe and Duyck2015) explained that the distinct results observed in different experiments may be due to the usage of disparate methods. The eye-movement measures are, not surprisingly, more complex and time-sensitive than the reaction times obtained in lexical decisions.
Notably, however, the findings of a larger FE in L2 originated primarily from the exploration of alphabetic language pairs (e.g., Cop et al., Reference Cop, Keuleers, Drieghe and Duyck2015). Only a few studies have explored FEs in within-group comparisons of bilinguals with non-alphabetic and alphabetic language pairs, namely Hebrew–English (e.g., Mor & Prior, Reference Mor and Prior2020). Mor and Prior (Reference Mor and Prior2020) found a larger FE in L2 than in L1 and a negative correlation between L2 proficiency and the size of L2 FE among unbalanced Hebrew–English bilinguals, using word frequency in a lexical decision task as a continuous variable. Still, these pioneer findings need to be further explored in both different scripts (such as Chinese) and with different experimental paradigms (such as natural reading). Furthermore, due to the lack of empirical evidence from bilinguals with disparate L1 writing systems, the existing FE hypotheses proposed based on findings of alphabetic languages remain to be verified for the speculation on the L2 word frequency. Next, we will discuss the existing hypotheses regarding FEs.
1.4 FE hypotheses
The learning hypothesis is generally considered to explain the FE. It suggests that repeated exposure to an item could lower recognition threshold (e.g., the logogen model of Morton, Reference Morton and Norman1970) or raise baseline activation (e.g., Monsell, Reference Monsell, Besner and Humphreys1991, cited from Cop et al., Reference Cop, Keuleers, Drieghe and Duyck2015). Hence, HF words, which have a higher rate of exposure, are processed faster than LF words. In addition, this hypothesis involves the asymptotic learning function, which posits that as the occurrences of words increase (i.e., as word frequency increases), the facilitation effect of learning on its performance gradually diminishes, resulting in a corresponding decrease in processing time until it remains constant (also see Duyck et al., Reference Duyck, Vanderelst, Desmet and Hartsuiker2008; Murray & Forster, Reference Murray and Forster2004). Therefore, word recognition times should correlate negatively with word frequency in a nonlinear, logarithmic way.
The FE can also be explained by the lexical entrenchment hypothesis (e.g., Diependaele et al., Reference Diependaele, Lemhöfer and Brysbaert2013), which highlights the strength of lexical representations in memory. Frequent exposure to a word leads to more entrenched representations, resulting in faster and more accurate processing compared with LF words. Given that unbalanced bilinguals are generally less exposed to their L2, the objective frequency of their L2 should be lower than that of their L1. Both theoretical hypotheses predict a larger FE in L2 than in L1, consistent with the existing findings (for detailed discussion, see Duyck et al., Reference Duyck, Vanderelst, Desmet and Hartsuiker2008). Interestingly, they also predict that once the L1 and/or L2 exposure is similar in balanced bilinguals, the L1 and/or L2 FEs of the two groups should be similar in size, regardless of their writing systems (e.g., Chinese and Dutch).
Another possible explanation, the rank hypothesis (Murray & Forster, Reference Murray and Forster2004), later extended to bilinguals (Duyck et al., Reference Duyck, Vanderelst, Desmet and Hartsuiker2008), suggests that the lexicon is organized into frequency-ordered bins with sequential searching, starting with HF words. For bilinguals, the bins are either language-specific (i.e., L1 or L2) with specific scanning speeds (longer scanning speed in L2) or shared by all known languages, with processing time increasing nonlinearly with decreasing word frequency (Duyck et al., Reference Duyck, Vanderelst, Desmet and Hartsuiker2008). Nevertheless, regardless of the lexicon type (i.e., language-specific or shared), bilinguals with similar word frequency rankings should have comparable FEs, irrespective of language dominance or writing systems. The lexicon type might also be assumed to interact with cross-lingual similarity, being shared only when L1 and L2 employ the same writing system (e.g., Dutch–English) but separate for those with different scripts (e.g., Chinese–English, for a discussion, see Duyck et al., Reference Duyck, Vanderelst, Desmet and Hartsuiker2008). In this case, different-script bilinguals might exhibit smaller L1 and L2 FEs than those that shared bins (due to the increased nonlinearly searching times). If they have longer scan speed in L2, as assumed above, their L2 FEs may be larger than their L1 FEs and be similar in size with the same-script bilinguals.
To summarize, all the above assumptions predict that language exposure moderates the word FE, regardless of writing systems or language dominance (L1 or L2). Bilinguals with comparable L1 exposure (or have their L1 proficiency included in the analysis) should exhibit similar L1 FEs (except for one of the extended rank hypothesis that posit an interaction between lexicon type and cross-lingual similarity as it also predicts a larger FE in different-script bilinguals). Balanced bilinguals with similar exposure to both languages should have similar FEs in their L1 and L2. Conversely, unbalanced bilinguals with less L2 exposure should have a larger FE in L2 reading, either due to the relatively more asymptotic learning in their LF words, relatively weak lexical representations or to the well-behind location of LF L2 words or longer L2 scanning speed in the frequency-ranked bins. In addition, various hypotheses generate different predictions regarding the L2 FEs for bilinguals with distinct L1 wiring systems. The learning, lexical entrenchment and frequency-ranked (which assume that bins are language-specific or shared among languages) hypotheses suggest that the L2 FE should not be affected by the L1 writing systems. Instead, the frequency-ranked hypothesis, which assumes that bins are only shared by alphabetic languages, indicates that the L2 FE may vary with the L1 writing system and that different-script bilinguals should exhibit smaller L2 effects than those with the same script.
2 Current study
Comparisons of FEs in natural reading between Chinese–English and alphabetic language bilinguals in their L1 and L2 and compare L1 and L2 FEs within Chinese bilinguals are of theoretical importance. First, it is necessary to evaluate the universality of assumptions and predictions of FE and reading theories. Second, they can shed light on whether word processing differs between L1s with diverse writing systems and whether the L2 reading is affected by the L1 writing system. By doing so, one can provide a plausible explanation for the seemingly counterintuitive results that may be found in different groups of bilinguals.
However, to date, no studies have compared the FEs of Chinese–English bilinguals with those of the same alphabet bilingual reading. Indeed, studying FEs in natural reading across-group of bilinguals is a considerable challenge. One reason is that data collection among bilinguals with disparate L1s is challenging (e.g., preparing materials) and time-consuming, especially when aiming for a dataset with sufficient power. In addition, cross-experiment comparisons are generally not convincing in investigating FE differences in reading across bilinguals unless carefully matched. One major reason is that differences in materials affect reading performance as discussed above. Yet, studying this effect in isolated conditions is not ideal, as the observed phenomenon cannot fully reflect the performance in natural reading, especially for Chinese–English bilinguals who need to perform word segmentation, which may affect word recognition in Chinese sentence reading (see discussion above).
Hence, the present study aims to investigate the FEs of bilinguals with different L1 writing systems and the same L2, i.e., Chinese– and Dutch–English bilinguals, in the L1 and L2 reading by measuring their eye movements. Our first interest is to understand whether the L1 FE of non-alphabetic (i.e., Chinese) is comparable to that of alphabetic languages (i.e., Dutch) and whether language exposure explains the variation in FEs. Our second interest is to compare the L2 FE between different bilinguals and whether it differs depending on the L1 writing systems. Our third interest is to verify whether the FE in L2 is larger than that in L1 for unbalanced Chinese–English bilinguals (note that Cop et al., Reference Cop, Keuleers, Drieghe and Duyck2015 have explored the within-group comparisons for Dutch bilinguals). We will further consider language exposure, which is known to influence FEs in alphabetic languages (e.g., Brysbaert et al., Reference Brysbaert, Lagrou and Stevens2017), by examining whether this influence applies to different writing systems, and whether it can explain group differences across bilinguals.
We will compare eye-movement data from two large corpora, GECO (Dutch–English bilinguals; Cop et al., Reference Cop, Dirix, Drieghe and Duyck2017a) and GECO-CN (Chinese–English bilinguals; Sui et al., Reference Sui, Dirix, Woumans and Duyck2023), in which unbalanced bilinguals read different language versions of an entire novel in paragraphs. The corpora shared identical experimental procedures and used the same reading materials. In the experiments, readers read half of the novel in their L1 and the other half in L2. The novel has approximately 5,000 sentences and contains a wide range of word stimuli, and thus word frequencies, in each language. Logically, the linguistic properties these two datasets involve should be comparable and not interfere with the comparison between the bilingual groups (e.g., their frequency distributions do not seem to differ significantly, see Figure S.1 in the Supplementary materials). In addition, both corpora provide LexTALE scores (Lemhöfer & Broersma, Reference Lemhöfer and Broersma2012; Dutch and English; HSK [Chinese Proficiency Test, n.d.] score for Chinese), which reflect language proficiency by examining the vocabulary size, which we will use as a proxy of language exposure. Notably, there was no significant difference in the L2 LexTALE scores among the bilingual participants in both corpora (see Sui et al., Reference Sui, Dirix, Woumans and Duyck2023).
3. Method
3.1 Participants and materials
3.1.1 GECO
GECO (Cop et al., Reference Cop, Dirix, Drieghe and Duyck2017a) is an eye-movement corpus where 19 Dutch–English bilinguals (average age: 21.2; SD = 2.2; undergraduate and master students; also see Table S.1 in the Supplementary materials) and 14 British English monolinguals read an entire novel (The Mysterious Affair at Styles by Agatha Christie) while their eye-movement behaviour was measured. The participants read the novel in four self-paced sessions that each contained a fixed number of chapters. After each chapter, multiple choice questions were presented to ensure participants were, as instructed, reading for comprehension. Dutch natives read half of the novel in Dutch and the other half in English, whereas monolinguals read the entire book in English. For the present study, only the bilingual data were used. For further information on the corpus, we refer the reader to Cop et al. (Reference Cop, Dirix, Drieghe and Duyck2017a, Reference Cop, Dirix, Van assche, Drieghe and Duyck2017b).
3.1.2 GECO-CN
GECO-CN is a dataset consisting of eye-movement data from 30 Chinese–English bilinguals (average age: 25.3; SD = 2.60; undergraduate, master and PhD students). It follows the identical experimental procedure and uses the same reading materials as the original GECO (Cop et al., Reference Cop, Dirix, Drieghe and Duyck2017a). Participants read half of the novel in Chinese and the other half in English. They also complete a series of language proficiency tests in both languages (see Table S.1). For more details, we refer the reader to Sui et al. (Reference Sui, Dirix, Woumans and Duyck2023).
3.2. Analysis
This study only investigated content words (for Chinese–English bilinguals, 511,157 data points in Chinese and 442,638 in English; for Dutch–English bilinguals, 275,458 data points in Dutch and 264,634 in English) excluding all cognates, as these orthographically and semantically overlapping equivalents may confound the investigation of the FE. The present work classified a word as a cognate if its Levenshtein distance between the two languages was greater than or equal to .7 (in orthography; 5.19% of words in Dutch and 7.29% in English; also see Da Silveira & van Leussen, Reference Da Silveira and van Leussen2015). Cognates were only present among the Dutch–English texts. Furthermore, the first and last words of a line and fixations of less than 100 ms were removed from the analysis, as the former could reflect the sentence wrap-up effect (e.g., Rayner et al., Reference Rayner, Sereno, Morris, Schmauder and Clifton1989; 10.31% in Chinese and 16.97% in English for Chinese–English bilinguals; 17.3% in Dutch and 16.8% in English for Dutch–English bilinguals), while the latter fixations are considered too short to reflect word processing (e.g., Sereno & Rayner, Reference Sereno and Rayner2003).
This experiment used R software (Version 494) to perform linear mixed-effects models (for fixation durations) and generalized linear mixed-effects models (for skipping probability) from the lme4 package (Version 1.1-26). We conducted separate analyses for L1 and L2 and for different reading time measures, and considered important psycholinguistic predictors as control variables. In each model, predictor variables included group (categorical, Chinese vs. Dutch bilinguals), word frequency (continuous), word length (continuous), proficiency of the relevant language, congruent with the model (continuous; if L1 FEs were investigated, it is L1 proficiency) and the sequential numbering of word repetition in sessions (continuous; see FFD in Table 1 for the full model). Additionally, we examined various eye-movement measures as dependent variables, including FFD, GD and TRT (e.g., Clifton et al., Reference Clifton, Staub, Rayner, van Gompel, Fischer, Murray and Hill2007) and skipping probability. The random effects were the participant and the word token. The predictors were all centred, whereas the dependent variables were Box-Cox transformed. Such transformation normalized the distribution without changing its functional relationship. In each reading time measure, fixation durations differing by more than 2.5 standard deviations (SDs) per individual and per language were discarded.
Estimate, estimates; Std. error, standard errors; t value, t-values; Pr(>|t|), p-values (calculated using the lmerTest package); VIF, variance inflation factor; Bold values indicate p < .0125.
*p < .0125, **p < .0025, ***p < .00025 (corrected significant level according to Von der Malsburg & Angele, Reference Von der Malsburg and Angele2017).
Word length is one of the important factors affecting frequency performance. Yet, the average length of Chinese words is much shorter than that of alphabetic language words. Thus, this work made some adjustments by proportioning word length in Chinese and Dutch. For example, the longest Chinese words in GECO-CN were the six-character words. The length of a one-character word then became 1/6, and the length of a three-character word became 1/2. The method was used for Dutch word length rescaling as well.
Notably, since both bilingual groups had English as their L2 and completed the English LexTALE, L2 word lengths were not rescaled. In addition, this work used the same log10-transformed Zipf (frequency) based on SUBTLEX-CH (Cai & Brysbaert, Reference Cai and Brysbaert2010), SUBTLEX-NL (Keuleers et al., Reference Keuleers, Brysbaert and New2010a) and SUBTLEX-UK (Van Heuven et al., Reference Van Heuven, Mandera, Keuleers and Brysbaert2014) as frequencies for Chinese, Dutch and English words, respectively. We also employed the car package (Version 3.0-12) to calculate the variance inflation factor (VIF) to estimate the multicollinearity of coefficients in each regression model. A VIF greater than 5 or 10 was considered as moderate or severe multicollinearity, respectively (also see Dirix & Duyck, Reference Dirix and Duyck2017).
4. Results
4.1 Bilingual L1: Chinese versus Dutch
4.1.1 First-fixation duration
The FFD in L1 reading did not differ significantly between the Chinese and Dutch bilingual groups (see Table 1). Both bilingual groups showed an overall FE, significantly larger effect in Dutch compared to Chinese bilinguals, showing that fixations were shorter for HF words than for LF ones. However, the word length effect was not significant in either group. Frequency and word length interacted significantly in both groups. The FE became larger as word length increased and increased significantly more in Dutch than in Chinese bilinguals (see Figure 1A). Language proficiency did not influence word fixation durations or the FE in either group or did the repetition effect.
4.1.2 Gaze duration
Chinese and Dutch bilinguals showed no significant difference in GD (see Table 1) but in frequency and word length effects. Both groups exhibited frequency and word length effects, with shorter GDs for higher frequency or shorter words. However, Dutch bilinguals showed significantly steeper effects compared to Chinese bilinguals. The interaction between frequency and word length observed in Chinese bilinguals differed significantly from that in Dutch bilinguals. The FE increased with word length and was more pronounced in Dutch bilinguals (see Figure 1A). Similar to what was observed in FFD, the language proficiency of Chinese and Dutch bilinguals did not affect GD or the FE. Furthermore, neither Dutch nor Chinese bilinguals exhibited a word repetition effect.
4.1.3 Total reading time
Overall, there was no significant difference in TRTs between Chinese and Dutch bilinguals (see Table 1). The frequency and word length effects were significant in Chinese bilinguals, with significantly smaller frequency and significantly larger word length than the Dutch group. The interactions between frequency and word length was not evident in Chinese bilinguals, differing significantly from that observed in Dutch bilinguals (see Figure 1A). In Dutch bilinguals, word length exhibited a greater effect on LF words than on HF ones, showing a larger FE in long words. L1 proficiency did not appear to affect TRTs and FEs in either group. In contrast to the findings in FFD and GD, the repetition effect was significant in both groups, showing inhibitory effect in Chinese and facilitatory pattern in Dutch bilinguals.
4.1.4 Skipping probability
Chinese bilinguals demonstrated a significantly higher skipping probability than Dutch bilinguals (see Table 1). Their frequency and word length effects were significant, with higher frequency or shorter words being more likely to be skipped, and were significantly smaller in frequency and larger in word length compared to Dutch bilinguals. The interaction between frequency and word length did not yield significance in Chinese bilinguals (see Figure 1A), differing significantly from Dutch groups, where the FE was larger in short words. In addition, Chinese bilinguals exhibited an inhibitory repetition effect, with a low skipping probability for words that were repeated more often, significantly differing from the performance of Dutch bilinguals, who showed a facilitative effect, with higher word repetition related to higher skip rates.
4.2 Bilingual L2: English versus English
4.2.1 First-fixation duration
FFDs were not significantly different between Chinese and Dutch bilinguals, with a tendency for longer FFDs in the former group (see Table 2). Chinese bilinguals exhibited frequency and word length effects, which were significantly larger than those observed in Dutch bilinguals. The interaction between frequency and word length found in Chinese bilinguals significantly differed from that in Dutch bilinguals (see Figure 1B). As the word length reduced, the decrease in FFDs was greater for HF than for LF words in Chinese bilinguals, whereas it was greater for LF words in Dutch bilinguals. Language proficiency had no effects on Chinese bilinguals but interacted with frequency, not with FFD, in Dutch bilinguals (see Figure 2). The FE decreased as language proficiency increased, with HF words being affected more than LF words. In addition, both groups spent more time reading frequently repeated words than infrequently repeated words, and the effect was similar between them.
Estimate, estimates; Std. error, standard errors; t value, t-values; Pr(>|t|), p-values (calculated using the lmerTest package); VIF, variance inflation factor; Bold values indicate p < .0125.
*p < .0125, **p < .0025, ***p < .00025 (corrected significant level according to Von der Malsburg & Angele, Reference Von der Malsburg and Angele2017).
4.2.2 Gaze duration
Different from the findings in FFD, Chinese bilinguals spent more time on GD than Dutch bilinguals (see Table 2). The observed frequency and word length effects in Chinese bilinguals were statistically larger than those in Dutch bilinguals (see Figure 1B). Both groups exhibited a significant interaction between frequency and word length, with Chinese bilinguals showing a statistically smaller decrease in the FE as word length reduced. Language proficiency showed no effect on fixation duration but did affect the FE in both groups. As language proficiency increases, Chinese bilinguals showed a significantly more pronounced decrease in FEs than Dutch bilinguals (see Figure 2), particularly in GD for LF words, whereas Dutch bilinguals showed a greater decrease in GD for HF than LF words. Additionally, Chinese and Dutch bilinguals showed repetition effects with comparable size, both spending more time reading frequently repeated words compared to infrequently repeated ones.
4.2.3 Total reading time
Chinese bilinguals spent more time on TRT than their Dutch counterparts (see Table 2). Frequency and word length in the two groups were negatively and positively correlated with TRTs, respectively, with smaller effect sizes in Dutch bilinguals. There was a significant interaction between frequency and word length in Chinese and Dutch bilinguals. The FE increased with word length and to a greater extent in Dutch bilinguals, especially in LF words (see Figure 1B). The effect of L2 proficiency on fixation duration was not evident in the two groups, as in FFD and GD (see Figure 2). However, it interacted with frequency in Chinese but not in Dutch bilinguals. Highly proficient readers exhibit a smaller FE, mainly manifested in the greater influence on the fixation duration of LF words. Both groups spent more time reading the more repeated words and to a similar extent.
4.2.4 Skipping probability
Chinese bilinguals skipped fewer words than Dutch bilinguals, different from findings in the L1 reading (see Table 2). Chinese bilinguals were affected by frequency and word length, not language proficiency, indicating a higher skipping probability for HF or short words. Their word length effect was statistically smaller than Dutch bilinguals but their frequency and language proficiency effects were comparable. Interactions between frequency and word length or between frequency and language proficiency were significant in Chinese bilinguals and differed significantly from those observed in Dutch bilinguals. As word length increased, the larger FE in short words decreased more in Dutch than in Chinese bilinguals (see Figure 1B). As language proficiency increased, the FE decreased in Chinese and increased in Dutch bilinguals, manifesting in higher skipping probability for LF and HF words, respectively (see Figure 2). The word repetition effect was significant and did not differ in the two groups. The more times a word is repeated, the lower the skipping probability.
4.3 Chinese–English bilingual: L1 versus L2
4.3.1 First-fixation duration
The FFD was significantly shorter in L1 than in L2 (see Table S.2 in the Supplementary materials). The FE was significant in L1, whereas the word length effect was not. Both were statistically smaller than those in L2. HF or shorter words were processed faster than LF or longer ones. The FE did not increase with word length in L1, but in L2, showing a significant difference between the two languages (see Figure S.1 in the Supplementary materials). Both languages exhibited a word repetition effect positively correlated with FFDs, with no significant difference between them. In addition, the results showed that FFD increased with language proficiency.
4.3.2 Gaze duration
Readers spend less time reading in L1 than in L2 (see Table S.2). Frequency and word length effects were observed in both languages, with statistically larger effects in L2. GDs were negatively correlated with frequency and positively correlated with word length. Frequency and word length interacted in both languages. FE increased with word length and increased significantly more in L2 than in L1 (see Figure S.2 in the Supplementary materials). Language proficiency was negatively correlated with GD, showing that highly proficient readers had shorter GDs than low-proficient ones. Language proficiency also interacted with frequency, with FE decreasing as proficiency increased. In addition, the repetition effect was significant, with no major differences between the two languages. The higher the number of word occurrences, the longer the GDs.
4.3.3 Total reading time
The TRTs were significantly longer in L2 than in L1 (see Table S.2). Frequency and word length effects were significant in L1, with longer TRTs for LF or long words. These effects were statistically smaller in L1 than in L2. The interaction between frequency and word length was not significant in L1 and differs statistically from that in L2 (see Figure S.2). FE in L2 increased with word length. Language proficiency affects GD and FE, with highly proficient readers having shorter GDs and smaller FE. The repetition effect was significant and positively correlated with TRTs, with no differences between languages.
4.3.4 Skipping probability
The skipping probability in L1 was higher than that in L2 (see Table S.2). The frequency and word length effects were significant in L1, with statistically smaller frequency and larger word length effects than those in L2. The higher the frequency or the shorter the word, the higher the probability of skipping it. The frequency and word length did not interact in L1, significantly different from that in L2 (see Figure S.2). The results showed that the reverse FE increased with decreasing word length. The language proficiency effect was significant. Readers with higher proficiency have a higher skipping probability. It also interacted with FE, showing that reverse FE decreased with increasing language proficiency. The repetition effect was significant, with no difference between L1 and L2. The more the repetitions, the lower the probability of skipping.
5. Discussion
This work compared FEs between bilinguals with the same and different scripts in L1 and L2 reading and between L1 and L2 in Chinese bilinguals. Our three objectives were to examine (a) whether the L1 FEs are similar in size across writing systems, (b) whether L2 FEs differ across readers with distinct L1 writing systems and (c) whether the L2 FE is larger than that of L1 in Chinese–English bilinguals. Language proficiency, known to affect the FE, was also taken into account to ensure that any potential differences in FEs between the two groups were not due to variations in language proficiency. Below, we will discuss the comparative results of FEs in L1 and L2 reading between the Chinese and Dutch groups. Within-group comparisons of Chinese–English bilinguals (L1 vs. L2) will be discussed briefly in the first subsection to avoid repetition. Following that, we will relate these empirical observations to the predictions of the different theoretical accounts of the FE.
5.1. Bilingual L1: Chinese versus Dutch
In contrast to previous studies reporting longer fixation durations for Chinese readers in single-sentence reading (Liversedge et al., Reference Liversedge, Drieghe, Li, Yan, Bai and Hyönä2016; Rayner et al., Reference Rayner, Li, Juhasz and Yan2005; but see Sui et al., Reference Sui, Dirix, Woumans and Duyck2023), this work shows that they read their L1 as quickly as alphabetic language readers and has a much higher skipping probability. That is, Chinese readers have a much higher reading speed than Dutch bilinguals for texts of comparable length in L1. Divergent findings from previous studies may be due to the different nature of the reading material (e.g., single sentences vs. paragraphs; controlled sentences vs. natural sentences; for a discussion, see Sui et al., Reference Sui, Dirix, Woumans and Duyck2023). Here, we used a very natural form of reading, with meaningful, contextualized materials (a book).
Chinese words, although generally much shorter than alphabetic ones, show a reliable length effect in GD, TRT and skipping probabilities. Fixation duration increased with word length, deviating from the U-shaped pattern found in previous research using a lexical decision task (Ferrand et al., Reference Ferrand, New, Brysbaert, Keuleers, Bonin, Méot, Augustinova and Pallier2010; Tsang et al., Reference Tsang, Huang, Lui, Xue, Chan, Wang and Chen2017). The discrepancy may stem from methodological differences, as the limited shared variance between lexical decision task and eye-tracking data (Dirix et al., Reference Dirix, Brysbaert and Duyck2019; Kuperman & Van Dyke, Reference Kuperman and Van Dyke2013) shows that lexical decision differs substantially from the natural reading process. Word length interacts with FEs in all reading measures for Dutch bilinguals but only in FFD and GD for Chinese bilinguals. In Dutch reading, the lexical access for LF words is prolonged more as word length decreases than HF words, and the same logic applies to later word processing stages. In Chinese reading, however, the prolongation effect is limited to the earlier stages of word recognition.
The L1 FEs of alphabet readers appear to be influenced by language proficiency rather than language quantity (monolinguals or bilinguals; Cop et al., Reference Cop, Keuleers, Drieghe and Duyck2015; Diependaele et al., Reference Diependaele, Lemhöfer and Brysbaert2013) or language (Dutch or French; Diependaele et al., Reference Diependaele, Lemhöfer and Brysbaert2013). Based on this logic, readers with similar proficiency levels should exhibit comparable L1 FEs. Yet, Dutch bilinguals exhibited larger FEs than Chinese bilinguals in all reading measures and skipping probabilities in this study (see Figures 1A and 3), inconsistent with certain discussed FE hypotheses, which will be explored further below. In addition, the FEs in the L1 were significantly smaller than those in the L2 across all reading time measures and the skip probability in Chinese–English bilinguals, congruent with previous findings (e.g., Cop et al., Reference Cop, Keuleers, Drieghe and Duyck2015; Mor & Prior, Reference Mor and Prior2020). HF words had shorter fixation times or higher skip probability.
Interestingly, when examining a single-reading measure, the impact of language proficiency on FEs was significant in FFD and TRT for Dutch bilinguals and marginally significant in GD for Dutch and Chinese bilinguals. Proficient readers spent less time on LF words than those with lower language proficiency, consistent with Kuperman and Van Dyke's (Reference Kuperman and Van Dyke2013) explanation of larger FE in those with less language exposure due to having less exposure to LF words. Yet, when investigating multiple eye-tracking measures, the interaction became insignificant for Dutch and Chinese bilinguals after adjusting the significant level to avoid an increase in false-positive probability (Von der Malsburg & Angele, Reference Von der Malsburg and Angele2017). Nevertheless, language proficiency appears to affect FEs for Dutch bilinguals to some extent, although not statistically powerful enough.
The smaller FE for Chinese readers in skipping probabilities could be due to a ceiling effect, as their skip rate reaches a surprisingly high .6. The relatively smaller FE in the time measures may have several possible explanations: firstly, certain language-specific factors, namely, character complexity, may affect FE. However, previous research did not find an interaction between character complexity and word frequency, arguing against the assumption (Sui et al., Reference Sui, Woumans, Duyck and Dirixsubmitted). Secondly, the number of words in languages may affect the FE. If Chinese has significantly fewer words than Dutch, Chinese words are likely to occur more often, resulting in a reduced FE. However, the number of commonly used words in Chinese and Dutch (about 56,000 words in Chinese and 54,319 in Dutch; Brysbaert et al., Reference Brysbaert, Keuleers and Mandera2019; Li & Su, Reference Li and Su2022) and the frequency distribution of the analysed data (see Figure S.1) do not differ significantly, collectively arguing against this possibility.
Thirdly, the FE on Chinese reading may be more limited than on Dutch. Given that word frequency interacts with the frequency of its constituent characters in Chinese reading (for a discussion, see Sui et al., Reference Sui, Woumans, Duyck and Dirixsubmitted), its impact on word recognition could be attenuated by character frequency. If, as previous work suggested, HF characters may have a greater inhibitory effect on HF words but a greater facilitative effect on LF words, it can explain the small FEs and the limited effect of language proficiency on FE in Chinese reading. However, the interaction of character and word frequencies has primarily been explored in two-character words, leaving unclear its applicability to other word lengths, especially single-character words with high collinearity between them and multi-character words with varying numbers of characters and character frequencies. Further research is needed to investigate whether the character frequency influence is responsible for the smaller word FE in Chinese with solid bases and well-designed.
Fourthly, the shorter word length constrains the degree of variation in FE. Since FE decreases with word length, which positively correlates with visual complexity, and the fact that Chinese words are generally much shorter than alphabetic language words, it is not surprising that their FEs are smaller and less affected by language proficiency, given the limited variations in FE. However, Chinese characters are composed of strokes, and the visual complexity of a short Chinese word may not necessarily be lower than that of a long alphabetic language word (i.e., number of letters). Hence, whether the effect of word length on FEs is similar in Chinese and Dutch and whether it can explain the smaller FEs in Chinese reading require further verification.
5.2. Bilingual L2: English versus English
In L2 reading, Chinese bilinguals exhibit longer fixation durations and larger FEs across reading time measures than Dutch bilinguals, differing from findings in L1. Yet, their skipping probabilities are lower than Dutch bilinguals, and their smaller FE observed in it could be explained by a floor effect. As Figures 2 and 3 illustrate, Chinese bilinguals read somewhat slower than their Dutch counterparts, even for HF words. It implies that even with similar L2 proficiency and read the same material in the same L2, bilinguals whose languages are from different writing systems are less efficient at visual word processing than those from one writing system.
One possible explanation for the findings could be the relatively limited exposure of Chinese bilinguals to the alphabetic writing system (i.e., letters and their specific combinations in orthographic structures such as bigrams or trigrams). Indeed, LexTALE scores indicated comparable English proficiency for Dutch and Chinese bilinguals (see Sui et al., Reference Sui, Dirix, Woumans and Duyck2023), and so should their exposure to the L2. Yet, the two languages of Dutch–English bilinguals use the same Latin alphabet and share some underlying orthographic structures, exhibiting more similarities in writing than those of Chinese–English bilinguals. Thus, in this particular context, the reported FE differences between the Dutch and Chinese groups might be explained by the exposure to alphabetic languages but not to English (i.e., L2). The facilitation effect may be greater on LF than on HF words, as the latter may already be approaching the ceiling effect. This possibility could explain longer English reading times overall, even for HF words, and larger FEs in Chinese compared to Dutch bilinguals in both early and late measures.
Another possible explanation is cross-lingual lexical interactions. The languages of bilinguals are well-known to co-activate even in unilingual reading (Brysbaert & Duyck, Reference Brysbaert and Duyck2010; Dijkstra & Van Heuven, Reference Dijkstra and Van Heuven2002; Schwartz & Kroll, Reference Schwartz and Kroll2006). Word recognition in a target language is influenced by non-target language words, explaining cognate and cross-language neighbourhood effects, etc. (e.g., Cop et al., Reference Cop, Dirix, Van assche, Drieghe and Duyck2017b; Dirix et al., Reference Dirix, Cop, Drieghe and Duyck2017; Whitford & Joanisse, Reference Whitford and Joanisse2021). Previous studies have shown that the greater the within- or/and cross-language neighbourhood density, the smaller the L2 FE of the LF words (Dirix et al., Reference Dirix, Cop, Drieghe and Duyck2017; Whitford & Titone, Reference Whitford and Titone2019). Chinese characters, however, differ fundamentally from the Latin alphabet, resulting in a much more limited cross-linguistic effect than those of the same writing system. Thus, they should have slower reading speeds and a larger FE than those with the same script, compatible with what we found.
One may argue that the larger FE observed in Chinese bilinguals could be due to a specific language proficiency test or lower language proficiency. Indeed, despite the comparable LexTALE scores between the two groups, Chinese bilinguals scored lower than Dutch bilinguals in the WRAT4 and Lexical decision task and should, therefore, show larger FEs. To investigate this possibility, we employed the same procedures and models, substituting LexTALE scores for WRAT4 and Lexical decision task ones, respectively (see Table S.3 in the Supplementary materials). Evidence shows that L2 FEs for Chinese–English bilinguals remain significantly larger than those for Dutch–English bilinguals, arguing against the possibilities.
Another argument is that different language exposure environments explain the varied L2 FEs between groups. Chinese–English bilinguals in the study may have a greater exposure to academic English. They may know HF words but need to comprehend LF words through context, resulting in a larger FE and longer fixation duration in L2. If so, word frequency would not affect fixation durations beyond a certain frequency level. Additionally, the effect of language proficiency on FE should disappear since it primarily affects LF words. Yet, evidence shows a linear negative correlation between FE and fixation duration, with slower growth at lower frequencies and interactions between language proficiency and FE, arguing against this hypothesis.
The interactions between frequency and language proficiency were observed in GD and TRT in Chinese bilinguals and at the early processing stage in Dutch (i.e., FFD and GD), with significant differences between the groups. The absence of interaction in the very early measure of Chinese bilinguals may be due to the multiple fixations they adopted (Chinese bilinguals fixate more on LF long words than on HF or short words; also see Figure S.3 in the Supplementary materials) or the fact that language proficiency does not affect the earliest stages of word recognition in Chinese bilinguals, such as the sub-lexical orthographic stage. The disproportionate effect of language proficiency on word processing was greater for LF words in Chinese bilinguals, compatible with previous findings (e.g., Mor & Prior, Reference Mor and Prior2020; Whitford & Titone, Reference Whitford and Titone2012).
Unexpectedly, Dutch bilinguals showed different patterns. Language proficiency had a greater impact on the processing of HF words, with negative correlations in reading measures and a positive correlation in skipping probability. Considering trends in L1 and the observed patterns in Chinese bilinguals in L2 are congruent with previous findings (Whitford & Titone, Reference Whitford and Titone2012 in French–English bilinguals; Mor & Prior, Reference Mor and Prior2020 in Hebrew–English bilinguals), we speculate that it is due to the diverse language environment to which bilinguals are exposed. Some highly proficient bilinguals may encounter LF L2 words as frequently as less-proficient ones but have more exposure to HF words. Then, their language proficiency should particularly affect the recognition of HF words, consistent with what we found.
Failing to find the effect of language proficiency on LF words in Dutch bilinguals may also be due to the influence of L1 proficiency, as obtained previously (Whitford & Titone, Reference Whitford and Titone2019). The L1, generally with greater exposure, should have a stronger impact on the L2 frequency than vice versa. Consequently, Dutch–English bilinguals may exhibit different results in their two languages but not Chinese bilinguals, as proficiency in distinct languages is unlikely to influence each other (Mor & Prior, Reference Mor and Prior2020). To examine this possibility, we performed an additional analysis, employing the same analysis procedure and models to investigate the effect of L1 proficiency on L2 frequency. Given the high correlation between L1 and L2 proficiencies in Dutch bilinguals (r = .69), we include only L1 proficiency instead of both in the analysis. Results show that L1 proficiency has a greater effect on LF words in GD (β = .000156, SE = .000030, t-value = 5.234; also see Figure S.4 in the Supplementary materials), congruent with Cop et al. (Reference Cop, Keuleers, Drieghe and Duyck2015). In this case, the L1 proficiency of Dutch bilinguals should increase, but not reduce the difference in fixation duration of LF words between proficient and less-proficient bilinguals, arguing against this possibility.
Chinese bilinguals exhibited larger word length effects than Dutch bilinguals, possibly due to not being accustomed to reading long words, as English words are usually longer than Chinese and shorter than Dutch words. Word length affects FE more in Dutch than in Chinese bilinguals. Frequency curves shifted upwards with word length, more with word length in Chinese bilinguals and more in the LF ranges in Dutch bilinguals. Note that at the earliest stages of word recognition (i.e., FFD), the FE of Chinese bilinguals decreases with word length increase, especially for HF words. We speculate that it is due to refixations and conducted further analysis using the same analysis procedure, taking fixation counts as the dependent variable. FE interacted with word length in the Chinese group (β = −.0009618, SE = .0002217, t-value = −4.338), significantly different from that in the Dutch group (β = −.001433, SE = .000352, t-value = −4.07). Chinese bilinguals refixated LF long words more frequently than short or HF words or Dutch readers (see Figure S.3), explaining the absence of the word length effect in LF words in FFD but in GD and TRT.
5.3. Theoretical discussion
We discussed three theoretical hypotheses explaining FEs in bilingual reading, all formulated for alphabetic language reading. The learning hypothesis indicates that learning becomes progressively smaller with increased word occurrences. The lexical entrenchment hypothesis states that lexical representations strengthen as word occurrences increase. The frequency-ranked hypothesis suggests that words are frequency-ordered in bins, with serial searching beginning with the highest-frequency word. Its extension for bilinguals assumes that bins are either language-specific, with potential different search speeds, or shared across languages.
These models do not differentiate in their assumptions regarding distinct writing systems. They all predict that the FE becomes smaller with increasing language exposure. That is, bilinguals with greater language exposure should have smaller FEs than those with limited exposure and similar FEs for those with similar exposure or proficiency, regardless of language writing systems and dominance (L1 or L2). In the present study, Chinese bilinguals reported much smaller L1 FEs than Dutch ones in all reading time measures. Their L1 FEs were unaffected by language exposure, while Dutch bilinguals exhibited trends. Interestingly, Chinese bilinguals showed larger L2 FEs despite similar L2 proficiency levels (assessed by English LexTALE), i.e., similar exposure to the L2, between them and Dutch–English bilinguals. These findings cannot be explained by the aforementioned FE hypotheses, challenging their applicability to logographic writing systems.
Another extended frequency-ranked hypothesis suggests language-specific bins for non-alphabetic and alphabetic language pairs and shared bins for alphabetic language pairs. That is, if Dutch–English bilinguals have much less L2 exposure than L1 (with all L2 words ranking behind L2 words), their L1 FEs may be comparable with Chinese–English bilinguals; otherwise, they should exhibit larger L1 FE, consistent with the findings obtained. Yet, this hypothesis fails to explain why the FE of Chinese words is unaffected by language exposure. More importantly, it predicts Dutch bilinguals to have a larger L2 FE than Chinese bilinguals, contradicting the current findings that Dutch showed much smaller, rather than larger, FEs in their L2 than Chinese bilinguals.
Alternative theoretical accounts for the present FE findings are the BIA+ model (Dijkstra & van Heuven, Reference Dijkstra and Van Heuven2002) and the “weaker links” hypothesis (Gollan et al., Reference Gollan, Montoya, Fennema-Notestine and Morris2005). These models suggest that exposure influences the activation speed and strength of links between word forms and lexical representations, respectively. Consequently, bilinguals, who often have less exposure to their L2 than L1, should exhibit larger L2 FEs. Yet, they seem, for the moment, failed to explain the larger L2 FEs in Chinese than in Dutch bilinguals despite having similar L2 proficiency. Another explanation for FE is the language-competition hypothesis (Diependaele et al., Reference Diependaele, Lemhöfer and Brysbaert2013; also see Dijkstra & Van Heuven, Reference Dijkstra and Van Heuven2002), which suggests that co-activated representations compete for selection across languages. In this case, the FE in L2 may be larger than in L1 as the interference from dominant L1 representations is likely greater than vice versa. Additionally, Chinese–English bilinguals should exhibit smaller FEs than same-alphabet bilinguals due to less cross-language competition between different writing systems. This hypothesis can explain the L1 FE found in this study but fails to explain its L2 findings as well as the differences in FEs within monolinguals and bilinguals of alphabetic languages. For instance, FEs in the same L2 do not vary with the orthographic similarity between the first- and second-alphabetic languages (e.g., Dutch–English vs. German–English; Diependaele et al., Reference Diependaele, Lemhöfer and Brysbaert2013). Clearly, all the existing hypotheses on word FEs fail to account for the current findings.
5.4. The word frequency hypotheses: implementation and limitations
So far, existing FE hypotheses have generally been considered universal across languages. Indeed, the effect has been taken as evidence of the similarity between Chinese and alphabetic writing systems in the underlying processes it involves (e.g., Li et al., Reference Li, Bicknell, Liu, Wei and Rayner2014; but see Sui et al., Reference Sui, Woumans, Duyck and Dirixsubmitted). Yet, this study shows that due to differences in L1 writing systems, the FEs vary between L1s and even between the same L2s, suggesting potential variations in the underlying processes involved in FE between Chinese– and Dutch–English bilinguals.
One key point in explaining the present findings is the word component effect (e.g., characters or letters), as the constituents of a word inevitably affect word recognition and FEs. With this extended assumption, the learning and lexical entrenchment accounts, but not the rank hypothesis, could explain the findings. That is, word frequency may affect activation thresholds or baselines, or the entrenchment of lexical representations. In Chinese, character frequency moderates the word FE. HF characters may facilitate the recognition of LF words and cause interference with HF words (Sui et al., Reference Sui, Woumans, Duyck and Dirixsubmitted). Since Chinese characters often appear as single-character words with word frequency, they may only provide additional activation rather than having a similar influence as words.
In contrast, languages within the same writing system apparently share word components (e.g., bigrams, trigrams), which affect word processing (e.g., Kuperman et al., Reference Kuperman, Bertram and Baayen2008; New & Grainger, Reference New and Grainger2011). These components may have similar facilitative effects alike to word frequency or provide extra activation for accessing target words. They should affect both L1 and L2 FEs, being more prominent on LF words and limited on HF words that are already close to the threshold. Thus, since Chinese characters have not only facilitative but also cause interference effect, the L1 word FE in Chinese reading is expected to be smaller than in Dutch. Additionally, Chinese–English bilinguals, lacking morphologically language-shared components, are expected to exhibit longer overall fixation durations and larger FEs in L2 than Dutch–English bilinguals, who have greater exposure to language-shared word component, explaining the variation in L2 FEs with L1 writing systems.
Until now, there are no (computation) models of reading that simulate reading across different script languages, as well as the interactions between those languages. Dominant models of bilingualism like BIA+ (Dijkstra & van Heuven, Reference Dijkstra and Van Heuven2002) have almost exclusively only been validated for same-script bilingualism. At present, it is unclear to what extent BIA+'s assumption about cross-language similarity holds for different-script bilingualism. At least, our observed differences between languages suggest that processing differs across such different languages in various aspects. Future research could examine this with more diverse language pairs.
In addition, word frequency typically explains about 30–40% of variances in reaction times in Chinese (e.g., Tsang et al., Reference Tsang, Huang, Lui, Xue, Chan, Wang and Chen2017; Tse et al., Reference Tse, Yap, Chan, Sze, Shaoul and Lin2017) and different alphabetic scripts (e.g., Ferrand et al., Reference Ferrand, New, Brysbaert, Keuleers, Bonin, Méot, Augustinova and Pallier2010, Reference Ferrand, Méot, Spinelli, New, Pallier, Bonin, Dufau, Mathôt and Grainger2018; Keuleers et al., Reference Keuleers, Brysbaert and New2010a) in isolated word recognition, inconsistent with the smaller FE observed in the current study for Chinese than for Dutch reading. The discrepancy could be explained by the use of different methodologies (also see Dirix et al., Reference Dirix, Brysbaert and Duyck2019; Kuperman et al., Reference Kuperman, Drieghe, Keuleers and Brysbaert2013). In addition to word-level variables, the only factors that affect reading times in isolated word recognition, word recognition in context is also affected by top-down factors such as contextual content. Consequently, the FE in the former condition should be smaller than in the latter (Dirix et al., Reference Dirix, Brysbaert and Duyck2019). Future research should take the use of different research methods into account, depending on the research purpose. In addition, future studies could also examine the applications of the available effects observed in context-free conditions across paradigms.
6. Conclusion
This work examined the word FEs of Chinese and Dutch bilinguals in L1 and L2 reading. It showed that even after considering language proficiency, Chinese bilinguals still have much smaller and larger FEs than Dutch bilinguals in the L1 and L2 reading, respectively. These results further confirm that the underlying processes that give rise to the word FEs are indeed different in Chinese and alphabetic languages. Furthermore, this indicates that the L1 writing system affects L2 reading but that some phenomena are constant. The results of this study fill an important gap of empirical evidence on bilingual natural reading of alphabetic and non-alphabetic languages.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S136672892400035X.
Data availability statement
Materials and data are from GECO (Dutch–English bilinguals; Cop et al., Reference Cop, Dirix, Drieghe and Duyck2017a) and GECO-CN (Chinese–English bilinguals; Sui et al., Reference Sui, Dirix, Woumans and Duyck2023), while the analysis code for this study is available at: https://osf.io/bg7nd/.
Acknowledgements
We would like to thank Professor Titus von der Malsburg for his valuable suggestions in correcting the significance level appropriately and two anonymous reviewers for their valuable input on earlier versions of this paper.
Competing interests
None.