Introduction
When reading sentences, information from different aspects of words becomes available during lexical access. Critically, sound-related properties of orthographic patterns are activated automatically for readers’ lexical processing, even when reading silently. Much of the theoretical debate about this has been driven by language comparative research, especially between the reading of logographic scripts like Chinese and alphabetic scripts like English and German. According to the dual route model of word reading (e.g., Coltheart, Rastle, Perry, Langdon & Ziegler, Reference Coltheart, Rastle, Perry, Langdon and Ziegler2001), semantics can be accessed either directly from orthography or indirectly via phonological mediation. While phonological mediation is often found in English (van Orden, Reference Van Orden1987), Chinese is well-known for its optimization of fast and direct access to meaning (Hoosain, Reference Hoosain1991; Yan & Kliegl, Reference Yan and Kliegl2023), as evident by early foveal (Chen & Shu, Reference Chen and Shu2001; Zhou & Marslen-Wilson, Reference Zhou and Marslen-Wilson1999, Reference Zhou and Marslen-Wilson2000) and parafoveal lexical processing of semantics (Yan, Richter, Shu & Kliegl, Reference Yan, Richter, Shu and Kliegl2009; Yan, Zhou, Shu & Kliegl, Reference Yan, Zhou, Shu and Kliegl2012) bypassing the mediation of phonology. As such, studies of Chinese reading are important not only to document language-specific aspects of reading but also to achieve universal reading principles. The present study tested late Cantonese–Mandarin bilingualsFootnote 1 who were native to Cantonese and spoke Mandarin as their second language (L2) in order to investigate how the two phonological representation systems contributed to lexical access during their silent reading of Chinese sentences. In the following, we first review prior works on phonological activation during reading employing the error disruption paradigm and the eye-tracking technique. We then focus on cross-language studies, which shed light on the mechanism of mapping multiple phonological representations on one written form in the lexicon. Finally, we elaborate on the characteristics of the two (spoken) languages involved, Cantonese and Mandarin.
The error disruption paradigm and eye-tracking experiments
One piece of evidence for the importance of phonology in sentence-reading comprehension has been illustrated with the error disruption paradigm (Doctor & Coltheart, Reference Doctor and Coltheart1980). In their study, the participants were presented with sentences containing certain typographic errors and were instructed to read silently and judge the meaningfulness of these sentences. It was noted that all the sentences should have been rated as meaningless due to the errors; the participants, however, showed a higher likelihood of falsely accepting the sentences with homophonic errors as meaningful (e.g., “He ran threw the street” for “He ran through the street”) than those containing unrecoverable words (e.g., “He ran sew the street”). This homophone recovery effect has been replicated in later studies (e.g., Coltheart, Avons & Trollope, Reference Coltheart, Avons and Trollope1990; Treiman, Freyd & Baron, Reference Treiman, Freyd and Baron1983), suggesting that alphabetic readers achieve lexical access via phonological decoding.
Phonological activation has also been explored in a sentence-reading comprehension task with readers’ eye movements recorded. Eye-tracking allows measurement of reading in a relatively natural scenario and provides psychologists with a powerful tool to understand implicit cognitive processing at high temporal and spatial resolutions. There is consistent evidence that not only phonemes but also detailed articulation-specific sub- and supra-phonemic features are used early during visual recognition of English words (Ashby & Clifton, Reference Ashby and Clifton2005; Ashby, Treiman, Kessler & Rayner, Reference Ashby, Treiman, Kessler and Rayner2006). A combination of the eye-tracking technique and the error-disruption paradigm provides more fine-grained measurements of lexical processing at the individual word level. The rationale for this paradigm is that erroneous substitutions that preserve critical linguistic features for readers to recover from should be less disruptive to reading than other non-recoverable substitutions. Specifically, as shown in previous studies, a longer fixation duration on a word indicates a greater processing effort and more difficulty in recovery (Inhoff & Topolski, Reference Inhoff and Topolski1992; Jared, Ashby, Agauas & Levy, Reference Jared, Ashby, Agauas and Levy2016). These experiments provide evidence that phonology plays an important role in lexical activation during silent sentence reading (Daneman & Reingold, Reference Daneman and Reingold1993; Rayner, Pollatsek & Binder, Reference Rayner, Pollatsek and Binder1998).
Notably, the error disruption paradigm has revealed that the role of phonological decoding in lexical access varies as a function of reading skills. Doctor and Coltheart (Reference Doctor and Coltheart1980) showed that the false-acceptance rate of sentences containing homophonic errors decreased with the increase of readers’ ages. This finding is convergent with other evidence, suggesting that beginning readers are more likely to rely on phonological information than more skilled and advanced readers, who, in contrast, rely on a more direct and orthography-based procedure (Ehri, Reference Ehri, Gough, Ehri and Treiman1992; Frith, Reference Frith, Patterson, Marshall and Coltheart1985; Harm & Seidenberg, Reference Harm and Seidenberg2004; Seymour, Reference Seymour, Perfetti, Rieben and Fayol1997). Similar eye-tracking evidence from the error disruption paradigm has been reported in Chinese (Zhou, Shu, Miller & Yan, Reference Zhou, Shu, Miller and Yan2018). Chinese children showed a recovery effect during their fixations on pre-target words caused by homophone targets, whereas this effect did not emerge in adults until they had accomplished lexical access, and it appeared only on post-target words. The results of their study, therefore, suggest that phonological decoding in lexical access is mediated by reading skill, even among readers of Chinese, a writing system in which the spelling-sound correspondence is rather opaque.
As children learn to read, they start learning to associate written characters/words with their oral vocabularies (Harm & Seidenberg, Reference Harm and Seidenberg2004). An interesting question to be asked, then, is how phonological routes would function if there were more than one spoken system involved. Would the predominant spoken system always activate due to the fact that it is used more often, or would the specific phonological system activated be situational-dependent? Before elaborating on this question, we review below cross-language evidence of phonological activation in a specific situation of processing cognates, where words have common meanings and forms in two languages.
Cross-language phonological activation in bilinguals
Many studies on cross-language phonological activation focused on bilinguals’ lexical access of translation equivalents (i.e., cognates) that share meaning and form properties. Evidence from alphabetic languages has revealed a priming effect for cognates during the processing of word lists, even when the two languages are cross-scripted (English-Hebrew: Gollan, Forster & Frost, Reference Gollan, Forster and Frost1997; Korean-English: Kim & Davis, Reference Kim and Davis2003; Japanese–English: Nakayama, Sears, Hino & Lupker, Reference Nakayama, Sears, Hino and Lupker2012), suggesting that lexical phonology is cross-linguistically integrated and represented for bilinguals (Dijkstra & Van Heuven, Reference Dijkstra and Van Heuven2002; Dijkstra, Wahl, Buytenhuijs, Van Halem, Al-Jibouri, De Korte & Rekké, Reference Dijkstra, Wahl, Buytenhuijs, Van Halem, Al-Jibouri, De Korte and Rekké2019). Specifically, Nakayama, Verdonschot, Sears, and Lupker (Reference Nakayama, Verdonschot, Sears and Lupker2014) accented the influence of phonological similarity between L1 (Japanese) primes and L2 (English) targets with Japanese–English cognates. They found that phonologically similar cognates were responded to more rapidly than were dissimilar ones.
However, as far as Chinese, the logographic script, is concerned, the role of phonology in cognate processing becomes somewhat unclear and inconsistent. On the one hand, for late Chinese–English bilinguals, Chinese words can phonologically prime English targets that are similar in pronunciation (e.g., Zhou, Chen, Yang & Dunlap, Reference Zhou, Chen, Yang and Dunlap2010). On the other hand, priming effects have been found not to differ between phonologically similar and dissimilar Chinese–Japanese cognate word pairs among late bilinguals, suggesting little phonological facilitation effect (Liu, Lupker & Nakayama, Reference Liu, Lupker and Nakayama2022; Liu, Wanner-Kawahara & Nakayama, Reference Liu, Wanner-Kawahara and Nakayama2019). These results may hint at a late role of phonology in Chinese lexical access.
The Chinese language and phonological activation
Chinese is known for its logographic nature. The basic writing units, characters, are disconnected square-shaped units occupying the same horizontal and vertical extents irrespective of visual complexity. Importantly, one character usually maps to one morpheme when combining with other character(s) to form a word or a phrase. Different from the alphabetic languages, the character's pronunciation, which is monosyllabic with a lexical tone, is not obtained transparently from its visual form. Visually similar Chinese characters can have fundamentally different pronunciations.
Chinese has variations in the written and spoken forms, as it is a language with a long history and has undergone development in different areas. Relevant to this study, in Macau, Chinese characters are written in accordance with Traditional Chinese (as opposed to Simplified Chinese, which is used in Mainland China). While the majority of the population speaks Cantonese in Macau, the numbers of Mandarin-speaking people are increasing nowadays. Although Cantonese and Mandarin share a largely common vocabulary, they are mutually unintelligible in the spoken forms, because the characters have different pronunciations in the two languages. As such, bilingualism in Cantonese and Mandarin utilizes one script in writing and two phonological systems in speaking. On the other hand, a huge amount of words in Cantonese and Mandarin languages, although differing in the degree of phonological similarity, share common meanings, orthographies and even syntactic functions. In this sense, these words can be considered as cognates from the bilingual perspective, just like Chinese–Japanese cognates.
It should be noted that, despite the similar phonological hierarchy of Cantonese and Mandarin, there are nine lexical tone categories and 700 meaningful syllables in Cantonese while there are five tone categories and 400 meaningful syllables in Mandarin (Tsou, Reference Tsou1976). Given that there are over 50,000 Chinese characters in total, with about 8,000 commonly used ones (Shen & Bear, Reference Shen and Bear2000), there are many homophones in both languages. Importantly, unlike English words, Chinese homophones can be visually dissimilar. For instance, 施氏食獅史, a group of five visually distinct Mandarin homophones with an identical pronunciation of /shi/, means “the story of Mr. Shi eating lions”. Similarly in Cantonese, 余與汝遇於雨, with all characters pronounced as /jyu/, it translates as “I encountered you on a rainy day”. The feature of high homophone density offers a unique opportunity to explore phonological processing independent of orthographic overlap. An interesting phenomenon to note is that a pair of homophones in Mandarin can be either homophonic or non-homophonic in Cantonese and vice versa (Chu & Taft, Reference Chu and Taft2010), which allows an orthogonal manipulation of homophony in the two languages using a within-item design. For instance, the character 習 (Mandarin: /xi2/ and Cantonese: /zaap6/) has a Mandarin-only homophone 席 (Mandarin: /xi2/ and Cantonese: /zik6/) and a Cantonese-only homophone 雜 (Mandarin: /za2/ and Cantonese: /zaap6/). In the present study, we made use of this phenomenon to test cross-language phonological processing independent of visual similarity among native Cantonese readers who spoke Mandarin as L2.
Studies of Chinese reading suggest that phonological activation occurs during lexical access. Isolated character/word recognition involves activation of phonological properties (Tan & Perfetti, Reference Tan and Perfetti1997, Reference Tan and Perfetti1999), although such homophonic priming effects have been demonstrated mainly under long stimulus onset asynchrony (SOA) and thus may hint at a relatively late phonological process in Chinese (Chen & Shu, Reference Chen and Shu2001; Zhou & Marslen-Wilson, Reference Zhou and Marslen-Wilson1999, Reference Zhou and Marslen-Wilson2000; Zhou, Marslen-Wilson, Taft & Shu, Reference Zhou, Marslen-Wilson, Taft and Shu1999). Considering the unique properties of Chinese, the question also arises about the degree to which detailed phonological features are activated during visual word recognition. For instance, variety-specific tonal characteristics can affect processing during silent reading in Chinese, leading to shorter viewing durations and fewer refixations on neutral-tone words than on full-tone words (Yan, Luo & Inhoff, Reference Yan, Luo and Inhoff2014). In a later study, Luo, Yan, Yan, Zhou and Inhoff (Reference Luo, Yan, Yan, Zhou and Inhoff2016) further recorded electrophysiological activities and showed that, in comparison to full-tone words, neutral-tone words elicited smaller N100 (i.e., a negative going potential that peaks around 100 ms after stimulus onset) and anterior N250 amplitudes and a larger N400 amplitude. Testing a different tone-change phenomenon in Chinese with native Mandarin speakers, Pan, Zhang, Huang, and Yan (Reference Pan, Zhang, Huang and Yan2021a) reported that sandhi-tone target words elicited longer viewing durations than base-tone target words when the words were infrequent, suggesting a more direct lexical access route for frequent words and a more phonology-based route for infrequent ones.
Previous studies of phonological activation during the silent reading of Chinese sentences focused almost exclusively on Mandarin, with the majority using the Simplified Chinese script. Therefore, little is known about the role of phonology in Cantonese written in Traditional Chinese script. The most relevant study was conducted by Lam, Perfetti, and Bell (Reference Lam, Perfetti and Bell1991). They measured Cantonese–Mandarin bilinguals’ reaction times in a homophone judgment task. Four types of critical words, homophones in both languages, in Cantonese-only, in Mandarin-only, and in neither language, were presented. They found that the participants were slower in responding to word pairs that differed in their homophone status in either language. Although the study made an important step towards understanding the effect of L2 phonology on L1 lexical access, there are a few methodological considerations to be made. First, the results were based on small samples of participants and items, when evaluated according to the current standard. There were only 16 native Cantonese participants and only 30 items for each of the four experimental conditions. Second, a between-item experimental design was adopted, in which each condition had a different word list. Consequently, the study suffered from possibly uncontrolled confounding factors, reducing its reliability. Third, the homophone judgement task was rather explicit, encouraging readers’ effortful activation of phonological representations. Fourth, their critical comparison of Mandarin homophone pairs that were either homophonic or non-homophonic in Cantonese was based on “yes” responses versus “no” responses, because the participants had to make different responses to these two groups of word pairs when judging Cantonese pronunciation. Finally, the readers’ reaction times in the task were quite long. Therefore, the results only hinted at a late processing stage of phonology in lexical access. However, this study nevertheless provided us with a direction to study cross-language phonological activation with homophones in Cantonese and Mandarin.
The role of phonology during Chinese sentence reading has also been examined using the error disruption paradigm. A study by Wong and Chen (Reference Wong and Chen1999) is another rare example that focused on Cantonese phonology. They manipulated the type of erroneous first character within a two-character target word (i.e., visually similar, homophonic, and unrelated substitution characters) and found a recovery effect from the visually similar substitutions in first-fixation duration (FFD; duration of the first fixation on a word irrespective of the number of fixations) and gaze duration (GD; the cumulative duration of all fixations during the first-pass reading of the word). However, no homophone recovery effect was found in either of these two fixation measures. Arguably, experimental effects that emerge in FFD are assumed to take place in an earlier temporal stage than those that appear only in GD when a target word is re-fixated on. Likewise, effects shown only in second-pass reading measures such as total reading time (TRT, sum of all fixations on a word, including regressive fixations) reflect a late processing stage (Inhoff, Reference Inhoff1984; Inhoff & Radach, Reference Inhoff, Radach and Underwood1998). In this sense, the results from Wong and Chen (Reference Wong and Chen1999) agree with previous studies of Chinese isolated word recognition and suggest that phonological activation may show up late. Two recent eye-tracking experiments focusing on Mandarin homophones (Pan, Laubrock & Yan, Reference Pan, Laubrock and Yan2021b; Pan, Yan, Laubrock & Shu, Reference Pan, Yan, Laubrock and Shu2019) did not find evidence for early homophone recovery in FFD or GD, either. Such an effect only emerged in TRT, supporting the view of late phonological activation in Chinese. Perhaps the disparity in the roles of phonology in Chinese and English has been best captured by Feng, Miller, Shu, and Zhang (Reference Feng, Miller, Shu and Zhang2001), in a cross-language study. They compared how skilled English and Chinese readers rely on word shape and phonology for lexical recovery during silent sentence reading. Their English readers showed an early phonological recovery effect, whereas the Chinese readers only had a late effect.
The present study
As reviewed above, the identical written form of Chinese is mapped to several very different spoken systems, of which the most widely used are Cantonese and Mandarin. Research on Cantonese–Mandarin homophones offers a unique opportunity to understand phonological representation among bilingual readers, free of confounding caused by script familiarity, because the same written forms of target words and sentences are used in both. We aimed to incorporate the research ideas reviewed above to explore how L1 and L2 phonological knowledge is used during late Cantonese–Mandarin bilinguals’ silent reading of Chinese sentences. We adopted a natural reading comprehension task with the error disruption paradigm, in which no explicit response was required to study readers’ implicit phonological activation. To activate their specific phonological representations for L1 or L2 processing, our participants were required to read aloud a paragraph in either Cantonese or Mandarin (i.e., the primed language) before they read the experimental sentences for eye-movement recording. In addition, a within-item design was chosen, in which each target word was paired with substitutions under different conditions, to achieve a better experimental control. The manipulation of homophones of dual-language and single-language allowed us to examine if the lexical access of a word is facilitated with strengthened phonological cues from both languages for bilinguals. Finally, for more reliable results, we used larger samples of participants and items than were used in previous related studies.
Our predictions were clear, as follows. First, based on previous studies (Pan et al., Reference Pan, Yan, Laubrock and Shu2019, Reference Pan, Zhang, Huang and Yan2021b; Wong & Chen, Reference Wong and Chen1999), we hypothesized that phonological information is processed in a relatively late stage in Chinese word recognition. Therefore, we expected an overall late homophone recovery effect in Chinese. Second, we hypothesized that different prime languages would activate different language modes, leading to different reliance on phonological decoding for lexical access. As a rule of thumb, skilled Chinese readers are known to have a more direct lexical access than less-skilled readers. Therefore, since our participants were late bilinguals living in an L1-dominant environment, after being primed for L1 and as skilled readers of Cantonese, they were expected to show relatively less phonology-based recovery. In contrast, when primed for L2, our participants were expected to behave as less-skilled readers of Mandarin and thus to rely more on phonological decoding. As a result, we anticipated that L2 phonology activation would be more likely to appear when the readers were primed for their L2 mode, resulting in an overall stronger phonological activation when primed in L2 than in L1. Note that the phonological activation during the L2 mode likely involves both L1 and L2 representations (Oppenheim, Wu & Thierry, Reference Oppenheim, Wu and Thierry2018).
Method
Participants
Sixty-five participants, with a mean age of 20.9 years (SD = 2.7, 40 females), were tested in the eye-tracking experiment. To ensure their language dominance, we carefully chose local students who had undertaken their education in Macau (where Cantonese is the official and the most-used language) since primary school. Two independent samples, of 30 and 40 participants, were recruited for norming studies for target-word predictability and plausibility, respectively. All participants were university students with normal or corrected-to-normal vision and were native Chinese readers of traditional characters and Cantonese speakers. All experimental procedures were reviewed and approved by the Human Research Ethics Committee of the Education University of Hong Kong (No.2017-2018-0195) and approved by the Ethics Committee of the Department of Psychology, University of Macau (SONA-2020-05). The participants gave their written informed consent prior to the experiment, which conformed to the tenets of the Declaration of Helsinki.
The participants filled out a brief adapted version of the language-history questionnaire created by Li, Sepanski, and Zhao (Reference Li, Sepanski and Zhao2006). All participants were born in native Cantonese-speaking families. They all indicated that they spoke Cantonese with their mothers and all but two with their fathers. All participants used Cantonese as their daily communication language and therefore were not asked to report their L1 language proficiency. The participants reported late acquisition Mandarin (M age of acquisition = 6.6, SD = 2.6) and had learned it officially for an average of 14.3 years (SD = 3.3). Their self-evaluations of their Mandarin language skills indicated high proficiencies in reading (M = 5.7, SD = 1.1), writing (M = 5.4, SD = 1.2), oral communication (M = 5.0, SD = 1.4) and listening (M = 5.3, SD = 1.3), all rated on 7-point scales.
Design and materials
We adopted a 2 × 5 two factorial within-subject and within-item design. The first factor was language mode. We collected the participants’ eye movements in two testing sessions; each session started with their reading aloud a short passage in Cantonese or Mandarin to activate their respective phonological modes. The second factor was substitution type. Each target character was paired with three different homophonic characters and an unrelated one. The three homophone conditions were bilingual homophone (C+M+), Cantonese-only homophone (C+M-) and Mandarin-only homophone (C-M+). In the identical (no substitution) condition, the participants saw the correct target character itself and in the baseline condition they saw an unrelated character (C-M-). Therefore, 10 different reading lists were created and each participant silently read two of them containing two different sets of sentences, with one list in a pre-activated language mode of Cantonese and the other one in Mandarin.
We selected 75 quintuplets of critical characters for the identical, C+M+, C+M-, C-M+ and C-M- substitutions. The critical characters were embedded in the position of the first character in two-character target words. Therefore, only the correct target character formed real words with the following character. The substitution characters were matched strictly for frequency [F(4, 296) = 1.304, p = .269; RIH-CUHK, 2001] and number of strokes [F(4, 296) = .412, p = .800; Table 1]. For each set of the critical characters, two target words and two different sentence frames were constructed, resulting in a total of 150 experimental sentences. Pre-target and target word regions, which were always two characters in length, were never among the first or last three words in the sentences. The target-preceding sentence frames, including the pre-target words, were constructed to be non-predictive for different types of substitution characters, in order to minimize top-down processing. In the cloze test for predictability, each participant was presented with a half set of the sentence frames up to the pre-target words and asked to complete the sentences. As expected, the non-identical substitution characters were equally unpredictable [F(3, 447) = 1.60, p = .189]. In addition, we conducted a plausibility rating using a 5-point Likert scale (1 = not plausible at all and 5 = highly plausible). The participants were presented with sentence frames up to and including the substitutions and were asked to rate how the sentences would end meaningfully. Plausibility did not cause the non-identical substitutions to differ significantly [F(3, 447) = 1.249, p = .291].
An example set of critical characters with their pronunciations in Cantonese provided in Jyutping (formally known as the Linguistic Society of Hong Kong Cantonese Romanization Scheme, a Romanization system for Cantonese) and pronunciations in Mandarin provided in Pinyin. See the example sentence in Figure 1 in which the example substitution characters here were embedded. Means (and standard deviations in parentheses) of log-transformed character frequency (number of occurrences per million), number of strokes (count), plausibility rating (5-point scale) and predictability (percentage) of the substitution are shown.
Apparatus
The participants’ eye movements were recorded with an Eyelink Desktop system running at a sampling rate of 1000 Hz. Each sentence was presented in a single line on a 24-inch Dell E2416H monitor (resolution: 1920 x 1080 pixels; frame rate: 60 Hz) using the Song font. The participants were seated 65 cm from the monitor and were tested individually with their heads placed on a chin-and-forehead rest. Each character subtended 0.9° of visual angle. All recordings and calibrations were done monocularly based on the right eye; viewing was binocular.
Procedure
The experiment was completed in two testing sessions. The participants were first instructed to read a short passage aloud, in either Cantonese or Mandarin, to activate their respective language modes, after which their eye movements during sentence reading were collected. The second session followed the same procedure and tested the other language mode. The order of the two sessions was counterbalanced across the participants.
Before eye-movement data collection started, the participant's gaze position was calibrated with a 9-point grid (maximum errors < 0.5°). Prior to each sentence, an additional calibration was performed if a participant's gaze was not detected on the initial fixation-point. Fixation on the initial fixation-point initiated presentation of the next sentence, with its first character occupying the fixation-point. The participants were instructed to read the sentences silently for comprehension, then fixate on a point in the lower right corner of the monitor, and finally press a keyboard button to signal completion of a trial. We used a silent sentence-reading comprehension task to test implicit phonological activation. The participants were also told that there might be typographical errors in the sentences and that they should try to ignore them and understand the sentence meaning. They received 12 practice trials before reading the experimental sentences. We randomly selected 48 experimental sentences (32% of all sentences), each to be followed by an easy yes-no comprehension question, to encourage the participants’ engagement with the reading task. Data from three participants with accuracy lower than 70% were discarded from the analysis. The remaining 62 participants, on average, answered 85.4% of the questions correctly (SD = 4.9% and range: 75% to 95%).
Data analysis
Fixations were determined with an algorithm for saccade-detection (Engbert & Kliegl, Reference Engbert and Kliegl2003). For fixation-duration analyses, we screened our data at several levels, as described below. Overall, 450 (4.8%) trials were removed due to participants’ blinks, coughing or body movements during reading, or to tracker errors. In total, 516 target words (6.7% of all fixated target words) with FFDs shorter than 60 ms or longer than 800 ms, or GDs longer than 1000 ms, or TRTs longer than 1600 ms were removed. Additionally, using an a priori criterion (Briihl & Inhoff, Reference Briihl and Inhoff1995), 325 target words (4.2% of all fixated target words) with regressions from them were discarded because they may reflect incomplete lexical processing. The remaining 4899 observations were largely distributed evenly across the conditions.
Estimates were based on (general) linear mixed models (GLMMs/LMMs) using the lme4 package (Version 1.1-23; Bates, Maechler, Bolker & Walker, Reference Bates, Maechler, Bolker and Walker2015) in the R environment (Version 3.6.3; R Development Core Team, 2020). The dependent variables were viewing duration measures explained earlier for LMMs, as well as skipping probability (SP, the probability of a word not being fixated on during first-pass reading) and refixation probability (RP, the probability of a word receiving multiple fixations during first-pass reading) for GLMMs. Language mode, substitution type, and their interactions were the fixed effects (i.e., independent variables) in the (G)LMMs. We specified a sum contrast for language mode and a treatment-contrast with the unrelated condition as a reference baseline for substitution type. The first level of the treatment-contrast was between the no-substitution condition and the unrelated condition and indicated an effect of word legality. Analogously, the other three levels of the contrast between the three homophone substitution conditions and the unrelated condition reflected effects of bilingual, Cantonese and Mandarin homophony, respectively. We reported parsimonious LMMs for successful convergence (Bates, Kliegl, Vasishth & Baayen, Reference Bates, Kliegl, Vasishth and Baayen2015; Matuschek, Kliegl, Vasishth, Baayen & Bates, Reference Matuschek, Kliegl, Vasishth, Baayen and Bates2017). Additionally, we calculated p-values using the lmerTest package (Version 3.1-2; Kuznetsova, Brockhoff & Christensen, Reference Kuznetsova, Brockhoff and Christensen2017). The dependent variables of viewing duration measures were log-transformed in the LMMs (Kliegl, Masson & Richter, Reference Kliegl, Masson and Richter2010). Analyses for untransformed and log-transformed durations yielded the same patterns of significance.
Results
Overall, the readers skipped the target regions more often (b = 0.368, SE = 0.100, z = 3.685, p < .001) and refixated on them less often (b = −0.964, SE = 0.112, z = −8.619, p < .001) in the no-substitution condition than in the baseline condition. There were no statistically significant differences between the two reading modes, or between the homophone substitution conditions and the unrelated condition in skipping or refixation probabilities (p-values > .1; Table 2). Our traditional-Chinese readers spent less time processing the target region when the correct word was presented (FFD: b = −0.132, SE = 0.014, t = −9.095, p < 0.001; GD: b = −0.271, SE = 0.022, t = −12.403, p < 0.001 and TRT: b = −0.367, SE = 0.024, t = −15.577, p < 0.001). As expected, the identical condition did not introduce any interruption and led to a shorter time than other types of substitutions, indicating that our data were reliable. More relevant to the core research question of the present study, the participants fixated on the bilingual homophones (C+M+) more briefly over the baseline (C-M-; FFD: b = −0.034, SE = 0.015, t = −2.328, p = 0.020 and TRT: b = −0.099, SE = 0.019, t = −5.206, p < 0.001, with a marginal significant effect in GD: b = −0.035, SE = 0.020, t = −1.754, p = 0.080). The main effect of L1 phonological recovery from Cantonese homophones (C+M-) appeared only in TRT (b = −0.053, SE = 0.019, t = −2.809, p = 0.005). In contrast, there was no reliable main effect of L2 phonological recovery (p > 0.1).
Means (and standard deviations in parentheses) for skipping probability (SP) and refixation probability (RP) in percent, first-fixation duration (FFD), single-fixation duration (SFD), gaze duration (GD), and total reading time (TRT) in ms. Values are computed across participant means.
In addition to the main effects reported above, we also observed significant interactions in TRT between language mode and bilingual homophony (C+M+; b = 0.080, SE = 0.040, t = 2.031, p = 0.042), between language mode and Cantonese homophony (C+M-; b = 0.097, SE = 0.039, t = 2.483, p = 0.013), and a marginally significant interaction between language mode and Mandarin homophony (C-M+; b = 0.065, SE = 0.039, t = 1.679, p = 0.093). Figure 2 shows that, in general, in TRT our readers exhibited stronger homophone recovery effects in their L2 (Mandarin) mode (C+M+: b = −0.137, SE = 0.030, t = −4.572, p < 0.001; C+M-: b = −0.100, SE = 0.028, t = −3.626, p < 0.001; C-M+: b = −0.056, SE = 0.027, t = −2.056, p = 0.040) than in their L1 (Cantonese) mode, where the only significant, although weaker, effect was found in the C+M+ condition (b = −0.064, SE = 0.028, t = −2.309, p = 0.021). In contrast, the interaction between language mode and the contrast between the no-substitution and baseline conditions were non-significant (p > 0.1), indicating a language mode-independent effect of word legality.
Discussion
The present study explored how native Cantonese readers make use of phonological information for lexical recovery during the reading of traditional Chinese sentences. Up to now Cantonese readers’ phonological processing in L1 and L2 modes remains largely unknown from the existing literature. The use of eye-tracking methodology during online sentence reading allows us to understand lexical processing in a more natural scenario as compared to many previous studies that adopted isolated word recognition tasks. Additionally, thanks to high temporal and spatial resolutions, eye-tracking indices provide more fine-grained measurements to capture moment-to-moment cognitive processes. One novel contribution of the present study is that readers give different priorities to phonological processing in different language modes, even during the silent reading of Chinese. Reflected by an interaction between bilingual homophony and language mode, the readers showed more phonological-based recovery when their L2 (Mandarin) mode was pre-activated than when L1 (Cantonese) was, suggesting that they may generally rely more on phonological cues in attempting to recover from lexical errors when reading in their L2 mode. In contrast, the readers may employ a more direct lexical access route in their native language mode. In addition, a phonological recovery effect from the L2 homophone was discovered only in the L2 mode, indicating that the readers had a higher degree of activation of the Mandarin phonological coding system after reading aloud a short passage in the language. Interestingly, such a language mode pre-activation procedure seemed to introduce a long-lasting effect through the whole testing session. Finally, effective recovery from L1 homophone was observed with even more salient benefits in the readers’ L2 mode, implying robust phonological activation of readers’ L1 phonological representation overriding their current language mode. Below we focus our discussion on three interrelated aspects of psycholinguistic research to provide implications for lexical access in Chinese, bilingualism and second language learning.
Our results agree with several previously established critical findings. Chinese reading studies have consistently shown activation of phonological knowledge during visual word recognition (e.g., Tan & Perfetti, Reference Tan and Perfetti1997). The present study also showed that phonology is among the most important aspects of lexical access in reading Chinese. Overall, the present study has unveiled phonological activation in a relatively natural reading task in which readers comprehend sentences that may or may not contain errors, and it appeared mainly in a late processing measurement of eye movement as reflected by TRT. Chinese is considered a logographic script, optimized for semantics but less so for phonology (Hoosain, Reference Hoosain1991). Although lexical access in Chinese involves activation of orthography, phonology and semantics just like in alphabetic scripts (Zhou et al., Reference Zhou, Marslen-Wilson, Taft and Shu1999), most of the experimental evidence from both isolated priming paradigms (Chen & Shu, Reference Chen and Shu2001; Zhou & Marslen-Wilson, Reference Zhou and Marslen-Wilson1999, 2000) and sentence reading paradigms (Pan, Yan & Yeh, Reference Pan, Yan and Yeh2022; Tsai, Kliegl & Yan, Reference Tsai, Kliegl and Yan2012; Yan et al., Reference Yan, Richter, Shu and Kliegl2009) generally favors a direct lexical access route for Chinese adults. For instance, Yan et al. (Reference Yan, Richter, Shu and Kliegl2009) reported, in Simplified Chinese, a larger semantic than phonological priming effect from parafoveally presented priming characters. A similar pattern has been reported during horizontal (Tsai et al., Reference Tsai, Kliegl and Yan2012) and vertical reading (Pan et al., Reference Pan, Yan and Yeh2022) in Traditional Chinese. Nevertheless, activation of phonological information in Chinese reading may shift to an earlier temporal stage due to specific task demands. Isolated character-naming experiments showed that phonological codes of Chinese characters can be activated early during character identification when explicit naming is involved (Pollatsek, Tan & Rayner, Reference Pollatsek, Tan and Rayner2000; Shen & Forster, Reference Shen and Forster1999; Zhou & Marslen-Wilson, Reference Zhou and Marslen-Wilson2000). During sentence reading, Pan, Laubrock, and Yan (Reference Pan, Laubrock and Yan2016) examined how Chinese readers adjusted their relative weighting of phonological and semantic information processing when reading silently and aloud. They found that these readers showed earlier and stronger phonological activation in oral reading than in silent reading and attributed the effect to an articulatory demand of phonological production when reading aloud. In contrast, semantic activation is robust and independent of task. According to the empirical evidence reviewed above that, Chinese readers can adjust their processing priorities flexibly and put more weight on phonology when required by the current task. In the present study, our participants, who were late bilinguals living in an L1-dominant environment, clearly demonstrated more reliance on phonological decoding in their non-dominant L2 mode. From this perspective, the present study has provided a novel piece of evidence for Chinese readers’ enhanced phonological activation in their non-dominant spoken language mode.
This study also took the first step to explore native Cantonese readers’ L1 (Cantonese) and L2 (Mandarin) phonological activation during online reading of sentences written in traditional Chinese. The results add to our knowledge of cross-language phonological activation of cognates in bilinguals. In their influential work on bilingual visual word recognition, Dijkstra, Van Heuven and their colleagues proposed the Bilingual Interactive Activation model (BIA: Dijkstra & Van Heuven, Reference Dijkstra, Van Heuven, Grainger and Jacobs1998; Van Heuven, Dijkstra & Grainger, Reference Van Heuven, Dijkstra and Grainger1998; BIA+: Dijkstra & Van Heuven, Reference Dijkstra and Van Heuven2002; Multilink: Dijkstra et al., Reference Dijkstra, Wahl, Buytenhuijs, Van Halem, Al-Jibouri, De Korte and Rekké2019), arguing for an integrated lexicon and a language non-selective lexical access in comprehension. Specifically, co-activated orthography and shared semantics of the cognates – that is, resonance between orthographic and semantic representations – directly and indirectly activate their linked phonological representations. In line with this model, besides word comprehension, naming and translation tasks, the results from the present study among late bilinguals further demonstrate that multiple phonological representations of words in different languages can be activated automatically during natural sentence reading. Specifically, our results also generally agree with previous findings that late bilinguals automatically activate L1 knowledge when they are not using it (Oppenheim et al., Reference Oppenheim, Wu and Thierry2018). Given the high degree of visual similarity between Chinese and Japanese Kanji and the large number of cognates in these two languages, it is of great theoretical and practical importance to explore bilingual phonological representation and activation among Chinese–Japanese bilingual readers.
Moreover, the activation asymmetry of the phonological-engaged route in Chinese reading can be taken as a form of task-dependent adjustment in bilinguals. Prior works have captured several types of asymmetries in the influence between L1 and L2. For instance, some studies reported enhanced cross-language cognate facilitation effects for L1 prime words over L2 prime words (Gollan et al., Reference Gollan, Forster and Frost1997; Nakayama et al., Reference Nakayama, Sears, Hino and Lupker2012; Voga & Grainger, Reference Voga and Grainger2007) and shorter production latencies in L2-to-L1 translation than in L1-to-L2 translation (see Kroll, van Hell, Tokowicz & Green, Reference Kroll, van Hell, Tokowicz and Green2010 for a review; but Christoffels, De Groot & Kroll, Reference Christoffels, De Groot and Kroll2006). In the current study, as our participants were late L2 learners, such an asymmetry also contributed to the different recovery effects observed. Additionally, the existing literature has shown a task-dependent effect of phonology, that Chinese readers process phonological information more efficiently when reading sentences aloud (Pan et al., Reference Pan, Laubrock and Yan2016, Reference Pan, Yan, Laubrock and Shu2019, Reference Pan, Laubrock and Yan2021b) and in naming or production tasks (Liu et al., Reference Liu, Lupker and Nakayama2022). Dijkstra et al. (Reference Dijkstra, Wahl, Buytenhuijs, Van Halem, Al-Jibouri, De Korte and Rekké2019) recruited a task/decision system in their computational model, which explains these facts as the system's capability to check and tune the degree of orthographic, phonological, or semantic activation, depending on the task and stimulus list at hand. It is possible that Cantonese speakers do not rely heavily on orthographic-phonological connections when reading in their native language, like all Chinese readers do. They nevertheless may set a different parameter for the phonological activation threshold when they are in an L2 Mandarin mode, in which they are not as efficient as in their L1 Cantonese mode.
Although our findings suggest that activation of both L1 and L2 phonology mainly happens in a late temporal stage, it is worth noting that a weak yet significant early recovery effect of the bilingual homophone, as reflected by FFD, was observed in the L1 reading mode. We tend to interpret this as reflecting an extra benefit in retrieving the correct word caused by the double overlap of phonological representations, an approach of “walking on two legs”. A follow-up study on this topic is needed to confirm this speculation. For instance, the gaze-contingent boundary paradigm (Rayner, Reference Rayner1975) adopts a priming logic and has been considered a “gold standard” to measure lexical access during sentence reading. Indeed, the paradigm has been used widely to explore the types and their priorities in lexical processing in a number of orthographies, especially in Chinese.
From a practical perspective, the comparison between Cantonese and Mandarin in the present study provides a reference for educational policy makers and classroom teachers with regard to Mandarin education in Cantonese-speaking areas. Our results suggest that, for bilinguals, the procedure of reading a passage aloud in a language introduces a long-lasting and effective activation of its phonological representation. Therefore, school teachers may consider focusing instruction on their students’ oral reading of the target language as early as possible, preferably within the very first few minutes of the lesson, for a better learning effect.
As a limitation, the conclusion from the present study is restricted to the lexical processing of foveated character/words. However, studies of perceptual span (i.e., the effective area of vision during sentence reading; McConkie & Rayner, Reference McConkie and Rayner1975) have shown that Chinese readers can obtain useful information from up to four upcoming characters beyond the current fixation (e.g., Inhoff & Liu, Reference Inhoff and Liu1998; Yan, Li, Su, Cao & Pan, Reference Yan, Li, Su, Cao and Pan2020; Yan, Zhou, Shu & Kliegl, Reference Yan, Zhou, Shu and Kliegl2015). In other words, lexical processing typically starts parafoveally before a word is fixated on. To understand bilingual phonological activation in an earlier (i.e., parafoveal) processing stage in sentence reading, it would be desirable to use a more sensitive experimental paradigm such as the gaze-contingent boundary paradigm (Rayner, Reference Rayner1975). Following earlier work on parafoveal phonological processing in English (Chace, Rayner & Well, Reference Chace, Rayner and Well2005; Pollatsek, Lesch, Morris & Rayner, Reference Pollatsek, Lesch, Morris and Rayner1992) and in Chinese (e.g., Liu, Inhoff, Ye & Wu, Reference Liu, Inhoff, Ye and Wu2002; Tsai, Lee, Tzeng, Hung & Yen, Reference Tsai, Lee, Tzeng, Hung and Yen2004), future studies are needed to determine early phonological access among native Cantonese readers. In addition, the current study tested native Cantonese speakers who had been living in a Cantonese-dominant environment and learned Mandarin as L2 at their school age. As proficiency and dominance are important factors modulating cross-language activation (Costa, Pannunzi, Deco & Pickering, Reference Costa, Pannunzi, Deco and Pickering2017), future studies are needed to investigate how bilingual phonological activation is affected by other factors such as language proficiency and age of acquisition.
To conclude, the present results consolidate our current understanding about the language-universal importance of the phonological code, even in the logographic Chinese writing system. More generally, from a perspective of bilingual cognition, the results provide novel evidence for a notion that the human mind can adapt flexibly to the current language environment and access lexical information accordingly.
Acknowledgements
This research was supported by a Multi-Year Research Grant from the University of Macau (MYRG2020-00120-FSS), by a FDCT grant from the Macao Science and Technology Development Fund (Project code: 0015/2021/ITP), by the CASS Innovation Program, and by the Research Grants Council of Hong Kong Special Administrative Region, China (EdUHK ECS 28606818). The authors thank Yuqi Hao for her efforts during data collection.