Introduction
Communication between native and non-native speakers of a language is increasingly common in our globalized society. Within the United States, nearly 10% of school-aged students are classified as English Language Learners and, in some states, almost one-in-five students are English Language Learners (e.g., 19.2% in California and 18% in Texas; Department of Education, 2020). Outside of the educational system, 20.5% of the U.S. population speaks a language other than English at home, and nearly 40% of that population reports that they speak English at a level below “very well” (American Communities Survey, 2018) English serves as a “lingua franca” or common language of communication in many contexts internationally, including business and trade (Brutt-Griffler, Reference Brutt-Griffler2005; Rogerson-Revell, Reference Rogerson-Revell2007). There are nearly double the number of non-native speakers of English globally than native English speakers, a gap that continues to grow (Kachru, Reference Kachru1986). Further, conversation in English often occurs among parties who do not share a native language background. Previous research has demonstrated that non-native speech is more difficult for native speakers to understand than native speech (Munro and Derwing, Reference Munro and Derwing1995). However, individuals are able to improve their comprehension of accented speakers with relatively limited exposure (Baese-Berk et al., Reference Baese-Berk, Bradlow and Wright2013; Bradlow and Bent, Reference Bradlow and Bent2008). Investigations of this adaptation have focused exclusively on properties of speech that underly this adaptation (e.g., how the acoustic properties of speech modulate adaptation; Xie & Myers, Reference Xie and Myers2017) or on cognitive factors that may impact adaptation (e.g., working memory; Rönnberg et al., Reference Rönnberg, Lunner, Zekveld, Sörqvist, Danielsson, Lyxell and Rudner2013).
Here, we examine how direct incentives to perform modulate adaptation, or learning, and thereby contribute along multiple dimensions. For example, to an established literature we offer support for the external validity of existing results. Namely, existing laboratory studies have found that subjects adapt to non-native speech (Baese-Berk et al., Reference Baese-Berk, Bradlow and Wright2013; Bradlow and Bent, Reference Bradlow and Bent2008), but have based their support for this on experimental environments that are without direct performance incentives. Yet, outside of the laboratory, very many of the relevant environments—those in which conversations occur among parties who do not share a native language background—are fundamentally incentivized. For example, in professional or academic environments, there is clear benefit for listeners to both accurately and efficiently communicate. That is, in natural communication situations where participants may not share language backgrounds, one could imagine that participants in these conversations would be inherently incentivized to communicate well—both in terms of their own speech clarity but also in exerting more listening effort when they may be less familiar with a specific talker or their accent. However, this incentive is indirect, rather than a direct performance incentive.
This experiment measures whether directly incentivizing participant’s performance by attaching monetary rewards to how well they identify non-native speech impacts their performance and whether adaptation to unfamiliar speech is accelerated. In Section 2, we provide some context and background to the larger literatures we implicate. In Section 3, we describe the experiment and summarize the data-generating process, before discussing our empirical analysis in Section 4. We follow the analysis with a brief discussion of the policy implications in Section 5.
Background
Speech perception
Speech perception is a notoriously difficult task, requiring listeners to generalize over substantial variation within and across speakers in order to successfully perceive the intended message from a speaker. That is, multiple acoustic signals could map onto a single word—the way Person A produces the word “cat” will result in a different speech signal than the way Person B produces the same word. Listeners are, generally, extraordinarily good at handling this variability and typically understand speech with relatively little effort. However, in some circumstances, speech perception is less successful. For example, perceiving speech in noisy situations is more challenging than understanding speech in quiet (Cherry, Reference Cherry1953). Similarly, understanding speech from an unfamiliar talker, especially from a speaker with an unfamiliar accent, is more challenging than listening to familiar speakers or accents (e.g., Nygaard, Sommers, & Pisoni, Reference Nygaard, Sommers and Pisoni1994; van Wijngaarden, Reference Van Wijngaarden2001).
The hallmark case of unfamiliar speech is non-native speech. Non-native speech deviates from native speech on a variety of dimensions, but one of the most salient dimensions is the accent (or features of pronunciation) that differs from native speech. These pronunciation differences may occur at the segmental level (i.e., individual speech sounds) or at a suprasegmental level (overall pitch, rhythm, etc.). Taken together, these features create a distinct acoustic profile that is often challenging for native listeners to understand. Substantial previous work has investigated the sources of these challenges and has asked why some listeners succeed more than others on the task of understanding non-native speech. Cognitive factors including vocabulary size have been shown to predict a listener’s ability to understand non-native speech (Banks et al., Reference Banks, Gowen, Munro and Adank2015; Bent et al., Reference Bent, Baese-Berk, Borrie and McKee2016; McLaughlin et al., Reference McLaughlin, Baese-Berk, Bent, Borrie and Engen2018). Further, in a matched-guise task where the same speech sample from a native English speaker is matched with either an Asian face or a Caucasian face, listeners report that speech is more accented when paired with the Asian face (Rubin, Reference Rubin1992). Further, listeners transcribe speech more accurately when the race of the speaker matches the accent of the speech (e.g., a Chinese face and Chinese-accented English, McGowan, Reference McGowan2015). In addition to these factors, attitudinal factors also impact perception (see, e.g., Kutlu et al., Reference Kutlu, Tiv, Wulff and Titone2022a, Reference Kutlu, Tiv, Wulff and Titone2022b). Listeners with more negative attitudes toward non-native speakers report the speech as being more challenging to understand, even if they are equally able to transcribe the speech (Sheppard, Elliott, & Baese-Berk, Reference Sheppard, Elliott and Baese-Berk2017).
While it is clear that a variety of factors impact baseline perception of non-native speech, it is also the case that listeners are able to improve their perception of non-native speech with some practice. Some previous studies suggest that initial adaptation can be relatively quick (i.e., within a few sentences, Clarke and Garrett, Reference Clarke and Garrett2004), other work has demonstrated that longer periods of exposure (e.g., 30 minutes of training over the course of two days) result in significant improvements to perception of non-native speech (Bradlow and Bent, Reference Bradlow and Bent2008). Listeners can improve their perception of a specific accented talker, on a variety of talkers from a single accent background, or on talkers from a variety of accent backgrounds, depending on the speakers they are exposed to during training (e.g., Baese-Berk, Bradlow, & Wright, Reference Baese-Berk, Bradlow and Wright2013; Bradlow & Bent, Reference Bradlow and Bent2008; Sidaras, Alexander, & Nygaard, Reference Sidaras, Alexander and Nygaard2009). Thus, while we know that listeners can improve at understanding unfamiliar, accented speech, it is unclear what factors impact this adaptation.
The consequences of having a non-native accent extend far beyond communication specifically, as non-native speakers face myriad biases (Gluszek and Dovidio, Reference Gluszek and Dovidio2010). Individuals with non-native accents are often viewed as less employable than native speakers (Carlson and McHenry, Reference Carlson and McHenry2006) and are less likely to be recommended for a promotion or to receive entrepreneurial investments (Huang, Frideger, & Pearce, Reference Huang, Frideger and Pearce2013). Further, non-native speakers may be judged as less credible (Lev-Ari and Keysar, Reference Lev-Ari and Keysar2010), a bias that might emerge in early childhood (Kinzler, Corriveau, & Harris, Reference Kinzler, Corriveau and Harris2011). These judgments are often tied to judgments about the speaker’s language and challenges for the listener when understanding that speech. Indeed, some work suggests that listeners who have more experience with speakers from a variety of language backgrounds impact both a listener’s attitude about the speaker and their ability to understand the speaker’s speech (Kutlu et al., Reference Kutlu, Tiv, Wulff and Titone2022b). Therefore, it is critically important to understand how listeners can best improve their ability to perceive non-native speech.
Pay for performance
That incentives matter to human behavior is so foundational to economics that many introductory lectures often start with lessons from history. Many modern textbooks (Cowen and Tabarrok, Reference Cowen and Tabarrok2018) retell stories of convict ships in the 1700s, for example, when the British government paid sea captains to take felons to Australia. Yet, many would not survive the voyage. In response, the government tried to fix the problem with myriad solutions (e.g., mandating that captains bring medical personnel on the voyage or requiring them to bring lemons to prevent scurvy). However, nothing worked until the pattern of paying for each prisoner that walked on the ship in Great Britain was abandoned and replaced by a system that paid captains for each prisoner that walked off the ship in Australia. The change in incentives aligned the self-interest of the captains with the self-interest of the convicts, and the captains responded to the incentives.
In the most general terms, an incentive is anything that motivates a person to do something. As we approach our research question, it will also serve well to distinguish two types of incentives. Namely, intrinsic incentives come from within—a person with an intrinsic motivation wants to do something for its own sake, without an outside pressure or reward. The contributions to what intrinsically incentivizes individuals are many and varied, from feeling personal fulfillment and satisfaction from doing certain things, or from learning a new skill just for the fun of it. On the other hand, extrinsic incentives involve providing a material reward for accomplishing a task (a positive incentive) or threatening a punishment for failure to do so (a negative incentive).
Thus, in the absence of extrinsic incentives—we will use money to incentivize subject performance in our experiment—it is not the case that experimental subjects are then without incentive at all—this is true here and in any laboratory of human subjects. Rather, our design will allow us to “difference out” the intrinsic incentives that are likely to be common to both treated and control subjects and leave the extrinsic incentive provided only to the treated group as the implicated mechanism explaining the difference in performance between the two groups.Footnote 1
That human subjects respond to incentives is well established, across a variety of environments (e.g., Haley, Reference Haley2003; Lazear, Reference Lazear2000; Seiler, Reference Seiler1984; Shearer, Reference Shearer2004). The role of motivation, a concept closely related to incentive, has been implicated in previous theories of speech perception in challenging listening situations, namely in the Framework for Understanding Effortful Listening (FUEL; Pichora-Fuller et al., Reference Pichora-Fuller, Kramer, Eckert, Edwards, Hornsby, Humes, Lemke, Lunner, Matthen and Mackersie2016). This model integrates multiple factors ranging from cognitive to interpersonal that may impact how a person understands speech in challenging listening situations, such as listening to a speaker with an unfamiliar accent. The present study builds on this proposal by explicitly manipulating extrinsic incentives to participants to investigate this component of a broader construct of motivation. Further, while some previous work has investigated the role of monetary reward on listening effort during speech perception in noise (e.g., Koelewijn et al., Reference Koelewijn, Zekveld, Lunner and Kramer2018, Reference Koelewijn, Zekveld, Lunner and Kramer2021), this study directly investigates how monetary reward impacts both performance and, crucially, improvement in performance over time.
Method
Positionality statement
The three authors of this work are affiliated with the University of Oregon. Chasen Jaleh Afghani identifies as an Iranian-American woman. She identifies as speaking standardized American English, having Farsi as a heritage language and Spanish as an L2. Her interest in linguistics production and perception stems from her background. Melissa Michaud Baese-Berk identifies as a white, educated, cisgender woman, and identifies as speaking a standardized form of American English. While she has also identified as a learner of a variety of languages and has spent time as a “non-native” speaker in environments where those languages are dominant (e.g., a Spanish learner living in Spain), she has spent most of her life in environments where her language variety is the dominant form across institutions of power. She uses behavioral methods to investigate speech perception, speech production, and language learning across a wide array of language varieties. Glen Waddell identifies as a behavioral social scientist, data scientist, and economist. He speaks a standardized form of American English and has spent most of his life in environments where his language variety is the dominant form.
Participants
In January 2020, we recruited adult native English speakers from the student population at University of Oregon. All individuals were adults (ages 18–28 (mean = 22; sd = 2.2); 30 female, 20 male). We recruited participants using a recruitment flyer sent by e-mail and shown in various classes. Subjects were chosen to participate in the study if they self-identified as native, monolingual English speakers with limited experience with non-native accented speech (i.e., did not have a family member, close friend, or roommate who is a non-native speaker of English). Further, no subject reported having a history of speech, language, or hearing disorder; however, demographic characteristics of participants were not collected. A total of 50 subjects were recruited—25 were chosen at random to experience the treatment regime and 25 were the control.
Methods
Participants completed a sentence transcription task (i.e., an intelligibility task) and a questionnaire. All tasks were administered via PsychoPy (Peirce, Reference Peirce2007). Listeners heard sentences over headphones. At the conclusion of each sample, they were asked to type exactly what they heard. Listeners also completed a questionnaire about their experience with other languages, and accents of English.
Participants were assigned to one of two groups. In the treatment group, subjects were rewarded for their ability to correctly identify the words they have been presented with (i.e., intelligibility). Subjects in this group were paid a $9 “show-up” fee plus $2 for each word they correctly identified in one of the (104) samples. Subjects received payment for one such sample, which was determined randomly, with equal chance of it being any of the samples they experienced. In the end, the mean payment to subject in the treatment group was $15.88, with minimum and maximum payments of $9 and $19. In the control group, subjects were paid a flat fee of $14 for their participation and no other direct incentive was given.Footnote 2 Experiments were performed in the Spoken Language Research Laboratories, where it took participants roughly 60 minutes to complete all experimental tasks.
Materials
Stimuli for the experiment were drawn from the Hearing in Noise Test subsection (Nilsson, Soli, & Sullivan, Reference Nilsson, Soli and Sullivan1994) of the Archive of L1 and L2 Scripted and Spontaneous Transcripts and Recordings (ALLSSTAR corpus; Bradlow, Kim, & Blasingame, Reference Bradlow, Kim and Blasingame2017), a publicly available corpus of native and non-native speech.Footnote 3 Stimuli were chosen from six native Mandarin talker, three men and three women. In Table 1, we reproduce the 104 target sentences employed in this experiment. Sentences are between five and seven words long. All subjects experienced the same 104 sentences, though their order was randomized across subjects. Following previous work, these stimuli were embedded in speech-shaped noise at a 1:1 (i.e., 0 dB) signal-to-noise ratio to avoid ceiling effects (Bradlow and Bent, Reference Bradlow and Bent2008). Each response was scored for the number of words correctly transcribed by the listener using Autoscore (Borrie, Barrett, & Yoho, Reference Borrie, Barrett and Yoho2019). Words had to be entirely correct and partial credit was not given.
Results
We begin by describing the pattern of results we observe in the data. In Figure 1, we plot the average number of correctly identified words in each of the 104 target sentences. We separately identify (in orange) the average performance among subjects in the treatment group (n = 25), and (in blue) the average performance among subjects in the control group (n = 25). We also fit both groups to a third-order polynomial. Here, we first see a clear suggestion that there is a level increase in performance among the treated subjects—those who were given extrinsic monetary incentives to understand speech. However, the implied shape parameters also suggest a different pattern of learning emerges in the treatment and control groups—across the order of sentences, treated subjects not only start ahead, but gain over control subjects. Following these observations, we fit a series of regression models using R. We describe our statistical analyses of the data in detail below.
Do people listen better with incentive to do so?
In Table 2, we report estimates from a series of models. In each, our objective is to measure the effect of monetary incentive on the number of words identified correctly in each of 104 target sentences presented to subjects—these will form our baseline specifications. Specifically, we model responses as:
where Number Correctis captures the number of words subjects i correctly identify in target sentences s, and (Treatedi = 1) captures the treatment status of i. Subjects are known to learn with experience (Bradlow and Bent, Reference Bradlow and Bent2008)—we model the systematic component of learning in f(Sentence Orderis). As subjects experience target sentences in random order, there is variation across subjects when a particular target sentence is drawn. As such, throughout our analysis we will control for unobservable time-invariant heterogeneity specific to the 104 individual target sentences in δs. As the treatment varies at the subject level, the anticipated level at which errors will cluster is with subjects. That said, in Table 2, we report estimated standard errors allowing for clustering at the subject level, at the sentence level, and at the subject + sentence level. Inference is not sensitive to this distinction.
In estimating (1), we are identifying the effect of monetary incentive as measured by the average difference (across the 104 target sentences) in the performance of the treated subjects on average and the control subjects on average.Footnote 4 However, we approach the modeling of the number of words subjects correctly identify in two distinct ways. In the first three columns of Table 2, we allow outcomes to change according to a third-order polynomial—that is, we estimate the effect of treatment having fit outcomes to f(Orderis) = β1Orderis + β2Order2is + β3Order3is. In columns (4) through (6), we instead absorb any differences in the average performance on each of the 104 questions-orders (i.e., the first, second, third questions).Footnote 5 This non-parametric approach is less restrictive than to assume a cubic functional form to learning, yet, across both approaches we see similar point estimates, and only slightly different confidence intervals.
Treated subjects significantly outperform control subjects—we find 11.6% higher performance with monetary incentive (i.e., 0.294 additional words identified correctly in the average sentence). Incentivizing performance at a rate of $2 per word increased average performance among treated subjects the equivalent of 15% of a standard deviation (i.e., 0.15σ). In all cases, we reject that the average number of words correctly identified among treated subjects is equal to that among control subjects. In subsequent tables, we will adopt the specification of Column (4) of Table 2 as our preferred model—this represents the most conservative approach to inference, where we include sentence-order fixed effects, and estimate standard errors that allow for errors that may be correlated (across sentences) within subjects.
In Table 3, we demonstrate the robustness of the experimental results to the number of words in each sentence. As our preferred specification will include sentence-level fixed effects, we do not fear that unobserved heterogeneity across sentences drives treatment-effect estimates in Table 2. However, the opportunity for treated subjects to outperform control subjects may still differ with sentence length. Indeed, we see the largest gaps between treated and control subject on the seven-word sentences—subjects facing monetary incentives to correctly identify 29% more words (0.41σ). As longer sentences may better reflect the realities of language and communication in the field, we see this increase in effect size as an indication that the potential improvements in performance we document in the laboratory are suggestive of meaningful improvements externally.
This methodological approach differs from that of previous studies (Baese-Berk et al., Reference Baese-Berk, Bradlow and Wright2013) that have used a “training and test” approach to examining learning. In earlier experiments, two groups of participants are compared: one group who has been exposed to non-native speech during training and another which has been exposed to native speech during the training period. That is, both groups have experience with the task and with the laboratory setting before their performance is examined. As such, all participants are tested on novel talkers, and performance is only assessed at test. In these experiments, participants who have been previously exposed to non-native speech typically identify 0.31 additional words than those without the earlier exposure, or roughly 16% of a standard deviation in performance. These magnitudes are similar to those in the present study.
Do incentives to perform induce faster learning, too?
In Table 4, we stratify our baseline results by the order of sentences.Footnote 6 We first estimate the model of Equation (1) on a sample we restrict only to the first-15 sentences (Column 1). Here, treated subjects correctly identify 0.201 more words on average, relative to control subjects, which is equivalent to roughly 11% of a standard deviation in performance. Over the first 52 of the 104 sentences (Column 2), the gap between treated and control subject increases to 0.277 additional words (0.14σ). In Column (3), we replicate our preferred specification on the full sample—there, the gap in the pooled model is 0.294 additional words (0.15σ). As a general rule, treated subjects correctly identify differentially more words later in the experiment than they do early in the experiment, consistent with performance incentives not only increasing performance but inducing more-efficient learning.
This is further evidenced as we discard sentences experienced early in the experiment. For example, in Column (4), we restrict the sample to the last half of sentences, where the gap between treated and control subjects is higher still—to 0.312 additional words, or roughly 16% of a standard deviation in performance. In the last-15 sentences, the gap is also highest, increasing to 0.330 additional words, or roughly 18% of a standard deviation (Column 5). In the end, performance incentives increase performance by roughly 64% more in the last-15 sentences than in the first-15 sentences.
In Table 5, we offer a different approach to identifying the dynamics of learning in the treated and control groups. In columns (1) and (2), we separately fit outcomes to a “linear-learning” restriction.Footnote 7 Under such a restriction, we see no evidence of differential learning across treatment and control groups. Point estimates on their linear slopes are statistically indistinguishable. However, as was suggested in Figure 1, relaxing the linear restriction on learning in favor of a “cubic learning” technology reveals a richer story.Footnote 8 Not only do treated subjects correctly identify more words generally, they learn more quickly early and again late. Cubic learning cannot be rejected in either treated or control subjects. However, all three components are significantly different across treatment and control groups—in particular, the linear and cubic components are significantly more positive among those subjects facing direct incentives to identify words correctly. Performance incentives increase subjects’ ability to correctly identify words spoken and increase the rate of learning. It’s as though learning is itself more productive with direct incentive.
Discussion
Learning to listen to non-native speech has previously been understood as being an issue of exposure. The more non-native speech a listener has heard the better they are able to understand new speech from new talkers (Baese-Berk et al., Reference Baese-Berk, Bradlow and Wright2013; Bradlow and Bent, Reference Bradlow and Bent2008). The results of the current study offer a critique of those accounts along two dimensions. First, incentive alone results in a shift to initial performance. That is, individuals who are told they will be paid more based on their performance begin the experiment at a higher level of performance than individuals who are told they will be paid a flat rate, suggesting some aspects of performance may not be tied to exposure at all, but may instead be tied to motivation. Second, individuals who are incentivized to perform well on a task demonstrate more robust learning during the course of exposure than individuals who are not. Together, the results suggest that exposure alone does not result in the most robust learning. Instead, it appears that motivation, as indexed here by monetary incentive, can further improve learning, above and beyond exposure alone. Instead of theories of learning that rely primarily on exposure and modulating factors of this exposure, we must provide explanations for learning that are more nuanced, including exposure, motivation, and other social or attitudinal factors. Some prior studies have examined whether listeners process accented speech differently depending on their expectations about that speech signal (e.g., the source of the accent) (Lev-Ari, Reference Lev-Ari2015; McGowan, Reference McGowan2015), and whether these expectations can impact how listeners are able to adapt to this speech (Vaughn, Reference Vaughn2019) which also suggests that exposure alone cannot sufficiently account for improvement in perception of non-native speech. To our knowledge, no prior work has examined whether listener’s attention and motivation to adapt to non-native speech can be explicitly modulated by attaching performance to a reward.
Performance incentives have been used as a proxy for motivation or encouraging shifts in attentional resource allocation in other auditory tasks. For example, increased motivation (also through providing monetary incentive) can improve performance on some auditory tasks for both hearing-impaired and normal-hearing listeners (Mirkovic et al., Reference Mirkovic, Debener, Schmidt, Jaeger and Neher2019). Interestingly, physiological measures (e.g., cardiovascular reactivity and pupil dilation) have been linked to the provision of monetary incentive (Koelewijn et al., Reference Koelewijn, Zekveld, Lunner and Kramer2018, Reference Koelewijn, Zekveld, Lunner and Kramer2021; Richter, Reference Richter2016). These physiological markers are also thought to correlate with subjective measures of how much effort a listener is exerting while understanding the speech signal (Peelle, Reference Peelle2018). While effortful listening is often used in understanding speech perception by hearing-impaired populations, similar issues are likely at work when listening to speech from an unfamiliar accent (Van Engen and Peelle, Reference Van Engen and Peelle2014).
Interestingly, while some previous work has demonstrated that participants show increased physiological responses in conditions of monetary incentive, they do not demonstrate improvement in behavioral outcomes (Koelewijn et al., Reference Koelewijn, Zekveld, Lunner and Kramer2018, Reference Koelewijn, Zekveld, Lunner and Kramer2021). The dichotomy between those previous results and those presented here is interesting given that in some ways one would expect similar performance across these projects. However, the work here differs from the previous work on a few dimensions that may be relevant for understanding these diverging results. First, the task in this study was to transcribe speech produced by an unfamiliar talker with an unfamiliar accent. While the repetition task in Koelewijn and colleague’s work is similar, the speech-in-noise task used there is, in some ways, more challenging than the task used here. Further, we were interested in adaptation across the course of an experiment; therefore, our measures were slightly different than those used in the previous work. Future studies could consider why the performance here diverges from this previous work.
Still, it is likely that the improvement in performance we see here is driven by increased attention or effort during listening. Therefore, while no current models of accent adaptation can fully account for the modulation in performance as a function of incentive we observe here, models of listening effort that are designed for understanding speech perception from hearing-impaired populations do include modulation of motivation. For example, the Framework for Understanding Effortful Listening (FUEL) (Pichora-Fuller et al., Reference Pichora-Fuller, Kramer, Eckert, Edwards, Hornsby, Humes, Lemke, Lunner, Matthen and Mackersie2016) integrates issues of motivational intensity (Brehm and Self, Reference Brehm and Self1989) with other cognitive factors known to impact listening effort. This demonstrates the need for more-sophisticated models of accent adaptation that include not only aspects of the target speech (Baese-Berk et al., Reference Baese-Berk, Bradlow and Wright2013; Xie and Myers, Reference Xie and Myers2017) but also other factors known to influence listening effort, including motivation.
Finally, it is crucial to understand the real-world implications for these results. As a reviewer notes, “We can’t go around paying people to motivate them while they’re engaging in conversations.” Indeed, it is crucial to determine how both external and internal motivations might modulate these results. In a series of planned experiments, we will investigate how occupational hierarchy (i.e., someone being your superior vs. your subordinate at work) might impact these results and how places of employment could incentivize adaptation to unfamiliar accents. In other work, we are investigating how various types of training could help listeners adapt in educational settings. While the real-world implications are not straightforward, it is critical that future work addresses this area.
Conclusion
We open a new avenue to investigate how listeners can improve their understanding of non-native speech. In doing so, we shift the burden of communication from being something that solely rests with the speaker toward a more-equitable sharing across the two parties engaged in communication. Our experimental results point clearly to an opportunity for improving native and non-native communication—we find immediate and sustained performance differentials induced by incentivizing listeners. Moreover, listeners learn faster in the presence of incentives, leaving unincentivized control subjects further behind over the length of the experiment. As the consequences of having a non-native accent likely extend far beyond communication, we anticipate the longer-run welfare benefits associated with improving listeners’ communication to be large—more-effective business decisions, richer personal and professional relationships, and improved attitudes toward non-native speakers.
Acknowledgements
This work was partially supported by a grant from the National Science Foundation to [removed for review] and a grant from the Undergraduate Research Opportunities Program to [removed for review].
Competing interests
The authors declare none.