1. Introduction
Immersive learning technologies, including virtual reality (VR), augmented reality (AR), mixed reality (MR), and 360-degree videos, are revolutionizing English language education by providing authentic linguistic contexts (Bendeck Soto et al., Reference Bendeck Soto, Toro Ocampo and Beltrán Colon2020) and facilitating practical application of English skills (Huang, He & Wang, Reference Huang, He and Wang2020). VR offers an interactive, computer-generated world where users actively participate (Kim, Park, Lee, Yuk & Lee, Reference Kim, Park, Lee, Yuk and Lee2001; Schmidt et al., Reference Schmidt, Lee, Francois, Lu, Huang, Cheng and Weng2023), enhancing learners’ involvement and English practice through avatar interactions (Liaw, Reference Liaw2019; Lin & Wang, Reference Lin and Wang2021). AR, blending virtual elements into real-life settings, enriches learning experiences by bridging background knowledge gaps (Pribeanu, Balog & Iordache, Reference Pribeanu, Balog and Iordache2017; Santos et al., Reference Santos, Lübke, Taketomi, Yamamoto, Rodrigo, Sandor and Kato2016). MR fuses these elements, presenting a real-world view with 3D avatars and objects for immersive cultural and linguistic interactions (Parveau & Adda, Reference Parveau and Adda2018). Lastly, 360-degree videos offer a real-world spherical view, elevating authenticity in language learning (Ozkeskin & Tunc, Reference Ozkeskin and Tunc2010). Collectively, these technologies underscore their significance in enhancing English language learning.
While these technologies offer innovative ways to engage learners, empirical studies have delved deeper into their specific impacts on various aspects of English language acquisition. Researchers have explored the impact of immersive technologies on English language learning (ELL), focusing on skill development such as writing (Koç, Altun & Yüksel, Reference Koç, Altun and Yüksel2021), listening (Lan, Fang, Hsiao & Chen, Reference Lan, Fang, Hsiao and Chen2018), vocabulary (Alfadil, Reference Alfadil2020; Chen et al., Reference Chen, Wang, Zou, Lin, Xie and Tsai2020; Tsai, Reference Tsai2020), and pronunciation (Alemi & Khatoony, Reference Alemi and Khatoony2020). Additionally, studies have assessed students’ affective variables (Wu & Hung, Reference Wu and Hung2022), collaboration, communication, critical thinking, and engagement in these technology-enhanced settings (Hsu, Reference Hsu2017; Kruk, Reference Kruk2014).
Clearly, immersive learning technologies have gained the attention of researchers and practitioners as a potential tool for language learning, as underscored by numerous literature reviews that analyze its educational use, which we enumerate here. First, Parmaxi (Reference Parmaxi2020) conducted a content analysis of 26 scholarly manuscripts published from 2015 to 2018 and found that VR can serve as a useful tool in language classrooms, but that learning effectiveness can be challenged due to technical configuration demands and insufficient pedagogical grounding. Second, in another study, Lin and Lan (Reference Lin and Lan2015) analyzed 29 articles published from 2004 to 2013 and found that the most popular research topics were interactive communication, behaviors, affect, beliefs, and task-based instruction. Their research highlighted the need for more studies to focus on how teachers can influence the impact of immersive learning interventions. Third, Parmaxi and Demetriou (Reference Parmaxi and Demetriou2020) conducted a systematic review of 54 publications from 2014 to 2019 on the use of AR in language learning and found that while mobile-based AR appears popular for supporting vocabulary, reading, speaking, writing, and other generic language skills, many of the included studies failed to sufficiently consider theory in their approaches. Fourth, Dhimolea, Kaplan-Rakowski and Lin (Reference Dhimolea, Kaplan-Rakowski and Lin2022) conducted a systematic review of 32 peer-reviewed studies published between 2015 and 2020. They found some evidence of efficacy, for example, that VR is beneficial for contextual vocabulary learning and perceptions of language learning in VR tend to be positive, but that its effectiveness is inconclusive and that multiple exposures to VR are necessary for effective learning. Fifth, Hein, Wienrich and Latoschik (Reference Hein, Wienrich and Latoschik2021) conducted a review on immersive technology’s role in foreign language learning, emphasizing how VR can influence student behavior and attitudes, enhancing language learning. They also noted high motivation and acceptance of immersive tools in language education. Sixth, Peixoto, Pinto, Melo, Cabral and Bessa (Reference Peixoto, Pinto, Melo, Cabral and Bessa2021) found that VR allows learners to recreate authentic environments, enhancing participation and leading to optimal learning. Seventh, and finally, Raju and Joshith (Reference Raju and Joshith2020) highlighted the benefits of AR in English learning, suggesting that AR interaction boosts enjoyment, motivation, and positive attitudes towards the language. In summation, while the potential of immersive technologies in language learning is evident, the nuances of their application and effectiveness remain subjects of rigorous academic exploration and debate.
Speaking now to the limited and inconclusive research on the effectiveness of immersive technologies for language learning, as highlighted by Govender and Arnedo-Moreno (Reference Govender and Arnedo-Moreno2021), it is pertinent to note that merely learning a language for its utility can diminish motivation. This accentuates the importance of active learning mechanisms and underscores the pressing need for further research to discern the true impact of these technologies on language learning outcomes. One area where research on immersive technology for language learning is particularly limited is in K–12 environments. Therefore, K–12 ESL learners were selected as the target group for this study because age has long been recognized as a critical factor influencing second language acquisition (Oyama, Reference Oyama1976). Younger children have been found to consistently perform better than adolescents and adult learners (Sang, Reference Sang2017). Additionally, the characteristics of immersive technologies appeal to and benefit young learners whose “understanding comes through hands and eyes and ears” (Scott & Ytreberg, Reference Scott and Ytreberg1990: 13). Because VR involves real-time simulations and interactions experienced through multiple sensorial channels, immersive technology-enhanced environments can stimulate learners’ physical presence and enhance their real-life sensory experience (Burdea & Coiffet, Reference Burdea and Coiffet2003). These channels are primarily visual and auditory, but some VR systems also activate touch, smell, and taste, which could support young learners in learning through all senses in a way that is highly representative of the real world (Burdea & Coiffet, Reference Burdea and Coiffet2003). Bridging these insights, it becomes evident that the unique attributes of immersive technologies align well with the inherent learning tendencies of younger K–12 students, thus emphasizing the need for more focused research in this domain. This need is highlighted by the limited research on immersive technology’s impact on language learning in K–12 settings. Given immersive technologies’ potential, exploring their impact on K–12 ESL learners is essential. Therefore, the following research questions guided the current systematic literature review:
-
1. How is the effectiveness of K–12 students’ English learning in immersive technology contexts operationalized and evaluated in empirical studies?
-
2. How do the design elements of immersive technologies identified contribute to English learning effectiveness?
-
3. What role does theory play in guiding and explaining immersive interventions?
2. Methods
This systematic review follows PRISMA guidelines (Moher, Liberati, Tetzlaff, Altman & The PRISMA Group, Reference Moher, Liberati, Tetzlaff and Altman2009) for transparency, accuracy, and completeness (Shamseer et al., Reference Shamseer, Moher, Clarke, Ghersi, Liberati, Petticrew, Shekelle and Stewart2015). This review outlines our search strategy, database selection, and initial findings. We set criteria to filter studies, ensured data screening reliability through interrater reliability, and summarized our results. The subsequent section details the process.
2.1 Search strategy and databases
Using search criteria and databases (Table 1), we conducted an initial comprehensive screening. We focused on immersive learning technologies like VR, AR, MR, and 360-degree videos from peer-reviewed publications in the last 10 years. Keywords included subject and learning field, targeting abstracts, titles, and topics. We used “AND” for keyword coordination, “OR” for synonyms, and “*” for morphological variations. Only journal articles were considered, excluding formats like posters and videos.
Electronic databases hosting journals focused on language learning, computer-assisted language learning, and educational technology were searched: ERIC, Web of Science, Linguistics and Language Behavior Abstracts, PsycINFO (EBO), JSTOR, ACM Digital Library, BEI, and ProQuest. All searches were performed separately. Search results were transferred to Zotero (Idri, Reference Idri2015).
2.2 Selection criteria
After the literature search, the inclusion and exclusion criteria (Table 2) were set to assess articles. Considering the research questions, articles were filtered based on subject, age group, language use, immersive technology role, English learners, and research design.
2.3 Reliability assessment and data extraction
Two researchers independently searched the literature based on set criteria, analyzing 919 records and removing 72 duplicates. They then assessed the reliability of codes for 271 of the 809 articles using Cohen’s kappa, achieving a substantial agreement score of 0.77 (Cohen, Reference Cohen1960). After this, they reviewed the full articles and reached a consensus on coding.
2.4 Search results and findings
Of the 919 articles, 809 were excluded by the researchers based on title, abstract, language, age group, study type, or technology relevance. After a full-text review of the remaining articles, 33 met the criteria. Figure 1 illustrates adherence to PRISMA guidelines.
Of the 33 included studies, 15 were identified as AR-related empirical research and 18 as VR-related. No mixed- or cross-reality studies were included. No studies focusing on 360-degree videos were included due to inappropriate target groups or lack of access.
3. Results
The 33 articles identified on the basis of selection criteria were coded by two researchers for further analysis. The results of the analysis are presented as follows. Each section focuses on one of the three research questions.
3.1 Operationalization and evaluation of learning effectiveness in included studies
The first research question considers the operationalization and evaluation of the effectiveness of immersive technology on K–12 students’ English learning.
3.1.1 The operationalization of learning effectiveness in identified empirical studies
Learning effectiveness can be defined as “the degree to which the learning outcomes are achieved” (Blicker, Reference Blicker2005: 102), in which “learning outcomes are statements of what a learner is expected to know, understand and/or be able to demonstrate after completion of a process of learning” (European Communities, 2009: 47). To explore the operationalization and evaluation of learning effectiveness, we mapped the evaluated variables and resources, English skills, and corresponding data collection methods (Table 3).
Note. VR = virtual reality; AR = augmented reality; a = test; b = questionnaire; c = interviews; d = observation recordings; e = observation notes; f = survey; g = reflective notes; h = evaluation sheets; i = feedback; 1 = 3D modeling; 2 = 2D graphics; 3 = VR simulation; 4 = digital sound; 5 = input function; 6 = game elements; 7 = camera function; 8 = sensor displays; 9 = AR-generated video.
Two components of the Kirkpatrick and Kirkpatrick (Reference Kirkpatrick and Kirkpatrick2006) model were used to characterize the learning effectiveness of training and educational programs. This model contains four levels of assessment: (1) reaction, (2) learning, (3) behavior, and (4) result. Specifically, levels 1 and 2 were used to map how researchers evaluated learning effectiveness based on reaction, learning, and evaluated variables. Reaction variables show how learners respond to immersive technology interventions; learning variables refer to students’ increased knowledge and change of attitude (Kirkpatrick & Kirkpatrick, Reference Kirkpatrick and Kirkpatrick2006). Table 3 shows that 19 studies evaluated changes in students’ reactions in response to immersive technology intervention. Eight studies measured and analyzed students’ behavior during the learning experience. To analyze learning outcomes, test and questionnaire scores, feedback (including students’ performance and response in class), recordings, and interview transcripts were used. Further, multiple data collection methods were used to measure effectiveness of immersive technologies (Figure 2), including testing (n = 27), questionnaires (n = 22), interviews (n = 12), observation recordings (n = 8), observation notes (n = 2), surveys (n = 1), reflective notes (n = 2), and evaluation sheets (n = 1).
Tests were used most frequently in identified studies evaluating students’ language knowledge and performance and offer insight into the effectiveness of different teaching methods (Fokides & Zampouli, Reference Fokides and Zampouli2017). Pre-test/post-tests were designed to compare English learners’ increase in knowledge due to the immersive learning intervention, and differences in scores provided evidence for changes that might be attributed to use of immersive technologies.
Questionnaires were used for diverse purposes such as to analyze student enjoyment and interest levels when using immersive learning technologies (Dalim, Sunar, Dey & Billinghurst, Reference Dalim, Sunar, Dey and Billinghurst2020), provide background information about the research participants’ learning history (Fokides & Zampouli, Reference Fokides and Zampouli2017; Kruk, Reference Kruk2014), collect data on learning motivation for further quantitative analysis (Chen & Wang, Reference Chen and Wang2015; Tsai, Reference Tsai2020), explore attitudes towards immersive technologies (Chen et al., Reference Chen, Wang, Zou, Lin, Xie and Tsai2020; Limsukhawat, Kaewyoun, Wongwatkit & Wongta, Reference Limsukhawat, Kaewyoun, Wongwatkit and Wongta2016; Morton, Gunson & Jack, Reference Morton, Gunson and Jack2012), and understand how immersive technologies promote specific English skills (Tai & Chen, Reference Tai and Chen2021).
Interviews with students aimed to gather feedback on their experiences, attitudes, and learning outcomes in the VR- or AR-supported classroom. Interviews with teachers found that feedback tended to concentrate on the use of immersive technologies, benefits to English language learners, and usability of immersive learning systems (e.g. Vedadi, Abdullah & Cheok, Reference Vedadi, Abdullah and Cheok2019).
Researchers observed and analyzed learners’ behavior by video recording and observational notes to know learners’ feelings and experiences during interventions. Results from pre-test/post-tests and questionnaires revealed positive learning effectiveness. Qualitative data (e.g. reflective notes) examined learners’ perceptions of immersive technology-based courses and assessed the usability of immersive technology.
3.1.2 Research design and methodology
All eligible studies reported findings suggesting that immersive technologies can facilitate the target group’s ESL despite the different methods used to measure learning effectiveness across a range of English language knowledge and skills. The total records (N = 33) consist of 10 articles using quantitative methods, 20 using mixed methods, and 3 using purely qualitative methods.
More than half of the studies (n = 19) adopted an experimental or quasi-experimental design with a control and experimental group to investigate learning effectiveness. Such studies include VR versus video material (Dooly & Sadler, Reference Dooly and Sadler2016; Tai, Chen & Todd, Reference Tai, Chen and Todd2020), VR versus personal computer (Lai & Chen, Reference Lai and Chen2021), real and physical body versus the 3D avatar versus non-embodied learning (Lan et al., Reference Lan, Fang, Hsiao and Chen2018), VR versus traditional teaching methods (Chang, Chen & Liao, Reference Chang, Chen and Liao2020; Khatoony, Reference Khatoony2019; Kruk, Reference Kruk2014, Reference Kruk2015; Morton et al., Reference Morton, Gunson and Jack2012), VR with different teaching methods versus traditional teaching methods (Fokides & Zampouli, Reference Fokides and Zampouli2017), English learners with high versus low proficiency in AR context (Chen & Wang, Reference Chen and Wang2015), AR with different teaching methods (Hsu, Reference Hsu2017), AR versus traditional classroom teaching (Dalim et al., Reference Dalim, Sunar, Dey and Billinghurst2020; Koç et al., Reference Koç, Altun and Yüksel2021; Redondo, Cozar-Gutierrez, Gonzalez-Calero & Ruiz, Reference Redondo, Cozar-Gutierrez, Gonzalez-Calero and Ruiz2020; Tsai, Reference Tsai2020), AR contexts with different variables such as English proficiency and caption scaffolding (Chen et al., Reference Chen, Wang, Zou, Lin, Xie and Tsai2020), AR versus video materials (Chen, Reference Chen2020), and AR with different media conditions (Vedadi et al., Reference Vedadi, Abdullah and Cheok2019).
3.2 Design elements of immersive technologies and improved learning outcomes
The second research question concerned how immersive technologies’ design elements can facilitate K–12 ESL. Eighteen studies focused on VR-enhanced contexts and 15 on AR-enhanced contexts. Studies are categorized by design elements and learning outcomes in Table 3.
In VR-related studies, VR simulated contexts, game elements, 3D models such as avatars, and models with digital sound and 2D, 3D graphics scaffolding such as videos, and images were mainly used. Almost all identified studies used VR design elements in virtual learning contexts, contributing to the main learning outcomes of increased motivation, engagement, and attention.
AR-related studies (n = 15) are summarized in Table 3. Design elements of AR incorporated in the K–12 context mainly included 3D interactive models, images, videos, and text accompanied by audio. Learners interacted with 3D models and text/audio to increase contextualization of learning, which influenced learning effectiveness and attitudes. For example, 360-degree photos can bring learners to virtual yet authentic situations to practice language (Koç et al., Reference Koç, Altun and Yüksel2021). AR-based learning systems also boast the camera function for scanning target images or AR markers (n = 4) to access supplementary materials such as tutorial videos and capturing images (Hsu, Reference Hsu2017). For example, individual students scanned insect specimens with mobile devices. Video clips of specimens appeared on their devices in AR that they could watch and zoom in on (Chen, Reference Chen2020).
Table 3 shows the design elements used in AR and VR interventions. In addition, the studies report learning outcomes enhanced by immersive technologies that contribute to ELL effectiveness. All learning outcomes were extracted based on learners’ answers from interviews and questionnaires. Figure 3 shows the distribution of learning outcomes.
Learners’ attitudes and emotions were investigated in 33 studies. Because of the different types of AR and VR studies, it is meaningful to compare and analyze the more divergent learning outcomes. The AR-focused studies suggested that AR more effectively builds learners’ problem-solving and communicative skills and increases satisfaction, a sense of novelty, and interest in their learning experience. VR appeared to be more productive in enhancing learners’ curiosity, imagination, cognition, awareness, attention, and enthusiasm. Both AR and VR interventions increased motivation, enjoyment, and engagement in learning English.
However, immersive technologies did not work for all learners. Four studies reported negative feedback on immersive tools due to technical issues such as failure to show the images (Dalim et al., Reference Dalim, Sunar, Dey and Billinghurst2020), inability of immersive tools to be manipulated simultaneously (Tai & Chen, Reference Tai and Chen2021), unsatisfactory display speed (Fokides & Zampouli, Reference Fokides and Zampouli2017), and unavailability of technology because of high cost and lack of professional training (Liu, Liu, Yang, Guo & Cai, Reference Liu, Liu, Yang, Guo and Cai2018). In addition, negative immersive learning experiences were found in eight studies, including mental overload and learning anxiety (Hsu, Reference Hsu2017), eye strain (Alemi & Khatoony, Reference Alemi and Khatoony2020; Tsai, Reference Tsai2020), distraction (Tai & Chen, Reference Tai and Chen2021; Tsai, Reference Tsai2020; Urueta & Ogi, Reference Urueta, Ogi, Barolli, Nishino, Enokido and Takizawa2020), poor adaptation to the immersive tools (Fan & Antle, Reference Fan and Antle2020), and longer course design preparation and class time (Chen, Reference Chen2018). Consideration of learner issues is critical in the design of learning interventions (Schmidt et al., Reference Schmidt, Lu, Luo, Cheng, Lee, Huang, Weng, Kichler, Corathers, Jacobsen, Albanese-O’Neill, Smith, Westen, Gutierrez-Colina, Heckaman, Wetter, Driscoll and Modi2022); hence, further research in this area is warranted.
3.3 Role of theories in the included articles
This section is divided into the theories that informed the design of immersive tools and theories that explained the experimental results. Approximately 43% of the identified studies (n = 15) used relevant theories to support design and explain the results of empirical studies. As shown in Table 4, theories were identified as either (1) informing design, such as design of evaluation, design of data collection method, design of immersive interventions, and class design, or (2) explaining and corroborating empirical findings.
3.3.1 Theories informing design
The Attention, Relevance, Confidence, and Satisfaction (ARCS) model (Keller, Reference Keller and Reigeluth1983) is an approach to instructional design using multimedia technology based on a synthesis of motivational concepts. It was used to examine whether the AR-enhanced learning environment could improve students’ attitudes, interests, behavior, and satisfaction (Chang et al., Reference Chang, Chen and Liao2020). Fan and Antle (Reference Fan and Antle2020) used it to design items in the questionnaire on students’ motivation to learn with an AR app. One study mentioned the ARCS model in the abstract but actually applied it in empirical research (Vedadi et al., Reference Vedadi, Abdullah and Cheok2019).
The interaction hypothesis (Long, Reference Long, Ritchie and Bhatia1996) claims that conversational interaction between a learner and, for example, a native speaker can facilitate the learner’s development since it affords negotiated interaction providing comprehensible input in the target language. The interaction hypothesis was used to simulate interactive scenarios where English learners negotiated with the computer rather than with a real speaker (Morton et al., Reference Morton, Gunson and Jack2012).
In the eco-dialogical model (Zheng, Reference Zheng2012), the linguistic perspective of communication as a negotiation of meaning between two actors is extended systematically to consider how objects in the environment and sociocultural factors can influence meaning-making and the realization of values. Zheng, Schmidt, Hu and Liu (Reference Zheng, Schmidt, Hu and Liu2017) used this model to explore whether eco-dialogical learning can facilitate the development of translanguaging abilities of ELL secondary school students in a virtual world.
The Assessment, Pedagogy, Technology (APT) method developed by Osborne (Reference Osborne2014) adapts principles of ecological psychology to align the needs of teachers and learners through the affordances of digital technologies. Huang, Han, He, Du and Liang (Reference Huang, Han, He, Du and Liang2018) used it to design and develop VR educational games to improve English learners’ learning performance and engagement.
Content and Language Integrated Learning (CLIL) focuses on appropriate and effective language usage, particularly the interrelationship between content, communication, cognition, and culture (awareness of self and others) to build on the synergies of integrated learning (content and cognition) and language learning (communication and cultures) (Coyle, Reference Coyle and Hornberger2008). Constructivist learning theory argues that people construct their comprehension and knowledge of the world by going through things and reflecting on those experiences (Bereiter, Reference Bereiter1994). CLIL, together with constructivism, was used in the empirical study by Fokides and Zampouli (Reference Fokides and Zampouli2017) to develop a multi-user virtual learning environment. In addition, constructivism combined with inquiry-based learning strategies was incorporated in Liu et al.’s (Reference Liu, Liu, Yang, Guo and Cai2018) AR intervention study to make the classroom more engaging and motivating and to develop learners’ ability to collaborate and self-regulate. It emphasizes active participation and learner responsibility for discovering new knowledge (De Jong & Van Joolingen, Reference De Jong and Van Joolingen1998).
Total physical response (TPR) (Asher, Reference Asher1969) is applied to concentrate language learners’ attention on listening and reacting to oral commands. In short, TPR is built around the coordination of speech and action and attempts to teach language through physical (motor) activity (Widodo, Reference Widodo2005). TPR was used by Lan et al. (Reference Lan, Fang, Hsiao and Chen2018) to investigate how VR learning influences young students’ listening performance.
Jolly Phonics focuses on letter-sound associations and the importance of training children to better comprehend letter-sound correspondence (Lloyd, Reference Lloyd1998). Limsukhawat et al. (Reference Limsukhawat, Kaewyoun, Wongwatkit and Wongta2016) developed an AR-supported mobile game application to provide learners with phonics practice based on this approach to encourage learners to practice blending, decoding, and encoding words for reading and writing, and found positive improvement in students’ learning efficiency and attitudes.
In Webb’s (Reference Webb2019) study, incidental vocabulary learning indicated that the essential research direction of incidental learning is the extent to which words can be learned through different input types. Lai and Chen (Reference Lai and Chen2021) used it to design a VR-enhanced learning environment to improve students’ vocabulary acquisition.
The socio-constructivist principle proposed by Vygotsky (Reference Vygotsky1968), suggests that learning and culture are the frameworks through which humans experience, communicate, and understand reality (Akpan, Igwe, Mpamah & Okoro, Reference Akpan, Igwe, Mpamah and Okoro2020). It informed the project-based language learning of the VR intervention system in the study by Dooly and Sadler (Reference Dooly and Sadler2016).
3.3.2 Theories explaining the results
Only two studies were found to explain research findings from theoretical perspectives. Dual coding theory (DCT) (Clark & Paivio, Reference Clark and Paivio1991), a cognitive theory, claims that a learner’s memory consists of two separate but interrelated verbal and visual codes for processing information. Dalim et al. (Reference Dalim, Sunar, Dey and Billinghurst2020) used DCT to explain that the AR-supported learning environment makes words visualized and auditory so that learners’ memory of vocabulary can be enhanced and other information processing skills such as association of features with previous knowledge can be stimulated. Hypothetical Model of Immersive Cognition (HMIC) (Ladendorf et al., Reference Ladendorf, Schneider, Xie, Zhang and Cristol2019) was used by Tai and Chen (Reference Tai and Chen2021) to corroborate the finding that sense of presence in VR interventions can enhance learning effectiveness.
Beyond the explicit use of theories to inform design and explain results in empirical studies, two studies implicitly indicated the use of theories. Ou Yang, Lo, Hsieh and Wu (Reference Ou Yang, Lo, Hsieh and Wu2020) indicated that the benefits of using VR in facilitating ELL’s communicative ability are supported by the theories of constructivist learning, contextualized learning, and immersive learning. In Chen’s (Reference Chen2018) study, constructivism, situated learning theory, self-determination theory, and flow theory were mentioned to elaborate on how the affordances of AR could enhance learning. However, neither study explicitly referenced theories to inform design or explain results.
4. Discussion and implications
For RQ1, we described how current research operationally defines learning effectiveness in immersive technology intervention contexts. We also analyzed the research designs used in the selected empirical studies and the different English knowledge and skills found to be influenced by immersive technologies. Our findings show that mixed methods were most frequently used, with a particular focus on vocabulary teaching and learning practice over other methodologies. This is possibly due to the large number of vocabulary-teaching cases. Few studies adopted only qualitative methods and those that did mainly focused on general English learning, including comprehensive skills with science (Liu et al., Reference Liu, Liu, Yang, Guo and Cai2018) and socio-pragmatic competencies in communication (Dooly & Sadler, Reference Dooly and Sadler2016). Besides methodologies, we also analyzed evaluation methods used in empirical research, with findings suggesting that tests and questionnaires were used most frequently. Building on these observations, it becomes evident that, given the predominant focus on vocabulary teaching and learning practice, future designs should examine the effectiveness of immersive technology interventions in different ELL skills beyond vocabulary. This would provide a more comprehensive understanding of the potential benefits of immersive technology in ELL contexts.
Regarding RQ2, the target groups’ attitudes and emotions enhanced by immersive technologies were found to be crucial for ELL effectiveness, which echoes Krashen’s (Reference Krashen1986) affective filter hypothesis. Affective variables included motivation, self-confidence, anxiety, and personality traits as crucial factors facilitating second language acquisition (Schütz, Reference Schütz2007). Cross-curricular skills, including collaboration, communication, awareness, and attention facilitated by immersive learning helped to create a learner-centered climate, which also aligns with Krashen’s Acquisition Learning Hypothesis regarding acquired (unconscious acquisition of knowledge) and learned (formal instruction) language performance systems. Immersive technology interventions afford a virtual environment with real-life contexts and flexible interactions with 3D models and characters, stimulating students to use their own pace of learning input and output and achieving spontaneous learning behaviors, such as communication, heightened awareness, attention, etc. The identified studies used immersive technology interventions in class design and instructions to make the learning process novel and interactive, further contributing to optimal learning achievement. Building on this, the studies underscored the potency of immersive technology interventions in class design and instructions, enhancing the learning experience’s novelty and interactivity, leading to optimal learning outcomes. Consequently, these findings underscore the implication that, from a pedagogical vantage point, future endeavors should prioritize a learner-centered approach, ensuring students can navigate their learning journey, emphasizing spontaneous behaviors like communication and heightened awareness.
In terms of RQ3, half of the studies cited relevant theories to support the design framework and results explanation. This indicates a general lack of theoretical grounding in the design of immersive tools and interpretation of results in empirical studies of immersive technology interventions in the K–12 ESL context, a finding that echoes Huang and Schmidt’s (Reference Huang, Schmidt, Peterson and Jabbari2023) systematic review of theory-informed digital game-based language learning. Given the general lack of theoretical grounding in the design of immersive tools and interpretation of results in empirical studies of immersive technology interventions in K–12 ELL contexts, designers should consider placing greater emphasis on integrating learning theories into their designs.
Moving beyond the research questions that guided this research, the current study also summarized studies that used treatment and control group methodology to compare the effectiveness of immersive technology in language learning. This methodology bears similarities with the media comparison study. Media comparison may be popular because researchers can easily run studies and explain the comparative achievements of different media. However, this methodology has a long history of intense critique, in part because media alone cannot influence learning outcomes (Clark, Reference Clark1994; Jonassen, Campbell & Davidson, Reference Jonassen, Campbell and Davidson1994). Indeed, many scholars agree that media is most appropriate as a vehicle for delivering learning experiences rather than as a conduit for improving learning. Furthermore, design and methodological problems frequently occur in empirical studies in line with the arguments of Reeves’s (Reference Reeves1995) pseudoscience; thus, interpreting learning outcomes as being influenced by media interventions is challenging (Bryant & Hunton, Reference Bryant and Hunton2000). Given these insights, the implication is clear: researchers must exercise caution when employing media comparison studies, ensuring rigorous design and methodology while being wary of overattributing learning outcomes solely to the media used. Researchers might also consider alternative study designs, such as case studies, design-based research, and mixed-methods designs, or more contemporaneous approaches, such as learning experience design (Schmidt & Huang, Reference Schmidt and Huang2022; Schmidt, Tawfik, Jahnke & Earnshaw, Reference Schmidt, Tawfik, Jahnke and Earnshaw2020).
4.1 Limitations
Our findings should be interpreted in light of the following limitations. First, our approach concentrated more on the affordances and features of immersive technologies in English learners’ learning outcomes but did not consider how learning theories might have influenced their design. Synthesizing why researchers used certain design theories and how their designs will improve ESL effectiveness also warrants future research, as does the exploration of how learning theories are implemented in the design and development of emerging technology learning affordances and how they influence ESL as a whole. Additionally, we only looked at papers in English – the “Tower of Babel” bias – which could have potentially excluded important studies. Including so-called gray literature may have uncovered additional studies. Likewise, concentrating on adverse effects resulting from immersive technology would be meaningful.
The findings presented here suggest that immersive technologies hold great promise for ESL; however, further research is warranted. In addition, while our research sheds light on some of the gaps and challenges associated with the empirical research in this area as well as the influence of immersive learning designs on learning outcomes, the focus of the current paper was not concerned with pedagogical implications. Given the broad range of empirical research in K–12 contexts that we identified in our systematic review, it is clear that ELL instructors see value in the use of immersive learning for promoting language learning outcomes. However, what is not clear is how K–12 instructors might effectively integrate immersive learning interventions into their own teaching practices so as to promote highly effective learning outcomes. We see this as an area of critical need for future research. Finally, another limitation of our study is the potential for positive publication bias. It is possible that studies with positive results were more likely to be published, while studies with negative results were less likely to be published. This bias could have influenced our findings and may have resulted in overestimating the effectiveness of immersive technology interventions in K–12 ELL contexts. Future studies should consider the potential for publication bias and take steps to mitigate its effects, such as conducting a comprehensive search of gray literature and including unpublished studies.
4.2 Conclusion
This systematic literature review delves into immersive technology’s role in K–12 ELL settings, emphasizing the evaluation of learning effectiveness and the need for theory-driven research designs. While the study recognizes the potential of technologies like VR in enhancing motivation and cross-curricular skills, it also advocates for exploring other immersive tools, such as 360-degree videos. Despite limitations like potential bias and unexplored adverse effects, the review underscores the significance of further research in this domain to benefit English learners.
This paper addressed three research questions, mapping out evaluations from identified literature. Tests predominantly assessed English skills, while questionnaires and interviews gauged learners’ attitudes and perceptions of immersive technology. Table 3 indicates that immersive technology’s design elements, backed by second language acquisition theories, enhance affective variables and cross-curricular skills. However, while theories informed design and results interpretation, their application in the research was found lacking.
This paper enriches scholarly discourse by precisely defining learning effectiveness, moving beyond ambiguous “good learning” definitions. It maps evaluation methods to a learning effectiveness framework, offering insights for future research. The study underscores the need for a robust theoretical foundation in design and interpretation, suggesting frameworks like self-determination theory for VR-supported individual learning or social constructivism for multi-user VR designs. The research also identifies trends in immersive technology for ESL, noting an underutilization of 360-degree videos and mixed reality, possibly due to cost and technical challenges, but encourages researchers to diversify their technological tools for richer educational outcomes.
Supplementary material
To view supplementary material referred to in this article, please visit https://doi.org/10.1017/S0958344024000041
Ethical statement and competing interests
No funding was received to assist in the preparation of this manuscript. This article does not contain any studies with human participants performed by any of the authors. The authors declare no competing interests.
About the authors
Yueqi Weng is a PhD student in learning design and technology at the University of Georgia. Yueqi’s research aims at learning analytics, learner and user experience design, and designing and applying emerging learning technologies, such as immersive technology and artificial intelligence in health education and language learning.
Matthew Schmidt is an Associate Professor at the University of Georgia in the Learning, Design, and Technology department. His primary research interest includes design and development of innovative educational courseware and computer software with a particular focus on individuals with disabilities, their families, and their providers. His secondary research interests include learning in extended reality (inclusive of virtual reality, augmented reality, and mixed reality) and learning experience design.
Wanju Huang is a Clinical Associate Professor of Learning Design and Technology at Purdue University. Her research focuses on various areas, including online learning, professional development in STEM, augmented reality/virtual reality, and the integration of artificial intelligence in education.
Yuanyue Hao is currently a PhD student in the Department of Education, University of Oxford. His research interests include automated language assessment, pronunciation assessment, and individual differences. He is interested in using methods such as systematic review and meta-analysis, latent variable modelling, Bayesian statistics, and machine learning in applied linguistic research.
Author ORCIDs
Yueqi Weng, https://orcid.org/0009-0009-8420-2243
Matthew Schmidt, https://orcid.org/0000-0002-8110-4367
Wanju Huang, https://orcid.org/0000-0001-5965-2597
Yuanyue Hao, https://orcid.org/0000-0001-7557-6133