Since its emergence in December 2019, 1,Reference Green2 the coronavirus disease 2019 (COVID-19) has upended global systems, and fundamentally changed the way that humans live and interact around the world. The acuity and rapid spread of SARS-CoV-2, as well as the uncertainty provoked by the COVID-19, led the public to turn to the Internet for information and real-time guidance. Previous studies have shown that 50% of the world population has access to the Internet, 3 and that 70% of American adults use the Internet to find health information. Reference Fox and Duggan4 As the Internet is a key tool for delivering health information, understanding the quality of health-related content posted online during the recent global health crisis is particularly pressing and relevant. Reference Cuan-Baltazar, Muñoz-Perez and Robledo-Vega5 While sharing accurate, reliable information online is an effective way to reduce public panic, as well as provide updates and direction, circulating misinformation or holding back credible information can have devastating consequences. The importance of this issue was underscored by the Director General of the World Health Organization (WHO) on February 15, 2020, at the Munich Security Conference; he asserted that “[w]e’re not just fighting an epidemic; we’re fighting an infodemic” because “[f]ake news spreads faster and more easily than this virus, and is just as dangerous. Reference Ghebreyesus6 ”
YouTube has more than 2 billion monthly users and is accessed by one-third of people using the Internet globally, making it the second largest social networking website. 7,Reference Mitchell, Gottfried and Kiley8 As such, this platform plays host to a variety of health-related media. The quality of YouTube videos about health topics, however, has been called into question. For example, the quality of information on YouTube regarding gastroesophageal reflux disease was found to be low in a study by Aydin and Aydin. Reference Aydin and Aydin9 Furthermore, during public health crises, studies of YouTube content during the H1N1, Ebola, and Zika epidemics revealed that up to a quarter of videos about these topics shared misleading information. Reference Li, Bailey and Huynh10–Reference Bora, Das and Barman13 This phenomenon has also been observed during earlier stages of the coronavirus pandemic. In late March 2020, Li et al. found that 27.5% of COVID-19 videos posted on YouTube contained inaccurate information, and that content from reputable sources was under-viewed. Reference Li, Bailey and Huynh10 these data are particularly alarming, as it suggests that health misinformation from YouTube is reaching more individuals than during previous public health crises. Reference Li, Bailey and Huynh10
As the world continues to grapple with the consequences of the COVID-19 pandemic, vaccine development has been touted as an opportunity for long-term prevention and control of this global health crisis. Public opinion on vaccines is largely influenced by health communications on this subject, 14 with YouTube being a major source of information and misinformation. Reference Basch, Zybert and Reeves15 Data has shown that there are more anti-vaccine videos on YouTube compared to pro-vaccine content (65.5% vs 25.3%), and that more than one-third of the most popular videos about vaccines on YouTube did not include any scientific evidence. Reference Basch, Zybert and Reeves15 In a recent study examining 100 widely viewed YouTube videos on COVID-19 vaccinations, nearly three-quarters of videos were uploaded by news sources; however, quality and accuracy of the video content was not assessed. Reference Basch, Hillyer and Zagnit16
As global attention has turned to COVID-19 vaccine development and distribution Reference Chen17 (Chen, 2020) and to the global exacerbation of pre-existing health disparities by the COVID-19 crisis, Reference Macias Gil, Marcelin and Zuniga-Blanco18–Reference Wang and Tang26 it is important to explore these topics in more depth. In an effort to both build on existing literature, as well as update the academic community’s understanding of this issue in parallel with the progression of the pandemic, this study seeks to investigate the most viewed content on YouTube during the first 6 mo of 2020 and assess the quality and accuracy of information regarding COVID-19 content related to vaccines on YouTube.
Methods
Search Protocol
The study performed a YouTube (www.youtube.com) query on June 27, 2020, using the search terms “COVID-19,” “Coronavirus,” and “Coronavirus pandemic.” To avoid user-based video recommendations, the search was run in “incognito mode,” which clears tracking data such as cookies and viewing history. Filters were applied, so as to collect only English-language results from the past year. Results were sorted by “views” to obtain the most viewed, and, therefore, most popular videos, at the time of the search. The hyperlinks for the 150 most viewed videos were collected, assigned a unique video identification number, and organized in a spreadsheet. This method has been validated in previous studies across disciplines of YouTube’s online library. Reference Li, Bailey and Huynh10,Reference Pandey, Patni and Singh11,Reference Kunze, Krivicich and Verma27,Reference Singh, Singh and Singh28
Data Extraction
Exclusion criteria were applied to non-English language videos, live-streams, or content not related to COVID-19. English-language videos that met the search criteria were included in the study. The following video characteristics were extracted for each YouTube video: (1) view count, (2) length of video, (3) number of comments, (4) number of “likes”/“dislikes,” and (5) uploading source.
Uploading source was defined as the uploading user as listed below the YouTube video. Uploading source was further organized into the categories of: Independent Users/Consumers, Professional Organization, University Channels, Entertainments News, Network News, Internet News, Government, Newspaper, Education, or Medical Advertisements/For-Profit Companies.
Closed Captioning Data Review
YouTube offers automatically generated closed caption files, containing the text of what is said in a video. Additionally, YouTube contributors can upload their own closed caption files generated by a third-party transcriber. For the 150 most viewed videos, closed captioning data were downloaded as a text file, and a quantitative analysis was performed to determine the most used terms across the full 6-mo period.
To determine how many of the 150 total videos discussed vaccines, closed captioning data for all videos were searched for the following terms: Vaccine/Vaccines; Vaccination/Vaccinations; Immunization/Immunize/Immunizations; Vaccine-preventable; Live-attenuated; Inactivated vaccines; Subunit, recombinant, polysaccharide, and conjugate vaccine(s); Toxoid vaccine(s); and Placebo. This list of terms was created using reliable sources including the Centers for Disease Control and Prevention and the World Health Organization and was vetted by 3 practicing physicians across 3 specialties. 29,30 Of the 150 total videos, 50 videos included these terms in the closed caption search. These 50 videos were further narrowed to include only videos that included the term “vaccine,” yielding 32 videos that went on to be reviewed for quality (Figure 1).
Quality, Accuracy, and Reliability Review
The 32 videos including the term “vaccine” in the closed captioning search were further evaluated for quality, accuracy, and reliability using a rubric that incorporated tools from prior studies, including Global Quality Scale (GQS), JAMA Benchmark Criteria, and DISCERN. Reference Li, Bailey and Huynh10,Reference Pandey, Patni and Singh11,Reference Singh, Singh and Singh28,Reference Silberg, Lundberg and Musacchio31–Reference Stellefson, Chaney and Ochipa33 The “JAMA Benchmark Criteria” was created in 1997 and focuses on authorship, attribution, disclosure, and currency through 4 yes/no questions. The “DISCERN tool,” validated by 28 health information providers, was adapted to evaluate reliability through a series of 8 yes/no questions. Reference Cassidy and Baker34 The “Global Quality Scale” offers a 5-point scale evaluating quality and coverage of appropriate content.
At least 3 reviewers from the research team independently reviewed and analyzed all 32 videos according to the rubric (Table 1). To start, 4 reviewers evaluated the same 4 videos using the rubric and discussed how they interpreted the results. Once the process was standardized, 3 reviewers were randomly assigned to each video. Numeric scores were averaged across reviewers. For yes/no questions, the majority answer was selected. Last, videos were scored according to the criteria described in Table 2.
Study data were collected and managed using REDCap electronic data capture tools hosted at George Washington University and Children’s National Hospital. REDCap (Research Electronic Data Capture) is a secure, Web-based software platform designed to support data capture for research studies, providing (1) an intuitive interface for validated data capture, (2) audit trails for tracking data manipulation and export procedures, (3) automated export procedures for seamless data downloads to common statistical packages, and (4) procedures for data integration and interoperability with external sources. Reference Harris, Taylor and Minor35,Reference Harris, Taylor and Thielke36 The study was not deemed human subject research and thus did not require Institutional Review Board review.
Results
Most Viewed 150 Videos
The 150 most viewed videos between January 1, 2020, and June 27, 2020, had over 618 million views. Of the 150 most viewed videos from January 1, 2020, to June 27, 2020, the largest proportion (40%) were uploaded in March and the lowest proportion were uploaded in January (5%) (Figure 2). March also had the highest engagement numbers, with nearly 350 million views represented. The average video length across all 150 videos was 11 min and 26 s (Table 3).
a As of 06/07/2020.
Uploading Sources
The 2 major uploading sources included Network News (27%) and Entertainment News (25%). Less than 2% of the 150 most viewed videos were from professional organizations.
Quality, Accuracy, and Reliability of Videos With the Term “Vaccine”
The 32 reviewed videos were uploaded between the dates January 1, 2020, and May 31, 2020, and from a variety of sources (Figure 3). Collectively, these videos received 139,764,188 views at the time of data collection (Table 4). The majority of videos were rated useful by the research review team (n = 29), while several (n = 3) were rated misleading. For videos rated misleading, reviewer comments were captured, and analyzed. Comments reflected the videos’ use of anecdotal story and hearsay over the use of evidence-based, scientific information.
a Quality (scored 1 to 5), Accuracy (scored 0 to 4), Reliability (scored 0 to 5)
Aggregate scores for quality, accuracy, and reliability domains were then generated for the 32 videos containing the term “vaccine.” An overall mean score was then calculated for each domain of quality, accuracy, and reliability for all videos (Table 4). The average quality score for all 32 videos was 3.63 of 5 (standard deviation [SD] = 0.83), the average accuracy score was 1.28 of 4 (SD = 0.81), and the average reliability score was 3.69 of 5 (SD = 1.12).
Additionally, after grouping videos by month, a monthly mean score was calculated for each of these domains for January through May (Figure 4). The scores in each domain demonstrated little fluctuation from month to month, with the exception of May showing a decrease in quality and reliability. Of note, the 3 videos that were rated as misleading by reviewers before aggregate scoring also occurred during May. No particular month’s videos revealed a considerable increase in the scoring of any domain, despite increasing and growing scientific knowledge regarding COVID-19.
Last, each video’s aggregate scores for quality, accuracy, and reliability were compared against that same video’s number of views. This was done to assess if higher or lower rated videos received considerable engagement. There was no discernable pattern identified between ratings and viewership.
Limitations
This study and assessment of videos has potential limitations. The ratings were done by members of the study; thus, the ratings scales are subject to human interpretation. We offset this limitation by building our scoring instrument combining multiple scales that have been used in prior studies and also having at least 3 reviewers for each video. Next, we used auto-generated closed captioning to identify videos using the word “vaccine.” It is possible that the auto-generated closed captioning software missed some pertinent videos. Last, as this study examined videos in a retrospective manner, we highlight the possibility of hindsight bias in reviewers. To minimize potential bias, reviewers adhered to the objective scale.
Discussion
This analysis graded the accuracy, quality, and reliability of the most viewed YouTube COVID-19 videos that included content on vaccines over a 6-mo period, thereby offering insight on the type of information viewed by the general public. Of note, the majority of videos received low scores based on a scoring instrument developed from 3 widely cited tools, with network news sources receiving some of the lowest scores overall.
Other studies examining YouTube communication during the COVID-19 pandemic reviewed videos from 1 distinct time point, with an inability to assess trends over time. Reference Gao, Zheng and Jia37 This study builds on prior work; our data provide insight into how patterns in uploading source, engagement, and quality have evolved over the course of the pandemic. Additionally, by examining the 150 most viewed videos over the course of 6 mo, this sample size is larger than what is currently reported in the literature. To our knowledge, no other study used closed captioning data to further focus review on quality of information regarding vaccinations.
Evaluation of Accuracy, Quality, and Reliability
Based on the study analysis, scores from the 32 videos with mention of vaccines were consistently low across all 3 rubrics.
In looking at scores over time, scores dropped across all 3 rubrics in May. Additionally, the 3 videos that were rated as “misleading” appeared in the analysis in May. All videos rated as misleading included political commentary or coverage of political events. The decreased scores during this time could reflect a political shift and polarization of scientific information.
In reviewing uploading sources, network news notably often scored below the 50% threshold; whereas uploading sources from the education category consistently received the highest scores. As the scoring rubrics were designed by the academic community, they may be best equipped to evaluate academic sources, leading to inaccurate scores among network news. In the age of social media, as a large share of the population gathers information through platforms like YouTube, it will be important to adapt scoring instruments to appropriately evaluate quality, reliability, and accuracy across a variety of media channels.
Based on the analysis, the number of views did not correlate with rubric scores. It is important to note there was no relationship associated between lower scoring videos and higher levels of viewer engagement. This contradicts previous evidence that demonstrated that controversy may increase interest and lead to increased engagement. Reference Madathil, Rivera-Rodriguez and Greenstein38
The 3 scoring tools were designed before YouTube and social media evolved as primary sources of information for the general public. Reviewers acknowledged difficulty with the binary (yes/no) scale, as it did not accommodate for a spectrum of information provided within the videos. For example, the JAMA Benchmark Criteria asks “Is copyright information clearly listed AND are resources/sources for content stated or listed?” Reviewers recognized that some videos offered 1 or the other, but the “AND” indicator in this question prevented analysis from further understanding levels of attribution. In the DISCERN Tool, a question states “Are additional sources of information listed for patient reference?” After discussion, reviewers agreed to answer this question based upon the specific video and comments; however, YouTube provided a blanket disclaimer with a link to the Centers for Disease Control and Prevention updates on any video discussing COVID-19. In further iterations of these scoring instruments, assessing platform versus specific media should be clarified.
Conclusions
As the COVID-19 vaccine starts to be administered globally, it is more important now than ever to assess both how the public is receiving health information and the quality of these messages. Furthermore, with an overflow of information, there is a growing need to expand educational efforts so the general public can accurately distinguish reliable sources of information. This study highlights the challenges of applying currently existing evidence-based rubrics (GQS, JAMA, and DISCERN) to evolving health information on YouTube. These tools, while the best currently available, may not capture the complexities of social media; the authors would argue that new tools may allow for better understanding the modern landscape of health communications, and are more appropriate for print media. Suggestions include designing a new tool to assess health information across all media platforms, or developing a unique rubric for each social media platform, so as to more accurately capture the way in which information is shared and how the public interacts with it. In developing a guideline to assess the quality of health communications on YouTube, analysis of categories such as video thumbnails, comments, advertisements, and video currency should be considered. In addition, future rubrics can be improved by avoiding the use of stacked questions and defining key terms, so as to reduce the potential for greater heterogeneity in research outcomes and reduced comparability between studies.
Author Contributions
Harleen Marwah (Study Lead + Author + Analysis); Kyle Carlson (Corresponding Author + Study Design + Analysis), Natalie A. Rosseau (Author + Literature Review); Katherine C. Chretien (Author + Editor); Terry Kind (Author + Editor), and Hope T. Jackson (Study Design + Senior Author + Editor).