Impact statement
The overrepresentation of genomic data from individuals of Northern-European descent in biobanks worldwide is now a well-recognised issue. Despite global efforts to improve the representation of individuals from other ancestry groups, this skewing remains, and various populations remain underrepresented and underserved in commonly used repositories worldwide. It is crucial to address this issue as it can lead to inequities in genomic medicine, and ultimately in health inequalities. This is because research and technologies can inherit biases from use of skewed data. This article synthesises evidence from the literature on the complex historical, social and ethical terrain in which attempts to diversify data are located and highlights how merely diversifying genomic data is not sufficient, but it must be done so to a high ethical standard in order to ultimately reduce inequities in genomic medicine.
Introduction
This research is situated within the wider studies that explore ethical considerations surrounding genomic technologies and practices as well as the ethical issues related to diversity across broader health studies (Duster Reference Duster2003, Reference Duster2015; M’Charek, Reference M’Charek2005; Fullwiley, Reference Fullwiley2007; Hammonds and Herzig, Reference Hammonds and Reverby2008; Fujimura and Rajagopalan, Reference Fujimura and Rajagopalan2011; Nelson, Reference Nelson2016). We start from the premise that the majority of genomic data repositories have been sourced from individuals of Northern-European ancestry, which has created a significant gap in our understanding of the role of genetics in health and disease for a global population (Aicardi et al., Reference Aicardi, Del Savio, Dove, Lucivero, Tempini and Prainsack2016; Popejoy and Fullerton, Reference Popejoy and Fullerton2016; Sirugo et al., Reference Sirugo, Williams and Tishkoff2019; Mills and Rahal, Reference Mills and Rahal2020). The impact of the overrepresentation of Northern-European ancestral groups in well-established data repositories, which are often used more readily in research (because of the years of linked data they contain) is far-reaching. It may reduce the generalisability of findings, due to poorer understandings about what variants are common or rare across the underrepresented populations (Petrovski and Goldstein, Reference Petrovski and Goldstein2016, Caswell-Jin et al., Reference Caswell-Jin, Gupta, Hall, Petrovchich, Mills, Kingham, Koff, Chun, Levonian, Lebensohn, Ford and Kurian2018; Kurian et al., Reference Kurian, Ward, Hamilton, Deapen, Abrahamse, Bondarenko, Li, Hawley, Morrow, Jagsi and Katz2018); or it may limit our ability to gain insights about genetic variations in specific ancestries and this in turn can lead to erroneous conclusions around disease pathogenicity (Need and Goldstein, Reference Need and Goldstein2009; Bustamante et al., Reference Wade, López-Beltrán, Restrepo and Santos2011; Petrovski and Goldstein, Reference Petrovski and Goldstein2016). For example, Manrai et al. (Reference Manrai, Funke, Rehm, Olesen, Maron, Szolovits, Margulies, Loscalzo and Kohane2016) demonstrated that genetic variants in hypertrophic cardiomyopathy were wrongly classified as disease-causing due to their rareness in predominantly European datasets, while their prevalence in a global population made disease causation unlikely.
As a result, the recognition of the bias in genomic datasets has led to calls to improve diversity in genomic data (Green et al., Reference Green and Guyer2011; Hindorff et al., Reference Hindorff, Bonham and Ohno-Machado2018; Popejoy et al., Reference Popejoy, Ritter, Crooks, Currey, Fullerton, Hindorff, Koenig, Ramos, Sorokin, Wand, Wright, Zou, Gignoux, Bonham, Plon and Bustamante2018; Fatumo et al., Reference Fatumo, Chikowore, Choudhury, Ayub, Martin and Kuchenbaecker2022). The word diversity is used variably – to denote a range in ethnicity, racial categories, ancestral groups, age, gender, sexual orientation, language, education, access to care, socioeconomic status, social class, disabilities, geography or any other shared characteristics in underrepresented populations. However, in the context of calls for diversity in genomics, diversity is often used in relation to genetic ancestry (and how our ancestors migrated across the globe over millions of years).
The calls to diversity present a range of challenges related to the social, political and historical terrain in which they are situated (Ilkilic and Paul, Reference Ilkilic and Paul2009; George et al., Reference George, Duran and Norris2014; Reardon, Reference Reardon2017). In this article we aimed to identify the ethical issues associated with diversifying data in order to develop new approaches to address them.
Methods
We conducted a qualitative evidence synthesis to investigate the ethical issues surrounding the diversification of genomic data, specifically the inclusion of individuals from historically underserved populations, ethnic and racial minoritised groups, and those experiencing ongoing racial and/or intersectional disadvantage in genomic and wider health studies. An interdisciplinary team with backgrounds ranging from sociology, science and technology studies, sociology of race and ethnicity, philosophy and anthropology, to clinical genetics and genomics medicine statistics undertook the study between March and May 2022 and synthesised evidence in three stages.
Rapid review
We drew on methods of systematic reviews to search for eligible empirical studies on electronic databases, across academic and grey literature (including editorials and conference presentations). We conducted the search using OVID Embase, The Social Science Premium Collection and Web of Science databases (see thesaurus and free text search terms in Supplementary material S1). We applied date and language filters to include English articles that were published between 1st January 2000 and 26th February 2022 and were readily available electronically through institutional subscriptions/direct from the author. We outlined the inclusion criteria (Table 1) using Strech et al.’s (Reference Strech, Synofzik and Marckmann2008) Methodology, Issues, Participants (MIP) model and Butler et al.’s (Reference Butler, Hall and Copnell2016) guide, which were developed iteratively with two researchers piloting 30 abstracts to test and adjust eligibility.
In total, 100 articles were included in the rapid review (see Figure 1 for the process, and Supplementary material S2 for full list). The PRISMA-S checklist was used to guide the literature search and reporting on the process (Rethlefsen et al., Reference Rethlefsen, Kirtley, Waffenschmidt, Ayala, Moher, Page and Koffel2021).
We collaboratively designed and piloted data extraction forms, and thematically analysed the extracted data in meetings using thematic analysis methods (Braun and Clark, Reference Braun and Clarke2012; Terry et al., Reference Terry, Hayfield, Clarke, Braun, Willig and Stainton Rogers2017). The extracted data included any participant concernsFootnote 1 about participation in health and genomics studies that was discussed in the findings, discussions or conclusion sections of the articles, as well as authors’ ethical concerns raised in all sections of the articles.
Diverse data ethics workshop
We presented the preliminary themes generated during the rapid review at an online expert workshop in May 2022. The workshop was attended by seven international academics across the fields of medical ethics and bioethics, women’s studies and health promotion, sociology and law, most of whom have been involved in past or current initiatives that attempt(ed) to diversify genomic data. The workshop aimed to consult with key academic experts in the field about the preliminary findings of the review and to identify gaps in the literature. Experts were all female academics affiliated with universities in the United States of America, the United Kingdom, and Australia. The workshop inherited the weakness of the rapid review, in that the invited academics were from English-speaking countries whose work in the field we were familiar with through the rapid review or beyond. The findings of the review, therefore, mainly stem from authors and workshop experts located in a few countries from the Global North.
Other attendees included four members of Genomics England’s Diverse Data initiative, colleagues from the PHG Foundation, colleagues from the University of Oxford with research expertise at the intersection of health/genomics and ethics, and the members of the review team (n = 17). The workshop explored the themes generated during the rapid review, focusing on the complexity of the topic, especially because some of the issues we anticipated did not appear in empirical literature and may be embedded and hidden within research practices or wider social structures and systems. Conversations were recorded, transcribed and analysed collaboratively by team members to generate key themes.
Post-workshop narrative review
We conducted a post-workshop narrative review to supplement the rapid review and workshop discussions. As Greenhalgh et al. (Reference Greenhalgh, Thorne and Malterud2018) argue, systematic reviews are focused and have summative value, whilst narrative reviews focus on the more interpretative and critical stances designed to enhance understanding. Our rapid review drew on elements of systematic reviews and therefore we considered that our synthesis would benefit from an additional narrative review. Moreover, the search strategy of the rapid review was limited to articles that had genomics and related words in their title and abstract. However, during the screening, it was realised that some of the expected ethical issues were only discussed in the wider health research literature.
The narrative review built on the key themes from the workshop and our research group’s knowledge-base that were missing from the rapid review. We also searched for themes generated from the discussions in the workshop on Google Scholar in the wider health studies. The transcripts of the workshop, including workshop discussions of researchers within our research group, were analysed to identify key themes. These themes were then compared with those themes that emerged from the literature review. For similar themes, any additional issues emerging from the workshop were incorporated. New themes were added to the literature review. For these themes, we conducted snowballing to expand on these newer themes based on discussions of relevant literature supplied by the workshop participants.
Findings
Analysing themes from the rapid review, the expert workshop and the narrative review, we found that ethical issues are interconnected across structural factors and research practices. Structural issues include those related to the politics of knowledge production, existing inequities, and their effects on how the harms and benefits of genomics are distributed. Issues related to research practices include those around reflexivity, exploitative dynamics and prioritising meaningful co-production. In what follows we start by detailing structural issues.
Structural issues
Our synthesis identified two key themes related to the structure of the research from which ethical issues may arise. These key themes are the politics of knowledge production and the implications of existing inequities:
Politics of knowledge production
Our findings showed how the ethical issues related to the structure of research might arise from a failure to recognise and engage with the politics of knowledge production – that is to say, the ways in which knowledge is produced, validated and disseminated, and how these processes are influenced by social, economic, political and cultural factors. Ethical issues may arise from overlooking the politics of knowledge production in different ways:
(1) Data, categorisation and neutrality
The perception of viewing data and technologies as neutral and objective was discussed during the workshop. This perception could prevent researchers from interrogating classification systems, categorisation methods and research designs. In turn, these are key in unpacking societal values embedded in technologies and, if ignored, can risk perpetuating social biases and inequalities. The narrative review echoed these concerns, emphasising that data and technologies cannot be separated from their social context and tend to reflect biases and social inequalities (Bowker and Star, Reference Bowker and Star2000; Gitelman, Reference Gitelman2013; Benjamin, Reference Benjamin2019; Ruppert and Scheel, Reference Ruppert and Scheel2021). For example, classification systems and technical tools used for categorising populations are not neutral and need to be closely examined (Bowker and Star, Reference Bowker and Star2000).Footnote 2 This includes common racial and ethnic categories used for recruiting individuals from underserved groups (Popejoy, Reference Popejoy2022), as well as the concept of genetic ancestry used for genomic analysis (Lewis et al., Reference Lewis, Molina, Appelbaum, Dauda, Di Rienzo, Fuentes, Fullerton, Garrison, Ghosh, Hammonds, Jones, Kenny, Kraft, Lee, Mauro, Novembre, Panofsky, Sohail, Neale and Allen2022). Whilst self-reported racial and ethnic categories can be helpful for studying health inequalities,Footnote 3 they should not be used as mappings based on genetic variation (Shim et al., Reference Shim, Ackerman, Darling, Hiatt and Lee2014),Footnote 4 and therefore, may not help in studying genetic variation across populations.Footnote 5 The narrative review highlighted the need to consider the political implications of such commonly used methods in research. For example, research design might reflect methodological “whiteness,” which fails to acknowledge the role of race in the structuring of the world and knowledge construction (Bhambra, Reference Bhambra2017) in Rai et al. (Reference Rai, H, McManus and Pope2022, p. 4).
(2) Misconceptions of race as a biological category
The rapid review stressed that using social categories in genetic research without considering their contingent and complex nature can lead to misconceptions that race and ethnicity are biological constructs which in turn can perpetuate the stereotyping and objectification of certain groups (Ali-Khan and Daar, Reference Ali-Khan and Daar2010, pp. 26–27; Singh and Steeves, Reference Singh and Steeves2020). Similarly, the narrative review included arguments advocating the need to critically evaluate the use of race in genetic research, explaining that human genetic variation is not adequately captured by social classifications such as race and ethnicity, as there is often greater genetic variation within groups than between them (Lewontin, Reference Lewontin, Dobzhansky, Hecht and Steere1972; Tishkoff and Kidd, Reference Tishkoff and Kidd2004). Despite anti-racist agendas, it was highlighted that genomic research can inadvertently reinforce race as a biological concept when social categories are employed to diversify genomic data (Wade et al., Reference Wade, López-Beltrán, Restrepo and Santos2015, p. 777). For example, clustering genetic ancestry by continent can contribute to the reification of racial categories or increase the likelihood of stereotyping (Lewis et al., Reference Lewis, Molina, Appelbaum, Dauda, Di Rienzo, Fuentes, Fullerton, Garrison, Ghosh, Hammonds, Jones, Kenny, Kraft, Lee, Mauro, Novembre, Panofsky, Sohail, Neale and Allen2022). It is therefore important to be aware of the potential consequences of using social categories in genetic research and to strive for more equitable approaches to understanding genetic variation (Lewis et al., Reference Lewis, Molina, Appelbaum, Dauda, Di Rienzo, Fuentes, Fullerton, Garrison, Ghosh, Hammonds, Jones, Kenny, Kraft, Lee, Mauro, Novembre, Panofsky, Sohail, Neale and Allen2022).
Existing inequities
The effect of underlying power imbalances and existing inequities on the distribution of harms and benefits of research was identified as a theme in both reviews and workshop discussions. Socioeconomic factors like race, ethnicity, social class, citizenship and cultural capital affect participants’ ability to access research benefits (Schulz et al., Reference Schulz, Caldwell and Foster2003), whilst the organisational structure of healthcare services may exclude underserved groups (Halford et al., Reference Halford, Fuller, Lyle and Taylor2019), and curtail targeted health interventions from genomic research for these groups (Hammonds and Reverby, Reference Hammonds and Reverby2019). Moreover, people from underserved groups may endure specific harms such as structural racism and legacies of colonialism that can be grouped into three subthemes.
(1) Legacies of colonialism and structural racism
The workshop and narrative review highlighted the influence of historical trajectories of structural racism, legacies of colonialism and unethical conduct on current experiences of participating in biomedical studies (Harry and Dukepoo, Reference Harry and Dukepoo1998; Bowekaty and Davis, Reference Bowekaty and Davis2003; Strickland, Reference Strickland2006; Washington, Reference Washington2006; Christopher et al., Reference Christopher, Saha, Lachapelle, Jennings, Colclough, Cooper, Cummins, Eggers, Fourstar, Harris, Kuntz, Lafromboise, Laveaux, McDonald, Bird, Rink and Webster2011; Harding et al., Reference Harding, Harper, Stone, Neill Catherine, Berger, Harris and Donatuto2012; Hodge, Reference Hodge2012; Kelley et al., Reference Kelley, Belcourt-Dittloff, Belcourt and Belcourt2013; Morton et al., Reference Morton, Proudfit, Calac, Portillo, Lofton-Fitzsimmons, Molina, Flores, Lawson-Risso and Majel-McCauley2013). The study of genetics has itself played a part in perpetuating racism (Roberts, Reference Roberts2011) and has been used to support racist ideologies (ASHG, 2018). Sometimes this has been explicit; for instance, white nationalists have attempted to use genetic ancestry testing to advance their claims of racial superiority (Harmon, Reference Harmon2017; Panofsky and Donovan, Reference Panofsky and Donovan2019). However, colonial practices have also been perpetuated more inadvertently: The Human Genome Diversity Project (HGDP), which aimed to explore global human genetic diversity, was criticised for resembling activities of European colonialists and had long-lasting implications for trust in researchers (Dodson and Williamson, Reference Dodson and Williamson1999; Greely, Reference Greely2001; TallBear, Reference TallBear2007; Roberts, Reference Roberts2011; Claw et al., Reference Claw, Anderson, Begay, Tsosie, Fox and Garrison2018).
(2) Barriers to participate and benefit from research
The rapid review highlighted that trust issues can be worsened if participants’ healthcare needs are deprioritised in research, especially if genomic services are limited or unaffordable to certain groups (Hiratsuka et al., Reference Hiratsuka, Hahn, Woodbury, Hull, Wilson, Bonham, Dillard, Avey, Beckel-Mitchener, Blome, Claw, Ferucci, Gachupin, Ghazarian, Hindorff, Jooma, Trinidad, Troyer and Walajahi2020). Low participation rates of underserved groups in biomedical research were understood in the narrative review and workshop discussions as not solely due to mistrust in institutions or researcher–participant relations (Katz et al., Reference Katz, Green, Kressin, Claudio, Wang and Russell2007, Reference Katz, Kegeles, Kressin, Green, James, Wang, Russell and Claudio2008; Fisher and Kalbaugh, Reference Fisher and Kalbaugh2011). Rather, structural issues associated with limited access to healthcare services, biased assumptions by healthcare professionals and the need for translation services were considered as potential contributors (Fisher and Kalbaugh, Reference Fisher and Kalbaugh2011; Shim et al., Reference Shim, Bentz, Vasquez, Jeske, Saperstein, Fullerton, Foti, McMahon and Lee2022). Ongoing efforts were deemed necessary to establish trustworthiness (Strickland, Reference Strickland2006; Reverby, Reference Reverby2009).
(3) Diversity in the workforce
Both reviews and the workshop discussions highlighted that underrepresentation of diverse ethnic groups in the genomic workforce and lack of diversity amongst genomic researchers (Bentley et al., Reference Bentley, Callier and Rotimi2020; Lewis-Fernández et al., Reference Lewis-Fernández, Coombs, Balán and Interian2018) play their own part in perpetuating inequities. A diverse workforce was considered crucial for reducing inequities in healthcare and scientific research and realising the promise of genomics (Aviles-Santa et al., Reference Aviles-Santa, Heintzman, Lindberg, Guerrero-Preston, Ramos, Abraido-Lanza, Bull, Falcón, McBurnie, Moy, Papanicolaou, Piña, Popovic, Suglia and Vázquez2017; Atkins et al., Reference Atkins, Kelly, Johnson, Williams, Nelson, Joseph, Jackson, King, Stellmacher, Halty, Tinglin and Gage2020; Hiratsuka et al., Reference Hiratsuka, Hahn, Woodbury, Hull, Wilson, Bonham, Dillard, Avey, Beckel-Mitchener, Blome, Claw, Ferucci, Gachupin, Ghazarian, Hindorff, Jooma, Trinidad, Troyer and Walajahi2020; Bonham and Green, Reference Bonham and Green2021), as well as enhancing innovation and creativity that results from more varied lived experiences and perspectives (Lee et al., Reference Lee, Cho, Kraft, Varsava, Gillespie, Ormond, Wilfond and Magnus2019). The absence of diversity in the workforce has the potential to lead to a loss of voices in developing hypotheses and leading research (Bentley et al., Reference Bentley, Callier and Rotimi2020; Bonham and Green, Reference Bonham and Green2021). The need for a supportive environment and management was perceived necessary for sustaining this diversity. Studies warned about tokenistic attempts at diversification whereby existing power structures and hierarchies remain unchallenged, leading to staff from underserved groups being overburdened with addressing diversity issues (Taylor and de Mendoza, Reference Taylor and de Mendoza2018; Ahsan, Reference Ahsan2022; Jeske et al., Reference Jeske, Vasquez, Fullerton, Saperstein, Bentz, Foti, Shim and Lee2022).
Issues surrounding research practices
Our synthesis identified three key themes related to the practice of research from which ethical issues may arise: (a) reflexivity (b) exploitative practices and (c) co-production and engagement.
Reflexivity
Our findings highlighted how ethical issues related to research practice might arise from a lack of researcher reflexivity. This can occur in four main ways.
(1) Cultural humility
Cultural factors can impact people’s attitudes towards biobanking and the sharing of genomic data (Abadie and Heaney, Reference Abadie and Heaney2015; Anie et al., Reference Anie, Olayemi, Paintsil, Owusu-Dabo, Adeyemo, Sani, Galadanci, Nnodu, Tluway, Adjei, Mensah, Sarfo-Antwi, Nwokobia, Gambo, Benjamin, Salim, Osae- Larbi and Ofori-Acquah2021; Canedo et al., Reference Canedo, Wilkins, Senft, Romero, Bonnet and Schlundt2020; Haring et al., Reference Haring, Henry, Hudson, Rodriguez and Taualii2018; Hiratsuka et al., 2020; Lysaght et al., Reference Lysaght, Ballantyne, Xafis, Ong, Schaefer, Ling, Newson, Khor and Tai2020), as well as access to medical help (Atkins et al., Reference Atkins, Kelly, Johnson, Williams, Nelson, Joseph, Jackson, King, Stellmacher, Halty, Tinglin and Gage2020) and affecting health outcomes more generally (Aviles-Santa et al., Reference Aviles-Santa, Heintzman, Lindberg, Guerrero-Preston, Ramos, Abraido-Lanza, Bull, Falcón, McBurnie, Moy, Papanicolaou, Piña, Popovic, Suglia and Vázquez2017). Incorporating cultural values in research practices was perceived necessary for improving diversity (Jacobs et al., Reference Jacobs, Roffenbender, Collmann, Cherry, Bitsói, Bassett and Evans2010; Aviles-Santa et al., Reference Aviles-Santa, Heintzman, Lindberg, Guerrero-Preston, Ramos, Abraido-Lanza, Bull, Falcón, McBurnie, Moy, Papanicolaou, Piña, Popovic, Suglia and Vázquez2017; Haring et al., Reference Haring, Henry, Hudson, Rodriguez and Taualii2018; Kraft et al., Reference Kraft, Cho, Gillespie, Halley, Varsava, Ormond, Luft, Wilfond and Lee2018; Bentley et al., Reference Bentley, Callier and Rotimi2020; Hiratsuka et al., Reference Hiratsuka, Hahn, Woodbury, Hull, Wilson, Bonham, Dillard, Avey, Beckel-Mitchener, Blome, Claw, Ferucci, Gachupin, Ghazarian, Hindorff, Jooma, Trinidad, Troyer and Walajahi2020; Hendricks-Sturrup and Johnson-Glover, Reference Hausman2021; Fatumo et al., Reference Fatumo, Chikowore, Choudhury, Ayub, Martin and Kuchenbaecker2022). However, some highlighted that using cultural factors for stereotyping and blaming patients for mismanaging disease (Bell et al., Reference Bell, Odumosu, Martinez-Hume, Howard and Hunt2019) should be avoided. Others aspired to integrate cultural factors in their research practices. For example, Beaton et al. (Reference Beaton, Hudson, Milne, Port, Russell, Smith, Toki, Uerata, Wilcox, Bartholomew and Wihongi2017) described a framework for incorporating cultural values in the design of genomic research, and Bonham et al. (Reference Bonham, Citrin, Modell, Franklin, Bleicher and Fleck2009) discussed how deliberation and participatory research methods can be culturally tailored to empower participants to generate policy recommendations.
The workshop discussions and narrative review confirmed the significance of cultural context in research (Arbour and Cook, Reference Arbour and Cook2006; Ilkilic & Paul, Reference Ilkilic and Paul2009), and in clinical practice (Warren and Wilson, Reference Warren and Wilson2013), and advocated prioritising local cultural valuesFootnote 6 and improving cultural humility (Sabatello et al., Reference Sabatello, Blake, Chao, Silverman, Mazzoni, Zhang, Chen and Appelbaum2019). Cultural humility refers to the practice of self-reflection (Tervalon and Murray-García, Reference Tervalon and Murray-García1998), and “learning our own biases, being open to others’ cultures, and committing ourselves to authentic partnership and redressing power imbalances” (Minkler, Reference Minkler2012, p. 6). It emphasises the importance of reflexivity, active listening and taking responsibility for interactions on the side of researchers and research institutions (Minkler, Reference Minkler2012; Isaacson, Reference Isaacson2014; Sabatello et al., Reference Sabatello, Blake, Chao, Silverman, Mazzoni, Zhang, Chen and Appelbaum2019). Many also advocated prioritising local cultural values and accommodating collective considerations, in addition to individual autonomy, in research practices (Emanuel and Weijer, Reference Emanuel, Weijer, Childress, Meslin and Shapiro2005; Tsosie et al., Reference Tsosie, Yracheta and Dickenson2019).Footnote 7
(2) AccessibilityFootnote 8
Both reviews and workshop discussions emphasised the importance of adapting research practices to the needs of different groups and designing accessible communication strategies that ensure critical information is conveyed clearly and effectively (Kobayashi et al., Reference Kobayashi, Boudreault, Hill, Sinsheimer and Christina2013; Campbell et al., Reference Campbell, Susser, Mall, Mqulwana, Mndini, Ntola, Nagdee, Zingela, Van Wyk and Stein2017; Kraft and Doerr, Reference Kraft and Doerr2018; Sabatello et al., Reference Sabatello, Blake, Chao, Silverman, Mazzoni, Zhang, Chen and Appelbaum2019; Hendricks-Sturrup and Johnson-Glover, Reference Hausman2021; Uebergang et al., Reference Uebergang, Best, de Silva and Finlay2021; Garofalo et al., Reference Garofalo, Rosenblum, Zhang, Chen, Appelbaum and Sabatello2022). Such communication strategies were thought to improve the trustworthiness of research (Blanchard et al., Reference Blanchard, Hiratsuka, Beans, Lund, Saunkeah, Yracheta, Woodbury, Blacksher, Peercy, Ketchum, Byars and Spicer2020). However, it was also reported that critical information on genomic health research is sometimes communicated in ways that can cause confusion and misunderstandings for participants, posing barriers for participation in genomic research (Garofalo et al., Reference Garofalo, Rosenblum, Zhang, Chen, Appelbaum and Sabatello2022). Inaccessible facilities, information, transportation and other systematic and institutional factors were reported as barriers to access and participation for people with disabilities (Sabatello et al., Reference Sabatello, Chen, Zhang and Appelbaum2019; Garofalo et al., Reference Garofalo, Rosenblum, Zhang, Chen, Appelbaum and Sabatello2022).
(3) Contextualising participants’ concerns
The rapid review reported concerns about the assumptions made regarding non-participation in genomic studies. Concerns included those related to privacy (Buseh et al., 2013; Abadie and Heaney, Reference Abadie and Heaney2015; Simon et al., Reference Simon, Tom and Dong2017; Garrison et al., Reference Garrison, Barton, Porter, Mai, Burke and Carroll2019; Lee et al., Reference Lee, Cho, Kraft, Varsava, Gillespie, Ormond, Wilfond and Magnus2019; Reddy et al., Reference Reddy, Amarnani, Chen, Dynes, Flores, Moshchinsky, Lee, Kurbatov, Shapira, Vignesh and Martello2020; De Ver Dye et al., Reference De Ver Dye, Tavarez, Ramos, Fernandez, Vega, Ocasio, Avendaño, Cardona Cordero, Hering, Dozier and Groth2021; Hendricks-Sturrup and Johnson-Glover, Reference Hausman2021), stigmatisation (Marsh et al., Reference Marsh, Kombe, Fitzpatrick, Williams, Parker and Molyneux2013; Abadie and Heaney, Reference Abadie and Heaney2015; Faure et al., Reference Faure, Matshabane, Marshall, Appelbaum, Stein, Engel and de Vries2019), commodification of data leading to dispossession (Abadie and Heaney, Reference Abadie and Heaney2015) and re-use of data beyond the scope of the original researchFootnote 9 (de Vries et al., Reference de Vries, Williams, Bojang, Kwiatkowski, Fitzpatrick and Parker2014); for example, by commercialisation of the research and unjust corporate profiteering (Lee et al, Reference Lee, Cho, Kraft, Varsava, Gillespie, Ormond, Wilfond and Magnus2019). It was noted that whilst such concerns may be common amongst other groups, they might be heightened for those from underserved groups due to experiences of stigmatisation, discrimination and prejudicial judgement (Abadie and Heaney, Reference Abadie and Heaney2015),Footnote 10 particularly in cases of disease-related stigma (Ali-Khan and Daar, Reference Ali-Khan and Daar2010; Faure et al., Reference Faure, Matshabane, Marshall, Appelbaum, Stein, Engel and de Vries2019). For example, Schulz et al. (Reference Schulz, Caldwell and Foster2003, p. 165) described that “concerns…included the risk that the racial or ethnic group as a whole would become identified with one or more genetic condition and that this identification would lead to discrimination and further inequality.” The potential harms from stigmatisation may be felt immediately within groups, whereas the benefits of genomic research may take much longer to materialise (Beaton et al., Reference Beaton, Hudson, Milne, Port, Russell, Smith, Toki, Uerata, Wilcox, Bartholomew and Wihongi2017). Furthermore, even when the benefits of the research are more immediate, wider socioeconomic factors may affect people’s ability to access those benefits (Schulz et al., Reference Schulz, Caldwell and Foster2003).
(4) Conceptual clarity
The workshop discussions and the narrative review highlighted the difficulty of measuring diversity, and using any such measurements in different contexts. When discussing the need for diversity in genomic data, it is often implied that we are talking about ancestral diversity (Popejoy et al., Reference Popejoy, Ritter, Crooks, Currey, Fullerton, Hindorff, Koenig, Ramos, Sorokin, Wand, Wright, Zou, Gignoux, Bonham, Plon and Bustamante2018; Mills and Rahal, Reference Mills and Rahal2020). However, there is a lack of conceptual clarity in the language of race, ethnicity and ancestry in genomic studies (Bonham et al., Reference Bonham, Citrin, Modell, Franklin, Bleicher and Fleck2009; Bonham et al., Reference Bonham, Green and Pérez-Stable2018; Birney et al., Reference Birney, Inouye, Raff, Rutherford and Scally2021; Khan et al., Reference Khan, Gogarten, McHugh, Stilp, Sofer, Bowers, Wong, Cupples, Hidalgo, Johnson, McDonald, McGarvey, Taylor, Fullerton, Conomos and Nelson2021). While the use of these terms is evolving (Flanagin et al. Reference Flanagin, Frey and Christiansen2021; Khan et al., Reference Khan, Gogarten, McHugh, Stilp, Sofer, Bowers, Wong, Cupples, Hidalgo, Johnson, McDonald, McGarvey, Taylor, Fullerton, Conomos and Nelson2021), differences in when, where and how they are used remains (Hunt and Megyesi, Reference Hunt and Megyesi2008). There is a tendency to use genetic (biogeographical) ancestry and ethnicity/race interchangeably, leading to conflation between socially constructed notions of race and ethnicity that are tied to identity and biological categories of ancestry (Armitage, Reference Armitage2020). Similarly, terms such as “population” and “community” are also often used without interrogating how they are conceptualised. For example, community might be used to refer to a group of people with geographic proximity, shared characteristics or shared lived experiences (M’charek, 2000).
Exploitative practices
The history of medical research is rife with scandals that harmed individuals and groups.Footnote 11 The narrative review found concerns about “ethics dumping” – where privileged researchers outsource ethically questionable research activity to lower-income or less-privileged settings with less oversight (Nature Editorial, 2022). Concerns were raised about exploitative and inequitable dynamics when researchers from high-income countries work with participants from lower-income countries (Igbe and Adebamowo, Reference Igbe and Adebamowo2012; de Vries et al., Reference de Vries, Williams, Bojang, Kwiatkowski, Fitzpatrick and Parker2014) and in the absence of adequate and culturally appropriate oversight (Tiffin, Reference Tiffin2019). Specifically, without commitment to capacity building, researchers may take advantage of funding and programs from developing regions without contributing to the larger objectives of local communities (Mulder et al., Reference Mulder, Abimiku, Adebamowo, de Vries, Matimba, Olowoyo, Ramsay, Skelton and Stein2018), nor passing them the full benefit of the research (Bentley et al., Reference Bentley, Callier and Rotimi2020).
Co-production and engagement
The narrative review highlighted that a reductionist approach to participant engagementFootnote 12 – one that prioritises, or is limited to, recruitment – can worsen existing and create new forms of inequalities (Moodley and Beyer Reference Moodley and Beyer2019). In their critical reflections about a study that formed part of a randomised control trial, Rai et al. (2022) point to the ways in which standard approaches to participant recruitment prioritise speed and volume of recruitment, with little scope for investing time in more community-based approaches centred on relationship building.Footnote 13 Instead, engagement must be long term and regularly evaluated (US National Academy of Medicine, 2022). Furthermore, limiting engagement to the recruitment stage and applying market research tools and strategies in recruitment such as demographic targeting (Epstein, Reference Epstein2008; Cooper and Waldby, Reference Cooper and Waldby2014) can overlook the fact that often barriers to participation are more structural. Conflating recruitment with engagement can lead to further alienation of groups that are already impacted by historical injustices and, consequently, have implications for trust (Ferryman and Pitcan, Reference Ferryman and Pitcan2018).
The workshop discussions highlighted the significance of acknowledging participants as active researchers and knowledge producers, and emphasised the need for co-production of research together with potential participants. This was suggested to help identify and avoid potential problems around data diversification. The narrative review also revealed the role of academic journals in driving change, as many now take a stand against research practices that only involve local researchers in the research process during recruitment (Nature Editorial, 2022).
Various studies in both reviews advocated community engagement throughout research processes (Boyer et al., Reference Boyer, Dillard, El Woodahl, Thummel and Burke2011; Chadwick et al., Reference Chadwick, Copeland, Daniel, Erb-Alvarez, Felton, Khan, Saunkeah, Wharton and Payan2014; Beans et al., Reference Beans, Saunkeah, Woodbury, Ketchum, Spicer and Hiratsuka2019; Tsosie et al., Reference Tsosie, Yracheta and Dickenson2019; Blanchard et al. Reference Blanchard, Hiratsuka, Beans, Lund, Saunkeah, Yracheta, Woodbury, Blacksher, Peercy, Ketchum, Byars and Spicer2020; Hiratsuka et al., Reference Hiratsuka, Hahn, Woodbury, Hull, Wilson, Bonham, Dillard, Avey, Beckel-Mitchener, Blome, Claw, Ferucci, Gachupin, Ghazarian, Hindorff, Jooma, Trinidad, Troyer and Walajahi2020; Hudson et al., Reference Hudson, Garrison, Sterling, Caron, Fox, Yracheta, Anderson, Wilcox, Arbour, Brown, Taualii, Kukutai, Haring, Te Aika, Baynam, Dearden, Chagné, Malhi, Garba, Tiffin, Bolnick, Stott, Rolleston, Ballantyne, Lovett, David-Chavez, Martinez, Sporle, Walter, Reading and Carroll2020Kaladharan et al., Reference Kaladharan, Vidgen, Pearson, Donoghue, Whiteman, Waddell and Pratt2021), and some incorporated it in the design, development and implementation of their studies (Hiratsuka et al., Reference Hiratsuka, Brown and Dillard2012). The rapid review touched upon the lessons learnt from research initiatives that aspired to prioritise co-production. For example, Kowal (Reference Kowal2019) outlined the ethical issues involved in co-producing the “first Indigenous-governed genome facility in the world” − the National Center for Indigenous Genomics (NCIG), with biosamples held at the Australian National University (ANU).
Limitations
We noted some limitations to our review. Firstly, the rapid review search resulted in papers that were mostly from the USA. Furthermore, the search mainly focused on underrepresentation that was based on gender, race and ethnicity, leaving out other (sometimes) underserved groups such as children, older people, people with mental health conditions, prisoners and so on. Secondly, whilst those invited to the workshop were experts in the field, other key voices such as those from low and middle income countries, and non-English speakers were missing from the workshop due to time and budget limitations. In this sense, the workshop inherited the weakness of the rapid review, in that the invited academics were from English-speaking countries whose work in the field we were familiar with through the rapid review or beyond. The findings of the review, therefore, mainly stem from authors and workshop experts located in a few countries from the Global North and were not first-hand experiences of underserved individuals.
Conclusion
The evidence synthesis identified a number of ethical issues arising from the structure and practice of research. Although structural issues partially inhibit researchers and participants from ethically diversifying genomic data, researchers can and should develop new approaches that improve current practices: Mistrust due to past unethical research conduct, different definitions of knowledge and a tendency to seek technical solutions amongst other factors contribute to the lack of diversity in current genomic repositories. Incorporating cultural humility can help improve the inclusivity and diversity of health and genomic studies. Co-production approaches can also help mitigate some of the ethical issues, and lack of them can worsen existing power imbalances. Improving reflexivity of practices by researchers and research institutions can also help avoid exacerbating existing issues.
Our findings demonstrate that diversifying the data on its own is not enough for addressing health inequities, and diversity must be approached holistically to confront unethical practices by researchers, academic institutions, funding bodies, academic journals and policymakers. Therefore, efforts are needed to diversify data as well as empowerment of underserved groups and engagement with structural issues to address wider inequities. We conclude it is essential to co-create knowledge with potential participants and ensure that the benefits of that knowledge are fed back to diverse populations. To diversify genomics as an enterprise, ethical preparedness must be valued and facilitated, and research cultures established that encourage engagement with ethical issues. Cross-fertilisation of ideas between researchers, participants and theorists is essential for facilitating ethical preparedness (Farsides and Lucassen, Reference Farsides and Lucassen2023). Moreover, interdisciplinary collaborations that accommodate working with different knowledge systems can help go beyond diverse data and towards diverse knowledge making.
In conclusion, it is necessary to broaden the scope of diversity beyond data, and engagement beyond recruitment, to encompass all stages of research, from forming the research questions, to analysis, dissemination and governance.
Open peer review
To view the open peer review materials for this article, please visit http://doi.org/10.1017/pcm.2023.20.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/pcm.2023.20.
Acknowledgements
We wish to thank Prof Jenny Reardon, Dr. Alice Popejoy, Prof Emma Kowal, Dr. Krystal S. Tsosie, Dr. Jenny Douglas, Dr. Maya Sabatello, Dr. Colin Mitchell, Alison Hall, Prof Catherine Pope, Prof Donna Dickenson and Dr. Arzoo Ahmed; for participating in the workshop; many of them also commented on the early stages of this work. We also wish to thank Dr. Helena Carley, Dr. Natalie Banner, Prof Karoline Kuchenbaecker, the Diverse Data team at Genomics England and Bana Alamad for their comments on the earlier drafts, and Vicky Fenerty for their comments on the rapid review’s search strategy.
Author contribution
F.H., K.L., R.H., G.S., S.W., L.B., and A.L. contributed to conceiving the research. F.H., G.S., L.B., R.T., L.V.D.P.T., J.D.G.U., D.K., T.J., N.T-W., E.R.H., F.R.A., Y.E., E.H. and A.L. contributed to the rapid review. All contributed to the workshop. F.H., K.L., R.H., G.S., S.W., L.B., A.L. contributed to the narrative review. F.H., K.L., R.H., G.S., S.W., L.B., M.M. and A.L. were involved in drafting and editing the paper. K.S.T., P.D., M.M., L.N. and A.L. commented on the earlier drafts that shaped the paper. A.L. oversaw the project.
Financial support
A review on the ethical, legal and social issues around diversifying genomic data was commissioned by Genomics England. Additional financial support came through work funded by Wellcome trust grant numbers 205,339/A/16/Z and 208,053/B/17/Z.
Competing interest
This review was commissioned by the Diverse Data initiative at Genomics England in January 2022 to gather evidence and learnings from previous data diversification efforts, to inform the initiative’s design. M.M. is the Programme Lead, and L.N. is the Ethics Lead for the initiative. M.M. and L.N. took part in the workshops and commented on drafts of the paper.
Comments
No accompanying comment.