The transparency revolution has swept across the social sciences. Within political science, qualitative researchers have been divided about the value of increasing forms of transparency for qualitative data derived from interviews, focus groups, and participant observations. These discussions often talk past one another and do not systematically analyze current practices. Furthermore, these debates have not acknowledged that the costs and benefits of transparency initiatives within qualitative research differ across data types. The result has been a mischaracterization of the status quo, a failure to provide a blueprint for what transparency practices are feasible, and a lack of specificity about the costs and benefits of changing standards for research openness. In this article, we take the case of text-based sources, a data type used by scholars across the social sciences, to show that current transparency practices are insufficient, and we provide actionable guidelines for how to improve them.
Discussions about increasing transparency, or research openness, in political science began when Gary King called for the discipline to be held to the standard of replicability (Reference King1995, 444). Various initiatives followed, including the American Political Science Association (APSA)’s guidelines for the adoption of Data Access and Research Transparency (DA-RT) (APSA 2012, 9–10), new transparency technologies and institutions (Moravcsik Reference Moravcsik2014, 50–52; Kapiszewski and Karcher Reference Kapiszewski and Karcher2021, 473–76), and the adoption of the Journal Editors’ Transparency Statements (JETS) (2015) by at least twenty-seven political science journals.
Qualitative researchers are divided over the merits of the transparency movement.Footnote 1 Some have been supportive and experimented with new developing transparency technologies (e.g., Saunders Reference Saunders2014, 696–97; Mayka Reference Mayka2021; Herrera Reference Herrera2015; Myrick Reference Myrick2021; Siewert Reference Siewert2021; Herrera Reference Herrera2017). Proponents have argued that greater transparency improves research evaluation and assessment and the research process itself; bridges diverse research communities; and facilitates knowledge building (Elman, Kapiszewski, and Lupia Reference Elman, Kapiszewski and Lupia2018, 39–41; Jacobs et al. Reference Jacobs, Buthe, Arjona, Arriola, Bellin, Bennett and Bjorkman2021, 177–79).
Other qualitative researchers have pushed back against the transparency movement—questioning whether replicability can be applied to qualitative research and even the concept of transparency itself (Jacobs et al. Reference Jacobs, Buthe, Arjona, Arriola, Bellin, Bennett and Bjorkman2021, 179–82). They have raised concerns about transparency initiatives that generate ethical issues (e.g., Monroe Reference Tripp2018, 143–45; Tripp Reference Tripp2018, 730–35) and that reflect a narrow understanding of political science scholarship, especially for qualitative-interpretive traditions (e.g., Isaac Reference Isaac2015, 277; Schwartz-Shea and Yanow Reference Schwartz-Shea and Yanow2016, 6–8).
While recent discussions of transparency in political science have centered on dissuading scholarly misconduct or augmenting research’s replicability, we argue that research transparency can provide more tools for evaluating claims and produce better qualitative research. Strategies for strengthening transparency come with costs—some trivial, others more substantial—and scholars often disagree about whether these costs outweigh the benefits. We evaluate the costs and benefits of enhanced transparency, analyze scholarly debates, and provide a practical blueprint to enhance transparency in text-based research.
We advance a five-point framework for improving transparency practices for text-based sources, premised on more explicitly specifying source location, production, selection, analysis, and access. Next, in order to explain current practices—and researchers’ hesitation in changing them—we review 1,120 articles in leading political science journals. We then analyze online commentary from political scientists who participated in the Qualitative Transparency Deliberations (QTD), an online forum organized by APSA’s Qualitative and Multi-Method Section from 2016 to 2018. Using posts as a primary source, we identify a set of common concerns scholars raised about research openness. These concerns are then juxtaposed against three illustrative article examples that showcase different qualitative research traditions implementing transparency in practice. We conclude by recommending a series of norms and practices for augmenting transparency for text-based sources.
A Framework for Evaluating Transparency in Text-Based Sources
Defining Text-Based Sources
Qualitative research depends, in part, on analyzing text-based sources. Text-based sources can, for example, include documents from government archives, records from political parties or social movements, correspondence, speeches, diaries, court rulings, media transcriptions, and secondary sources. Textual evidence may also incorporate multi-media sources such as photographs, videos, and websites. Because text-based sources are inanimate and not created by researchers, they are less prone to respondent bias (e.g., acquiescence or social desirability bias) and researcher bias (e.g., moderator or wording bias) than more interactive qualitative data collection methods such as interviews (Kapiszewski, MacLean, and Read Reference Kapiszewski, MacLean and Read2015, 151–60). However, the act of selecting and interpreting text-based sources for descriptive or inferential claims can introduce other types of collection and analysis bias.
Existing Approaches for Text-Based Sources from Other Disciplines
We now turn to related disciplines to examine best practices concerning text-based sources. In history, preliminary source analysis often begins with understanding source type. For example, a source may be a relic (a physical specimen) or testimony (oral or written report). A source may have been produced intentionally (to serve as an official record) or unintentionally. Furthermore, written sources are often categorized as being either narrative (chronicles or tracts of opinion), diplomatic/juridical (documenting a legal situation), or social (products of recordkeeping by bureaucratic agencies) (Howell and Prevenier Reference Howell and Prevenier2001, 17–28).
Marc Bloch notes that “the struggle with documents” is what defines the professional historian (Reference Bloch1954, 86). The challenge begins with text selection. There is rarely one authoritative source, and historians must adjudicate between sources since “the majority of sources are in some ways inaccurate, incomplete and tainted by prejudice and self-interest” (Tosh Reference Tosh2002, 98). Political scientists encounter these challenges too, since transcripts capturing interactions between dominant and subaltern groups are imbued in power disparities where “it is frequently in the interest of both parties to tacitly conspire in misrepresentation” (Scott Reference Scott1990, 2). To determine a source’s authoritativeness, historians consider a document’s genealogy—its genesis, its originality, and the author’s trustworthiness (Howell and Prevenier Reference Howell and Prevenier2001, 61–68). They draw inferences from a text by determining “how, when and why it came into being” (Tosh Reference Tosh2002, 86).
Understanding the production of sources also involves interrogating how archives are curated—whose agenda is reflected and what is absent. Archivists and state officials are responsible for how records are categorized, described, and made accessible. Context matters. Amid political uncertainty or violence, records are often lost or destroyed. In post-authoritarian countries, archival gatekeepers may fear uncovering truths about the past and decide to mislead or intimidate researchers (Tesar Reference Tesar2015, 105–9). State archives may be inscrutable because of post-war gutting, financial constraints, or self censorship (Daly Reference Daly2017, 314–16); some public figures do not keep records (Saunders Reference Saunders2014, 693). When working with historical personal records, researchers may face ethical dilemmas about whether to publish findings based on sensitive materials (Tesar Reference Tesar2015, 110–13).
Researchers may also face ethical issues at the archives, such as sharing research on endangered languages that may offend speech communities being studied (Innes Reference Innes2010, 199–202) or privacy concerns when working with psychiatric records (Taube and Burkhardt Reference Taube and Burkhardt1997, 61–63). In countries marked by economic and political instability, the best documentary evidence may be found outside of state archives (Daly Reference Daly2017, 312). Informal archives can be invaluable evidentiary sources, but accessing informal archives requires the researcher to locate them and gain the owner’s trust (Auerbach Reference Auerbach2018, 345–46). We use these valuable interdisciplinary insights to inform our understanding of how to work with text-based sources.
Transparency Principles for Text-Based Sources in Political Science Research
Scholars engaged in debates about qualitative research transparency often talk past one another because there remains considerable disagreement about what research transparency means. Although the dominant typology emerging from DA-RT initiatives champions “data access, analytic transparency and production transparency” (APSA 2012, 9–10), there is little consensus on how these goals should apply to qualitative work.
We propose five transparency-enhancing principles for text-based sources. These involve increasing specificity about 1) where sources are located, 2) how sources were produced, 3) why the researcher chose the source, 4) how the source provides evidence for the scholar’s claim, and 5) how to access the source material. The overarching goal of transparency is to help others evaluate a researcher’s key claims. These standards will not apply to every statement made in a manuscript, but rather to key analytical, descriptive, or causal claims (see also Jacobs et al. Reference Jacobs, Buthe, Arjona, Arriola, Bellin, Bennett and Bjorkman2021, 21).Footnote 2
Source location. In identifying a source’s location, authors should provide enough information that readers can locate a source themselves. If a document is privately held by the author or not publicly available, this should be noted. Qualitative researchers typically situate evidentiary claims by citing and sometimes quoting a particular source, but often do not provide enough information to allow readers to actually find it. This problem is pronounced in the case of secondary source citations with missing page numbers.
Source production. All text-based sources are produced in contexts outside of a researcher’s control. These conditions present unique challenges for scholars seeking to evaluate a source’s evidentiary value. When was the source created? Who was involved in its production? What were the contextual factors around its use? Was the source created by a state-sponsored organization, a media outlet with a particular ideological orientation, a paid consultant, or a political dissident? Answers to these questions can fundamentally alter scholars’ interpretations. Production-related information allows readers to evaluate a source’s evidentiary value through the broader context in which it was formed (see also Elman, Kapiszewski, and Lupia Reference Elman, Kapiszewski and Lupia2018, 33–34).
Source selection. As Scott notes, sources that document the “open interaction between subordinates and those who dominate” portray very different accounts than those that occur “offstage,” (Reference Scott1990, 2–5); thus, reliance on one type of source would provide different insights than reliance on another. Qualitative researchers working in positivist traditions are often warned against engaging in selection bias (Moravcsik Reference Moravcsik2014, 49; Thies Reference Thies2002, 355). Sources may be imbued with a variety of biases, which can vary depending on the nature of the research project. With these issues in mind, authors can help readers evaluate the quality and applicability of evidence by explicitly discussing why they selected specific sources to support particular evidentiary claims. Why was one source privileged over another? Is it because it provides more detail, is authored by someone with more knowledge, or simply because it is the only option available? Answers to these questions will help readers evaluate the credibility of sources and authors’ use of them.
Source analysis. Another transparency measure for text-based sources is providing information on how a source supports an author’s claims—otherwise known as “analytic transparency” (Moravcsik Reference Moravcsik2014, 48–49). As Elman, Kapiszewski, and Lupia note, “the goal of analytic transparency is to help others understand how we know what we claim to know” (Reference Elman, Kapiszewski and Lupia2018, 34). This approach helps readers assess how the author is drawing inferences from a source or mix of sources.Footnote 3 Political scientists’ record on analytic transparency is mixed. For example, some scholars are more likely than others to include discussion of a source’s analytical value in a “meaty footnote,” although journals’ varying word-count limits and differing subfield norms lead to inconsistency across the discipline. Yet source analysis is fundamental for research transparency because it allows readers to better understand how a source supports an author’s claims and why a particular document holds evidentiary value.
Source access. The final transparency-enhancing measure involves sharing an excerpt of a source or the entire source itself. Under APSA guidelines, this technique is referred to as “data access” (APSA 2012, 9–11). There are several dimensions of research explicitness that do not involve making full or partial sources publicly available and thus greater transparency can be achieved even without data source dissemination. However, we emphasize that sharing source excerpts can help readers gauge a source’s authorial intent and meaning.
Considering Interpretivism and Research Openness
Scholars in the interpretivist tradition, which centers the use of text-based sources on “meaning-making,” have views that differ from the ones we present. Interpretivists and positivist qualitative researchers “increasingly do not travel under the same philosophical umbrella when it comes to … knowability of their subjects of inquiry” (Yanow and Schwartz-Shea Reference Yanow and Schwartz-Shea2015, xvii). Yet, text-based sources are central to many types of interpretivist research (e.g., Hansen Reference Hansen2006; Lynch Reference Lynch1999; Tidy Reference Tidy2017), and scholars rely on diverse sources such as films, postage stamps, and political cartoons as “communicators of meaning” (Yanow and Schwartz-Shea Reference Yanow and Schwartz-Shea2015, 155). Some interpretivists argue that increased research transparency can lead to greater engagement with methodological positivists, a stronger basis for interdisciplinary work, and a better understanding of how researcher positionality impacts accessing sources, and generating and analyzing data (Yanow and Schwartz-Shea Reference Yanow and Schwartz-Shea2015, xv).
While interpretivists and positivists have different ideas about research openness, we think most would agree that—bracketing confidentiality concerns—clearly stating a source’s location and the circumstances of its production are important goals. These two elements are crucial for scholars concerned with intertextuality, where texts represent a lived experience (Yanow and Schwartz-Shea Reference Yanow and Schwartz-Shea2015, 156). Source location and production help establish the context in which texts—and their interpretations—are “coproduced in and through field-based interactions rather than as objectified, free-standing entities available (‘given’) for ‘collection’ divorced from their field setting” (Yanow and Schwartz-Shea Reference Yanow and Schwartz-Shea2015, xix).
There may be less consensus between positivists and interpretivists regarding the remaining components of text-based research transparency, such as source selection. The representativeness of a text-based source, for example, would not necessarily factor into an interpretivist’s decision to employ it, since providing evidence of a causal claim is not the goal of interpretivist research. Additionally, interpretivists do not view underlying texts as data until particular texts are brought into the research process; texts are devoid of meaning before a scholar’s schema converts a text-as-source into a source with meaning (Yanow and Schwartz-Shea Reference Yanow and Schwartz-Shea2015, xxi). Still, interpretivists view source selection as an important component of the research process—they carefully consider the principles underlying the selection process (Hansen Reference Hansen2006, 73–78). Even though interpretivists and positivists have different goals, enhanced transparency in source selection might be beneficial for certain types of interpretivist work.
Analytic transparency, meanwhile, is already an established part of the interpretivist process. For interpretivists, analysis “commences when one begins to conceive of a research project, to frame one’s research question, read others’ writings on the subject, and design one’s study,” and is not simply the “penultimate step in the research process” (Yanow and Schwartz-Shea Reference Yanow and Schwartz-Shea2015, 158). Discourse on how a source supports a given perspective or interpretation is at the core of every argument; analytic transparency is achieved through the act of interpretation itself. Thus, interpretivists who provide detailed source analysis within the body of an article might not need to include further information on analytic transparency in footnotes or appendices. These spaces may instead engage other goals such as discussing researcher positionality.
Finally, in terms of source access, interpretivists may reject the idea that different readers could come to the same conclusion by accessing the same original texts. Indeed, an interpretivist’s data is “processed, not ‘raw,’ data—‘cooked’ and filtered through the initial researcher’s interpretive schema;” this understanding “renders problematic the creation of databases of interpretive data for other researchers to use” (Yanow and Schwartz-Shea Reference Yanow and Schwartz-Shea2015, xxi). This does not mean that interpretivists would necessarily oppose sharing sources when possible; however, sharing sources might not indicate acceptance of expectations of a “common norm” or of the JETS transparency mandate. The sharing of texts can serve different purposes for different epistemological communities.
How and Why Do Political Scientists Practice Transparency for Text-Based Sources?
The Status Quo
To examine the status quo in transparency practices within political science, we reviewed five leading journals during six years of publication: American Political Science Review (APSR), American Journal of Political Science (AJPS), World Politics (WP), Perspectives on Politics (PoP), and Security Studies (SS). We initially reviewed APSR, AJPS, WP, and PoP; subsequently, we selected SS precisely because it has historically been receptive to qualitative work. Publication bias may limit the range of printed scholarship; APSR and AJPS tend to publish fewer articles using qualitative methods, whereas WP, PoP, and particularly SS publish a larger amount of qualitative research (Teele and Thelen Reference Teele and Thelen2017, 440–41). Yet these five journals represent prestigious outlets for new work; if we are going to see extensive transparency practices in political science journals, it would likely be in these publications.
We reviewed every article in these five journals published every other year from 2008 to 2018 (a total of 1,120 articles). We then selected articles that used qualitative methods for analysis, excluding articles that used solely quantitative methods or that were mixed quantitative-qualitative methods with qualitative methods playing a very minor role. We then selected empirical articles that used text-based sources as the foundation for their claims, totalling 160 empirical articles that substantially used both qualitative methods and text-based sources; that less than 15% of articles surveyed made it into our sample is indicative of the low rate of qualitative research published in top political science journals.
Through this analysis, we identified a wide range of text-based sources used by political scientists conducting qualitative research. Table 1 shows that researchers routinely use archival material, autobiographies, court documents, government archives, pamphlets, NGO reports, newspaper articles, and secondary sources as a basis for empirical evidence. Materials employed less frequently include church documents, religious iconography, film, novels, company reports, speeches, and protest signs. The frequency with which different types of text-based sources appear across the journals varies. During the six years in our sample, SS published qualitative research articles that used text-based sources in 218 instances, whereas the count for WP was 126, PoP was 69, APSR was 31, and AJPS was 8. Taken together, these 160 articles used text-based sources a total of 452 times.
* Secondary sources in this table refer mostly to historical interpretations (e.g., historical treatises, military histories, etc.). Refer to online appendix A, n3, “A Note On Secondary Sources.”
Source: Authors’ compilation based on articles published every other year from 2008–2018. See Appendix A.
How explicit are researchers about their use of text-based sources? For each of the five forms of transparency that we coded, an article could receive either a 1 or a 0. Table 2 summarizes our coding for transparency practices across our five dimensions, excluding articles that only relied on secondary sources; online appendices A and C provide more information.
Source: Authors’ compilation based on articles published every other year from 2008–2018. See Appendix A. Cells include raw counts and percentages in parentheses.
When coding for source location, we focused on findability—how easily others could locate the source (if publicly available) or understand where the author had found it. Here we only coded articles whose primary use of text-based sources were not secondary sources. For example, did newspaper articles provide full titles, dates, and working html links? Did the location information include the location of the archive and additional identifying information (box number, etc.)? In our sample, 78 articles (49%) provided some information about where sources were located.
We see that the remaining forms of transparency were even more infrequent in our sample. Only 11% of articles provided information about how their sources were produced, 9% explained how sources were selected, 12% provided analysis of how sources supported claims being made, and 14% provided access to partial or full selections of sources. While it is not surprising that few articles provided full data access, other forms of transparency (such as providing identifying information from archives or indicating why particular sources were selected) were routinely missing.
To further analyze source location practices, we studied norms regarding the inclusion of page numbers for secondary sources (refer to online appendix A, n3, for details about our classification of secondary sources). Table 3 confirms that omitting page numbers for secondary source citations has become the status quo. Our analysis shows that of the 20,894 total citations of scholarly sources found in the 160 articles we surveyed, only 43% provided page numbers for in-text citations or notes. The average masks significant heterogeneity across journals, however. In SS, which we selected anticipating stronger transparency norms in qualitative research, 67% of citations contained page numbers for in-text citations or notes; excluding SS, in the other four journals, only 22% of citations contained page numbers. In addition, 39% of citations with page numbers were in cases of directly quoted text; this proportion rises to 51% when SS is excluded from the analysis. These findings suggest that the norm in political science has become to not use page numbers unless citing a direct quotation. That said, we find evidence that journals with stronger qualitative commitments, such as SS, have pointed a way forward by publishing more articles employing enhanced source location transparency standards (see table 3 and online appendix A, n3).
Source: Authors’ compilation based on articles published every other year from 2008-2018. See Appendix A.
Our review suggests that there are currently real deficiencies in qualitative researchers’ methods of citing text-based sources. Studies have shown the ubiquity of “reference rot” in APSR (Gertler and Bullock Reference Gertler and Bullock2017, 167). We frequently encountered broken URLs and missing source material on researchers’ personal webpages across the journals we surveyed. Moreover, our sample presented few patterns or norms of how transparency was carried out when it did occur.
Researchers’ Concerns about Changing Practices
To understand how political scientists have articulated concerns about greater transparency in research utilizing text-based sources, we draw on the Qualitative Transparency Deliberations (QTD) online discussion forum (see Jacobs et al. Reference Jacobs, Buthe, Arjona, Arriola, Bellin, Bennett and Bjorkman2021, 171–76).Footnote 4 The QTD boards were a space for political scientists to discuss what they understood to be the costs and benefits of augmenting transparency practices in qualitative research. Participants were asked to comment on the advantages and disadvantages of increasing transparency standards for text-based sources from their current vantage point. QTD boards were broadly representative of most “typical” qualitative researchers, few of whom were active proponents or early critics of DA-RT. Participants’ concerns thus provide a valuable “temperature check” on how scholars are grappling with the practical and ethical issues surrounding changing transparency norms. Concerns raised in the QTD forum allowed us to establish a clear set of research issues that we take up in the final section of this paper.
1. How to navigate copyright law? QTD discussion suggested that issues surrounding copyright infringement were unclear. Gelbman noted that “creating an expectation that source materials will be digitized opens a whole can of worms with respect to permissions and copyright.”Footnote 5 If sharing entire documents becomes equated with greater transparency, how would scholars navigate this norm if they are unable to legally or ethically share entire sources? While a transparency appendix could include small excerpts (up to 150 words) from a copyrighted work containing the most relevant information, the question of sharing entire copyrighted works worried some participants. Particular types of text-based data may face even stricter copyright restrictions. Harkness reports that in order to access archival British colonial maps, she “had to agree not to give public access to the digitized maps according to crown copyright law.”Footnote 6 How would scholars, who are not trained in copyright law, determine who holds a copyright for something like photographs published in newspapers, or when a copyright expires? Researchers working internationally encounter contexts where copyright rules may be opaque, changing, or nonexistent.
2. What about archival rules and restrictions? Some archives—especially in developing countries—limit the number of pages researchers can scan or ban any reproduction of their materials. Even if an archive permits some forms of reproduction, making source materials available may be onerous and expensive, particularly for researchers with fewer resources. QTD forum participants noted that graduate students and junior faculty may struggle to scan and process large quantities of sources. Harkness reports from her experience working in African archives that “obtaining photocopies is usually possible … although there is often a bit of an expensive racket around doing so. Those costs cannot always be well-anticipated up front.”Footnote 7 Archives might also disapprove of broad dissemination of their records. Gelbman was concerned that a requirement to share textual data, in addition to creating work for the researcher, could make archives less cooperative. She notes that “[It] can make it harder for researchers to gain access in the first place if [archives] come to believe that facsimiles of documents will be shared in the peer review and publication process.”Footnote 8 Gelbman also writes that “some archive[s] also change their policies, sometimes very suddenly and without maintaining a record of the old rules.”Footnote 9 Would researchers be able to keep up with these changes?
3. What about privacy concerns? While ethical issues may be less common in text-based research than in human-subjects research, researchers using textual evidence may also face ethical dilemmas. Even if archives permit researchers to copy documents, it could be a violation of trust with the archive to publicize its documents. Hymans writes that the owner of a private archive he worked with “did not want anyone else to see the papers, and in fact he did not want anyone else to know that he had the papers.”Footnote 10 Concerns could also arise regarding the privacy of individuals described in personal communications, such as diaries or letters. Thurston notes that she uses “letters from private citizens to advocacy groups in [her] research. This raises questions about whether and how to protect their identities when it comes to citation.”Footnote 11 For some research projects, efforts to de-identify documents may not sufficiently protect research subjects.
4. Can single documents be separated from their broader context? Some QTD posters worried that certain research transparency requirements could place an undue burden on qualitative researchers in cases where providing context for isolated notes requires sharing extensive background information closely linked to the research site. Several commenters were concerned that transparency measures would require separating documents from the broader context that only the researcher understands. One guest poster said that in their archival work, “it would be impossible for a second researcher to understand the significance of document #10 without reading documents #1–9, and #11–50.”Footnote 12 If transparency always requires sharing reproduced documents in their entirety (a point we negate and take up in the final section of this article), how might authors discuss their consultation of adjacent documents for someone who had not consulted those same documents? This challenge is especially acute for interpretivist, immersive, and ethnographic scholars. Keck writes, “I am perfectly comfortable with the idea that someone who does not speak the languages I speak, does not know the history I know, and does not have the kinds of social and intellectual networks I have … would not be able from my notes or appropriately archivable interviews to come to the same conclusions I have come to.”Footnote 13
5. What about right of first use? Another hesitation in immediately sharing original data stems from a desire to protect one’s “right of first use.” QTD contributors have suggested that journals requiring data sharing should also guarantee an embargo period where newly-collected data is reserved for the collector’s exclusive use. Capoccia writes that “the question of embargoes for original data seems to deserve more attention” from journals and research transparency advocates. Otherwise, researchers will be incentivized to use “off-the-shelf” data instead of collecting their own original sources.Footnote 14 How much lag time should be implemented when sharing data and how can such restrictions be practically implemented?
6. How much time does transparency take? An additional cost of making qualitative research more transparent is time. Researchers may encounter a tradeoff between preparing a transparency appendix and working on other projects. Mansbridge remarks that “the work of producing a TRAX [transparency appendix] is undoubtedly good for the researcher and for the reader. But how good, compared to starting work on another important subject? We have limited lives and very limited research time.”Footnote 15 Some commenters feared that adding time-consuming transparency requirements to qualitative textual research could disincentivize researchers from pursuing it. Greitens observes that “the uncertainty and increased transaction costs around qualitative research seem already to be leading many [graduate students] to conclude that the attempt is simply not worth the risk.”Footnote 16 Poteete notes that “[A TRAX] would certainly slow the time to publication, potentially significantly, and that is an important cost. Slower time to publication implies fewer publications when on the job market or up for review, tenure, or promotion. Will expectations shift to account for changing practices?”Footnote 17
7. How will transparency appendices be viewed in journal and promotion reviews? Would creating a detailed transparency appendix make one’s work vulnerable to reviewers? Handlin finds that “a TRAX is not really like a quantitative appendix in this regard, which (de facto at least) tends to present a bunch of information and additional statistical results that are carefully selected by the author to be relatively bullet proof. Presenting big sections of text from your sources in a TRAX, in contrast, will always open you up to potential criticism from others about the interpretation of the text.”Footnote 18 Would reviewers nitpick and use the material against authors to sink a paper? Others noted that transparency appendices for qualitative research would likely be longer than quantitative article appendices; would reviewers be expected to wade through this material? Finally, how would transparency requirements affect junior faculty under pressure to publish quickly before tenure reviews?
The QTD boards broached important questions about the ethical, legal, and practical issues related to increasing transparency for qualitative research based on text-based sources, some of which are easier to negotiate than others. We take up these issues in the final recommendations section by engaging with established literature and reviewing new digital technologies that can mitigate some QTD concerns.
Illustrative Examples of Transparency for Text-Based Sources
We now draw on three examples of political science scholarship from across the epistemological spectrum to showcase how scholars have advanced transparency in text-based sources along the five dimensions that we previously identified. These examples illustrate how authors using diverse text-based sources and encompassing variant research goals (e.g., theory-building, identifying causal mechanisms, critical discourse analysis and interpretive method, and qualitative replication) have already incorporated transparency practices in their qualitative research; Appendix D provides additional examples.
Example #1: Theorizing International Legitimacy with Archival Sources
Blending a process tracing and social-constructionist approach, Musgrave and Nexon (Reference Musgrave and Nexon2018) set out to explain why states routinely invest in expensive endeavors that do not appear to yield military or economic gain. The authors argue that when states face legitimacy concerns, they carve out authority in symbolic spheres to project international leadership. This argument is advanced through analysis of diverse text-based sources—including governmental archive records, declassified intelligence documents, presidential statements, and U.S. cabinet minutes during the Cold War. The authors also consult specialist tracts drawn from archaeological, documentary, and other primary sources from the Ming Dynasty in the early-fourteenth century.
Source location. When introducing archival materials, the authors provide specific location information down to the last identifier, facilitating “findability” and external assessment. For example, to tie U.S. investments in its Apollo space program to a perceived challenge to national competence after the USSR launched its Sputnik satellite, the article draws on contemporaneous viewpoints of American officials. Musgrave and Nexon provide detailed identifier information, for example: “US Information Agency, Office of Research and Analysis, ‘Impact of US and Soviet Space Programs on World Opinion,’ 7 July 1959, US President’s Committee on Information Activities Abroad (Sprague Committee) Records, 1959–1961, Box 6, A83-10, Dwight D. Eisenhower Library, Abilene, Kansas” (Musgrave and Nexon Reference Musgrave and Nexon2018, 610, fn 107).
Source production. In a data overview to the digital transparency appendix (or ATI, discussed in the next section) accompanying this article, Musgrave (Reference Musgrave2018) provides considerable detail regarding source production related to text-based evidence used to study state intentions in the Ming era. Most available primary sources used in the article (e.g., “Veritable Records of the Ming Dynasty,” a set of official records compiled by scholar-officials after the deaths of emperors) are themselves secondary source accounts. Many primary sources had also been lost, both unintentionally (in the midst of dynastic successions) and intentionally (due to specific bureaucratic sabotage in the later Ming era). The authors therefore decided to base their analysis on secondary sources. They eschewed “standard popularizations” and instead consulted a vast body of specialist tracts (e.g., Edward Dreyer’s 2007 tome, Zheng He: China and the Oceans in the Early Ming Dynasty, 1405–1433) that examined archaeological, documentary, and primary records. Their “capacious selection” of secondary sources permitted them to “better survey disputes over interpretations of the voyages’ meaning and impact” (as cited in Musgrave Reference Musgrave2018, data overview). Contextualizing the universe of sources relied upon allows readers to evaluate sources’ evidentiary value and gain insights into excluded texts.
Source selection. How did the authors select sources to evaluate their claim that the Ming naval expeditions to the Indian Ocean (“treasure-fleet voyages”) sometimes displayed force when seeking to purchase loyalty for the emperor from local potentates? To support the claim regarding the use of coercion, the authors introduce a quote, “the Treasure-ships were intended not only to dazzle foreign peoples with their wealth and majesty but to overawe potential opposition with their might and firepower” (Robert Finlay 1991, 12, as quoted in Musgrave and Nexon Reference Musgrave and Nexon2018, 608). The supporting transparency appendix discusses the authors’ decision to privilege this source, stating that a different scholar (Joseph Needham 1972, 489) provides an alternate account of the voyages, depicting them in noncoercive terms (as cited in Musgrave Reference Musgrave2018, annotation 20). Yet Musgrave and Nexon argue that Needham had a “Sinophilic outlook,” which might have led him to provide an overly peaceful interpretation of the voyages and ignore facts presented in Findley 1991 regarding the voyages’ militaristic nature (Musgrave Reference Musgrave2018, annotation 20).
Source analysis. In the Cold War case, Musgrave and Nexon’s argument hinges on the claim that U.S.–Soviet space competition was a matter of prestige, leading to large American investments in space technologies that had no overt military or economic rationales. As part of the evidence offered to support this contention, they contrast the “romantic language of [President Kennedy’s] public speeches about space” with transcripts of tape recordings of his private conversations with NASA officials, in which he states that “everything we do ought really to be tied to getting on the Moon ahead of the Russians … [W]e ought to be clear, otherwise we shouldn’t be spending this kind of money because I’m not that interested in space” (Musgrave and Nexon Reference Musgrave and Nexon2018, 616–17). The authors analyze this evidence—attaching greater weight to the tape recording of Kennedy’s private conversation—to show that space exploration was motivated by prestige, consistent with the article’s theoretical argument.
Source access. To substantiate their claim that an obsession with scientific prestige quickly permeated the U.S. establishment and motivated subsequent space investments, the authors introduce a declassified CIA document, “A Comparison of US and USSR Capabilities in Space,” written in January 1960 following the launch of Sputnik. An excerpt is presented as a figure in the article and the entire document is made available in a transparency appendix. The evidentiary value of the document is strong; it rates the United States and USSR on several specific dimensions, presenting what the authors argue is a “formal statement setting out how scientific capital was exchanged into prestige” (Musgrave Reference Musgrave2018, annotation 30; Musgrave and Nexon Reference Musgrave and Nexon2018, 610–11).
Example #2: Critiquing Dominant War Narratives with Multi-Media Sources
For an example of transparency from a different epistemic community, consider the interpretivist scholarship of Tidy (Reference Tidy2017). Does our understanding of state-sponsored violence during wartime change when we interpret killings through the lens of ordinary citizens? Tidy’s article sets out to challenge dominant narratives of Western warfare, which are created and written by those in power to advance self-serving goals and which obscure the perspectives of subordinate actors. The author draws on video footage, photographs, written narratives, media accounts, and testimony related to the killing of civilians by American forces in Iraq in 2007 to show that wartime killings can take on a political vocabulary of either “collateral damage” or “collateral murder” depending on whether they are interpreted from the perspectives of those “from above” or “from below.”
Source location. The preponderance of evidence Tidy consulted is from WikiLeaks, which dedicated a website to host videos related to an American Apache helicopter’s air-to-ground attack in New Baghdad during 2007, along with press coverage, still images, transcripts, and other materials surrounding the attack. In an accompanying transparency appendix, the author provides references to WikiLeaks’ online repository, along with specific links to embedded videos, stills, photographic evidence, transcripts, timelines, and additional resources (e.g., U.S. military rules of engagement; news commentary related to the event; and photos and information about civilians killed in the attack). Links to archived copies of defunct WikiLeaks websites are also shared (Tidy Reference Tidy2018, annotation 1).
Source production. Tidy carefully interrogates sources’ production in order to support the article’s argument that broader power structures frame popular representations of contemporary warfare. For example, the author situates videos of the killings taken from the Apache helicopter (the “view from above”) within a set of dominant technologies and actors (e.g., drones and bombers) that have become the hegemonic social representation of modern warfare in the United States (Tidy Reference Tidy2017, 102). This source is contrasted with the “view from below,” captured in photographs taken by a ground-level Reuters photojournalist, Namir Noor-Eldeen, until the very moment he was killed in the attack. Here, the perspective documented by a subaltern actor both physically and symbolically subordinate to the Apache crew is selected to offer a glimpse into the lives of civilians who are at a permanent disadvantage in wartime settings (Tidy Reference Tidy2017, 105).
Source selection. To document the “view from below,” the author selects photographs captured by Noor-Eldeen through his long-lens camera, which is significant because the Apache crew mistook the camera for an enemy missile; the source “becomes the literal and metaphorical visual mode through which the war experience of the commonly elided receivers of military violence are written into the narrative” (Tidy Reference Tidy2017, 105). This source is selected because it creates a Rashomon effect when juxtaposed with the “view from above,” and with a third perspective, “the view from on the ground” (captured by testimony and materials from a ground-level U.S. soldier who witnessed the killings); each presents a different perspective on the same military attack, but Noor-Eldeen’s visuals have a strikingly different encounter with violence.
Source analysis. The author leverages the text and visual elements in these different sources as the basis of the article’s discursive analysis. Consider the video from the Apache helicopter. As the author argues, state-sanctioned depictions of war are typically orchestrated as short clips demonstrating the mastery, precision, and rationality of military combatants; instead, this 39-minute video footage documents the Apache crew tracking targets, “interpreting an often-ambiguous feed of images,” and attacking vehicles that were revealed to have children (Tidy Reference Tidy2017, 104). It is important to understand the weight of this source, which was classified and never intended to be public. The evidence is damning, and Tidy uses it to make a compelling critical analysis with the very tools and weapons of those who she is criticizing.
Source access. Tidy consulted sources that were readily accessible in the public domain and provides links to them. Even though the sources are public, the article fastidiously documents pertinent information; source materials were qualitatively annotated in order to make underlying evidentiary materials easier to locate, and a transparency appendix was provided to show how specific sections of the article rely on distinct source elements (Tidy Reference Tidy2018).
Example #3: Qualitative Replication with Historical Documents
Our third example focuses on qualitative replication: Kreuzer (Reference Kreuzer2010) revisits underlying historical sources used in a quantitative study of the origins of proportional representation systems to replicate its findings. Kreuzer argues that differences in how sources are selected, interpreted, and translated into numerical datasets substantively alters the conclusions of quantitative analyses.
Source location. Kreuzer illustrates why it is important to be able to locate sources utilized in historical research. When assessing the reliability of the Labor Market Index, a key quantitative measure used in Cusack, Iversen, and Soskice (2007) [“CIS”], Kreuzer finds that the “absence of citations for 31 of the 90 cells (34.4%) in Table 1 makes it extremely difficult to replicate” CIS’s codings (Reference Kreuzer2010, 373). In Appendix B, Kreuzer documents how the sources referenced by CIS provide evidence for only certain constituent parts of the Labor Market Index, leaving other components unsubstantiated. The inability to locate sources fundamentally matters; Kreuzer’s qualitative replication presents alternate sources that disprove CIS’s findings.
Source production. Kreuzer also disputes several of CIS’s coding decisions by calling into question sources’ production value. For example, Kreuzer (Reference Kreuzer2010, 372) argues that CIS use Colin Crouch (1993) as their primary source for the strength of guilds, employer associations, and union centralization, yet fail to explain how cases were coded. In Appendix A, meanwhile, Kreuzer (Reference Kreuzer2010, 385) draws on a number of alternate country-specific historical sources to question these coding choices (as cited in Kreuzer, Reference Kreuzer2010, p. 372). For example, he relies on evidence in Hans Ulrich Jost (1990, 284–86) and Bernhard Ebbinghaus (1995, 73) to recode Switzerland as not having centralized unions because Swiss unions were divided by language and religion (as cited in Kreuzer Reference Kreuzer2010, p. 385).
Source selection. CIS’s finding that economic determinants led to the adoption of proportional representation systems derives from an analysis of eighteen cases. Kreuzer conducts an in-depth historical examination into each of the eighteen cases, consulting political, labor and economic history sources (Reference Kreuzer2010, 375). The replication from this expanded set of sources casts considerable doubt on the causal mechanisms in CIS’s argument. This exercise underlines how different types of sources can lead to very different conclusions; thus, more information on source selection can enhance transparency practices considerably.
Source analysis. In order to dispute CIS’s coding that France did not have widespread rural cooperatives, the author provides historical evidence regarding France’s rural cooperatives. Kreuzer (Reference Kreuzer2010, 372) directs the reader to Appendix A, where he cites two studies (M.C. Cleary 1989, 40-50; Isabela Mares 2003 133-35) that confirm that France “experienced a rapid growth of agricultural associations from the 1890s onward … Their growth continued throughout the interwar period.” This evidence supports Kreuzer’s decision to reverse the CIS’s coding.
Source access. After examining the qualitative evidence, Kreuzer recodes thirteen out of CIS’s ninety coding choices (Reference Kreuzer2010, 371–72). For example, CIS codes Austria as having a large skill-based export sector; in Appendix A, Kreuzer provides evidence to show that exports constituted only a small percentage of Austria’s output. He provides a source excerpt from Peter Katzenstein (1985, 138) to note that in the late nineteenth century, “Austrian producers by and large eschewed the specialization for exports typical of the other small European countries” and provides evidence from several other sources to support this claim (as cited in Kreuzer 2010, p. 385). Taken together, Kreuzer makes a strong case for why enhanced transparency practices can advance replication and social scientific enquiry.
Recommendations for Researchers
The types of transparency practices showcased in our three illustrative examples are unfortunately not common in the majority of articles we surveyed previously in our review of political science journals. Instead, most articles provided little information about their sources. We now offer recommendations aimed at improving this state of affairs. We surveyed existing literature to generate recommendations for work using text-based sources that includes the use of 1) detailed citations, 2) transparency appendices, and 3) data access. In doing so, we differentiate between practices that should be required, ones we strongly encourage, and those we view as optional. In these recommendations, we address concerns raised in the QTD boards and provide practical solutions for authors.
1. Detailed Citations Should Be Just Like Real Estate: “Location, Location, Location”
Requiring detailed citations specifying the location of cited evidence so that others may easily find referenced works should become a foundational practice in qualitative political science. We brought to bear systematic data to confirm previous studies decrying imprecise citations practices (e.g., Lustick Reference Lustick1996, 6; Trachtenberg Reference Trachtenberg2015, 13–14; Moravcsik Reference Moravcsik2010, 30) with our review that revealed low page number citation rates in top disciplinary journals. Political science journals replaced the use of discursive footnotes with parenthetical references in shifts to mirror quantitively-oriented fields (Lustick Reference Lustick1996, 6; Trachtenberg Reference Trachtenberg2015, 14); furthermore, the APSA Style Manual is vague about when to use page numbers (APSA 2018, 39). The result has been that citations have become merely ornaments as opposed to facilitators of scholarly interchange. We cannot improve transparency practices without addressing this fundamental problem.
1A. Specify page numbers for scholarly sources (require). We propose that page numbers be required for most in-text citations for journal submissions when citing scholarly sources such as articles, books, and book chapters. For any scholarly source used within the text or notes, manuscripts should cite the precise page or page ranges where evidence is being drawn. When referencing the main argument of a work, citing page numbers guides readers toward a summary of the main argument, but citing entire books or articles does not help others easily consult sources. We propose that journals require authors to report the percentage of their citations that have page numbers and set a benchmark that submissions have to meet (e.g., 75%) or otherwise be desk rejected.
1B. Specify location of primary sources—where a source resides and where evidence resides within a source (require when ethically and legally possible). Other text-based sources, such as those identified in table 1, are what historians call “primary sources” or sources that serve as an original foundation of information about a topic under study, often from firsthand witnesses. We urge scholars to specify the location of such primary sources whenever ethically and legally possible. This includes information from all publicly available archives, policy reports, etc. We strongly encourage qualitative researchers to cite like historians, who reference archival material down to the last identifier so that others can “pull the box from the shelf” (Trachtenberg Reference Trachtenberg2015, 14). Detailed location citations would include 1) citing the location where the source resides, and 2) specifying the location within the source itself where evidence is being pulled to support claims. The default practice wherever possible should be to cite the location within the source itself where an evidentiary claim is being pulled to the last available identifier (e.g., with a time stamp for an audiovisual, not just the title and date). If the source originates from the private archive of someone who wants to remain anonymous, the author should say so and provide de-identified information where ethically and legally possible. Online-only sources are likely to have different enumeration, but authors should simply include any available identifying location, such as a chapter number or section heading (see also APSA 2018, 45). We urge editors to specify the practice of location “findability” as “best practice” in their submission protocol and for reviewers of manuscripts using text-based sources to ask authors to make necessary revisions during the revise and resubmit period.
2. Transparency Appendices: New Technologies in a Changing World
As political science journals have adopted scientific notation practices and tight length limits, the practice of providing extended source annotation in footnotes has waned. We argue that manuscripts should provide information about their text-based sources, including how they were selected and produced, and most importantly, how they provide evidence for key claims that support their argument. We discuss two practices that scholars can adopt to improve transparency standards for text-based sources: 1) a text-based source overview and 2) source-based annotation.
2A. Text-Based Source Overview (strongly encourage). Most journal articles based on text-based sources provide only a cursory methods statement within the text itself. Our review revealed that political science manuscripts rarely treat text-based sources as a form of data that requires extended explanation. We recommend that authors provide a methods narrative in an appendix that includes a statement about the types of text-based sources used and relevant information about how they were produced and selected. Here authors can speak in a holistic way about their data generation, because data citations cannot capture the large amount of material that was used to draw inferences and generate conclusions. Moreover, some research traditions rely on evidence gathered over a long period of time, or depend on deep background knowledge and inferential paths that defy step-by-step descriptions (Elman, Kapiszewski, and Lupia Reference Elman, Kapiszewski and Lupia2018, 34). Source overviews allow authors to specify how particular documents were interpreted in light of other documents consulted.
Authors should decide the length of text-based source overviews—from a paragraph to a few pages—and withhold information as necessary to respect any legal and ethical constraints about their sources. For example, an author can specify 1) the overall universe of information relevant to the central research question, 2) how she decided what to consult, record, and what quotes to select for analysis, 3) what relevant data (e.g., boxes in archives or secondary sources) were not consulted or were omitted and the potential impact on the analysis, and 4) how key sources were produced and came into the author’s possession (Elman et al. Reference Elman, Kapiszewski, Moravcsik and Karcher2017, 7; Elman, Kapiszewski, and Lupia Reference Elman, Kapiszewski and Lupia2018, 33).Footnote 19 Methods appendices are common for top university press books; we argue that such a standard should also be adopted for journal articles that publish research based on text-based sources. Importantly, this type of appendix would not count against an article’s word limit.
All research, and particularly qualitative research, relies on interpretation—different people may come to different conclusions when looking at the same source. Replication is not necessarily feasible, or even desirable, for many types of qualitative research (Jacobs et al. Reference Jacobs, Buthe, Arjona, Arriola, Bellin, Bennett and Bjorkman2021, 171–85). Yet providing more information about text-based sources is not really about replication, but rather helping others understand the evidentiary basis of an argument and facilitating scholarly debate. The form and content of such an overview would reflect the diversity of the many epistemological communities within our discipline.
2B. Source-Based Annotation: From meaty footnotes to Annotation for Transparent Inquiry (ATI) (encourage). Individual annotations linked to referenced evidence is the hallmark of excellent qualitative scholarship. This practice is ubiquitous in history, legal scholarship, and social sciences scholarship published in historically oriented journals like Studies in American Political Development (SAPD). Discursive annotations allow authors to explain why the cited reference undergirds key claims. Annotations can also provide a space to explain source selection, production, and relevant bias within the source.
We recommend that source-based annotation in political science be strengthened by adopting two practices. First, annotations in the form of footnotes or endnotes should not count against the journal word limit, as this practice discourages transparency and scholarly exchange, particularly for research that relies on text-based sources. Journals should adopt the practice of “transparency footnoting,” where word limits exclude footnotes or endnotes (e.g., see Politics & Society). Alternatively, journals with longer word limits, such as International Security (20,000 words) or SAPD (no official limit) provide authors with more space to augment transparency practices in footnotes and endnotes. We agree with Gerring and Cojocaru (Reference Gerring, Cojocaru, Elman, Gerring and Mahoney2020, 98) that length limits are arbitrary and counterproductive.
Second, we encourage the use of ATI, a digital annotation infrastructure hosted by the Qualitative Data Repository. The ATI is automatically available for all Cambridge University Press journals, but is compatible with all journals. Popular press articles, newspapers, and social media now use hyperlinks to draw readers to digital information that provides supporting evidence for claims, illustrates examples, or facilitates further reading. As scholars we should not hesitate to do the same. Based on Moravcsik’s pioneering active citation work (Reference Moravcsik2014, Reference Moravcsik2010), an ATI is based on “open annotation” for generating and sharing digital annotation across the web that allows any digitally published manuscript to be annotated (Kapiszewski and Karcher Reference Kapiszewski and Karcher2021, 473–74). ATIs allow authors to share excerpts from sources and provide more detailed information relating to how specific passages within a journal article support key claims.
The ATI design provides a good blueprint for how we might think about annotating sources regardless of whether we use ATI, a different future technology, or a simply a meaty footnote. When deciding what to annotate, the ATI architects suggest considering 1) the centrality of the evidence-based claims, 2) the importance of the data source, 3) whether a claim is contested or controversial, and 4) whether a source is contested or controversial (Elman et al. Reference Elman, Kapiszewski, Moravcsik and Karcher2017, 11–12). ATI architects recommend providing a full citation, an analytic note discussing how the data was generated and analyzed to support the claim, a source excerpt (typically 100–150 words), and the data source itself if it can be shared legally and ethically (Elman et al. Reference Elman, Kapiszewski, Moravcsik and Karcher2017, 3). ATIs are flexible: authors can choose to provide annotations for as little or as much of their article as they like. They can provide excerpts without annotations, annotations without excerpts, and links to full data sources as appropriate. ATI architects urge annotations for only a subset of passages, and explicitly state that it “is unnecessary, potentially counter-productive, and almost certainly time-consuming” to annotate all passages that involve descriptive or causal inference (Elman et al. Reference Elman, Kapiszewski, Moravcsik and Karcher2017, 11).
Although ATIs are relatively new, users have identified numerous benefits. Annotations allow authors to engage directly with inevitable contradictions that emerge in data by providing more space to adjudicate between conflicting pieces of evidence (Mayka Reference Mayka2021, 480; Myrick Reference Myrick2021, 493; Siewert Reference Siewert2021, 489). In addition, ATIs can serve multiple qualitative research communities. For Qualitative Comparative Analysis, ATIs can help corroborate coding decisions by connecting original sources to their final assessment and make interpretations of evidence more transparent (Siewert Reference Siewert2021, 488–89). For multi-method qualitative and quantitative research, annotations can supplement technical material to make the paper’s quantitative analysis more accessible (Myrick Reference Myrick2021, 493). For process tracing studies, ATIs allow researchers to showcase the logic of inductive or deductive analysis of causal process observations or applications of process tracing tests (Mayka Reference Mayka2021, 480). Authors can use ATIs to share participants’ own words to better illustrate the lived experience of one’s interlocutors and how they engage in meaning-making (Mayka Reference Mayka2021, 480). ATIs even allow authors to share digital content such as maps, posters, songs, or videos.
ATI users have also noted opportunities for improving its uptake, including considerations of when and what to annotate and how to reduce the primary cost of annotating: time. The time it takes to compile transparency appendices was a key concern on the QTD boards. In our experience (Herrera Reference Herrera2015), creating an ATI does take time, but integrating ATIs into the article drafting process can greatly reduce the time required for annotations (Herrera Reference Herrera2017; Mayka Reference Mayka2021, 481). Also, more proactive messaging from journals about how to integrate ATI into the peer review process (a QTD concern) is needed (Myrick Reference Myrick2021, 495).
QTD posters sometimes worried that qualitative transparency appendices would need to be very lengthy and recreate the entire research process. ATI users suggest just the opposite, emphasizing the utility of concise and tailored annotations. Bombarding readers with too many annotations is counterproductive and burdensome for both readers and authors (Myrick Reference Myrick2021, 494; Siewert Reference Siewert2021, 490). Authors should fit annotations tightly to the top priorities of the manuscript. Mayka suggests potentially self-imposing a maximum number of annotations to safeguard against wading “aimlessly in [a] sea of qualitative data” (Reference Mayka2021, 481). Siewert repeatedly interrogated specific information to determine whether it was “essential” before annotating (Reference Siewert2021, 490).
ATI users will have to decide what to annotate in the digital ATI format and what to leave for a footnote in the journal text; the two should complement each other rather than overlap. ATIs are most useful for annotating key claims and sources, as they allow an author to discuss a source’s location, production, and selection process in detail and outside the scope of an article’s word count. ATIs should be viewed as a new approach that allows researchers to maintain control over their research agenda, epistemological commitments, and the ethical and legal considerations specific to each individual research project.
3. Data Access: Excerpts versus Archiving
Data access—and the thorny question of data sharing—was the most contentious component of the QTD debate, but in our estimation, the most misunderstood. Proponents of qualitative data sharing have consistently maintained that data sharing should be optional and adhere to legal, ethical, and logistical constraints (e.g., Kapiszewski and Karcher Reference Kapiszewski and Karcher2021, 473, 477; Elman et al. Reference Elman, Kapiszewski, Moravcsik and Karcher2017, 1, 2, 4; Elman, Kapiszewski, and Lupia Reference Elman, Kapiszewski and Lupia2018, 42). We distinguish between two options that were often conflated in the QTD boards, 1) sharing excerpts of text-based sources, which we encourage, and 2) sharing an entire text-based source, which we view as optional.
Our section on ATIs discussed sharing data excerpts; here we clarify how it compares to sharing entire text-based sources. Proponents of qualitative data sharing are frequently referring to the sharing of excerpts (typically 50–150 words), which can be accomplished with ATI technology, meaty footnotes, or a supplementary appendix. Sharing excerpts is little more than sharing an extended quotation of a source. QTD participants expressed concerns about navigating copyright law. Written excerpts of fewer than 150 words are subject to fair use copyright law, which allows for the reproduction of short excerpts for scholarly purposes (Karcher, Kirilova, and Weber Reference Karcher, Kirilova and Weber2016, 295; APSA 2018, 7).Footnote 20 If copyright rules of non-text based sources such as photographs or maps are unattainable or face restrictions outside of U.S. law, authors can simply decide to not post the source.
QTD boards also broached the subject of privacy concerns when sharing text-based sources. Sharing excerpts of a text-based source will not be controversial for the majority of pre-existing sources (e.g., constitutions or organizational charters). Scholars working with sensitive documents or privacy concerns can choose to share de-identified information or simply not share. While de-identified data and analytic utility may at times be at odds (e.g., Kapiszewski and Karcher Reference Kapiszewski, Karcher, Elman, Gerring and Mahoney2020, 209, n10), researchers always retain authority over what excerpts to share and how to present them to others as de-identified sources.
Alternatively, researchers may choose to archive entire research projects, although this option will likely only appeal to those interested in facilitating future data use by others. While qualitative data archiving is relatively new in political science, there are examples of existing archived qualitative projects in other fields (Mannheimer et al. Reference Mannheimer, Pienta, Kirilova, Elman and Wutich2019, 645–46). Several repositories exist for archiving social science data that offer data curation and preservation, but the Qualitative Data Repository at Syracuse University is an ideal choice because it was specifically designed to accommodate the heterogeneity of qualitative data source types and is curated by political scientists (Kapiszewski and Karcher Reference Kapiszewski, Karcher, Elman, Gerring and Mahoney2020, 205–2012; Mannheimer et al. Reference Mannheimer, Pienta, Kirilova, Elman and Wutich2019, 647).
Data repositories can help guide authors on how to legally and ethically share sources and even help researchers write data management plans. Institutional repositories can counsel researchers on how to de-identify data to preserve anonymity, work with university Institutional Review Board procedures and standards for informed consent, create differential or restricted access (so sensitive material can be shared only by “request”), and preserve the right of “first use”Footnote 21 (Kapiszewski and Karcher Reference Kapiszewski, Karcher, Elman, Gerring and Mahoney2020, 205; Mannheimer et al. Reference Mannheimer, Pienta, Kirilova, Elman and Wutich2019, 649–53). If a particular de-identified document would be rendered analytically useless or would insufficiently address legal and ethical concerns, it can simply not be shared. As for the QTD board concern regarding archival policies, authors who are bound by archival rules that do not permit sharing would simply not share. If scholars have an organized data management plan, time spent depositing files will likely be manageable (Saunders Reference Saunders2014, 695; Kapiszewski and Karcher Reference Kapiszewski, Karcher, Elman, Gerring and Mahoney2020, 201–2). Sharing data through a dedicated repository as opposed to author websites makes the data more accessible over time, and repositories are also better equipped than journals at archiving large quantities of documentsFootnote 22 (Kapiszewski and Karcher Reference Kapiszewski, Karcher, Elman, Gerring and Mahoney2020, 205–6; Mannheimer et al. Reference Mannheimer, Pienta, Kirilova, Elman and Wutich2019, 647).
While data archiving may sometimes incur fees and does take time, the benefits are multiple. Data archiving can boost authors visibility when their data is reused and cited (Kapiszewski and Karcher Reference Kapiszewski, Karcher, Elman, Gerring and Mahoney2020, 217–18), aid in pedagogical purposes (Kapiszewski and Karcher Reference Kapiszewski, Karcher, Elman, Gerring and Mahoney2020, 198), promote organized workflow, facilitate analysis and writing (Mannheimer et al. Reference Mannheimer, Pienta, Kirilova, Elman and Wutich2019, 650), and allow others to more easily engage with one’s scholarship. Archiving is a promising, optional choice for some research projects.
Conclusion
Qualitative researchers frequently engage in careful, painstaking work with difficult-to-access texts in demanding research settings. They do so in different languages and in politically volatile contexts. Yet publishing trends have shrunk the space available for scholars to explain which sources were consulted, why they were selected, and how they support key claims. As a result, citations have become symbolic placeholders rather than facilitators of scholarly exchange, and scholars are confused about how and whether to modify existing practices.
We contribute a practical element to this complex debate by focusing on how transparency can be augmented in research that relies on text-based sources.Footnote 23 Expanding Moravcsik’s original formulation (Reference Moravcsik2014, 48–49), we specify five types of text-based source transparency dimensions (location, production, selection, analysis, and access). We provide a first-of-its-kind assessment of existing practices, reviewing 1,120 articles published between 2008–2018, and find that less than 15% of articles provided information about source production, selection, analysis, and access. Most sources fail to provide information about where sources are located. Page number use is extremely low at 22% for APSR, AJPS, World Politics and Perspectives on Politics, and of these citations, half are for quoted text. Our review provides evidence that there is considerable room for improvement.
We use the QTD deliberations, a multiyear forum for online discussion about transparency practices for qualitative research, to identify researchers’ key concerns—such as navigating copyright law, archival rules, privacy concerns, right of first use, time burdens, and hiring and promotion impacts. We spell out recommendations for augmenting transparency in the use of text-based sources organized around 1) detailed citations, 2) transparency appendices, and 3) data access, using illustrative examples and established literature to showcase how others have navigated challenges. Our recommendations begin with those we see as most urgent, mandating the use of page numbers, to those we view as optional, such as data archiving.
We also outline the benefits of new technologies, such as ATIs, that allow scholars from diverse epistemological traditions to annotate and share their text-based sources via excerpts or full sources where legally and ethically feasible. ATIs will not solve challenges inherent to qualitative research, such as divergent evidence, noncomparable data types, “the absence of evidence as evidence,” and the entanglement between data collection and analysis that often makes the research process difficult. However, ATIs provide more space to wrestle with these issues, even if they are not a panacea.
Transparency technologies for qualitaitve research are unlikely to advance if departments do not count ATIs or data archiving towards hiring, tenure, and promotion. Data files could count as half or one-third of a peer reviewed article, and should be showcased on CVs, findable and citable. Letter writers can highlight their value for a hiring or promotion file. We see valuing ATIs or data archives in the same light of counting a research note in a promotion file—not as a means to punish scholars who have not adopted such measures but rather to provide professional recognition for those that do.
Incorporating more training about transparency technologies into qualitative research in graduate school would help scholars mitigate potential costs. Graduate qualitative methods courses can instill transparency norms in the research apparatus of early-stage scholars. Students can gain experience with the five dimensions of transparency by studying the practices highlighted in our illustrative examples or by incorporating similar practices in their own research. Paying attention to transparency in the early stages of research projects (e.g., gathering source excerpts during data collection or drafting with potential annotations in mind) reduces time burdens. Learning transparency technologies as an early stage qualitative researcher is also likely to produce other benefits, such as enhanced training surrounding working with documentary evidence and archives, and increased proficiency in multimedia data organization and project management.
The practical guidelines we propose have implications for other types of research. Although QTD participants disagreed fervently about the benefits and costs of data sharing, our paper dispels the myth that data sharing (in particular, uploading entire documents) is a required and necessary part of research openness. Instead, greater transparency can be achieved through the use of excerpts or through other forms of transparency such as providing information about sources’ location, selection, production, and analysis. Other types of qualitative research, such as those based on elite interviews, process tracing, or participant observation, could develop a similar approach that deemphasizes the need for data sharing and clarifies a path forward for enhancing transparency within particular epistemological traditions. There are implications for quantitative studies as well. Quantitative scholars frequently choose among data sets that vary in quality; applying the principles of transparency in “source selection” and “source production,” scholars should provide information on why they selected particular datasets and discuss potential biases associated with them.
Enhanced transparency measures such as those we outline here are beneficial and frequently possible, and can be adopted in ways that retain authorial authority and epistemological commitments. New transparency technologies are promising because they allow qualitative researchers to more easily provide more context, present complexity, and unpack relevant contradictions about politics.
Supplementary Materials
To view supplementary material for this article, please visit http://doi.org/10.1017/S153759272200069X.
Appendix A: Coding Procedures for Review of Existing Transparency Practices
Appendix B: Full Citations for the Qualitative Transparency Deliberations Board Posts
Appendix C: Additional Results from Review of Published Articles
Appendix D: Additional Illustrative Examples of Transparency in Practice
Appendix E: Copyright and Fair Use Resources
Appendix F: Reading Lists on Transparency Issues for Qualitative Researchers
Acknowledgments
The authors thank Josh Hooper, Willow Wei, Aaron Christensen, Aura Gonzalez, and Jenny Xiao for excellent research assistance. They are grateful to Rob Mickey for comments and collaboration, and for his contributions to the ideas in the section on interpretivism. This paper draws on discussions with Alan Jacobs, Tim Büthe, Kimberly Morgan, and participants from the Qualitative Transparency Deliberations forum hosted by the Qualitative and Multi-Method Section of the American Political Science Association.