Although the importance of popular ignorance of politics is disputed, few scholars dispute that such ignorance is widespread (e.g., Converse Reference Converse2000; Kinder Reference Kinder, Gilbert and Fiske1998, 784–89; Luskin Reference Luskin and Kuklinski2002). But some do. For example, Boudreau and Lupia (Reference Boudreau, Lupia and Druckman2011) argue that conventional measures do not capture relevant knowledge. And both Robison (Reference Robison2015) and Gibson and Caldeira (Reference Gibson and Caldeira2009) note that many measures of knowledge rely on open-ended questions. They find that people appear to know more when they answer closed-ended questions, and they argue that open-ended questions far understate what the public knows about politics.
We build on this body of research, showing that little-noticed features of question design can make a large difference to estimates of political knowledge. We find that ordinary closed-ended questions sometimes suggest as little knowledge as the corresponding open-ended questions. That is, contrary to previous arguments, closed-ended questions have no inherent property that leads them to suggest higher knowledge levels than those suggested by open-ended questions. Instead, estimated knowledge levels depend much more on two other, less-noticed features of question design: the number and difficulty of response options.
Theories of survey response offer reasons to expect both features of question design to affect estimated knowledge levels, but before we can explore those reasons, conceptual preliminaries are in order. We take knowledge to be confident belief in true statements (Bullock and Lenz Reference Bullock and Lenz2019). Knowledge may be measured by questions, and from an individual's perspective, questions are difficult to the extent that one is unsure of the correct answers. The aggregate implication is that questions are difficult if few respondents answer them correctly. Knowledge is thus a more fundamental concept than difficulty: difficulty is a property of both questions and people, and it is linked to knowledge through questions.
We follow Nadeau and Niemi (Reference Nadeau and Niemi1995, 325) and Bullock and Lenz (Reference Bullock and Lenz2019, 328–29) in holding that the process by which people construct answers to knowledge questions is, in many cases, not fundamentally different from the consideration- sampling process by which they construct answers to attitude questions (Tourangeau, Rips and Rasinski Reference Tourangeau, Rips and Rasinski2000; Zaller and Feldman Reference Zaller and Feldman1992). Any given person may thus find a question difficult for multiple reasons. In this article, we attend to two reasons – two ways in which changing response options may make questions harder to answer. The first reason is purely mechanical. If people are guessing blindly, adding more response options makes a question more difficult. And even if people are making educated guesses, adding response options will make a question more difficult, provided that the new options are more plausible than the correct option. Again, the process is mechanical: we need not invoke satisficing or other cognitive phenomena to reach this conclusion.
The second reason that changing the response options may make questions harder to answer is cognitive. Adding response options to a question, or making them more plausible, may increase the effort required to comprehend the question (including its response options). To fully comprehend it, one must canvas one's memory for related considerations – and if the response options are more numerous or more plausible, the relevant considerations will be more numerous. They may also be more complex. Of course, one may not be willing to invest the effort required to comprehend the question and the considerations that it calls to mind. In this case, one may satisfice, settling on a response that is ‘good enough’ rather than maximizing the probability of a correct response (Krosnick Reference Krosnick1991).
There is already a vast body of research on the design of test items. Even the body of research on the ‘best’ number of response options to use in closed-ended questions is large (e.g., Rodriguez Reference Rodriguez2005). This literature is informative – but only tangentially relevant, as recommendations about the design of questions are beyond the scope of this letter. Our point is not to judge the merits of different kinds of items or to promulgate recommendations about question design. Our point is more fundamental: it is to call attention to the major but little-recognized effects of response options on estimates of the public's level of knowledge. And in turn, it is to call attention to the ambiguity of claims about ‘levels of knowledge’ in the population, whether the knowledge in question is general or of particular facts. Some of these ideas have been raised before in passing (e.g., Krosnick, Visser and Harder Reference Krosnick, Visser, Harder, Fiske, Gilbert and Lindzey2010, 1291), but they deserve a sustained treatment and a wider airing.Footnote 1
We begin by reviewing the distinction between closed- and open-ended knowledge questions and by reviewing all prior data on a prominent question about the name of the chief justice of the United States. Noting the great variation in answers to this question across surveys, we proceed by describing a new study that permits us to identify the effects of asking closed-ended knowledge questions in different ways, and to compare different kinds of closed-ended knowledge questions to their open-ended counterparts.
Knowledge of the Court In Open- and Closed-Ended Questions
To fix ideas, we focus on knowledge of the US Supreme Court. What Americans seem to know about it depends a lot on the questions that we ask and the ways in which we ask them. For example, the chief justice may be the most prominent member of the most prominent court, but few can recall the exact job of ‘William Rehnquist’ or ‘John Roberts’ when asked in an open-ended question. Answers to this question in the American National Election Studies are illustrative: only 8 per cent of respondents in the 2000 ANES and 13 per cent in the 2012 ANES gave an answer that included ‘chief justice’ and either ‘Supreme Court’ or ‘United States’. But this is both a tough question and a tough standard for what counts as correct.Footnote 2 By our reading, an additional 25 per cent of respondents in 2000 and 18 per cent of respondents in 2012 gave answers that were nearly correct.Footnote 3
Perhaps it is unsurprising, then, that many more people can choose the correct name when asked to identify the chief justice in a closed-ended question. In one 2003 survey, for example, 63 per cent of respondents correctly identified Rehnquist as the current chief justice in response to a closed-ended question (Prior Reference Prior2005); eight years later, 69 per cent of people answering a closed-ended ANES question correctly identified Roberts as the chief justice. Results like these led some to suspect that differences in correct-response rates to open- and closed-ended questions stem from fundamental differences between the two types of questions (e.g., Gibson and Caldeira Reference Gibson and Caldeira2009, 431–35). In particular, open-ended questions test ability to recall correct answers, while closed-ended questions instead test ability to recognize the correct answers in a limited set of options (Tedin and Murray Reference Tedin and Murray1979). And because open-ended questions lack response options, they offer no frame of reference to anchor respondents’ thinking. Coupled with research that suggests the superiority of closed-ended knowledge measures (e.g., Mondak Reference Mondak2001), results like these have led scholars to infer that the public knows more about the Supreme Court – and perhaps much else – than many have suggested (Gibson and Caldeira Reference Gibson and Caldeira2009, 432–33).
To further explore the extent to which conventional open-ended questions indicate lower levels of political knowledge than their closed-ended counterparts, we collected the universe of national-sample studies, conducted in or before 2017, that included closed-ended questions about the name of the chief justice.Footnote 4 Table 1 shows, for each study, the percentage of respondents who identified Rehnquist or Roberts as the chief justice. In every case, the percentage correct is larger than the ‘exactly correct’ response rate from the corresponding open-ended ANES question. To better understand the sources of these differences, a close look at Table 1 is in order.Footnote 5
Note: EGSS = Evaluations of Government and Society Study. The table has two tiers: in the first tier (top two rows), the correct answer is ‘William Rehnquist’; in the second tier, it is ‘John Roberts’. Within each tier, studies are sorted by the percentage of respondents who answered correctly.
The table reports data from closed-ended questions, and it shows that the variation in correct-response rates is great – even though these questions have all been designed to measure knowledge of the same fact. For example, across the six surveys conducted during John Roberts' term, the correct-response rates range from 28 per cent to 69 per cent. This 41-point difference is comparable to the 39-point difference that Gibson and Caldeira (Reference Gibson and Caldeira2009, 435) find between correct-response rates for open- and closed-ended questions about John Roberts.Footnote 6
What explains this wide variation among questions that all use a closed-ended format? Increased public knowledge of the relatively new chief probably plays some role. In the two Pew surveys, more respondents correctly identified Roberts as the chief in 2012 (34 per cent) than in 2010 (28 per cent). But this within-survey-firm difference is dwarfed by the within-year difference between the 2012 Pew survey (34 per cent) and the 2012 ANES study (69 per cent).
Survey mode may also play a role. But the size of its role is hard to determine, partly because it is confounded with the year in which the interview was conducted. Moreover, the within-mode differences are large. In particular, the studies that generated the lowest and highest percentages of correct responses were both conducted by telephone.
Instead, Table 1 suggests that variation in correct-response rates may have more to do with the number and difficulty of the response options that respondents receive. In some sense, this is not news: much has been written about the role of response options in survey questions, and no scholar will deny that ‘response options matter’. But for all that has been written about levels of political knowledge, the extent to which measurement of those levels depends on the choice of response options has rarely been mentioned. Even the most prominent works in this field typically do not broach the topic (e.g., Delli Carpini and Keeter Reference Delli Carpini and Keeter1996; Luskin Reference Luskin1987). Instead, previous studies have focused on inherent differences between closed- and open-ended questions: for example, their implications for guessing and the way that open-ended questions test recall while closed-ended questions test recognition (e.g., Tedin and Murray Reference Tedin and Murray1979).
While Table 1 is suggestive, the questions it describes differ in many ways – not just with respect to response options. For example, the questions were asked in different years, and they appeared at different positions within different kinds of surveys. To isolate the effects of the number and difficulty of response options, we conducted a new experiment in which we held question wording constant while randomly assigning some people to see an open-ended question and others to see closed-ended versions of the same question. No previous published studies of political knowledge have isolated the effects of response options in this way.Footnote 7
A New Experiment: Exploring the Role of Response Options
We fielded an experiment from 14–30 March 2017. Our subjects were a national sample of 2,080 US adults. Following our pre-analysis plan, we restrict our analyses to the 1,961 subjects for whom we have complete demographic information. The Appendix shows that relaxing this restriction makes no substantive difference. Subjects were Survey Sampling International panelists, and they were asked six questions: about the number of justices who serve on the court, the length of their terms, the number of women who serve on the court, and the names of the chief justice and Senate majority leader.Footnote 8
For each question, subjects were assigned to one of five conditions: ‘short easy’, ‘short difficult’, ‘long easy’, ‘long difficult’, and open-ended. In the ‘long’ conditions, subjects saw five response options: the correct option and four others. In the ‘short’ conditions, they saw the correct option and two others that were randomly selected from the full set of four incorrect options. In both the long and short conditions, subjects also saw a ‘don't know’ response option (Cor and Sood Reference Cor and Sood2016, esp. 229n6; Jessee Reference Jessee2017; Luskin and Bullock Reference Luskin and Bullock2011; Sturgis, Allum and Smith Reference Sturgis, Allum and Smith2008). Assignment varied across questions; for example, a subject might be in the short-easy condition for one question, but in the long-difficult condition for another.
We classified response options as easy or difficult by considering the extent to which they were likely to seem correct. For example, our difficult-and-incorrect response options for the chief justice question were William Rehnquist, Earl Warren, Clarence Thomas, and Antonin Scalia. Two of these men served as chief justice before John Roberts; the two others served on the court with Roberts. By contrast, our easy-and-incorrect response options were Theodore Olson, Mark Rockefeller, Homer Stille Cummings, and J. Harvie Wilkinson, III. None of these men have ever been prominent figures in public life.
All of these response options have been used in other research. And for all other questions, we used response options that might plausibly be used in other surveys.
We asked six additional questions for which we manipulated only the number of response options. These questions were about judicial review, the limits of Supreme Court power, how justices are chosen and removed from office, the consequences of tie votes, and public access to oral arguments. For each question, some subjects were assigned to see five response options, and others were assigned to see three options. The incorrect responses in the three-option condition were randomly selected from those in the five-option condition.
We use these data to examine three hypotheses. Two are simple: we anticipate that correct-response rates decline as the number and difficulty of response options increases. But our main hypothesis is more complex, as it involves comparing two differences:
(a) the difference between correct-response rates to questions that have a short-easy set of response options and questions that have a long-difficult set of response options, and
(b) the difference between correct-response rates to questions that have a long-difficult set of response options and questions that are open-ended.
We expected that difference (a) would be greater than difference (b). That is, we expected that correct-response rates would depend more on the contrast between different types of closed-ended questions than on inherent differences between open- and closed-ended questions. This hypothesis is contrary to much thinking about open- and closed-ended questions. It is especially contrary to the claim of Gibson and Caldeira (Reference Gibson and Caldeira2009, 435), who attribute the large differences in correct-response rates that they observe to the inherent difference between closed- and open-ended questions, and who ‘strongly doubt’ that variation in response options alone can account for such differences. Our hypotheses were preregistered, and our pre-analysis plan can be found at https://doi.org/10.7910/DVN/HU04WI.
Results
Figure 1 displays the correct-response rates from each experimental condition, on average (black lines) and for each question (grey lines). And Table 2 depicts estimates from regressions of answers, coded correct (1) or incorrect (0), on indicator variables that represent different experimental conditions.
Note: cell entries are OLS estimates and standard errors. The outcomes are answers to the questions: correct = 1, incorrect = 0. The listed predictors are also coded 0 or 1. All regressions include fixed effects for each question. The baseline condition for Column 1 is ‘three response options’; for Column 2, ‘easy response options’; and for Columns 3 and 4, ‘open-ended questions’. Standard errors are clustered at the respondent level.
Panels 1 and 2 of the figure show that correct-response rates decline with both the number and the difficulty of response options. And in Table 2, the first regression shows that offering five response options, rather than three, reduces the correct-response rate by 12 percentage points. The second regression shows that offering difficult rather than easy response options has an even larger effect: it reduces correct-response rates by 24 percentage points.Footnote 9
To generalize, one may imagine two batteries that include the ‘same’ questions but different response options. All of the response options might plausibly be used in actual surveys – but how much people seem to know may depend hugely on which battery one chooses. As best we can tell, this point has not been demonstrated before in studies of political knowledge. (See Ahler and Goggin Reference Ahler and Goggin2017 for the most closely related work that we know.)
The third regression in Table 2 speaks to our main concern: the difference between different kinds of closed-ended questions, relative to the difference between open-ended questions and difficult closed-ended questions. With this regression, we estimate a model of the form:
where (three easy)ij and (five difficult)ij indicate the condition to which subject i was assigned for question j. The reference category in this regression is the open-ended condition. Per our pre-analysis plan, data from the ‘three difficult’ and ‘five easy’ conditions were excluded from this analysis. Appendix Table A13 shows that including these conditions makes no material difference to the results.
The results show that, relative to the open-ended condition, assignment to the ‘five difficult’ condition increases correct-response rates by 6 percentage points. But relative to the same open-ended condition, assignment to the ‘three easy’ condition increases correct-response rates by 35 points. The difference between correct-response rates in the ‘three easy’ and ‘five difficult’ conditions, 35 − 6 = 29 points, far exceeds the 6-point difference between correct-response rates in the ‘five difficult’ and open-ended conditions. The difference of differences is 29 − 6 = 23 points (95 per cent CI: [17 per cent, 26 per cent]). Given the evidence presented here, estimates of knowledge levels depend far more on differences between different kinds of closed-ended questions than on inherent differences between closed- and open-ended questions.
But perhaps it seems that we have stacked the deck in favor of such a finding. After all, the comparison of differences that we have just described involves the difference between open-ended questions and the most difficult closed-ended questions. One might find another difference of differences more revealing – a comparison of:
(a) the difference between correct-response rates to questions that have a short-easy set of response options and questions that have a long-difficult set of response options, to
(c) the difference between correct-response rates to all closed-ended questions and questions that are open-ended.
Table 2 speaks to this comparison as well. We know from its third regression that (a) is 35 – 6 = 29 percentage points. And the final regression shows us that (c) is 21 percentage points. Thus, the difference in correct-response rates between different kinds of closed-ended questions exceeds even the average difference between correct-response rates to closed- and open-ended questions. The difference of differences is 29 – 21 = 8 percentage points (95 per cent CI: [3 per cent, 10 per cent]). This result illustrates even more strongly that differences in estimated knowledge levels depend less on the choice of open- or closed-ended questions than on the number and difficulty of response options.Footnote 10
These results also bear on the recall-recognition distinction in knowledge measurement (e.g., Tedin and Murray Reference Tedin and Murray1979). Prior research suggests that open-ended questions measure recall while closed-ended questions measure recognition. Table 2 shows that although this distinction is sensible, it is not always substantial. Under difficult conditions, the proportion of people who can recognize the correct answer is similar to the proportion who can recall it.
Conclusion
Imagine the conclusions that one might draw about the public's knowledge of politics from two different knowledge batteries – batteries that comprise the same closed-ended questions but different response options. From the first battery:
The Supreme Court's decisions affect citizens, businesses, and American political institutions. To determine what people know about the court, we asked twelve closed-ended questions about its justices, its procedures, and the scope of its power. On average, only 39 per cent of respondents knew the answers to our questions. At the extreme, we find that only 28 per cent of respondents knew that three women now serve on the court.
And from the second battery:
On average, fully 68 per cent of respondents knew the answers to our questions. At the extreme, 74 per cent knew that three women now serve on the court.
There is nothing hypothetical about these conclusions or the vast difference between them. Both the negative and the positive conclusions follow from the study that we have just reported, in which questions were held constant but response options were varied.Footnote 11 In the abstract, everyone will grant that ‘easy questions are easy’ and that ‘hard questions are hard’, but no one has previously shown that the choice of some plausible response options instead of others makes such a difference to inferences about the public's level of knowledge. On the contrary, some have expressed strong doubt that response options can make such a difference (Gibson and Caldeira Reference Gibson and Caldeira2009, 435), and the lack of attention to the subject suggests that this doubt may be widely shared.
Of course, scholars have long attended to ways in which the forms of knowledge questions affect estimates of popular knowledge of politics. Perhaps the essential contrast in this research is between open- and closed-ended questions. Unlike closed-ended items, open-ended items offer no response options and thus no frame of reference to anchor respondents’ thinking. They allow estimated knowledge levels to be confounded with differing propensities to say ‘don't know’. They test recall rather than recognition – and recall seems to be the more difficult task (Tedin and Murray Reference Tedin and Murray1979). In light of these claims, a common conclusion is that open-ended questions lead us to understate levels of knowledge.
We have extended this line of inquiry by assigning subjects to answer either open- or closed-ended questions, and by further varying the response options of the closed-ended questions. This approach is novel; no previously published studies of political knowledge have randomly assigned some people to see an open-ended question and others to see closed-ended versions of the same question. We find immense variation in correct-response rates across different kinds of closed-ended questions. Indeed, the difference between different kinds of closed-ended questions is greater than the difference between closed-ended questions and their open-ended counterparts. Contrary to prior suggestions, closed-ended questions have no inherent property that leads them to suggest higher knowledge levels.
As the vignette that starts this section suggests, our results point to a deep limitation of efforts to characterize the public's level of knowledge. Characterizations based on any single question – whether open-ended or closed-ended with a single set of response options – never warrant claims of the form ‘X% of people know Y’. Of course, empirical claims are always conditional on other things, and often in minor ways. But the evidence reviewed here suggests that, where political knowledge is concerned, the little-appreciated effects of response options are not minor at all. Scholars may do better to study answers to multiple versions of any given question, rather than to use only a single version of each question. Adopting this practice will produce a wide range of findings about the public's level of knowledge, but that wide range will simply reflect the complexity of the topic.
Supplementary material
An online appendix is available at https://doi.org/10.1017/S0007123421000120. It reports further details about our survey, sample, and screening procedures. It also includes a variety of robustness checks, our pre-analysis plan, details of our analysis of the ANES ‘Chief Justice’ question, and a general discussion of the importance of measuring levels of knowledge.
Data availability statement
Data, code, and replication instructions can be found in Harvard Dataverse at: https://doi.org/10.7910/DVN/YLEY0R.
Acknowledgements
We thank Matthew DeBell for providing some of the ANES data that we use. We also thank Carlos Campos, Carlos Diaz, Noah Howe, and Leila Murphy for research assistance, Markus Prior and Maya Sen for sharing data, and Deborah Beim, Greg Caldeira, Scott Clifford, Jim Gibson, Jennifer Jerit, John Kastellec, Jeff Lax, and Celia Paris for helpful suggestions.
Ethical standards
Our pre-analysis plan was pre-registered at the Political Science Registered Studies Dataverse: see https://doi.org/10.7910/DVN/HU04WI.