Hostname: page-component-68c7f8b79f-b57wx Total loading time: 0 Render date: 2026-01-05T12:08:46.498Z Has data issue: false hasContentIssue false

Ordering Effects in Stereotype Scales

Published online by Cambridge University Press:  09 December 2025

L.J. Zigerell*
Affiliation:
Associate Professor of Politics and Government, Illinois State University, Schroeder Hall 404 Normal, IL 61790-4540, USA
Rights & Permissions [Opens in a new window]

Abstract

Stereotypes about groups are commonly measured by asking participants to rate the groups on a scale. However, the percentage of participants who stereotype a group can be affected by the order in which participants are asked to rate the groups. Data from a randomized experiment in the American National Election Studies 2022 Pilot Study indicated that a group was more frequently positively stereotyped relative to another group when the group was asked about first in the pair of groups, compared to when the other group in the pair was asked about first. Researchers are therefore advised to randomize the order of groups in a stereotype battery to evenly spread this ordering effect across groups and are also advised to design stereotype items to minimize this ordering effect.

Information

Type
Short Report
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NCCreative Common License - ND
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (https://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided that no alterations are made and the original article is properly cited. The written permission of Cambridge University Press or the rights holder(s) must be obtained prior to any commercial use and/or adaptation of the article.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of The American Political Science Association

Stereotypes about a group are often measured by asking participants to rate the group on a scale in which one end indicates that most group members possess a positive trait such as “hardworking” and the other end indicates that most group members possess the opposite trait, such as “lazy.” These ratings can then be used as a measure of participant stereotypes about the group, in isolation (e.g., Newman et al. Reference Newman, Merolla, Shah, Lemi, Collingwood and Karthick Ramakrishnan2021: 1146, Filindra et al. Reference Filindra, Kaplan and Buyuker2022: 967, O’Connell Reference O’Connell2025: 225) or relative to other groups (e.g., DeSante and Smith Reference DeSante and Smith2020: 974, Jardina and Ollerenshaw Reference Jardina and Ollerenshaw2022: 581, Yadon and Piston Reference Yadon and Piston2019: 801). Stereotype ratings have been used to measure phenomena such as prejudice (e.g., Hopkins Reference Hopkins2021: 672) and ethnocentrism (e.g., Kinder and Kam Reference Kinder and Kam2010: 45, Thompson Reference Thompson2022: 36).

However, if a participant is asked to rate multiple groups, the participant’s stereotype ratings might be affected by the order in which the groups are asked about. For example, if a participant rates the first group asked about at the positive end of the stereotype scale, the participant cannot rate a later group more positively, even if the participant’s stereotype about the later group is more positive than their stereotype about the first group. The present study tested for such an ordering effect in data from an experiment that randomized the order in which participants were asked to rate four groups on a stereotype scale.

Research design

For this, I used data from the American National Election Studies (ANES) 2022 Pilot Study (American National Election Studies 2022), which YouGov fielded on the internet from 14 through 22 November 2022, with an opt-in sample of 1,585 U.S. citizens aged 18 or older. The survey had exactly two sets of stereotype items, which asked participants to rate, in random order, “Whites,” “Blacks,” “Hispanic-Americans,” and “Asian-Americans” on a scale in which 1 was hard-working and 7 was lazy. Participants were then asked to rate the same groups in the same order on a scale in which 1 was intelligent and 7 was unintelligent. The four groups were presented in a static matrix on large-screen devices and in a dynamic matrix on small-screen devices, but the data did not indicate device type, so I do not report results by matrix type. The data also did not permit comparison of results in which multiple stereotypes were measured on the same webpage (as in the ANES 2022 Pilot Study) to results in which multiple stereotypes were measured on different webpages. I conducted the analysis in Stata 15 (StataCorp 2017) and produced the figure in R (R Core Team 2024) using the tidyverse (Wickham et al. Reference Wickham, Averick, Bryan, Chang, McGowan, François, Grolemund, Hayes, Henry, Hester, Kuhn, Pedersen, Miller, Bache, Müller, Ooms, Robinson, Seidel, Spinu, Takahashi, Vaughan, Wilke, Woo and Yutani2019) package. Sampling weights were provided for 1,500 participants and were applied to main text estimates to better match the population of adult U.S. citizens. See the supplement for item text and more information.

Results

Figure 1 illustrates the ordering effect on the frequency of stereotyping for each pair of groups. For example, estimates in the top row in the left panel indicate that, when Asian-Americans were the first group asked about, 50% rated Asian-Americans as more hard-working than Blacks, but that, when Blacks were the first group asked about, only 41% rated Asian-Americans as more hard-working than Blacks, for an ordering effect of 9 percentage points. Across all pairs and stereotypes, the ordering effect ranged from 0 to 18 percentage points, with a median effect of 8 percentage points.

Figure 1. Stereotype ordering effects.

Note: Points indicate the percentage that rated the first group in the comparison (listed before the “>”) more positively than the second group in the comparison when the first group was asked about the first of the four groups (black dots) or when the second group was asked about the first of the four groups (white dots). Error bars indicate 83.4% confidence intervals (Payton et al. Reference Payton, Greenstone and Schenker2003).

Discussion

The ordering effect detected in this analysis is plausibly caused by a restriction of range in which a participant who rates the first group asked about at the end of a stereotype scale has no options remaining on the scale to rate a later group even more extremely. For stereotype batteries that do not randomize the order of all groups, this ordering effect can plausibly cause analyses to mismeasure stereotyping. This mismeasurement of stereotyping can then cause misestimation of the effect of stereotyping when stereotypes are used to predict outcomes such as policy preferences and vote choice.

Randomizing the order in which groups are presented in a stereotyping battery would have the benefit of evenly spreading the ordering effect across groups. But other designs can reduce the ordering effect by reducing the percentage of participants who select an end of the stereotype scale. This might be accomplished by increasing the number of scale response options (see Chyung et al. Reference Chyung, Hutchinson and Shamsy2020) from the traditional seven to eleven or more. This might also be accomplished by changing scale labels. For example, in the ANES 2022 Pilot Study, the positive ends of the stereotype scales (“hard-working” and “intelligent”) were more commonly selected than the negative ends of the stereotype scales (“lazy” and “unintelligent”), and some participants might reasonably interpret the midpoints as indicating a negative stereotype about a group, such as being less than intelligent; however, scale ends could be labeled “very unintelligent” and “very intelligent” so that the midpoint might be reasonably interpreted or explicitly labeled as indicating that most group members have average intelligence. Another method is to ask participants to directly compare groups, such as asking participants to indicate whether, compared to Asians, Whites are on average more, less, or as intelligent; for a given stereotype, this would require three items to compare each pair in three groups and would require six items to compare each pair in four groups but could permit clearer inferences about stereotypes due to the directness of the items.

Supplementary material

The supplementary material for this article can be found at https://doi.org/10.1017/XPS.2025.10026.

Data availability

Data and code to replicate all analyses in this article are available at the Journal of Experimental Political Science Dataverse (Zigerell Reference Zigerell2025) within the Harvard Dataverse Network at: https://doi.org/10.7910/DVN/FYGUWO.

Acknowledgements

The author thanks the peer reviewers for their comments and for ideas such as methods to reduce the ordering effects in stereotypes and how to better visually present results.

Competing interests

The ANES 2022 Pilot Study was funded by a National Science Foundation grant SES-2209438 to the University of Michigan. The author did not receive a specific grant for funding this research and reports that there are no competing interests to declare.

Ethics statement

The author’s Institutional Review Board does not require review or approval of research that reports on deidentified data.

Footnotes

This article has earned badges for transparent research practices: Open Data and Open Materials. For details see the Data Availability Statement.

References

American National Election Studies. 2022. ANES 2022 Pilot Study [dataset and documentation]. December 14, 2022 version. www.electionstudies.org Google Scholar
Chyung, Seung Youn, Hutchinson, Douglas, and Shamsy, Jennifer A.. 2020. “Evidence-Based Survey Design: Ceiling Effects Associated with Response Scales.” Performance Improvement 59(6): 613.10.1002/pfi.21920CrossRefGoogle Scholar
DeSante, Christopher D., and Smith, Candis Watts. 2020. “Less is More: A Cross-Generational Analysis of the Nature and Role of Racial Attitudes in the Twenty-First Century.” The Journal of Politics 82(3): 967–80.10.1086/707490CrossRefGoogle Scholar
Filindra, Alexandra, Kaplan, Noah J., and Buyuker, Beyza E.. 2022. “Beyond Performance: Racial Prejudice and Whites’ Mistrust of Government.” Political Behavior 44(2): 961–79.10.1007/s11109-022-09774-6CrossRefGoogle ScholarPubMed
Hopkins, Daniel J. 2021. “The Activation of Prejudice and Presidential Voting: Panel Evidence from the 2016 US Election.” Political Behavior 43(2): 663–86.10.1007/s11109-019-09567-4CrossRefGoogle Scholar
Jardina, Ashley, and Ollerenshaw, Trent. 2022. “The Polls—Trends: The Polarization of White Racial Attitudes and Support for Racial Equality in the US.” Public Opinion Quarterly 86(S1): 576–87.10.1093/poq/nfac021CrossRefGoogle Scholar
Kinder, Donald R., and Kam, Cindy D.. 2010. Us against Them: Ethnocentric Foundations of American Opinion. University of Chicago Press.Google Scholar
Newman, Benjamin, Merolla, Jennifer L., Shah, Sono, Lemi, Danielle Casarez, Collingwood, Loren, and Karthick Ramakrishnan, S.. 2021. “The Trump Effect: An Experimental Investigation of the Emboldening Effect of Racially Inflammatory Elite Communication.” British Journal of Political Science 51(3): 1138–59.10.1017/S0007123419000590CrossRefGoogle Scholar
O’Connell, Heather A. 2025. “Confederate Monuments and Anti-Black Stereotypes in the US South.” Sociology of Race and Ethnicity 11(2): 221–36.10.1177/23326492241264234CrossRefGoogle Scholar
Payton, Mark E., Greenstone, Matthew H., and Schenker, Nathaniel. 2003. “Overlapping Confidence Intervals or Standard Error Intervals: What Do They Mean in Terms of Statistical Significance?.” Journal of Insect Science 3(1): 34.10.1673/031.003.3401CrossRefGoogle ScholarPubMed
R Core Team. 2024. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/ Google Scholar
StataCorp. 2017. Stata Statistical Software: Release 15. College Station, TX: StataCorp LLC.Google Scholar
Thompson, Jack. 2022. “Ethnonationalism and White Immigration Attitudes.” Nations and Nationalism 28(1): 3146.10.1111/nana.12754CrossRefGoogle Scholar
Wickham, Hadley, Averick, Mara, Bryan, Jennifer, Chang, Winston, McGowan, Lucy D’Agostino, François, Romain, Grolemund, Garrett, Hayes, Alex, Henry, Lionel, Hester, Jim, Kuhn, Max, Pedersen, Thomas Lin, Miller, Evan, Bache, Stephan Milton, Müller, Kirill, Ooms, Jeroen, Robinson, David, Seidel, Dana Paige, Spinu, Vitalie, Takahashi, Kohske, Vaughan, Davis, Wilke, Claus, Woo, Kara, and Yutani, Hiroaki. 2019. “Welcome to the Tidyverse.” Journal of Open Source Software 4(43): 1686. https://doi.org/10.21105/joss.01686 CrossRefGoogle Scholar
Yadon, Nicole, and Piston, Spencer. 2019. “Examining Whites’ Anti-Black Attitudes after Obama’s Presidency.” Politics, Groups, and Identities 7(4): 794814.10.1080/21565503.2018.1438953CrossRefGoogle Scholar
Zigerell, L.J. 2025. Replication Data for: “Ordering Effects in Stereotype Scales.” Harvard Dataverse, doi: 10.7910/DVN/FYGUWO.CrossRefGoogle Scholar
Figure 0

Figure 1. Stereotype ordering effects.Note: Points indicate the percentage that rated the first group in the comparison (listed before the “>”) more positively than the second group in the comparison when the first group was asked about the first of the four groups (black dots) or when the second group was asked about the first of the four groups (white dots). Error bars indicate 83.4% confidence intervals (Payton et al. 2003).

Supplementary material: File

Zigerell supplementary material

Zigerell supplementary material
Download Zigerell supplementary material(File)
File 170.7 KB
Supplementary material: Link

Zigerell Dataset

Link