Article contents
Recognizing Sample-Selection Bias in Historical Data
Published online by Cambridge University Press: 06 July 2020
Abstract
Recent research has ignited a debate in social science history over whether and how to draw conclusions for whole populations from sources that describe only select subsets of these populations. The idiosyncratic availability and survival of historical sources create a threat of sample-selection bias—an error that arises when there are systematic differences between the observed sample and the population of interest. This danger is common in studying trends in health as measured by average stature—scholars can often observe these trends only for soldiers and other similar groups; but whether these patterns are representative of those of the broader population is unclear. This article illustrates what simple patterns in a potentially selected sample can be used to recognize the presence of sample-selection bias in a source, and to understand how such bias might affect conclusions drawn from this source. Applying this intuition to the use of military data to describe stature in the antebellum United States, I present several simple empirical exercises based on these patterns. Finally, I use the results of these exercises to describe how sample-selection bias might affect the use of these data in testing for differences in average stature between the Northeast and the Midwest.
- Type
- Special Issue Article
- Information
- Social Science History , Volume 44 , Special Issue 3: Selection Bias and Social Science History , Fall 2020 , pp. 525 - 554
- Copyright
- © The Author(s), 2020. Published by Cambridge University Press on behalf of the Social Science History Association
References
- 6
- Cited by