No CrossRef data available.
Article contents
Sampling complex social and behavioral phenomena
Published online by Cambridge University Press: 05 February 2024
Abstract
We comment on the limits of relying on prior literature when constructing the design space for an integrative experiment; the adaptive nature of social and behavioral phenomena and the implications for the use of theory and modeling when constructing the design space; and on the challenges of measuring random errors and lab-related biases in measurement without replication.
- Type
- Open Peer Commentary
- Information
- Copyright
- Copyright © The Author(s), 2024. Published by Cambridge University Press
References
Breznau, N., Rinke, E. M., Wuttke, A., Nguyen, H. H., Adem, M., Adriaans, J., … Van Assche, J. (2022). Observing many researchers using the same data and hypothesis reveals a hidden universe of uncertainty. Proceedings of the National Academy of Sciences of the United States of America, 119, e2203150119.CrossRefGoogle ScholarPubMed
Brunswik, E. (1944). Distal focussing of perception: Size-constancy in a representative sample of situations. Psychological Monographs, 56, 1–49.CrossRefGoogle Scholar
Dawes, R. M. (1989). Statistical criteria for establishing a truly false consensus effect. Journal of Experimental Social Psychology, 25, 1–17.CrossRefGoogle Scholar
Galesic, M., Olsson, H., & Rieskamp, J. (2018). A sampling model of social judgment. Psychological Review, 125, 363–390.CrossRefGoogle ScholarPubMed
Gigerenzer, G., Hoffrage, U., & Kleinbölting, H. (1991). Probabilistic mental models: A Brunswikian theory of confidence. Psychological Review, 98, 506–528.CrossRefGoogle ScholarPubMed
Hertwig, R., Barron, G., Weber, E. U., & Erev, I. (2004). Decisions from experience and the effect of rare events in risky choice. Psychological Science, 15, 534–539.CrossRefGoogle ScholarPubMed
Juslin, P. (1994). The overconfidence phenomenon as a consequence of informal experimenter-guided selection of almanac items. Organizational Behavior and Human Decision Processes, 57, 226–246.CrossRefGoogle Scholar
Juslin, P., Olsson, H., & Björkman, M. (1997). Brunswikian and Thurstonian origins of bias in probability assessment: On the origin and nature of stochastic components of judgment. Journal of Behavioral Decision Making, 10, 189–209.3.0.CO;2-4>CrossRefGoogle Scholar
Juslin, P., Winman, A., & Olsson, H. (2000). Naive empiricism and dogmatism in confidence research: A critical examination of the hard-easy effect. Psychological Review, 107, 384–396.CrossRefGoogle ScholarPubMed
Krueger, J. I., & Funder, D. C. (2004). Towards a balanced social psychology: Causes, consequences, and cures for the problem-seeking approach to social behavior and cognition. Behavioral and Brain Sciences, 27, 313–327.CrossRefGoogle ScholarPubMed
Lee, E., Karimi, F., Wagner, C., Jo, H.-H., Strohmaier, M., & Galesic, M. (2019). Homophily and minority-group size explain perception biases in social networks. Nature Human Behaviour, 3, 1078–1087.CrossRefGoogle ScholarPubMed
Lejarraga, T., & Hertwig, R. (2021). How experimental methods shaped views on human competence and rationality. Psychological Bulletin, 147(6), 535–564.CrossRefGoogle ScholarPubMed
Lerman, K., Yan, X., & Wu, X. Z. (2016). The “majority illusion” in social networks. PLoS ONE, 11(2), e0147617.CrossRefGoogle Scholar
Wulff, D. U., Mergenthaler-Canseco, M., & Hertwig, R. (2018). A meta-analytic review of two modes of learning and the description-experience gap. Psychological Bulletin, 144, 140–176.CrossRefGoogle ScholarPubMed
You have
Access
We welcome this thoughtful and creative set of ideas for improving experimentation in the social sciences. We offer several points for discussion that might further clarify and strengthen the authors’ arguments.
First, how should the design space be constructed? The authors suggest that the design space from which researchers can sample various aspects of the phenomena of interest can be constructed mostly by reviewing past literature. However, past studies are often a biased sample of the phenomena of interest, driven by implicit or explicit theories their authors had at the time, by methodological limitations, or an adherence to a particular experimental paradigm.
An example from the judgment and decision-making literature is the phenomenon of overconfidence. The assumption that an experimenter can choose “good general knowledge items” led to results suggesting that people almost always show overconfidence. But using the Brunswikian ideas of representative design, later studies (Gigerenzer, Hoffrage, & Kleinbölting, Reference Gigerenzer, Hoffrage and Kleinbölting1991; Juslin, Reference Juslin1994) showed that the items that had been previously selected were not representative of the whole population of items people experience in the real world. By randomly sampling from the whole population of items, which approximates representative design, studies showed that the overconfidence effect is not as general as previously thought (Juslin, Olsson, & Björkman, Reference Juslin, Olsson and Björkman1997; Juslin, Winman, & Olsson, Reference Juslin, Winman and Olsson2000).
Another example is research on risky choices, where traditionally participants have been presented with summary descriptions of different options. Later research has shown that risky choices can be very different when people sample from the options themselves rather than relying on a description (Hertwig, Barron, Weber, & Erev, Reference Hertwig, Barron, Weber and Erev2004; Lejarraga & Hertwig, Reference Lejarraga and Hertwig2021; Wulff, Mergenthaler-Canseco, & Hertwig, Reference Wulff, Mergenthaler-Canseco and Hertwig2018). Relying solely on prior psychological studies to understand risky choice would not discover these insights.
Of course, new dimensions can always be added to the design space as they are discovered by new research, but this poses a practical problem of the rapidly growing number of experiments that could potentially be conducted. We therefore propose two ideas for a more exhaustive construction of the design space. One is to sample the phenomenon of interest directly. For example, Brunswik would sample participants’ behavior in random intervals during several weeks, recording the behavior of interest as it occurs in the participants’ natural environments (Brunswik, Reference Brunswik1944). With today's technological developments, such experience-based sampling becomes easier to do and might be a way toward a more exhaustive grasp of the phenomenon of interest.
The other way to improve the construction of the design space is to do it collectively by many labs, in particular labs situated in different disciplines. For example, decades of research in social psychology suggest many different biases in human social cognition, which are often contradictory (Krueger & Funder, Reference Krueger and Funder2004). A tighter integration of psychology and network science has enabled recognizing how some of these biases in fact reflect a well-adapted cognition in specific social network structures (Dawes, Reference Dawes1989; Galesic, Olsson, & Rieskamp, Reference Galesic, Olsson and Rieskamp2018; Lee et al., Reference Lee, Karimi, Wagner, Jo, Strohmaier and Galesic2019; Lerman, Yan, & Wu, Reference Lerman, Yan and Wu2016).
Second, how to deal with adaptive nature of complex social systems? As the authors point out, social and behavioral phenomena are typically caused by many interacting factors that can be hard to pin down. An additional, often overlooked property of these social-cognitive systems is that they are adaptive: They change over time in response to internal and external factors. As a consequence, even the most detailed static picture of these systems would not provide the full understanding of the underlying dynamics. This of course is a problem for both one-shot and integrative experiments, and it can be addressed by conducting longitudinal studies of these systems, coupled with theoretical development. For integrative experiments, however, it introduces the additional complication and cost of longitudinal studies, which multiplies the already large number of dimensions of the design space.
This explosion of potentially important dimensions in integrative experiment design could be tamed by assigning a stronger role to theory and modeling. The article focuses mostly on their role in interpreting the results of samples taken from an already constructed design space. However, theory and computational models seem essential already in the construction of the design space. In particular, an integrative theoretical framework constructed by a collective, strongly interdisciplinary effort mentioned above, could be a useful starting point for developing the initial design space. Such collective effort could also help recognize parts of the space that are implausible and would hardly be expected to occur in the real world. Then, computational modeling could be used to further narrow down the space by investigating which of the dimensions could have a meaningful influence on the results. Such models could show that some apparently important dimensions have only a marginal influence on the system performance. Recognizing this could significantly narrow the otherwise vast space of possible experiments that could be run.
Third, what does it mean when results of experiments at particular points in the design space fail to generalize to other points? The authors suggest that this might point to an important missing dimension or even a fundamental limit of explanation of a particular phenomenon. It is however also possible that the reason is more prosaic, merely reflecting an inevitable random measurement error. This suggests that the integrative design experiments, just as one-at-a-time experiments, should be replicated. This would allow researchers to approximate confidence intervals around each of the samples from the design space and recognize what apparent differences between different points can be expected by chance. Moreover, it is likely that beyond random error, experiments conducted by any single lab will have some systematic biases stemming from lab-specific practices that can be hard to recognize without explicitly comparing labs. Different data analysts are also likely to reach different conclusions even from exactly the same data, so different labs conducting experiments from the same design space could reach different conclusions (Breznau et al., Reference Breznau, Rinke, Wuttke, Nguyen, Adem, Adriaans and Van Assche2022). To the extent that the integrative design experiments require resources that will limit them to a few larger labs, these biases could go unnoticed.
Competing interest
None.