Sampling complex social and behavioral phenomena

Henrik Olsson; Mirta Galesic

doi:10.1017/S0140525X23002327

Sampling complex social and behavioral phenomena

Published online by Cambridge University Press: 05 February 2024

Henrik Olsson

and

Mirta Galesic

Show author details

Henrik Olsson: Affiliation:
Complexity Science Hub, Vienna, Austria Santa Fe Institute, Santa Fe, NM, USA olsson@santafe.edu; galesic@santafe.edu https://www.santafe.edu/people/profile/henrik-olsson; https://www.santafe.edu/people/profile/mirta-galesic
Mirta Galesic*: Affiliation:
Complexity Science Hub, Vienna, Austria Santa Fe Institute, Santa Fe, NM, USA olsson@santafe.edu; galesic@santafe.edu https://www.santafe.edu/people/profile/henrik-olsson; https://www.santafe.edu/people/profile/mirta-galesic
*: *Corresponding author.

Article contents

Abstract
Competing interest
References

Rights & Permissions

Abstract

We comment on the limits of relying on prior literature when constructing the design space for an integrative experiment; the adaptive nature of social and behavioral phenomena and the implications for the use of theory and modeling when constructing the design space; and on the challenges of measuring random errors and lab-related biases in measurement without replication.

Type: Open Peer Commentary
Information: Behavioral and Brain Sciences , Volume 47 , 2024 , e55

DOI: https://doi.org/10.1017/S0140525X23002327 [Opens in a new window]
Copyright: Copyright © The Author(s), 2024. Published by Cambridge University Press

We welcome this thoughtful and creative set of ideas for improving experimentation in the social sciences. We offer several points for discussion that might further clarify and strengthen the authors’ arguments.

First, how should the design space be constructed? The authors suggest that the design space from which researchers can sample various aspects of the phenomena of interest can be constructed mostly by reviewing past literature. However, past studies are often a biased sample of the phenomena of interest, driven by implicit or explicit theories their authors had at the time, by methodological limitations, or an adherence to a particular experimental paradigm.

An example from the judgment and decision-making literature is the phenomenon of overconfidence. The assumption that an experimenter can choose “good general knowledge items” led to results suggesting that people almost always show overconfidence. But using the Brunswikian ideas of representative design, later studies (Gigerenzer, Hoffrage, & Kleinbölting, Reference Gigerenzer, Hoffrage and Kleinbölting1991; Juslin, Reference Juslin1994) showed that the items that had been previously selected were not representative of the whole population of items people experience in the real world. By randomly sampling from the whole population of items, which approximates representative design, studies showed that the overconfidence effect is not as general as previously thought (Juslin, Olsson, & Björkman, Reference Juslin, Olsson and Björkman1997; Juslin, Winman, & Olsson, Reference Juslin, Winman and Olsson2000).

Another example is research on risky choices, where traditionally participants have been presented with summary descriptions of different options. Later research has shown that risky choices can be very different when people sample from the options themselves rather than relying on a description (Hertwig, Barron, Weber, & Erev, Reference Hertwig, Barron, Weber and Erev2004; Lejarraga & Hertwig, Reference Lejarraga and Hertwig2021; Wulff, Mergenthaler-Canseco, & Hertwig, Reference Wulff, Mergenthaler-Canseco and Hertwig2018). Relying solely on prior psychological studies to understand risky choice would not discover these insights.

Of course, new dimensions can always be added to the design space as they are discovered by new research, but this poses a practical problem of the rapidly growing number of experiments that could potentially be conducted. We therefore propose two ideas for a more exhaustive construction of the design space. One is to sample the phenomenon of interest directly. For example, Brunswik would sample participants’ behavior in random intervals during several weeks, recording the behavior of interest as it occurs in the participants’ natural environments (Brunswik, Reference Brunswik1944). With today's technological developments, such experience-based sampling becomes easier to do and might be a way toward a more exhaustive grasp of the phenomenon of interest.

The other way to improve the construction of the design space is to do it collectively by many labs, in particular labs situated in different disciplines. For example, decades of research in social psychology suggest many different biases in human social cognition, which are often contradictory (Krueger & Funder, Reference Krueger and Funder2004). A tighter integration of psychology and network science has enabled recognizing how some of these biases in fact reflect a well-adapted cognition in specific social network structures (Dawes, Reference Dawes1989; Galesic, Olsson, & Rieskamp, Reference Galesic, Olsson and Rieskamp2018; Lee et al., Reference Lee, Karimi, Wagner, Jo, Strohmaier and Galesic2019; Lerman, Yan, & Wu, Reference Lerman, Yan and Wu2016).

Second, how to deal with adaptive nature of complex social systems? As the authors point out, social and behavioral phenomena are typically caused by many interacting factors that can be hard to pin down. An additional, often overlooked property of these social-cognitive systems is that they are adaptive: They change over time in response to internal and external factors. As a consequence, even the most detailed static picture of these systems would not provide the full understanding of the underlying dynamics. This of course is a problem for both one-shot and integrative experiments, and it can be addressed by conducting longitudinal studies of these systems, coupled with theoretical development. For integrative experiments, however, it introduces the additional complication and cost of longitudinal studies, which multiplies the already large number of dimensions of the design space.

This explosion of potentially important dimensions in integrative experiment design could be tamed by assigning a stronger role to theory and modeling. The article focuses mostly on their role in interpreting the results of samples taken from an already constructed design space. However, theory and computational models seem essential already in the construction of the design space. In particular, an integrative theoretical framework constructed by a collective, strongly interdisciplinary effort mentioned above, could be a useful starting point for developing the initial design space. Such collective effort could also help recognize parts of the space that are implausible and would hardly be expected to occur in the real world. Then, computational modeling could be used to further narrow down the space by investigating which of the dimensions could have a meaningful influence on the results. Such models could show that some apparently important dimensions have only a marginal influence on the system performance. Recognizing this could significantly narrow the otherwise vast space of possible experiments that could be run.

Third, what does it mean when results of experiments at particular points in the design space fail to generalize to other points? The authors suggest that this might point to an important missing dimension or even a fundamental limit of explanation of a particular phenomenon. It is however also possible that the reason is more prosaic, merely reflecting an inevitable random measurement error. This suggests that the integrative design experiments, just as one-at-a-time experiments, should be replicated. This would allow researchers to approximate confidence intervals around each of the samples from the design space and recognize what apparent differences between different points can be expected by chance. Moreover, it is likely that beyond random error, experiments conducted by any single lab will have some systematic biases stemming from lab-specific practices that can be hard to recognize without explicitly comparing labs. Different data analysts are also likely to reach different conclusions even from exactly the same data, so different labs conducting experiments from the same design space could reach different conclusions (Breznau et al., Reference Breznau, Rinke, Wuttke, Nguyen, Adem, Adriaans and Van Assche2022). To the extent that the integrative design experiments require resources that will limit them to a few larger labs, these biases could go unnoticed.

Competing interest

None.

References

Breznau, N., Rinke, E. M., Wuttke, A., Nguyen, H. H., Adem, M., Adriaans, J., … Van Assche, J. (2022). Observing many researchers using the same data and hypothesis reveals a hidden universe of uncertainty. Proceedings of the National Academy of Sciences of the United States of America, 119, e2203150119.CrossRef Google Scholar PubMed

Brunswik, E. (1944). Distal focussing of perception: Size-constancy in a representative sample of situations. Psychological Monographs, 56, 1–49.CrossRef Google Scholar

Dawes, R. M. (1989). Statistical criteria for establishing a truly false consensus effect. Journal of Experimental Social Psychology, 25, 1–17.CrossRef Google Scholar

Galesic, M., Olsson, H., & Rieskamp, J. (2018). A sampling model of social judgment. Psychological Review, 125, 363–390.CrossRef Google Scholar PubMed

Gigerenzer, G., Hoffrage, U., & Kleinbölting, H. (1991). Probabilistic mental models: A Brunswikian theory of confidence. Psychological Review, 98, 506–528.CrossRef Google Scholar PubMed

Hertwig, R., Barron, G., Weber, E. U., & Erev, I. (2004). Decisions from experience and the effect of rare events in risky choice. Psychological Science, 15, 534–539.CrossRef Google Scholar PubMed

Juslin, P. (1994). The overconfidence phenomenon as a consequence of informal experimenter-guided selection of almanac items. Organizational Behavior and Human Decision Processes, 57, 226–246.CrossRef Google Scholar

Juslin, P., Olsson, H., & Björkman, M. (1997). Brunswikian and Thurstonian origins of bias in probability assessment: On the origin and nature of stochastic components of judgment. Journal of Behavioral Decision Making, 10, 189–209.3.0.CO;2-4>CrossRef Google Scholar

Juslin, P., Winman, A., & Olsson, H. (2000). Naive empiricism and dogmatism in confidence research: A critical examination of the hard-easy effect. Psychological Review, 107, 384–396.CrossRef Google Scholar PubMed

Krueger, J. I., & Funder, D. C. (2004). Towards a balanced social psychology: Causes, consequences, and cures for the problem-seeking approach to social behavior and cognition. Behavioral and Brain Sciences, 27, 313–327.CrossRef Google Scholar PubMed

Lee, E., Karimi, F., Wagner, C., Jo, H.-H., Strohmaier, M., & Galesic, M. (2019). Homophily and minority-group size explain perception biases in social networks. Nature Human Behaviour, 3, 1078–1087.CrossRef Google Scholar PubMed

Lejarraga, T., & Hertwig, R. (2021). How experimental methods shaped views on human competence and rationality. Psychological Bulletin, 147(6), 535–564.CrossRef Google Scholar PubMed

Lerman, K., Yan, X., & Wu, X. Z. (2016). The “majority illusion” in social networks. PLoS ONE, 11(2), e0147617.CrossRef Google Scholar

Wulff, D. U., Mergenthaler-Canseco, M., & Hertwig, R. (2018). A meta-analytic review of two modes of learning and the description-experience gap. Psychological Bulletin, 144, 140–176.CrossRef Google Scholar PubMed