The future of experimental design: Integrative, but is the sample diverse enough?

Sakshi Ghai; Sanchayan Banerjee

doi:10.1017/S0140525X23002212

The future of experimental design: Integrative, but is the sample diverse enough?

Published online by Cambridge University Press: 05 February 2024

Sakshi Ghai

and

Sanchayan Banerjee

Show author details

Sakshi Ghai: Affiliation:
Department of Psychology, Cambridge University, Cambridge, UK sg912@cam.ac.uk; https://www.psychol.cam.ac.uk/staff/sakshi-ghai
Sanchayan Banerjee*: Affiliation:
Environmental Economics, Institute for Environmental Studies, Vrije Universiteit Amsterdam, Amsterdam, Netherlands S.Banerjee@vu.nl; https://research.vu.nl/en/persons/sanchayan-banerjee
*: *Corresponding author.

Article contents

Abstract
Financial support
Competing interest
References

Rights & Permissions

Abstract

Almaatouq et al. propose an “integrative approach” to increase the generalisability and commensurability of experiments. Yet their metascientific approach has one glaring omission (and misinterpretation of) – the role of sample diversity in generalisability. In this commentary, we challenge false notions of subsumed duality between contexts, population, and diversity, and propose modifications to their design space to accommodate sample diversity.

Type: Open Peer Commentary
Information: Behavioral and Brain Sciences , Volume 47 , 2024 , e42

DOI: https://doi.org/10.1017/S0140525X23002212 [Opens in a new window]
Copyright: Copyright © The Author(s), 2024. Published by Cambridge University Press

Almaatouq et al. propose an “integrative approach” to increase the generalisability and commensurability of experiments. They suggest systematically sampling and testing a subset of experiments – chosen randomly from a design space – to deduce inferences about the population of all potential experiments. Yet their metascientific approach, which loosely translates the “potential outcomes framework” underlying experiments (Holland, Reference Holland1986; Neyman, Reference Neyman1923; Rubin, Reference Rubin1977) to the science of experimentation itself, has one glaring omission (and misinterpretation of) – the role of sample diversity in reasoning generalisability. Their suggestion that “an explicit, systematic mapping of research designs to points in the design space (research cartography) ensures commensurability” (target article, sect. 3.1, para. 6) is, at best, an ex-post exercise to ensure validity, based on the myopic assumption that “there is nothing special about the subjects…in principle, what goes for subjects also holds for contexts” (target article, sect. 2.2, para. 4). In this commentary, we challenge these false notions of subsumed duality between contexts, population, and sample diversity, and highlight the importance of diversity in ensuring commensurability. We propose modifications to their design space to incorporate sample diversity.

Almaatouq et al. outline their design space as a Cartesian product of two factors – population which is “a set of measurable attributes that characterizes the sample of participants,” and context which is a “set of independent variables hypothesized to moderate the effect in question as well as the nuisance parameter” (target article, sect. 2.2, para. 2). Their conceptualisation, however, fails to account for myriad factors arising from the sample and sampling technique itself, which affects the scope of any experiment. We outline three challenges resulting from this.

First, as defined, “population” lacks accountability of representativeness, such as cultural outliers in social and behavioural science experiments, a point that has been argued extensively by critics to metatheories of behaviours (Arnett, Reference Arnett2009; Henrich, Heine, & Norenzayan, Reference Henrich, Heine and Norenzayan2010). This evident oversight on diversity undermines the role that sample features can play in introducing biases to experiments, invariably leading to methodological narrowness, generating spurious and misleading results (Gurven, Reference Gurven2018; Rad, Martingano, & Ginges, Reference Rad, Martingano and Ginges2018). An integrative framework, therefore, must measure diversity, both between- and within-countries (Ghai, Fassi, Awadh, & Orben, Reference Ghai, Fassi, Awadh and Orben2023). Without a careful consideration of representativeness in selected sampling approaches and the match of samples to population – be it through crowdsourcing platforms or distributed collaborative networks of different laboratories or sophisticated machine-learning algorithms – integrative experimental techniques will continue to yield noisy and biased results, inapplicable beyond specific population samples.

Second, integrative techniques must grapple with the limitations of not only imperfect sampling approaches but also the limiting assumptions in current disciplinary theories (Medin, Ojalehto, Marin, & Bang, Reference Medin, Ojalehto, Marin and Bang2017). This is particularly important in the context of Almaatouq et al., who cite that the “ultimate goal” of experiments is to arrive at a comprehensive theoretical understanding of experimental insights. Nonetheless, here, the authors assume (falsely) that metatheories emerging from the design space will naturally lead to heterogeneity and guarantee commensurability. While mapping theoretical boundaries and engaging in meta-metatheoretical reflections, in applying integrative experimental approaches, can be valuable for understanding the generalisability of existing theories, this does not necessarily address underlying structural issues contributing to the lack of theoretical diversity (Haeffel & Cobb, Reference Haeffel and Cobb2022). Our critique speaks broadly to need for behavioural sciences to see and reason complex adaptive systems with diverse samples (see Banerjee & Mitra, Reference Banerjee and Mitra2023; Hallsworth, Reference Hallsworth2023).

Third, acknowledging diversity is important since the costs of running integrative metaexperiments are largely unequal, thereby excluding researchers and relevant stakeholders in the Global South from generating integrated experimental insights. Since, Almaatouq et al. suggest that an “integrative approach would start by identifying the dimensions… as suggested … by prior research” (target article, sect. 3.1, para. 4), it is likely this space will then suffer from publication biases. Their claim that an integrative approach “will actually broaden the range of people involved in behavioral research” (target article, sect. 5.7, para. 1) is, at best, misguided, given this drawback. Integrative methods may be transferable but such initiatives are expected to be concentrated and accepted in Western contexts mostly (Singh, Reference Singh2022). Thus, while we share the optimism for large-scale collaborative science, we are less confident on the ability to draw robust, generalisable conclusions by relying on integrative approaches only. Overcoming the epistemic marginalisation of underrepresented groups in integrative experimental designs arguably is important to achieve this.

In view of these critiques, we propose a modification to their design space that we think is necessary to unlock the power of the integrative approach.

Our proposition relates to explicitly measuring (sample) diversity to quantify the heterogeneity within the sample, advancing the goals of Almaatouq et al. One approach might be using a scalar measure of representativeness for all population and contextual characteristics, for any given experimental point. This scalar measure can then be used to transform and reweigh the design space for generalisability. For example, the authors' conceptualisation of the population space merely accounts “for a set of measurable attributes” rather than a rich and diverse set of measurable attributes “that characterizes the sample of participants” (target article, sect. 2.2, para. 2). As such, their original design space is unduly influenced by certain population subsamples more than others. Here, sampled experimental points cannot be fully representative of all potential experiments. Nonetheless, our approach of first measuring diversity as a scalar index, to then transform these factors of the design space, just like a weighted sampling approach, increases reliability of integrative experiments (Deffner, Rohrer, & McElreath, Reference Deffner, Rohrer and McElreath2022). One limitation of this approach is that scalar measures of diversity may vary depending on the population, as what even counts as diverse samples will widely differ between the Global North and South contexts (Ghai, Reference Ghai2022). Here we call on the field to develop new ways to increase global diversity and analyse which of these might work best in optimising the design space (Tang, Suganthan, & Yao, Reference Tang, Suganthan and Yao2006).

Ultimately, given that many new sources of knowledge are likely to emerge from the Global South and that these are likely to deviate from Western-centric behavioural insights (Adetula, Forscher, Basnight-Brown, et al., Reference Adetula, Forscher, Basnight-Brown, Azouaghe and IJzerman2022), accounting for the sample's diversity will truly enhance the scope of integrative experimental methods. The future of experimental design must not only be integrative but also diverse and inclusive.

Financial support

This research received no specific grant from any funding agency, commercial, or not-for-profit sectors.

Competing interest

None.

References

Arnett, J. J. (2009). The neglected 95%, a challenge to psychology's philosophy of science. American Psychologist, 64(6), 571–574. https://doi.org/10.1037/a0016723CrossRef Google Scholar

Banerjee, S., & Mitra, S. (2023). Behavioural public policies for the social brain. Behavioural Public Policy, 1–23.CrossRef Google Scholar

Deffner, D., Rohrer, J. M., & McElreath, R. (2022). A causal framework for cross-cultural generalizability. Advances in Methods and Practices in Psychological Science, 5(3). https://doi.org/10.1177/25152459221106366CrossRef Google Scholar

Ghai, S. (2022). Expand diversity definitions beyond their Western perspective. Nature, 602(7896), 211. https://doi.org/10.1038/d41586-022-00330-0CrossRef Google Scholar PubMed

Ghai, S., Fassi, L., Awadh, F., & Orben, A. (2023). Lack of sample diversity in research on adolescent depression and social media use: A scoping review and meta-analysis. Clinical Psychological Science, 0(0). https://doi.org/10.1177/21677026221114859Google Scholar

Gurven, M. D. (2018). Broadening horizons: Sample diversity and socioecological theory are essential to the future of psychological science. Proceedings of the National Academy of Sciences of the United States of America, 115(45), 11420–11427.CrossRef Google Scholar

Haeffel, G. J., & Cobb, W. R. (2022). Tests of generalizability can diversify psychologyand improve theories. Nature Reviews Psychology, 1(4), 186–187. https://doi.org/10.1038/s44159-022-00039-xCrossRef Google Scholar

Hallsworth, M. (2023). A manifesto for applying behavioural science. Nature Human Behaviour, 7(3), 1–13.CrossRef Google Scholar PubMed

Henrich, J., Heine, S. J., & Norenzayan, A. (2010). The weirdest people in the world? Behavioral and Brain Sciences, 33(2–3), 61–83.CrossRef Google Scholar PubMed

Holland, P. W. (1986). Statistics and causal inference. Journal of the American Statistical Association, 81(396), 945–960.CrossRef Google Scholar

Adetula, A., Forscher, P. S., Basnight-Brown, D., Azouaghe, S., & IJzerman, H. (2022). Psychology should generalize from – not just to – Africa. Nature Human Behaviour, 1, 370–371. https://doi.org/10.1038/s44159-022-00070-yGoogle Scholar

Medin, D., Ojalehto, B., Marin, A., & Bang, M. (2017). Systems of (non-) diversity. Nature Human Behaviour, 1(5), 88.CrossRef Google Scholar

Neyman, J. (1923). On the application of probability theory to agricultural experiments. Essay on principles. Annals of Agricultural Sciences, 1–51.Google Scholar

Rad, M. S., Martingano, A. J., & Ginges, J. (2018). Toward a psychology of Homo sapiens: Making psychological science more representative of the human population. Proceedings of the National Academy of Sciences of the United States of America, 115(45), 11401–11405.CrossRef Google Scholar

Singh, L. (2022). Navigating equity and justice in international collaborations. Nature Human Behaviour, 1, 372–373. https://doi.org/10.1038/s44159-022-00077-5CrossRef Google Scholar