Hostname: page-component-745bb68f8f-hvd4g Total loading time: 0 Render date: 2025-01-13T21:52:34.327Z Has data issue: false hasContentIssue false

The future of experimental design: Integrative, but is the sample diverse enough?

Published online by Cambridge University Press:  05 February 2024

Sakshi Ghai
Affiliation:
Department of Psychology, Cambridge University, Cambridge, UK sg912@cam.ac.uk; https://www.psychol.cam.ac.uk/staff/sakshi-ghai
Sanchayan Banerjee*
Affiliation:
Environmental Economics, Institute for Environmental Studies, Vrije Universiteit Amsterdam, Amsterdam, Netherlands S.Banerjee@vu.nl; https://research.vu.nl/en/persons/sanchayan-banerjee
*
*Corresponding author.

Abstract

Almaatouq et al. propose an “integrative approach” to increase the generalisability and commensurability of experiments. Yet their metascientific approach has one glaring omission (and misinterpretation of) – the role of sample diversity in generalisability. In this commentary, we challenge false notions of subsumed duality between contexts, population, and diversity, and propose modifications to their design space to accommodate sample diversity.

Type
Open Peer Commentary
Copyright
Copyright © The Author(s), 2024. Published by Cambridge University Press

Almaatouq et al. propose an “integrative approach” to increase the generalisability and commensurability of experiments. They suggest systematically sampling and testing a subset of experiments – chosen randomly from a design space – to deduce inferences about the population of all potential experiments. Yet their metascientific approach, which loosely translates the “potential outcomes framework” underlying experiments (Holland, Reference Holland1986; Neyman, Reference Neyman1923; Rubin, Reference Rubin1977) to the science of experimentation itself, has one glaring omission (and misinterpretation of) – the role of sample diversity in reasoning generalisability. Their suggestion that “an explicit, systematic mapping of research designs to points in the design space (research cartography) ensures commensurability” (target article, sect. 3.1, para. 6) is, at best, an ex-post exercise to ensure validity, based on the myopic assumption that “there is nothing special about the subjects…in principle, what goes for subjects also holds for contexts” (target article, sect. 2.2, para. 4). In this commentary, we challenge these false notions of subsumed duality between contexts, population, and sample diversity, and highlight the importance of diversity in ensuring commensurability. We propose modifications to their design space to incorporate sample diversity.

Almaatouq et al. outline their design space as a Cartesian product of two factors – population which is “a set of measurable attributes that characterizes the sample of participants,” and context which is a “set of independent variables hypothesized to moderate the effect in question as well as the nuisance parameter” (target article, sect. 2.2, para. 2). Their conceptualisation, however, fails to account for myriad factors arising from the sample and sampling technique itself, which affects the scope of any experiment. We outline three challenges resulting from this.

First, as defined, “population” lacks accountability of representativeness, such as cultural outliers in social and behavioural science experiments, a point that has been argued extensively by critics to metatheories of behaviours (Arnett, Reference Arnett2009; Henrich, Heine, & Norenzayan, Reference Henrich, Heine and Norenzayan2010). This evident oversight on diversity undermines the role that sample features can play in introducing biases to experiments, invariably leading to methodological narrowness, generating spurious and misleading results (Gurven, Reference Gurven2018; Rad, Martingano, & Ginges, Reference Rad, Martingano and Ginges2018). An integrative framework, therefore, must measure diversity, both between- and within-countries (Ghai, Fassi, Awadh, & Orben, Reference Ghai, Fassi, Awadh and Orben2023). Without a careful consideration of representativeness in selected sampling approaches and the match of samples to population – be it through crowdsourcing platforms or distributed collaborative networks of different laboratories or sophisticated machine-learning algorithms – integrative experimental techniques will continue to yield noisy and biased results, inapplicable beyond specific population samples.

Second, integrative techniques must grapple with the limitations of not only imperfect sampling approaches but also the limiting assumptions in current disciplinary theories (Medin, Ojalehto, Marin, & Bang, Reference Medin, Ojalehto, Marin and Bang2017). This is particularly important in the context of Almaatouq et al., who cite that the “ultimate goal” of experiments is to arrive at a comprehensive theoretical understanding of experimental insights. Nonetheless, here, the authors assume (falsely) that metatheories emerging from the design space will naturally lead to heterogeneity and guarantee commensurability. While mapping theoretical boundaries and engaging in meta-metatheoretical reflections, in applying integrative experimental approaches, can be valuable for understanding the generalisability of existing theories, this does not necessarily address underlying structural issues contributing to the lack of theoretical diversity (Haeffel & Cobb, Reference Haeffel and Cobb2022). Our critique speaks broadly to need for behavioural sciences to see and reason complex adaptive systems with diverse samples (see Banerjee & Mitra, Reference Banerjee and Mitra2023; Hallsworth, Reference Hallsworth2023).

Third, acknowledging diversity is important since the costs of running integrative metaexperiments are largely unequal, thereby excluding researchers and relevant stakeholders in the Global South from generating integrated experimental insights. Since, Almaatouq et al. suggest that an “integrative approach would start by identifying the dimensions… as suggested … by prior research” (target article, sect. 3.1, para. 4), it is likely this space will then suffer from publication biases. Their claim that an integrative approach “will actually broaden the range of people involved in behavioral research” (target article, sect. 5.7, para. 1) is, at best, misguided, given this drawback. Integrative methods may be transferable but such initiatives are expected to be concentrated and accepted in Western contexts mostly (Singh, Reference Singh2022). Thus, while we share the optimism for large-scale collaborative science, we are less confident on the ability to draw robust, generalisable conclusions by relying on integrative approaches only. Overcoming the epistemic marginalisation of underrepresented groups in integrative experimental designs arguably is important to achieve this.

In view of these critiques, we propose a modification to their design space that we think is necessary to unlock the power of the integrative approach.

Our proposition relates to explicitly measuring (sample) diversity to quantify the heterogeneity within the sample, advancing the goals of Almaatouq et al. One approach might be using a scalar measure of representativeness for all population and contextual characteristics, for any given experimental point. This scalar measure can then be used to transform and reweigh the design space for generalisability. For example, the authors' conceptualisation of the population space merely accounts “for a set of measurable attributes” rather than a rich and diverse set of measurable attributes “that characterizes the sample of participants” (target article, sect. 2.2, para. 2). As such, their original design space is unduly influenced by certain population subsamples more than others. Here, sampled experimental points cannot be fully representative of all potential experiments. Nonetheless, our approach of first measuring diversity as a scalar index, to then transform these factors of the design space, just like a weighted sampling approach, increases reliability of integrative experiments (Deffner, Rohrer, & McElreath, Reference Deffner, Rohrer and McElreath2022). One limitation of this approach is that scalar measures of diversity may vary depending on the population, as what even counts as diverse samples will widely differ between the Global North and South contexts (Ghai, Reference Ghai2022). Here we call on the field to develop new ways to increase global diversity and analyse which of these might work best in optimising the design space (Tang, Suganthan, & Yao, Reference Tang, Suganthan and Yao2006).

Ultimately, given that many new sources of knowledge are likely to emerge from the Global South and that these are likely to deviate from Western-centric behavioural insights (Adetula, Forscher, Basnight-Brown, et al., Reference Adetula, Forscher, Basnight-Brown, Azouaghe and IJzerman2022), accounting for the sample's diversity will truly enhance the scope of integrative experimental methods. The future of experimental design must not only be integrative but also diverse and inclusive.

Financial support

This research received no specific grant from any funding agency, commercial, or not-for-profit sectors.

Competing interest

None.

References

Arnett, J. J. (2009). The neglected 95%, a challenge to psychology's philosophy of science. American Psychologist, 64(6), 571574. https://doi.org/10.1037/a0016723CrossRefGoogle Scholar
Banerjee, S., & Mitra, S. (2023). Behavioural public policies for the social brain. Behavioural Public Policy, 1–23.CrossRefGoogle Scholar
Deffner, D., Rohrer, J. M., & McElreath, R. (2022). A causal framework for cross-cultural generalizability. Advances in Methods and Practices in Psychological Science, 5(3). https://doi.org/10.1177/25152459221106366CrossRefGoogle Scholar
Ghai, S. (2022). Expand diversity definitions beyond their Western perspective. Nature, 602(7896), 211. https://doi.org/10.1038/d41586-022-00330-0CrossRefGoogle ScholarPubMed
Ghai, S., Fassi, L., Awadh, F., & Orben, A. (2023). Lack of sample diversity in research on adolescent depression and social media use: A scoping review and meta-analysis. Clinical Psychological Science, 0(0). https://doi.org/10.1177/21677026221114859Google Scholar
Gurven, M. D. (2018). Broadening horizons: Sample diversity and socioecological theory are essential to the future of psychological science. Proceedings of the National Academy of Sciences of the United States of America, 115(45), 1142011427.CrossRefGoogle Scholar
Haeffel, G. J., & Cobb, W. R. (2022). Tests of generalizability can diversify psychologyand improve theories. Nature Reviews Psychology, 1(4), 186187. https://doi.org/10.1038/s44159-022-00039-xCrossRefGoogle Scholar
Hallsworth, M. (2023). A manifesto for applying behavioural science. Nature Human Behaviour, 7(3), 113.CrossRefGoogle ScholarPubMed
Henrich, J., Heine, S. J., & Norenzayan, A. (2010). The weirdest people in the world? Behavioral and Brain Sciences, 33(2–3), 6183.CrossRefGoogle ScholarPubMed
Holland, P. W. (1986). Statistics and causal inference. Journal of the American Statistical Association, 81(396), 945960.CrossRefGoogle Scholar
Adetula, A., Forscher, P. S., Basnight-Brown, D., Azouaghe, S., & IJzerman, H. (2022). Psychology should generalize from – not just to – Africa. Nature Human Behaviour, 1, 370371. https://doi.org/10.1038/s44159-022-00070-yGoogle Scholar
Medin, D., Ojalehto, B., Marin, A., & Bang, M. (2017). Systems of (non-) diversity. Nature Human Behaviour, 1(5), 88.CrossRefGoogle Scholar
Neyman, J. (1923). On the application of probability theory to agricultural experiments. Essay on principles. Annals of Agricultural Sciences, 151.Google Scholar
Rad, M. S., Martingano, A. J., & Ginges, J. (2018). Toward a psychology of Homo sapiens: Making psychological science more representative of the human population. Proceedings of the National Academy of Sciences of the United States of America, 115(45), 1140111405.CrossRefGoogle Scholar
Singh, L. (2022). Navigating equity and justice in international collaborations. Nature Human Behaviour, 1, 372373. https://doi.org/10.1038/s44159-022-00077-5CrossRefGoogle Scholar
Rubin, D. B. (1977). Assignment to treatment group on the basis of a covariate. Journal of Educational Statistics, 2(1), 126.CrossRefGoogle Scholar
Tang, E. K., Suganthan, P. N., & Yao, X. (2006). An analysis of diversity measures. Machine Learning, 65, 247271.CrossRefGoogle Scholar