We agree that family-based genome-wide association studies (GWASs) are an improvement on traditional GWASs in their ability to rule out confounding common causes. We are, however, sceptical that family-based GWASs will guide research aimed at identifying interventions on non-genetic second-generation variables that can be put to practical use in a manner akin to randomised controlled trials (RCTs). Our scepticism stems from an overlooked disanalogy between family-based GWASs and RCTs – the heterogeneity of the causal stimulus – and its impact on non-uniformity.
In most RCTs, individuals in the treatment group receive the same, or as similar as possible, treatment or causal stimulus, such as a drug or educational intervention (causal stimulus homogeneity). The same is true of Mendelian randomisation trials (which perhaps inspired Madole & Harden's [M&H's] arguments) – the causal variable(s) being investigated are relatively homogenous exposures. In contrast, family-based GWASs make claims about the average treatment effects of thousands of genetic variations distributed across a population. This aggregation is assigned a single variable, “genes,” which can be demonstrated as causal to an outcome to some degree, but nonetheless shows high causal stimulus heterogeneity.
Causal stimulus heterogeneity differs from the heterogeneity of treatment effects – a feature of both RCTs and GWASs – which M&H discuss in the article. The heterogeneity of treatment effects concerns the non-uniform effects of causal stimuli because of interactions with background factors like physiology and environment. This type of non-uniformity is seen in both GWAS and RCT studies, but is particularly challenging for GWASs because of causal stimulus heterogeneity. In GWASs, the complex role of the environment in the expression of the phenotype is amplified because the causal stimulus is varied and heterogeneous between individuals (Lynch, Reference Lynch, Smith, Richmond and Pingault2021). This non-uniformity means that the associations GWASs uncover between phenotypes and large aggregates of gene variants are very difficult to connect to mechanisms and function (Matthews & Turkheimer, Reference Matthews and Turkheimer2022). A similar challenge occurs in microbiome research: Significant within-population physiological and environmental variations make it difficult to track pathways between microbes and outcomes, limiting the scope for causal inference (Lynch, Parke, & O'Malley, Reference Lynch, Parke and O'Malley2019).
Causal stimulus heterogeneity increases non-uniformity and hampers tracing of mechanisms. This is because of variation in nature of “the same” treatment upon subjects. A simple hypothetical illustrates this well. Consider three different drug treatments: First, a drug with a single active ingredient (e.g., lithium). Second, a drug with thousands of ingredients of small efficacy. Third, a drug with thousands of ingredients of small efficacy, where each pill has one ingredient or an alternative at random according to a defined chance procedure. In all three cases, an RCT can determine whether the treatment drug has an average effect compared to a control, and thereby generate first-generation causal knowledge. This is possible even in the face of non-uniformity because of the heterogeneity of treatment effects. The prospect for these results to advance second-generation causal knowledge diminishes, however, across the three cases. The high causal stimulus heterogeneity in the third case is likely to produce non-uniform causal pathways from the very first steps, thus making it difficult or impossible to trace mechanisms from particular drug ingredients given only associations between treatments and outcomes.
A high degree of causal stimulus heterogeneity is typical for GWASs, including family-based ones. To analogise with our hypothetical drug case: The first drug is akin to a single-gene cause, the second a specific aggregate of genes, and the third an aggregate of many genes, which at the individual level is summarised by a polygenic risk score. Polygenic risk scores are highly heterogeneous causal stimuli with non-uniform effects that make it extremely difficult to trace mechanisms from particular genetic ingredients in the causal stimulus, in all but the simplest cases of gene expression (where GWASs are unnecessary). Even if we could hold environments fixed between individuals (thereby reducing the potential for non-uniformity because of background conditions), in GWASs there is typically too much variation between individuals in how the causal stimulus works at the “lower” biological level for effective “bottom-up” investigations of intermediate causes through biological mechanisms. In our view, this largely precludes these studies from providing the sort of causal knowledge required to identify mechanisms and intermediaries for investigation with second-generation studies.
A work-around might be Harden's proposal of phenotypic annotation, which rests on the statistical investigation of intermediate causes through mediation analysis (Belsky & Harden, Reference Belsky and Harden2019; see, e.g., Belsky et al. [Reference Belsky, Moffitt, Corcoran, Domingue, Harrington, Hogan and Caspi2016]). Mediation analysis test variables correlated with the stimulus to determine whether (and to what extent) they mediate the causal paths from stimulus to outcome. Such intermediaries could be possible targets for intervention in second-generation studies. In the simplest case, the genetic stimulus would act as an instrumental variable on the potential intermediary (as in Mendelian randomisation, see Davey Smith & Ebrahim, Reference Davey Smith and Ebrahim2003) allowing for measurement of the intermediary's causal effect. However, mediation analysis is tricky at the best of times (see Pearl [Reference Pearl2014] for a principled approach). The possibility of confounding common causes between intermediary and outcome is a serious challenge. Common causes (such as other genetic or environmental causes) may account for the relationship between potential intermediary and outcome. In this case, intervening on the potential intermediary will not cause the outcome. A heterogeneous causal stimulus, such as a polygenic risk score, effectively carries a potential common cause within itself: The different ways that the causal stimulus may be realised. The use of an average causal stimulus (by definition) precludes control of this common cause. To determine if a potential intermediary is indeed a cause of the outcome, one would need to do an RCT or another kind of study.
In conclusion, we agree that family-based GWASs provides one kind of causal information that has been missing from traditional heritability and GWASs (see Lynch [Reference Lynch2017] for the limitations of causal heritability claims). Unfortunately, the general heterogeneous nature of the genetic variation studied means that this information will not translate easily into second-generation causal knowledge.
We agree that family-based genome-wide association studies (GWASs) are an improvement on traditional GWASs in their ability to rule out confounding common causes. We are, however, sceptical that family-based GWASs will guide research aimed at identifying interventions on non-genetic second-generation variables that can be put to practical use in a manner akin to randomised controlled trials (RCTs). Our scepticism stems from an overlooked disanalogy between family-based GWASs and RCTs – the heterogeneity of the causal stimulus – and its impact on non-uniformity.
In most RCTs, individuals in the treatment group receive the same, or as similar as possible, treatment or causal stimulus, such as a drug or educational intervention (causal stimulus homogeneity). The same is true of Mendelian randomisation trials (which perhaps inspired Madole & Harden's [M&H's] arguments) – the causal variable(s) being investigated are relatively homogenous exposures. In contrast, family-based GWASs make claims about the average treatment effects of thousands of genetic variations distributed across a population. This aggregation is assigned a single variable, “genes,” which can be demonstrated as causal to an outcome to some degree, but nonetheless shows high causal stimulus heterogeneity.
Causal stimulus heterogeneity differs from the heterogeneity of treatment effects – a feature of both RCTs and GWASs – which M&H discuss in the article. The heterogeneity of treatment effects concerns the non-uniform effects of causal stimuli because of interactions with background factors like physiology and environment. This type of non-uniformity is seen in both GWAS and RCT studies, but is particularly challenging for GWASs because of causal stimulus heterogeneity. In GWASs, the complex role of the environment in the expression of the phenotype is amplified because the causal stimulus is varied and heterogeneous between individuals (Lynch, Reference Lynch, Smith, Richmond and Pingault2021). This non-uniformity means that the associations GWASs uncover between phenotypes and large aggregates of gene variants are very difficult to connect to mechanisms and function (Matthews & Turkheimer, Reference Matthews and Turkheimer2022). A similar challenge occurs in microbiome research: Significant within-population physiological and environmental variations make it difficult to track pathways between microbes and outcomes, limiting the scope for causal inference (Lynch, Parke, & O'Malley, Reference Lynch, Parke and O'Malley2019).
Causal stimulus heterogeneity increases non-uniformity and hampers tracing of mechanisms. This is because of variation in nature of “the same” treatment upon subjects. A simple hypothetical illustrates this well. Consider three different drug treatments: First, a drug with a single active ingredient (e.g., lithium). Second, a drug with thousands of ingredients of small efficacy. Third, a drug with thousands of ingredients of small efficacy, where each pill has one ingredient or an alternative at random according to a defined chance procedure. In all three cases, an RCT can determine whether the treatment drug has an average effect compared to a control, and thereby generate first-generation causal knowledge. This is possible even in the face of non-uniformity because of the heterogeneity of treatment effects. The prospect for these results to advance second-generation causal knowledge diminishes, however, across the three cases. The high causal stimulus heterogeneity in the third case is likely to produce non-uniform causal pathways from the very first steps, thus making it difficult or impossible to trace mechanisms from particular drug ingredients given only associations between treatments and outcomes.
A high degree of causal stimulus heterogeneity is typical for GWASs, including family-based ones. To analogise with our hypothetical drug case: The first drug is akin to a single-gene cause, the second a specific aggregate of genes, and the third an aggregate of many genes, which at the individual level is summarised by a polygenic risk score. Polygenic risk scores are highly heterogeneous causal stimuli with non-uniform effects that make it extremely difficult to trace mechanisms from particular genetic ingredients in the causal stimulus, in all but the simplest cases of gene expression (where GWASs are unnecessary). Even if we could hold environments fixed between individuals (thereby reducing the potential for non-uniformity because of background conditions), in GWASs there is typically too much variation between individuals in how the causal stimulus works at the “lower” biological level for effective “bottom-up” investigations of intermediate causes through biological mechanisms. In our view, this largely precludes these studies from providing the sort of causal knowledge required to identify mechanisms and intermediaries for investigation with second-generation studies.
A work-around might be Harden's proposal of phenotypic annotation, which rests on the statistical investigation of intermediate causes through mediation analysis (Belsky & Harden, Reference Belsky and Harden2019; see, e.g., Belsky et al. [Reference Belsky, Moffitt, Corcoran, Domingue, Harrington, Hogan and Caspi2016]). Mediation analysis test variables correlated with the stimulus to determine whether (and to what extent) they mediate the causal paths from stimulus to outcome. Such intermediaries could be possible targets for intervention in second-generation studies. In the simplest case, the genetic stimulus would act as an instrumental variable on the potential intermediary (as in Mendelian randomisation, see Davey Smith & Ebrahim, Reference Davey Smith and Ebrahim2003) allowing for measurement of the intermediary's causal effect. However, mediation analysis is tricky at the best of times (see Pearl [Reference Pearl2014] for a principled approach). The possibility of confounding common causes between intermediary and outcome is a serious challenge. Common causes (such as other genetic or environmental causes) may account for the relationship between potential intermediary and outcome. In this case, intervening on the potential intermediary will not cause the outcome. A heterogeneous causal stimulus, such as a polygenic risk score, effectively carries a potential common cause within itself: The different ways that the causal stimulus may be realised. The use of an average causal stimulus (by definition) precludes control of this common cause. To determine if a potential intermediary is indeed a cause of the outcome, one would need to do an RCT or another kind of study.
In conclusion, we agree that family-based GWASs provides one kind of causal information that has been missing from traditional heritability and GWASs (see Lynch [Reference Lynch2017] for the limitations of causal heritability claims). Unfortunately, the general heterogeneous nature of the genetic variation studied means that this information will not translate easily into second-generation causal knowledge.
Financial support
KEL was supported under Australian Research Council's Discovery Projects funding scheme (project number FL170100160); RLB was supported by funding through the ANU VC Futures Scheme.
Competing interest
None.