Madole & Harden's (M&H's) argument that genetic effects derived from within-family genome-wide association studies (GWASs) are equivalent to average treatment effects from randomized controlled trials (RCTs) rests on the assertion that both methods are “non-unitary, non-uniform, and non-explanatory.” This contention is misleading because these three “non-” dimensions are not binary, but very much a matter of degree. Double-blind RCTs are more likely to prevent the entry of other variables whose effects are confounded with those of the treatment, and thereby isolate the treatment as the most likely cause of outcomes. Thus, well-designed RCTs severely limit non-unitariness. RCTs of treatments that prove to be highly efficacious directly demonstrate their greater uniformity of therapeutic effects. Moreover, RCTs that examine treatment effects on not just clinical outcomes but mediating variables like brain activity and physiological arousal can provide mechanistic explanations of therapeutic effects (Horga, Kaur, & Peterson, Reference Horga, Kaur and Peterson2014), and can select between alternative mechanisms for these effects (e.g., Siegel, Cohen, & Warren, Reference Siegel, Cohen and Warren2022).
Even with more than a million randomized trials of alleles, within-family GWASs are much more non-unitary than a well-designed RCT. Besides the indirect sibling-to-sibling genetic effects that the authors address, there are other, multiple environmental sources of variation in phenotypes that cannot be ruled out and thus render within-family GWASs entirely non-unitary. Taking one of many possibilities, differential parenting of siblings, even when stemming from the siblings’ genetic differences, has developmental effects that can amplify phenotypic differences over time. Such unshared environmental influences are very difficult to measure (Turkheimer & Waldron, Reference Turkheimer and Waldron2000), and it cannot be assumed that they will wash out with large enough samples (McCarthy et al., Reference McCarthy, Abecasis, Cardon, Goldstein, Little, Ioannidis and Hirschhorn2008).
The non-uniformity of GWASs is demonstrated by the needs for very large samples (thousands of cases) and for replication because of limited statistical power (McCarthy et al., Reference McCarthy, Abecasis, Cardon, Goldstein, Little, Ioannidis and Hirschhorn2008). Both of these limitations stem from the need to correct for approximately 1 million independent tests of allele-outcome regressions in a typical GWAS (Visscher et al., Reference Visscher, Wray, Zhang, Sklar, McCarthy, Brown and Yang2017). For this reason, the journal Behavior Genetics requires replication to consider any GWAS for publication (Hewitt, Reference Hewitt2012).
GWAS non-uniformity also results from the need to control for genetic ancestry and thus potentially confounding genetic variants that differ across populations (McCarthy et al., Reference McCarthy, Abecasis, Cardon, Goldstein, Little, Ioannidis and Hirschhorn2008). As a result, virtually all GWASs have been of white Europeans, the most widely appraised ancestry. Therefore, GWAS findings may not apply to other ethnic groups, a non-uniformity that may exacerbate health disparities (Martin et al., Reference Martin, Kanai, Kamatani, Okada, Neale and Daly2019).
Another factor that grants RCTs more unitary causal inference than GWASs is the level of validity of outcome assessment. RCTs require accurate, or relatively “deep” assessments of behavioral outcomes to establish treatments as robust causes of changes in those outcomes. Because GWASs require very large samples that are often gathered from biobanks or by consortia across studies, phenotypes are typically assessed superficially to ensure standardization (Friedman, Banich, & Keller, Reference Friedman, Banich and Keller2021). For example, assessment of depression may be as simplistic as self-reported ratings. Although simulations indicate that the large samples of GWASs have sufficient power to discern genetic effects despite the large error associated with such minimal assessments (Border et al., Reference Border, Johnson, Evans, Smolen, Berley, Sullivan and Keller2019), these typical GWASs yield considerably lower heritability rates, and identify single-nucleotide polymorphisms (SNPs) with much less specificity, than GWASs with more valid phenotype assessments (Cai et al., Reference Cai, Revez, Adams, Andlauer, Breen, Byrne and Flint2020).
The above non-unitary and non-uniform factors may explain why GWASs have repeatedly yielded small genetic effects for traits that twin and family studies previously estimated to be large, what has been termed the problem of “missing heritability” (Maher, Reference Maher2008). GWAS heritability estimates are typically 40–80% lower than those yielded by twin and family studies (Friedman et al., Reference Friedman, Banich and Keller2021). For example, twin and family studies estimate genetic effects for schizophrenia to be about 60% (e.g., Lichtenstein et al., Reference Lichtenstein, Yip, Bjork, Pawitan, Cannon, Sullivan and Hultman2009). In contrast, a large GWAS (N = 3,322 cases and 3,587 controls without the illness) conducted during the same time identified ~74,000 genetic variants on a single chromosome that accounted for as little as 3%, or as much as 30%, of the variance in schizophrenia, depending on the analytic approach employed (Purcell et al., Reference Purcell, Wray, Stone, Visscher, O'Donovan and Sklar2009).
M&H introduce the within-family GWAS as having the potential to address the limitations of traditional heritability approaches like twin studies, which cannot “specify which genes or, crucially, how those genes are responsible for producing phenotypic differences” (target article; sect. 3.1, para. 2). Yet they conclude that the identified SNP has an “intermediate level of resolution, encompassing all alleles in LD [linkage disequilibrium] with the measured SNP” (target article; sect. 3.2.1, para. 10). In other words, the identified SNP is highly correlated with – a marker of – the causal variant, an implicit acknowledgment that the within-family GWAS has the same, aforementioned limitations as twin studies. Critically, modern arrays of genotyped SNPs miss certain variants that are not in linkage disequilibrium (LD) with the imputed SNP. While rare, these missed genetic variants can nonetheless have a large effect on variation in phenotypes (Visscher et al., Reference Visscher, Wray, Zhang, Sklar, McCarthy, Brown and Yang2017), which is also thought to underlie the aforementioned “missing heritability” (Friedman et al., Reference Friedman, Banich and Keller2021).
By identifying SNPs highly correlated with behavioral traits, the within-family GWAS brings us closer to identifying genetic causes. Whether it will alter the status of behavior genetics as a causal science, however, is a wide open question.
Madole & Harden's (M&H's) argument that genetic effects derived from within-family genome-wide association studies (GWASs) are equivalent to average treatment effects from randomized controlled trials (RCTs) rests on the assertion that both methods are “non-unitary, non-uniform, and non-explanatory.” This contention is misleading because these three “non-” dimensions are not binary, but very much a matter of degree. Double-blind RCTs are more likely to prevent the entry of other variables whose effects are confounded with those of the treatment, and thereby isolate the treatment as the most likely cause of outcomes. Thus, well-designed RCTs severely limit non-unitariness. RCTs of treatments that prove to be highly efficacious directly demonstrate their greater uniformity of therapeutic effects. Moreover, RCTs that examine treatment effects on not just clinical outcomes but mediating variables like brain activity and physiological arousal can provide mechanistic explanations of therapeutic effects (Horga, Kaur, & Peterson, Reference Horga, Kaur and Peterson2014), and can select between alternative mechanisms for these effects (e.g., Siegel, Cohen, & Warren, Reference Siegel, Cohen and Warren2022).
Even with more than a million randomized trials of alleles, within-family GWASs are much more non-unitary than a well-designed RCT. Besides the indirect sibling-to-sibling genetic effects that the authors address, there are other, multiple environmental sources of variation in phenotypes that cannot be ruled out and thus render within-family GWASs entirely non-unitary. Taking one of many possibilities, differential parenting of siblings, even when stemming from the siblings’ genetic differences, has developmental effects that can amplify phenotypic differences over time. Such unshared environmental influences are very difficult to measure (Turkheimer & Waldron, Reference Turkheimer and Waldron2000), and it cannot be assumed that they will wash out with large enough samples (McCarthy et al., Reference McCarthy, Abecasis, Cardon, Goldstein, Little, Ioannidis and Hirschhorn2008).
The non-uniformity of GWASs is demonstrated by the needs for very large samples (thousands of cases) and for replication because of limited statistical power (McCarthy et al., Reference McCarthy, Abecasis, Cardon, Goldstein, Little, Ioannidis and Hirschhorn2008). Both of these limitations stem from the need to correct for approximately 1 million independent tests of allele-outcome regressions in a typical GWAS (Visscher et al., Reference Visscher, Wray, Zhang, Sklar, McCarthy, Brown and Yang2017). For this reason, the journal Behavior Genetics requires replication to consider any GWAS for publication (Hewitt, Reference Hewitt2012).
GWAS non-uniformity also results from the need to control for genetic ancestry and thus potentially confounding genetic variants that differ across populations (McCarthy et al., Reference McCarthy, Abecasis, Cardon, Goldstein, Little, Ioannidis and Hirschhorn2008). As a result, virtually all GWASs have been of white Europeans, the most widely appraised ancestry. Therefore, GWAS findings may not apply to other ethnic groups, a non-uniformity that may exacerbate health disparities (Martin et al., Reference Martin, Kanai, Kamatani, Okada, Neale and Daly2019).
Another factor that grants RCTs more unitary causal inference than GWASs is the level of validity of outcome assessment. RCTs require accurate, or relatively “deep” assessments of behavioral outcomes to establish treatments as robust causes of changes in those outcomes. Because GWASs require very large samples that are often gathered from biobanks or by consortia across studies, phenotypes are typically assessed superficially to ensure standardization (Friedman, Banich, & Keller, Reference Friedman, Banich and Keller2021). For example, assessment of depression may be as simplistic as self-reported ratings. Although simulations indicate that the large samples of GWASs have sufficient power to discern genetic effects despite the large error associated with such minimal assessments (Border et al., Reference Border, Johnson, Evans, Smolen, Berley, Sullivan and Keller2019), these typical GWASs yield considerably lower heritability rates, and identify single-nucleotide polymorphisms (SNPs) with much less specificity, than GWASs with more valid phenotype assessments (Cai et al., Reference Cai, Revez, Adams, Andlauer, Breen, Byrne and Flint2020).
The above non-unitary and non-uniform factors may explain why GWASs have repeatedly yielded small genetic effects for traits that twin and family studies previously estimated to be large, what has been termed the problem of “missing heritability” (Maher, Reference Maher2008). GWAS heritability estimates are typically 40–80% lower than those yielded by twin and family studies (Friedman et al., Reference Friedman, Banich and Keller2021). For example, twin and family studies estimate genetic effects for schizophrenia to be about 60% (e.g., Lichtenstein et al., Reference Lichtenstein, Yip, Bjork, Pawitan, Cannon, Sullivan and Hultman2009). In contrast, a large GWAS (N = 3,322 cases and 3,587 controls without the illness) conducted during the same time identified ~74,000 genetic variants on a single chromosome that accounted for as little as 3%, or as much as 30%, of the variance in schizophrenia, depending on the analytic approach employed (Purcell et al., Reference Purcell, Wray, Stone, Visscher, O'Donovan and Sklar2009).
M&H introduce the within-family GWAS as having the potential to address the limitations of traditional heritability approaches like twin studies, which cannot “specify which genes or, crucially, how those genes are responsible for producing phenotypic differences” (target article; sect. 3.1, para. 2). Yet they conclude that the identified SNP has an “intermediate level of resolution, encompassing all alleles in LD [linkage disequilibrium] with the measured SNP” (target article; sect. 3.2.1, para. 10). In other words, the identified SNP is highly correlated with – a marker of – the causal variant, an implicit acknowledgment that the within-family GWAS has the same, aforementioned limitations as twin studies. Critically, modern arrays of genotyped SNPs miss certain variants that are not in linkage disequilibrium (LD) with the imputed SNP. While rare, these missed genetic variants can nonetheless have a large effect on variation in phenotypes (Visscher et al., Reference Visscher, Wray, Zhang, Sklar, McCarthy, Brown and Yang2017), which is also thought to underlie the aforementioned “missing heritability” (Friedman et al., Reference Friedman, Banich and Keller2021).
By identifying SNPs highly correlated with behavioral traits, the within-family GWAS brings us closer to identifying genetic causes. Whether it will alter the status of behavior genetics as a causal science, however, is a wide open question.
Financial support
This research received no specific grant from any funding agency, commercial, or not-for-profit sectors.
Competing interest
None.