The fetal origins hypothesis, also known as the Barker hypothesis, states that low birth weight (BW), an indication of poor fetal nutrition, is associated with increased risk of developing adult onset diseases. The British epidemiologist David Barker and his colleagues first suggested a potential role of fetal nutrition in the etiology of ischemic heart diseases after they observed that individuals in regions in England and Wales who showed increased fetal mortality between 1921 and 1925 also showed increased mortality rates from ischemic heart diseases several decades later (Barker & Osmond, Reference Barker and Osmond1986). In follow-up epidemiological studies using public records from Hertfordshire and Preston, Barker found that low BW was associated with mortality from ischemic heart disease, and also with type 2 diabetes and hypertension (Hales & Barker, Reference Hales and Barker1992).
Attempts to explain the low BW diabetes association implicated insulin and other metabolic mediators in the causal mechanism. Subsequent studies investigating this relationship advanced the thrifty phenotype hypothesis, which proposed that poor fetal nutrition ‘programs’ the fetus, causing long-lasting effects on health. Accordingly, the prenatal period is critical for the subsequent development; poor nutrition during this period causes structural and functional adaptations, such as decreased insulin sensitivity, that enhance survival and prepare the fetus for postnatal life in a nutrient-poor environment. However, such changes would have negative effects in affluent environments, predisposing to diabetes and metabolic syndrome (Hales & Barker, Reference Hales and Barker1992, Reference Hales and Barker2001).
Barker's hypothesis gained momentum in the following decades. Results were supported by several epidemiological and animal studies (Skogen & Øverland, Reference Skogen and Øverland2012). The hypothesis also expanded to include other diseases, such as psychiatric disorders, and it was suggested that the role of poor prenatal environment might explain the associations between psychiatric and cardiometabolic diseases (Schlotz & Phillips, Reference Schlotz and Phillips2009).
Importantly, and in line with this hypothesis, the Dutch famine studies provided the earliest evidence for the role of prenatal nutrition in mental disorders. While an extreme case that reflects complex effects, those studies showed that individuals conceived during the famine showed increased risk of several psychiatric illnesses, particularly schizophrenia, and depression (Räikkönen et al., Reference Räikkönen, Pesonen, Roseboom and Eriksson2012; Schlotz & Phillips, Reference Schlotz and Phillips2009). Subsequent case-control and cohort studies, however, yielded mixed results. A recent systematic review (Wojcik et al., Reference Wojcik, Lee, Colman, Hardy and Hotopf2013) indicated that the association between low BW and depression is rather weak. For schizophrenia, studies to date have produced contradictory results (Abel et al., Reference Abel, Wicks, Susser, Dalman, Pedersen, Mortensen and Webb2010; Gunnell et al., Reference Gunnell, Harrison, Whitley, Lewis, Tynelius and Rasmussen2005; Schlotz & Phillips, Reference Schlotz and Phillips2009). The strongest evidence was provided for attention deficit hyperactivity disorder (ADHD), with several epidemiological studies consistently associating low BW with hyperactivity and inattention (O'Donnell & Meaney, Reference O'Donnell and Meaney2016; Schlotz & Phillips, Reference Schlotz and Phillips2009).
Proponents of the Barker hypothesis suggest that the supporting evidence is strong, and that attention should shift toward revealing the causal mechanisms underlying the observed associations. However, the primary criticisms of the Barker hypothesis is that evidence for it comes from observational studies: observational evidence does not suffice for inferring causality, because the observed associations might reflect the effects of confounding variables (Skogen & Øverland, Reference Skogen and Øverland2012). For instance, a meta-analysis of 55 studies investigating the relationship between BW and blood pressure suggested that the reported association could well be attributable to random error, reporting bias, or to other confounders (Huxley et al., Reference Huxley, Neil and Collins2002). Similarly, a systematic review, which established a weak association between BW and depression, also noted several limitations of the pool of studies, including publication bias and lack of adjustment for potential confounders (Wojcik et al., Reference Wojcik, Lee, Colman, Hardy and Hotopf2013).
Demonstrating that BW causally affects psychiatric traits, and estimating the size of such effects, would provide an impetus to health policies and would establish the prenatal period as a crucial therapeutic window. Alternatively, refuting the hypothesis that the association is causal will help us to avoid basing interventions on incorrect causality models. However, causal inference presents a challenge when tight experimental control is unfeasible or impossible. The question of whether BW causes adult psychiatric disorders cannot be addressed using a randomized controlled trial, as one cannot assign individuals randomly to different BW conditions to study its effects on psychiatric traits. Mendelian randomization (MR), a well-established method to demonstrate causality using genetic data, has created an unprecedented opportunity to probe the causality in the association between BW and mental disorders (Davey Smith & Ebrahim, Reference Davey Smith and Ebrahim2003; Evans & Davey Smith, Reference Evans and Davey Smith2015; Pingault et al., Reference Pingault, O'Reilly, Schoeler, Ploubidis, Rijsdijk and Dudbridge2018). That the genes assort randomly and independently (by Mendel's first and second law) serves a function that is similar to that of randomization in a randomized controlled trial. Specifically, one can form random groups of individuals based on their genes, given that the genes assort randomly and independently of other traits and of environmental factors that may typically confound observational studies (Burgess & Thompson, Reference Burgess and Thompson2015; Evans & Davey Smith, Reference Evans and Davey Smith2015; Pingault et al., Reference Pingault, O'Reilly, Schoeler, Ploubidis, Rijsdijk and Dudbridge2018). Therefore, by using genetic variants that are robustly associated with the risk factor/exposure instead of using the exposure itself, MR makes it possible to study causal relationships isolated from the effects of confounders. This allows us to draw conclusions about causality from observational studies. In short, MR is a very suitable technique to test for effects of BW.
Recent MR results have demonstrated a causal effect of BW on risk factors for coronary artery disease (Zanetti et al., Reference Zanetti, Tikkanen, Gustafsson, Priest, Burgess and Ingelsson2018) and on type 2 diabetes (Wang et al., Reference Wang, Huang, Li, Zheng, Manson, Hu and Qi2016). However, no study to date has employed this approach to study the causal effect of BW on psychiatric disorders. Psychiatric disorders are leading causes of disability worldwide, forming an enormous burden on the individual, family, and society. Reliably establishing a role of the prenatal environment in their pathogenesis may be relevant to public health policies (Skogen & Øverland, Reference Skogen and Øverland2012). The present aim is to use MR to test whether BW has a causal effect on depression, schizophrenia, and ADHD. Our focus on these disorders is motivated by recent large genome-wide association studies (GWASs), which have established an increasing number of robust genetic associations. These provide reliable instruments that facilitate the application of the MR analysis.
Methods
We carried out two sample MR analyses to test the causal effects of BW on the risk of major depressive disorder (MDD), schizophrenia, and ADHD. We used publicly available summary statistics from GWAS conducted by the Psychiatric Genomics Consortium (https://www.med.unc.edu/pgc/results-and-downloads) and by the Early Growth Genetics Consortium (https://egg-consortium.org/).
MR allows one to probe causality in the relationships between risk factors/exposures and outcomes like mental health in non-experimental data (Davey Smith & Ebrahim, Reference Davey Smith and Ebrahim2003). MR, similar to a randomized controlled trial, entails a ‘randomization procedure’ as it uses genetic variants (randomly inherited at conception) to group individuals with different levels of exposure. In fact, MR is a form of instrumental variable (IV) analysis; IVs are variables directly associated with an exposure, with effects on the outcome assumed to be entirely mediated by the exposure (Lawlor et al., Reference Lawlor, Harbord, Sterne, Timpson and Davey Smith2008). In MR, genetic variants (usually single nucleotide polymorphisms; SNPs) feature as IVs (Davey Smith & Ebrahim, Reference Davey Smith and Ebrahim2003). To feature as a valid IV, a genetic variant should (a) be strongly associated with the exposure, (b) be uncorrelated with confounders (influences common to exposure and outcome), and (c) affect the outcome exclusively via the exposure (i.e., pleiotropic effects of the instrument on the outcome should be absent; Davey Smith & Ebrahim, Reference Davey Smith and Ebrahim2003).
Power Analyses
We estimated statistical power for each MR analysis using the tool by Burgess (https://sb452.shinyapps.io/power/). In calculating the power, we used the following settings: a significance level of 0.05 and a coefficient of determination of 0.02 (in the regression of the exposure on the genetic variants used to instrument the analysis), and we considered two effect sizes — 20% and 10%, respectively, per standard deviation change in the exposure (i.e., OR = 1.2 and OR = 1.1).
Mendelian Randomization
The inverse variance weighted (IVW) procedure was employed to probe causality (Burgess et al., Reference Burgess, Butterworth and Thompson2013). The IVW causal effect estimate is calculated based on multiple SNPs and on the ratio of coefficients method. We used genetic variants from the largest GWASs of BW (Horikoshi et al., Reference Horikoshi, Beaumont, Day, Warrington, Kooijman, Fernandez-Tajes and Grarup2016), depression (Wray et al., Reference Wray, Ripke, Mattheisen, Trzaskowski, Byrne, Abdellaoui and Andlauer2018), schizophrenia (Ripke et al., Reference Ripke, O'Dushlaine, Chambert, Moran, Kähler, Akterin and Fromer2013), and ADHD (Demontis et al., Reference Demontis, Walters, Martin, Mattheisen, Als, Agerbo, Belliveau, Bybjerg-Grauholm, Bækvad-Hansen, Cerrato, Chambert, Churchhouse, Dumont, Eriksson, Gandal, Goldstein, Grove, Hansen, Hauberg, Hollegaard, Howrigan, Huang, Maller, Martin, Moran, Pallesen, Palmer, Pedersen, Pedersen, Poterba, Poulsen, Ripke, Robinson, Satterstrom, Stevens, Turley, Won, Andreassen, Burton, Boomsma, Cormand, Dalsgaard, Franke, Gelernter, Geschwind, Hakonarson, Haavik, Kranzler, Kuntsi, Langley, Lesch, Middeldorp, Reif, Rohde, Roussos, Schachar, Sklar, SonugaBarke, Sullivan, Thapar, Tung, Waldman, Nordentoft, Hougaard, Werge, Mors, Mortensen, Daly, Faraone, Børglum, Neale and Cerrato2017). The 60 SNPs associated with BW at a genome-wide significance level in the offspring genotype analysis (p < 5×10−8) were used as instruments. For each genetic variant i (i = 1. . .N SNP), we computed the Wald ratio estimate as (Burgess et al., Reference Burgess, Butterworth and Thompson2013)
where βoSNPi is the regression coefficient in the outcome on SNPi regression, and β ESNPi is the regression coefficient in the exposure on SNPi regression. As one does in a meta-analysis, we combined these ratios of coefficients by weighting them by their inverse variance (Burgess et al., Reference Burgess, Butterworth and Thompson2013). Effectively, using the IVW procedure, one regresses the vector of NSNP associations with the outcome on the vector of associations with the exposure, while fixing the intercept to zero and employing inverse variance weighting. The IVW produces unbiased estimates of causal effects as long as the SNPs employed are valid IVs (Bowden et al., Reference Bowden, Davey Smith and Burgess2015). Assumptions (b) and (c) stated above are not empirically testable, and are unlikely to hold when many variants are used as instruments (as this increases the probability of pleiotropy; Bowden et al., Reference Bowden, Davey Smith, Haycock and Burgess2016). To avoid the bias due to potential violation of the ‘no horizontal pleiotropy’ assumption, we used the following MR methods, which are known to be robust in the presence of invalid IVs: median- (Bowden et al., Reference Bowden, Davey Smith, Haycock and Burgess2016) and mode-based methods (Hartwig et al., Reference Hartwig, Davey Smith and Bowden2017), and the MR-Egger regression (Bowden et al., Reference Bowden, Davey Smith and Burgess2015). Furthermore, we used forest plots (Wickham, Reference Wickham2010) to visualize the causal estimates based on each individual instrument, and the combined causal estimates (Hartwig et al., Reference Hartwig, Davey Smith and Bowden2017).
The simple median estimator is calculated as the median of the set of ratio coefficients from each SNP selected to instrument the analysis. Even if up to half of these SNPs are invalid instruments (i.e., pleiotropic), the simple median method will produce unbiased estimates of the causal effect (Bowden et al., Reference Bowden, Davey Smith, Haycock and Burgess2016). Unlike the simple median — where the ratio estimates from all instruments receive equal weights — the weighted median estimator weights the coefficients by their inverse variance to place more importance on instruments with more precise estimates. The resulting causal estimate is valid as long as at least half the weights are based on valid instruments (Bowden et al., Reference Bowden, Davey Smith, Haycock and Burgess2016). The mode-based estimator is calculated as the most frequent estimate (i.e., the mode) of the set of ratio coefficients estimated from each genetic instrument. In this approach, the strong ‘no pleiotropy’ assumption is replaced with the assumption that the largest group of SNPs yielding similar estimates of the causal effect includes solely non-pleiotropic instruments (Hartwig et al., Reference Hartwig, Davey Smith and Bowden2017). The weighted mode is similar to the simple mode, except that individual ratio estimates are weighted (Hartwig et al., Reference Hartwig, Davey Smith and Bowden2017).
We also used MR-Egger to correct for potential horizontal pleiotropic effects (Bowden et al., Reference Bowden, Davey Smith and Burgess2015). MR-Egger, similar to the IVW method, performs a weighted linear regression using inverse variance weights; yet unlike the IVW method, it freely estimates the intercept to capture potential pleiotropic effects. As such, under the weaker assumption that the effects of the SNPs on the exposure are uncorrelated with the effects of the SNPs on the outcome, MR-Egger is expected to provide unbiased causal estimates even when all of the SNPs display pleiotropy (Bowden et al., Reference Bowden, Davey Smith and Burgess2015; see also Burgess & Thompson, Reference Burgess and Thompson2017, for more details on this procedure). Finally, we used funnel plots as a visual test for horizontal pleiotropy, where symmetry is indicative of lower probability of pleiotropy (Bowden et al., Reference Bowden, Davey Smith and Burgess2015).
Tests of Heterogeneity
The causal estimates yielded by multiple instruments are expected to vary only by chance, if the SNPs satisfy the IV assumptions and the SNPs have the same causal effect size. Large inter-instrument heterogeneity may be indicative of pleiotropic effects (Bowden et al., Reference Bowden, Del Greco, Minelli, Davey Smith, Sheehan and Thompson2017). To assess heterogeneity, we used Cochran's Q test. Cochran's Q test is commonly employed in meta-analyses to test whether the observed discrepancy between individual estimates is consistent with sampling variation; the test is also useful to assess heterogeneity in the IVW model (Greco et al., Reference Greco, Minelli, Sheehan and Thompson2015). Forest plots were also used to visually examine the degree of heterogeneity in causal estimates based on each individual instrument (where non-overlapping confidence intervals indicate heterogenous effects).
Results
Power Analyses
Statistical power in a binary outcome MR analysis depends on sample size, cases-to-controls ratio, and on the coefficient of determination of exposure on the instruments (R 2; Bowden et al., Reference Bowden, Davey Smith and Burgess2015). The SNPs used to instrument the MR analysis explained ~2% of the variance in BW (Horikoshi et al., Reference Horikoshi, Beaumont, Day, Warrington, Kooijman, Fernandez-Tajes and Grarup2016). Details on power, sample sizes and data sources are provided in Table 1.
Note: BW = birth weight; MDD = major depressive disorder; ADHD = attention deficit hyperactivity disorder.
The power analyses showed that our analyses were adequately powered (>80%) to detect a causal effect OR = 1.2 given an alpha of 0.05. The power dropped under this optimal level when considering smaller effect sizes (OR = 1.1).
Mendelian Randomization Analyses
Figure 1 displays the MR results.
All the MR methods showed no evidence for a causal effect of BW on any of the outcomes (all 95% confidence intervals include odds ratio of 1). Figure S1 shows the MR estimates based on the individual instruments, and the combined causal estimates produced using the different two-sample MR estimators.
Tests of Horizontal Pleiotropy: MR-Egger and Funnel Plots
The results of MR-Egger regression, which corrects for horizontally pleiotropic effects, also showed no evidence for a causal effect of BW on any of the psychiatric disorders. Additionally, we could not reject the null hypothesis of no horizontal pleiotropy, as the test of the MR-Egger intercept was not significant; this result suggests that the assumption of ‘no horizontal pleiotropy’ holds. This is also demonstrated by funnel plots shown in Figure 2.
The plots were mostly symmetrical, which is consistent with the results of MR-Egger pleiotropy test.
Tests of Heterogeneity
The Cochran's Q indicated that the estimates based on the individual instruments are heterogenous (MDD: Cochran's Q(57) = 97.554, p = 7e-04; Schizophrenia: Cochran's Q(57) = 231.647, p = 6.46e-23; ADHD: Cochran's Q(52) =97.72, p = .0001). The large inter-instrument variation is also evident in the forest plots in Figure S1.
Discussion
In this study, we used the two-sample MR procedure to test the Barker hypothesis concerning the fetal origins of adult mental disorders. We assessed whether the observed epidemiological associations between BW, on the one hand, and ADHD, MDD, and schizophrenia, on the other, are causal. We used several methods that are robust to a certain degree to violation of the ‘no pleiotropy’ assumption, hence providing a good means to check the validity of our results. Furthermore, we used several diagnostic tests that can detect bias resulting from potential assumption violation.
Our findings do not support a causal effect of BW on any of the outcomes that we considered. The results were consistent across the methods. However, the interpretation of these results hinges upon the tenability of all MR assumptions. The first assumption (that the instrument associates robustly with the exposure variable) holds true, as the SNPs that we used as IV explained a significant proportion of the variance in BW (around 2%; Horikoshi et al., Reference Horikoshi, Beaumont, Day, Warrington, Kooijman, Fernandez-Tajes and Grarup2016). The second assumption (that the instrument is independent of confounders) cannot be tested rigorously because of the many potential confounders. According to Schlotz and Phillips (Reference Schlotz and Phillips2009), socio-economic status, education, and maternal smoking may lead to an association between BW and mental disorders. To get an indication of whether the SNPs employed to instrument the current analyses associate with these potential confounders, we used summary statistics obtained from the GWASs of educational attainment (Okbay et al., Reference Okbay, Beauchamp, Fontana, Lee, Pers, Rietveld and Meddens2016) and smoking behavior (Furberg et al., Reference Furberg, Kim, Dackor, Boerwinkle, Franceschini, Ardissino and Merlini2010). There were 26 and 59 BW-associated variants that passed the quality control checks in the GWAS of smoking and in the GWAS of educational attainment, respectively. Of these, we identified only one SNP that passed the significance threshold of 0.05 in the GWAS of smoking, and 12 SNPs that associated with educational attainment (see Table S1 for details). We note that this alpha threshold is overly liberal given the large number of multiple comparisons. We wanted to maximize the power to identify even very small genetic associations with the potential confounders. To check the effects of those correlations on our results, we re-ran the MR analyses after removing any SNP that showed a significant association with any of the confounders (13 SNPs removed). Excluding these SNPs from the analyses did not change the results and the current conclusions (see Figure S2). Davey-Smith and colleagues (Smith, Reference Smith2011; Smith et al., Reference Smith, Lawlor, Harbord, Timpson, Day and Ebrahim2007) also provided empirical support for the assumption that the genetic variants are distributed in the population independent of behavioral, social, and physiological factors that might confound epidemiological studies.
The MR-Egger pleiotropy test found no evidence for pleiotropic effects, although our results showed a high degree of heterogeneity (which might be indicative of horizontal pleiotropy). To further test this assumption, we used as an alternative test the recently developed MR-PRESSO (MR pleiotropy residual sum and outlier; Verbanck et al., Reference Verbanck, Chen, Neale and Do2018). The MR-PRESSO conducts a global test to detect overall pleiotropy; next, variants yielding outlying causal effects are removed, as such SNPs likely have pleiotropic effects. While the test demonstrated that such outliers are present, correcting for them still produced no evidence for a causal effect (see Table S2).
Another important consideration when interpreting the results is statistical power. Our power analyses indicated that we had relatively good power to detect an effect as small as an odds ratio of 1.1 in the BW–MDD study, probably owing to the large sample used in the GWAS of MDD (N = 173,005 individuals). The power was lower in the other analyses. The MR studies testing the causal association between BW and ADHD and schizophrenia had adequate power to detect effects as large as an odds ratio of 1.2 but not an effect of an odds ratio of 1.1.
To put these results into perspective, we note that Thompson et al. (Reference Thompson, Syddall, Rodin, Osmond and Barker2001) demonstrated an inverse association between low BW and men's adult depression (Thompson et al., Reference Thompson, Syddall, Rodin, Osmond and Barker2001), while Gale and Martyn (Reference Gale and Martyn2004) showed that this association is observed in both men and women with very low BW (<2.5 kg). Similarly, in a recent meta-analysis of 14 observational studies, De Mola et al. (Reference De Mola, De França, de Avila Quevedo and Horta2014) found an association between low BW (<2.5 kg) and depression in adults (OR = 1.39, 95% CI [1.21, 1.60]). They further showed that the strength of the effect varied over the studies as a function of (a) the threshold used to define BW categories, (b) gender composition of the sample, (c) presence or absence of adjustment for potential confounders (e.g., socio-economic status, gestational age), (d) type of study design, and (e) age of the participants. Conversely, a meta-analysis of 18 studies by Wojicik et al. (Reference Wojcik, Lee, Colman, Hardy and Hotopf2013) found a weak effect (OR = 1.15, 95% CI [1.00, 1.32]) of low BW on depression or psychological distress (they obtained a similar effect when restricting the analysis to the 15 studies that reported only depression as the outcome). Yet, correction for publication bias rendered the weak association observed by Wojicik et al. no longer significant. Similar null associations were observed in samples either restricted to women (Inskip et al., Reference Inskip, Dunn, Godfrey, Cooper and Kendrick2008) or to men (Osler et al., Reference Osler, Nordentoft and Andersen2005). The results concerning the relationship between BW and schizophrenia are also inconsistent. Several studies supported an association in individuals weighing less than 2.5 kg at birth (Cannon et al., Reference Cannon, Jones and Murray2002; Gunnell et al., Reference Gunnell, Rasmussen, Fouskakis, Tynelius and Harrison2003), while subsequent well-powered observational studies conducted in the large Scandinavian databases showed that the negative association remains significant when considering the normal BW range (up to 4.5 kg or more; Abel et al., Reference Abel, Wicks, Susser, Dalman, Pedersen, Mortensen and Webb2010; Eide et al., Reference Eide, Moster, Irgens, Reichborn-Kjennerud, Stoltenberg, Skjaerven and Abel2013). There is also evidence of no association between BW and schizophrenia (see, e.g., Gunnell et al., Reference Gunnell, Harrison, Whitley, Lewis, Tynelius and Rasmussen2005), as well as evidence supporting a reverse J-shaped association, with the largest odds of developing schizophrenia observed in individuals with very small weight at birth (<2.5 kg; see, e.g., Gunnell et al., Reference Gunnell, Rasmussen, Fouskakis, Tynelius and Harrison2003; Moilanen et al., Reference Moilanen, Jokelainen, Jones, Hartikainen, Järvelin and Isohanni2010). Unlike the present study, studies to date investigating the relationship between BW and mental disorders like schizophrenia and MDD were based on observational evidence and therefore cannot infer causality. It is likely that the previously reported associations reflect the effects of confounders, as adjustment for known confounders does not suffice (Pingault et al., Reference Pingault, O'Reilly, Schoeler, Ploubidis, Rijsdijk and Dudbridge2018); specifically, interpreting these associations as causal requires exhaustive adjustment for all possible confounders, as well as the untestable assumption of absence of reverse causation. Our study, as the first to use the MR approach, which has distinct advantages such as increased ecologic validity and absence of confounding, does not support a causal effect of BW on mental disorders. Our results, however, do not exclude the possibility of a causal association between (extremely) low BW (Lærum et al., Reference Lærum, Reitan, Evensen, Lydersen, Brubakk, Skranes and Indredavik2017) and schizophrenia and more severe forms of depression (i.e., recurrent severe depressive symptoms; Colman et al., Reference Colman, Ploubidis, Wadsworth, Jones and Croudace2007; Wojcik et al., Reference Wojcik, Lee, Colman, Hardy and Hotopf2013). Alternatively, it is possible that there is a small effect that went undetected in our study due to insufficient statistical power to detect relatively weak causal associations.
On the other hand, the evidence for the association between low BW and ADHD produced by studies to date is more robust (Momany et al., Reference Momany, Kamradt and Nikolas2018; Wiles et al., Reference Wiles, Peters, Heron, Gunnell, Emond and Lewis2006). Importantly, there are also several studies that probed the causality in this association using the co-twin control method; these studies consistently demonstrated an effect of BW on ADHD symptoms and attention problems (Ficks et al., Reference Ficks, Lahey and Waldman2013; Groen-Blokhuis et al., Reference Groen-Blokhuis, Middeldorp, van Beijsterveldt and Boomsma2011; Pettersson et al., Reference Pettersson, Sjölander, Almqvist, Anckarsäter, D'Onofrio, Lichtenstein and Larsson2015). The co-twin method probes the causal effect of a risk factor on an outcome or disorder by comparing the within-pair mean differences/relative risk for developing the disorder between unrelated participants, and monozygotic and dizygotic twin pairs discordant for exposure to the risk factor (Hart et al., Reference Hart, Taylor and Schatschneider2013; Kendler et al., Reference Kendler, Neale, MacLean, Heath, Eaves and Kessler1993; Middeldorp et al., Reference Middeldorp, Cath, Beem, Willemsen and Boomsma2008). While these studies employed a continuous measure of attention problems, our study was based on summary statistics obtained in the GWAS of ADHD (Demontis et al., Reference Demontis, Walters, Martin, Mattheisen, Als, Agerbo, Belliveau, Bybjerg-Grauholm, Bækvad-Hansen, Cerrato, Chambert, Churchhouse, Dumont, Eriksson, Gandal, Goldstein, Grove, Hansen, Hauberg, Hollegaard, Howrigan, Huang, Maller, Martin, Moran, Pallesen, Palmer, Pedersen, Pedersen, Poterba, Poulsen, Ripke, Robinson, Satterstrom, Stevens, Turley, Won, Andreassen, Burton, Boomsma, Cormand, Dalsgaard, Franke, Gelernter, Geschwind, Hakonarson, Haavik, Kranzler, Kuntsi, Langley, Lesch, Middeldorp, Reif, Rohde, Roussos, Schachar, Sklar, SonugaBarke, Sullivan, Thapar, Tung, Waldman, Nordentoft, Hougaard, Werge, Mors, Mortensen, Daly, Faraone, Børglum, Neale and Cerrato2017), which employed a dichotomous outcome (the presence or absence of ADHD, assessed based on a diagnostic interview or reported by parents or teachers). Using a continuous outcome confers larger statistical power relative to using a dichotomous measure (Groen-Blokhuis et al., Reference Groen-Blokhuis, Middeldorp, van Beijsterveldt and Boomsma2011). Again, our results do not exclude the possibility that there is a small causal effect that was not detected in our study due to weak instruments and the use of a dichotomous phenotype. Another possible explanation of the differences in the results is that the co-twin control method makes strong assumptions concerning the environmental influences not shared within a twin pair (Hart et al., Reference Hart, Taylor and Schatschneider2013). It is plausible that there are intrauterine factors not controlled for by this design — such as the difference in blood flow between twins — that could confound the relationship between BW and attention problems.
The Barker hypothesis suggests that poor fetal environment is linked to increased risk of adult diseases. BW is often used as a proxy for fetal development; yet, it might be possible that BW is a poor indicator of intrauterine factors contributing to the later development of mental disorders. To explore this possibility, we tested for causal effects using another indicator of fetal development — gestational age — by employing the same MR methods. These analyses also produced no evidence for a causal effect of gestational age on mental disorders (see Figure S3). These findings do provide additional support for our results; however, it must be noted that these new analyses had limitations, such as the small number of available instrumental SNPs, low proportion of variance explained in the exposure (weak instruments), and low statistical power.
Strengths and Limitations
To our knowledge, this is the first study to evaluate the effect of BW on mental disorders by using new statistical methods that can test causal hypothesis on the basis of genetic data. MR methods provide several advantages over the observational analysis, such as producing a causal effect estimate free from the effects of confounders (Lawlor et al., Reference Lawlor, Harbord, Sterne, Timpson and Davey Smith2008; Pingault et al., Reference Pingault, O'Reilly, Schoeler, Ploubidis, Rijsdijk and Dudbridge2018). Furthermore, the study applied various methods, each employing different assumptions. The fact that these different approaches yielded consistent results increases the likelihood that our results are robust. In addition, this study is the first to make use of summary statistics of the recently published GWASs to test the fetal origins of mental disorders.
There are three potential limitations that require attention in the interpretation of the current results. Note that the SNPs we used as IV explained only 2% of the variance in BW, which means that the instruments were relatively weak. Although we used multiple SNPs to test the causal hypotheses, these variants were employed individually; hence, the approach may still be vulnerable to weak instrument bias relative to an approach that uses a polygenic score to instrument the analysis. It is known that in the two-sample analyses, weak instruments bias the estimate toward the null (Evans & Davey Smith, Reference Evans and Davey Smith2015). Hence, it is worth readdressing the Barker hypothesis as novel GWAS summary statistics become available (particularly the BW-ADHD and BW-schizophrenia relationships, as the current study had low power to detect small effects), and by employing alternative approaches that allow for the use of strong instruments in the form of polygenic scores (Minică et al., Reference Minică, Dolan, Boomsma, de Geus and Neale2018). Another limitation concerns the relatively low power of MR-Egger to identify pleiotropic effects (Verbanck et al., Reference Verbanck, Chen, Neale and Do2018). Although MR-PRESSO is more powerful, its power depends heavily on the proportion of pleiotropic variants that made up the instrument. MR-PRESSO has good power if at least 10% of the variants have pleiotropic effects (Verbanck et al., Reference Verbanck, Chen, Neale and Do2018). However, a single pleiotropic variant is sufficient to bias the MR results. One final issue is that the assessment of outcomes was not always based on a clinical diagnosis; this might potentially affect the accuracy of the results.
Conclusion
Based on the current findings, we found no support for the Barker hypothesis concerning the fetal origins of mental disorders. To account for lack of power to identify small effects, it is important to re-run the analysis when more SNPs associated with BW are identified, resulting in stronger instruments. One way to further explore pleiotropy is to combine MR with twin models using the MR-direction of causation model (Minică et al., Reference Minică, Dolan, Boomsma, de Geus and Neale2018). In addition, indicators for fetal development other than BW can be used as proxies, such as height at birth and head circumference.
Acknowledgments
Camelia C. Minică was supported by the National Institute on Drug Abuse (grant number DA-018673). Subhi Arafat was supported by The European Union Education, Audiovisual and Culture Executive Agency.
Conflict of Interest
The authors have no conflict of interest to declare.
Supplementary material
To view supplementary material for this article, please visit https://doi.org/10.1017/thg.2018.65