- AD
-
Alzheimer's disease
- GWAS
-
genome-wide association study
- FTO
-
fat mass and obesity associated
- MTHFR
-
methylenetetrahydrofolate reductase
The use of genetic information for the detection of disease risk and the provision of tailored therapeutic strategies or lifestyle advice holds enormous potential to result in population health benefits. It is hoped that a comprehensive understanding of nutrigenetics, which refers to the interactive impact of genetic variation (genotype) and diet composition in defining health and risk of disease (phenotype), may lead to the partial replacement of current generic ‘one size fits all’ dietary guidelines with more efficacious stratified dietary advice based in part on personal genetic information. However, to date, only a relatively small fraction of the estimated total genetic contribution to phenotype has been identified. This begs the critical question of whether our initial estimates of hereditability are inflated or whether our current methods for the detection of genetic variation or the interpretation of data have been insensitive or misleading in their approaches. Erroneously overestimated heritability may occur as a result of non-additive genetic effects, gene–environment interactions or shared environment by family members( Reference Zuk, Hechter and Sunyaev 1 ); although the possibility cannot be discounted, there is little evidence to date that this is the case( Reference Manolio, Collins and Cox 2 , Reference Visscher, Medland and Ferreira 3 ). It appears more likely that a large proportion of genetic contribution to phenotype is as yet undiscovered.
The interaction between diet, genetics and health is complex and occurs at multiple levels (Fig. 1). Firstly, the influence of genotype on phenotype is not homogenous and can be more or less pronounced depending on diet composition and nutrition status of the individual. Genetic variation influences food preference, appetite and satiety and therefore overall diet composition. Once consumed, the amount of a particular dietary component absorbed and its subsequent metabolism, tissue uptake and elimination from the body are under genetic regulation. The influence of tissue concentrations of a dietary component on metabolism is again influenced by genotype, via for example, genetic variability influencing cell signalling mechanisms. Furthermore, adding to the complexity is the fact that the influence of genotype on homoeostasis and the metabolism and bioactivity of ditary components is not homogenous and is influenced by a whole range of variables such as sex, ethnicity, drug use and other lifestyle variables (see section on Genome-wide association studies).
The current paper will provide an update of an earlier review entitled ‘Nutrigenetics and personalised nutrition: how far have we progressed and are we likely to get there’ published in 2009( Reference Rimbach and Minihane 6 ). In the interim, a large body of data from genome-wide association studies (GWAS) has been published. Their contribution to our understanding of disease pathology and its genetic component will be considered, along with the strengths and limitations of GWAS approaches. The likely advances associated with emerging sequencing technologies will be briefly discussed.
Although there are a small number of notable exceptions, it is becoming apparent that the individual impact of the most common type of genetic variation, namely SNP, is small, with effect sizes in the 0–10% range. Therefore, it is unlikely that personalisation of risk of disease or dietary advice will be based on a small number of common variants with individual large effects. An alternative hypothesis as will be discussed is that a large proportion of variability is accounted for by rarer variants with large biological impacts, information not currently captured by GWAS.
Although the situation is rapidly changing, to date, due to lack of collection of dietary data, GWAS have contributed little to our nutrigenetic understanding which is largely derived from candidate-gene studies. Candidate-gene approaches have been used to identify and quantify the impact of variants such as APOE epsilon and the Methylenetetrahydrofolate reductase (MTHFR) C677T genotypes on disease risk and response to diet. An update on the recent literature for these gene loci will be provided.
Although the authors recognise that in addition to variation to the DNA code itself, the epigenetic status of genes influence phenotype, diet × epigenetic × phenotype interactions are outside the scope of the current paper, but have been reviewed extensively elsewhere( Reference Burdge, Hoile and Lillycrop 7 – Reference Caslake, Miles and Kofler 9 ).
Genome-wide association studies: how have they informed and misinformed us
In contrast to candidate-gene studies which focus on variants in genes with a known metabolic role, GWAS are not hypothesis driven. The advantage (but also challenge) of the hypothesis free approach used in GWAS is that it has the power to identify novel biological pathways associated with a phenotype or dietary component of interest. An example of this paradigm is the large research interest in uncovering the biological action of the fat mass and obesity associated (FTO) protein following the identification of the association of a SNP in the FTO gene with BMI in a 2007 GWAS study( Reference Tung and Yeo 10 ), with the authors Frayling et al. concluding that ‘FTO is a gene of unknown function in an unknown pathway’( Reference Frayling, Timpson and Weedon 11 ).
In GWAS, genetic variability is quantified in a group of cases v. matched controls, in order to establish disease associated variants. The standard SNP arrays typically have 300,000–2 × 106 tagging SNP which are correlated with (derived from HapMap( Reference Altshuler, Gibbs and Peltonen 12 )) and provide information on 80–90% of common variation (frequency >5%), but far less for the low frequency variants (0·01–5%) and virtually none for the rare variants( Reference Marian and Belmont 13 ).
The first GWAS output was published in 2005 and identified a polymorphism in complement H to be linked with age-related macular disease( Reference Klein, Zeiss and Chew 14 ). This led the way to an explosion of activity, and to date, GWAS has identified over 1600 variants associated with 250 traits( Reference Hindorff, MacArthur and Wise 15 ) and has had some success in identifying a large component of the genetic basis of particular phenotypes. For example, as reviewed by Manolio et al. for age-related macular degeneration five loci have been identified which collectively explain 50% of total heritability( Reference Manolio, Collins and Cox 2 ). In 2010, a meta-analysis of forty-six cohort studies confirmed ninety-five loci predictors of plasma lipids (total cholesterol, HDL-cholesterol, LDL-cholesterol and TAG) which explained 25–30% of the genetic component of these traits( Reference Teslovich, Musunuru and Smith 16 ). However, for the majority of polygenic traits, the identified variants only account for a much lower proportion of the total estimated heritability which varies from 20 to 80% depending on the phenotype of interest( Reference Manolio, Collins and Cox 2 , Reference Lander 17 ). For BMI and obesity, thirty-two individual loci have been identified and confirmed, but the effect size of each individual variant is small. The largest effect is evident for FTO, with each risk allele increasing BMI by on average 0·39 kg/m2 and obesity risk by 1·20( Reference Loos 18 ). However, collectively the thirty-two loci explain only 1·45% of the phenotypic variation in BMI, equivalent to 2–4% of heritability( Reference Speliotes, Willer and Berndt 19 ).
There are likely to be a number of reasons why GWAS has resulted in only modest capture of heritability( Reference Gibson 20 ). Firstly, although the gene variant may have been identified by GWAS, it may not have emerged as significant or its effect size may be underestimated for a number of reasons which include:
-
1. Use of stringent P value (typically P < 1 × 10−8) to compensate for multiple testing and eliminate false positive results. This may result in failure to detect many true signals and we may be ‘correcting away the hidden heritability’( Reference Williams and Haines 21 ). Alternative approaches to the use of strict P values, such as multi-stage confirmation of significant SNP in subsequent datasets( Reference Easton, Pooley and Dunning 22 , Reference Consortium 23 ), have led to the identification of further variants of interest;
-
2. Imprecise phenotyping( Reference Marian and Belmont 13 ). For disease outcomes, patients in the ‘case group’ often present with a range of related conditions with variable genetic aetiology. For example, in the cardiovascular field, myocardial infarction, ischaemic heart disease and coronary stenosis are often pooled, although they have both common and separate aetiological components. For many outcomes, such as blood pressure, the precision of the measurement is problematic, while for others there is a large intra-individual variability such as pro-inflammatory cytokines and C-reactive protein, which means the trait is imprecisely captured.
-
3. Control group of questionable quality( Reference McCarthy, Abecasis and Cardon 24 ). Often an individual in the control group does not have a clinical diagnosis of the primary outcome but is a registered patient for an alternative condition whose risk may also be impacted by the identified gene variants, therefore underestimating the effect size of the variant.
-
4. True causal variants incompletely surveyed are not in full linkage disequilibrium with tagging SNP.
-
5. A number of causal variants may exist in one locus, with only one tagging SNP chosen, which may result in an underestimation of the total heritability accounted for by that particular region.
Secondly, it is plausible and increasingly demonstrated that rarer single nucleotide changes, or structural variants such as copy number variations( Reference Redon, Ishikawa and Fitch 25 , Reference Almal and Padh 26 ), which are far more common than originally thought, could make a significant contribution to hidden variability (Fig. 2)( Reference Almal and Padh 26 – Reference Jakobsson, Scholz and Scheet 28 ). Next generation sequencing is becoming increasingly feasible and affordable and is in more widespread use as a research tool( Reference Manolio, Collins and Cox 2 , Reference Bras, Guerreiro and Hardy 29 , Reference Clarke, Zheng-Bradley and Smith 30 ). This technology provides a complete map of an individual's genome, overcoming limitations associated with SNP tagging in GWAS and allowing the detection of less common variants. The 1000 Genomes Project, which will include far more than 1000 aims to capture all variants with <1% frequency and >0·1% in protein coding regions (exome)( Reference Clarke, Zheng-Bradley and Smith 30 ). It is hoped that this technology will detect much of the hidden heritability.
Thirdly, the broad sense heritability model posits that once ‘unveiled’ neither common variants with modest impact nor rare alleles with high penetrance are likely to explain away missing heritability. It theorises that known genetic variation in the form of interactions, between allele pairs (dominance), between alleles in different genes (epistasis) and between genotype and environment (including diet composition) or physiological variables, explain a large proportion of inheritance.
Impact of physiological variables on genotype–phenotype associations
Currently, in genetic studies, populations are considered as single entities. It is becoming increasingly apparent that population genetic associations often under- or over-estimate the effect in subgroups. At this stage, it is too early to say what the contribution of differential penetrance in population subgroups to missing heritability is likely to be, but is likely to be significant.
For biological processes with a known influence of sex, such as adiposity and plasma lipids, it is plausible to assume that the impact of genetic variation on these phenotypes may vary between sexes, with numerous demonstrations of this now available. In the Framingham Heart Offspring cohort, no variant showed genome wide significance for measures of obesity. However, sex dimorphism was evident with four polymorphisms in the lysophospholipase-like protein 1 (LYPLAL1) locus which encodes for a lipase/esterase in adipose tissue, having divergent effects in men and women( Reference Benjamin, Suchindran and Pearce 31 ). In a meta-analysis of the available GWAS, and using the waist:hip ratio as a measure of body fat topography, along with the LYPLAL1 signal, thirteen new loci for the waist:hip ratio were evident. Seven of these displayed dimorphism with a stronger effect in women( Reference Heid, Jackson and Randall 32 ). Using a candidate-gene approach we have reported a number of significant associations between common SNP and the postprandial lipaemic response, a CVD determinant of ever increasing prevalence( Reference Jackson, Poppitt and Minihane 33 ). For the leptin receptor (Gln223Arg, rs1137101) and APOA5 (−1131T > C, rs662799) variants, the effect of genotype was only evident in men( Reference Olano-Martin, Abraham and Gill-Garrison 34 , Reference Jackson, Delgado-Lista and Gill 35 ). For example, for leptin receptor, a 20% lower postprandial TAG response was evident in ArgArg v. GlnGln homozygotes with men and women combined, with a 35% difference in the men only group and no effect of genotype in women (Table 1)( Reference Jackson, Delgado-Lista and Gill 35 ).
* Values are group mean postprandial TAG area under the curves (mmol/l × 480 min) following consumption of test meals containing 49 g (0 min) and 29 g (330 min) total fat.
There is also evidence of racial/ethnic differences in the physiological impact of particular variants. Results from the Population Architecture Using Genomics and Epidemiology Consortium( Reference Fesinmeyer, North and Ritchie 36 ) and a meta-analysis of forty-six individual GWAS( Reference Teslovich, Musunuru and Smith 16 ), which aimed to identify genome-wide signals for BMI and plasma lipids, respectively, showed a considerable overlap between associations in those of European and Asian ancestry, with more modest replication in more traditional populations such as African Americans and American Indians. Linkage disequilibrium patterns suggest that tagging SNP used in GWAS for Europeans may not adequately capture the genetic variation in other ethnic groups. Apparent ethnic differences in genotype × phenotype associations from GWAS may be in part attributable to differences in the habitual diets between populations.
Genome-wide association studies in the study of nutrigenetic interactions
Although GWAS methodology and modelling do not lend themselves well to the direct study of genotype × diet × disease associations, in part due to the fact that the sample size needed would be enormous( Reference Thomas 37 ), an increasing number of GWAS have nutrient status as their primary endpoint( Reference Lemaitre, Tanaka and Tang 38 , Reference Wang, Zhang and Richards 39 ). Using a genome-wide approach, Wang et al. identified three loci near the genes for cholesterol synthesis, hydroxylation and vitamin D transport as significant predictors of vitamin D status, which could potentially be used to set vitamin D intake recommendations( Reference Wang, Zhang and Richards 39 ). The output from GWAS has also informed the choice of variants for a more focused study of genotype × diet interaction in human epidemiology and intervention studies and in targeted replacement animal models. The identification of the FTO genotype by GWAS has led to a flurry of activity examining its impact on food intake, satiety and appetite and its interaction with macronutrient composition in determining BMI and risk of obesity( Reference Tung and Yeo 10 , Reference McCaffery, Papandonatos and Peter 40 , Reference Razquin, Marti and Martinez 41 ). In the RISCK randomised control trial, the impact of forty GWAS identified lipid associated SNP on the response of plasma lipids to a low-saturated fat diet were determined( Reference Walker, Loos and Olson 42 ). Relatively recent availability of GWAS data for cohorts for which participants have detailed dietary data, such as the Nurse's Health Study, Framingham Heart Studies and EPIC is likely to make a significant contribution to our nutrigenetic understanding in the near future.
To date, the majority of nutrigenetic information is derived from candidate-gene studies. Two of the most widely researched variants using this approach are APOE epsilon and MTHFR SNP, with approximately 6000 and 3000 associated published articles, respectively. However, despite extensive research focus there remains considerable uncertainty regarding the relative impact of these common genotypes on health and response to dietary change, which demonstrates the complexity of the interactions.
The renowned APOE epsilon genotype
As its name suggests apoE is an important modulator of many stages of lipoprotein metabolism and is the main lipid transporter in the central nervous system. Since its original identification, its pleiotropic nature has been realised with apoE now known to regulate immunity and inflammation, oxidative status and β-amyloid metabolism in the central nervous system. Two non-synonymous SNP in the APOE gene, result in three specific apoE protein isoforms namely apoE2, apoE3 and apoE4. The APOE genotype was originally described as a genetic contributor to CVD, with APOE4 carriers at increased risk. Over time, and with ever larger meta-analyses it has become apparent that at a population level the impact on CVD risk is marginal( Reference Bennet, Di Angelantonio and Ye 43 , Reference Song, Stampfer and Liu 44 ), and often does not emerge as a significant signal in GWAS (Table 2), although in individual population subgroups such as smokers, the APOE genotype remains a highly significant risk factor. In both the Northwick Park and Framingham Offspring cohorts, risk of CVD was about 2-fold higher in smokers who were wild-type E3/E3 v. E4 carriers (Table 3)( Reference Humphries, Talmud and Hawe 46 , Reference Talmud, Stephens and Hawe 47 ).
CAD, coronary artery disease; AD, Alzheimer's disease.
* E2 carriers include E2/E2 and E2/E3; E4 carriers include E3/E4 and E4/E4.
† OR (95% CI) with the wild-type E3/E3 genotype as the reference.
* E2 carriers include E2/E2 and E2/E3; E4 carriers include E3/E4 and E4/E4.
† Hazard ratio (95% CI) with all genotypes combined, never smokers as reference.
‡ Hazard ratio (95% CI) with E3/E3 never smokers as reference.
More recently there has been much interest in the APOE genotype as a longevity gene. Although not fully consistent, the APOE4 allele has emerged as being associated with a shorter life-span( Reference McKay, Silvestri and Chakravarthy 48 , Reference Murabito, Yuan and Lunetta 49 ).
Perhaps the most consistent and consequential common genotype-disease association described to date is the impact of the APOE genotype on risk of age-related cognitive decline and Alzheimer's disease. As summarised in Table 2, the APOE3/E4 and APOE4/E4 individuals are at approximately 3–4- and 12–16-fold increased risk of Alzheimer's disease and have a much earlier age of onset( Reference Bertram, McQueen and Mullin 45 ). The clinical significance of the genotype is demonstrated by the fact that almost 50 and 10% of Alzheimer's disease patients are APOE3/E4 and APOE4/E4, whereas these genotype subgroups represent approximately 20 and 2%, respectively, of the general population( Reference Minihane, Jofre-Monseny and Olano-Martin 50 , Reference Ward, Crean and Mercaldi 51 ). Interestingly, when James D. Watson was presented with his genetic information, his being the first genome sequenced by next-generation sequence technologies, he elected not to know his APOE genotype( Reference Wheeler, Srinivasan and Egholm 52 ).
The aetiological basis of this association is likely to be multi-factorial with APOE4 carriers having altered central nervous system lipid metabolism, vascular dysfunction, increased neuroinflammation and oxidative stress, β-amyloid deposition, synaptic dysfunction and impaired neurogenesis( Reference Hauser, Narayanaswami and Ryan 53 ). This is of wide interest in the quest for further establishment of the Alzheimer's disease pathological process and its treatment and prevention.
APOE genotype and its response to dietary change
Given the association between the APOE4 allele and cognitive decline and CVD risk in particular individuals, there is wide interest in the identification of dietary strategies to reduce disease risk in this large genotype subgroup. Research into the impact of the APOE genotype in response to diet has almost exclusively focused on plasma lipid response to altered dietary fat composition.
Overall, the evidence is suggestive that APOE4 carriers are most responsive to the plasma cholesterol modulating impact of total fat, cholesterol, saturated fat intake and long chain n-3 PUFA, EPA and DHA intake( Reference Minihane, Jofre-Monseny and Olano-Martin 50 ). However, with a few exceptions, the study of the impact of genotype has been conducted using retrospective genotyping, where lack of power has often led to inconclusive findings. In our original study, using retrospective genotype analysis, we observed a LDL-cholesterol raising effect of high-dose fish oil supplementation (3 g EPA + DHA per d) in APOE4 carriers which may in part negate the cardioprotective benefits( Reference Minihane, Khan and Leigh-Firbank 54 ). In a subsequent adequately powered recruitment on the basis of genotype approach, we confirmed these earlier findings and demonstrated that it is likely to be the DHA rather than EPA in fish which raises cholesterol( Reference Olano-Martin, Anil and Caslake 55 ). In more recent publications, also using prospective recruitment, we have reported no significant APOE genotype × DHA × LDL-cholesterol interaction following lower intake of fish oils (<2 g EPA + DHA per d)( Reference Caslake, Miles and Kofler 9 ) or against a background of high saturated fat intake( Reference Carvalho-Wells, Jackson and Lockyer 56 ).
Interestingly, there is also inconsistent evidence to suggest that the purported cognitive benefits of increased EPA + DHA status may be APOE genotype dependent, with no benefit in the APOE4 carriers( Reference Barberger-Gateau, Samieri and Feart 57 ). This genotype × diet interaction requires substantiation but may underlie a differential long-chain n-3 PUFA uptake and partitioning in the central nervous system.
Owing to the population prevalence of the homozygous APOE4/E4 genotype (2%), studies to date have largely compared response in the APOE4 carriers (largely APOE3/E4 individuals) v. non-carriers. Although there is limited supporting evidence, it is likely that the APOE4/E4 individuals are most responsive to dietary fat manipulation. Quantification of the response in this genotype group is important given its impact on risk of cognitive decline.
The often cited methylenetetrahydrofolate reductase C677T variant
MTHFR is an important enzyme in folate/homocysteine metabolism. It provides a clear demonstration of how genetic information could be used to provide targeted dietary advice in an at-risk population subgroup. A homozygous mutant genotype (TT, rs 1801133), which has a frequency of approximately 10% worldwide( Reference Wilcken, Bamforth and Li 58 ), is associated with reduced enzyme activity( Reference Frosst, Blom and Milos 59 ). Its subsequent impact on homocysteine concentrations, blood pressure and risk of diseases such as cancer and CVD, is variable and has been shown to be dependent on factors such as sex and ethnicity( Reference Holmes, Newcombe and Hubacek 60 , Reference Xuan, Bai and Gao 61 ). The penetrance of the genotype is also dependent on vitamin B status (folate, riboflavin, vitamin B6 and vitamin B12)( Reference Holmes, Newcombe and Hubacek 60 , Reference McNulty, Pentieva and Hoey 62 ). In two complementary intervention trials, Scott and colleagues elegantly demonstrated that riboflavin (a co-factor of MTHFR) lowered blood pressure in patients with the MTHFR TT genotype which is independent of background use of prescribed drugs( Reference Horigan, McNulty and Ward 63 , Reference Wilson, Ward and McNulty 64 ). Overall, there is considerable evidence to indicate that adequate vitamin B status is likely to abrogate the negative physiological impact of this genotype on disease risk.
Conclusion: the road ahead
The first draft of the majority (about 90%) of the sequence of the human genome was published in a Nature article entitled ‘Initial sequencing and analysis of the human genome’ a little over a decade ago (February 2001)( Reference Lander, Linton and Birren 65 ) with the complete sequence (about 99·7%) available in 2004( 66 ). At the time of availability such information was considered by many to be the panacea and one of the greatest ever medical achievements. Ten years on, many consider progress based on the human genome to be limited, but as reviewed by Eric Lander (who was lead author of the original 2001 publication), this may be a rather harsh assessment( Reference Lander 17 ). Of the 3000 Mendelian (monogenic) disorders whose genetic basis is known, the locus of the vast majority have been identified since 2001. Furthermore, GWAS and HapMap approaches have led to the identification of 1600 variants associated with 250 traits, which has established numerous novel pathological pathways and contributed to the development of novel therapies. Although sufficient to indicate the potential of the technology, there has been limited success in the use of genetic information for disease prediction and personalisation of therapeutics or preventative advice, with much of the estimated heritable component of disease risk and response to diet unaccounted for. Based on the available information, it appears that rather than being overestimated, the heritability is dark matter (i.e. it is real but we cannot see it yet), attributable to as yet undetected rare variants, or the underestimation of the impact of known variants. The wider use of sequencing will provide information on variants with a frequency of <5%. More detailed and precise characterisation of study participants in genetic studies and more sophisticated modelling will undoubtedly lead to the detection of variants of particular importance in population subgroups. Most of the benefits of genetics in public health remain to be realised and we undoubtedly have a long way to go. In the words of Churchill, it feels like the ‘end of the beginning’ rather than the ‘beginning of the end’.
Acknowledgements
The author declares no conflict of interest.