Genome-Wide Association Study for Ovarian Cancer Susceptibility Using Pooled DNA

Yi Lu; Xiaoqing Chen; Jonathan Beesley; Sharon E. Johnatty; Anna deFazio; Australian Ovarian Cancer Study (AOCS) Study Group; Sandrina Lambrechts; Diether Lambrechts; Evelyn Despierre; Ignace Vergotes; Jenny Chang-Claude; Rebecca Hein; Stefan Nickels; Shan Wang-Gohrke; Thilo Dörk; Matthias Dürst; Natalia Antonenkova; Natalia Bogdanova; Marc T. Goodman; Galina Lurie; Lynne R. Wilkens; Michael E. Carney; Ralf Butzow; Heli Nevanlinna; Tuomas Heikkinen; Arto Leminen; Lambertus A. Kiemeney; Leon F.A.G. Massuger; Anne M. van Altena; Katja K. Aben; Susanne Krüger Kjaer; Estrid Høgdall; Allan Jensen; Angela Brooks-Wilson; Nhu Le; Linda Cook; Madalene Earp; Linda Kelemen; Douglas Easton; Paul Pharoah; Honglin Song; Jonathan Tyrer; Susan Ramus; Usha Menon; Alexandra Gentry-Maharaj; Simon A. Gayther; Elisa V. Bandera; Sara H. Olson; Irene Orlow; Lorna Rodriguez-Rodriguez; Stuart Macgregor; Georgia Chenevix-Trench

doi:10.1017/thg.2012.38

Genome-Wide Association Study for Ovarian Cancer Susceptibility Using Pooled DNA

Published online by Cambridge University Press: 13 July 2012

Yi Lu ,

Sandrina Lambrechts ,

Ignace Vergotes and

Yi Lu: Affiliation:
Queensland Institute of Medical Research, Brisbane, Australia
Xiaoqing Chen: Affiliation:
Queensland Institute of Medical Research, Brisbane, Australia
Jonathan Beesley: Affiliation:
Queensland Institute of Medical Research, Brisbane, Australia
Sharon E. Johnatty: Affiliation:
Queensland Institute of Medical Research, Brisbane, Australia
Anna deFazio: Affiliation:
Department of Obstetrics and Gynaecology, University of Sydney, Sydney, Australia Westmead Institute for Cancer Research, Westmead Hospital, Sydney, Australia
Sandrina Lambrechts: Affiliation:
Division of Gynaecological Oncology, Department of Obstetrics and Gynaecology, University Hospital Leuven, University of Leuven, Leuven, Belgium
Diether Lambrechts: Affiliation:
Vesalius Research Center, VIB, Leuven, Belgium Vesalius Research Center, University of Leuven, Belgium
Evelyn Despierre: Affiliation:
Division of Gynaecological Oncology, Department of Obstetrics and Gynaecology, University Hospital Leuven, University of Leuven, Leuven, Belgium
Ignace Vergotes: Affiliation:
Division of Gynaecological Oncology, Department of Obstetrics and Gynaecology, University Hospital Leuven, University of Leuven, Leuven, Belgium
Jenny Chang-Claude: Affiliation:
Division of Cancer Epidemiology, German Cancer Research Center (DKFZ), Heidelberg, Germany
Rebecca Hein: Affiliation:
Division of Cancer Epidemiology, German Cancer Research Center (DKFZ), Heidelberg, Germany
Stefan Nickels: Affiliation:
Division of Cancer Epidemiology, German Cancer Research Center (DKFZ), Heidelberg, Germany
Shan Wang-Gohrke: Affiliation:
Department of Obstetrics and Gynecology, University of Ulm, Ulm, Germany
Thilo Dörk: Affiliation:
Gynaecology Research Unit, Hannover Medical School, Hannover, Germany
Matthias Dürst: Affiliation:
Department of Gynaecology, Friedrich Schiller University, Jena, Germany
Natalia Antonenkova: Affiliation:
Byelorussian Institute for Oncology and Medical Radiology, Aleksandrov NN, Minsk, Belarus
Natalia Bogdanova: Affiliation:
Byelorussian Institute for Oncology and Medical Radiology, Aleksandrov NN, Minsk, Belarus Clinics of Radiation Oncology, Hannover Medical School, Hannover, Germany
Marc T. Goodman: Affiliation:
Cancer Epidemiology Program, University of Hawaii Cancer Center, Honolulu, HI, USA
Galina Lurie: Affiliation:
Cancer Epidemiology Program, University of Hawaii Cancer Center, Honolulu, HI, USA
Lynne R. Wilkens: Affiliation:
Cancer Epidemiology Program, University of Hawaii Cancer Center, Honolulu, HI, USA
Michael E. Carney: Affiliation:
Department of Obstetrics and Gynecology, John A Burns School of Medicine, University of Hawaii, Honolulu, HI, USA
Ralf Butzow: Affiliation:
Department of Obstetrics and Gynecology, Helsinki University Central Hospital, Helsinki, Finland
Heli Nevanlinna: Affiliation:
Department of Obstetrics and Gynecology, Helsinki University Central Hospital, Helsinki, Finland
Tuomas Heikkinen: Affiliation:
Department of Obstetrics and Gynecology, Helsinki University Central Hospital, Helsinki, Finland
Arto Leminen: Affiliation:
Department of Obstetrics and Gynecology, Helsinki University Central Hospital, Helsinki, Finland
Lambertus A. Kiemeney: Affiliation:
Department of Urology, Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands Department of Epidemiology, Biostatistics and HTA, Nijmegen Medical Centre, Radboud University, Nijmegen, The Netherlands Comprehensive Cancer Center, Nijmegen, The Netherlands
Leon F.A.G. Massuger: Affiliation:
Department of Gynaecology, Nijmegen Medical Centre, Radboud University, Nijmegen, The Netherlands
Anne M. van Altena: Affiliation:
Department of Gynaecology, Nijmegen Medical Centre, Radboud University, Nijmegen, The Netherlands
Katja K. Aben: Affiliation:
Department of Epidemiology, Biostatistics and HTA, Nijmegen Medical Centre, Radboud University, Nijmegen, The Netherlands Comprehensive Cancer Center, Nijmegen, The Netherlands
Susanne Krüger Kjaer: Affiliation:
Department of Viruses, Hormones and Cancer, Institute of Cancer Epidemiology, Danish Cancer Society, Copenhagen, Denmark
Estrid Høgdall: Affiliation:
Department of Viruses, Hormones and Cancer, Institute of Cancer Epidemiology, Danish Cancer Society, Copenhagen, Denmark
Allan Jensen: Affiliation:
Department of Gynecology, Juliane Marie Centre, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark
Angela Brooks-Wilson: Affiliation:
Department of Biomedical Physiology and Kinesiology, Simon Fraser University, Burnaby, British Columbia, Canada Genome Sciences Centre, BC Cancer Agency, Vancouver, British Columbia, Canada
Nhu Le: Affiliation:
Cancer Control Research, BC Cancer Agency, Vancouver, British Columbia, Canada
Linda Cook: Affiliation:
Division of Epidemiology and Biostatistics, University of New Mexico, Albuquerque, NM, USA
Madalene Earp: Affiliation:
Genome Sciences Centre, BC Cancer Agency, Vancouver, British Columbia, Canada
Linda Kelemen: Affiliation:
Alberta Health Services — Cancer Care, Calgary, Alberta, Canada
Douglas Easton: Affiliation:
Departments of Oncology and Public Health and Primary Care, University of Cambridge, Cambridge, UK
Paul Pharoah: Affiliation:
Departments of Oncology and Public Health and Primary Care, University of Cambridge, Cambridge, UK
Honglin Song: Affiliation:
Departments of Oncology and Public Health and Primary Care, University of Cambridge, Cambridge, UK
Jonathan Tyrer: Affiliation:
Departments of Oncology and Public Health and Primary Care, University of Cambridge, Cambridge, UK
Susan Ramus: Affiliation:
Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
Usha Menon: Affiliation:
Gynaecological Oncology Unit, UCL EGA Institute for Women's Health, University College London, London, UK
Alexandra Gentry-Maharaj: Affiliation:
Gynaecological Oncology Unit, UCL EGA Institute for Women's Health, University College London, London, UK
Simon A. Gayther: Affiliation:
Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
Elisa V. Bandera: Affiliation:
The Cancer Institute of New Jersey, New Brunswick, NJ, USA Memorial Sloan-Kettering Cancer Center, New York, NY, USA
Sara H. Olson: Affiliation:
Memorial Sloan-Kettering Cancer Center, New York, NY, USA
Irene Orlow: Affiliation:
Memorial Sloan-Kettering Cancer Center, New York, NY, USA
Lorna Rodriguez-Rodriguez: Affiliation:
The Cancer Institute of New Jersey, New Brunswick, NJ, USA
Stuart Macgregor*: Affiliation:
Queensland Institute of Medical Research, Brisbane, Australia
Georgia Chenevix-Trench: Affiliation:
Queensland Institute of Medical Research, Brisbane, Australia
*: address for correspondence: Stuart Macgregor, Queensland Institute of Medical Research, Locked Bag 2000, Herston, Queensland 4029, Australia. E-mail: stuart.macgregor@qimr.edu.au

Article contents

Abstract
Materials and Methods
Results
Discussion
References

Abstract

Recent Genome-Wide Association Studies (GWAS) have identified four low-penetrance ovarian cancer susceptibility loci. We hypothesized that further moderate- or low-penetrance variants exist among the subset of single-nucleotide polymorphisms (SNPs) not well tagged by the genotyping arrays used in the previous studies, which would account for some of the remaining risk. We therefore conducted a time- and cost-effective stage 1 GWAS on 342 invasive serous cases and 643 controls genotyped on pooled DNA using the high-density Illumina 1M-Duo array. We followed up 20 of the most significantly associated SNPs, which are not well tagged by the lower density arrays used by the published GWAS, and genotyping them on individual DNA. Most of the top 20 SNPs were clearly validated by individually genotyping the samples used in the pools. However, none of the 20 SNPs replicated when tested for association in a much larger stage 2 set of 4,651 cases and 6,966 controls from the Ovarian Cancer Association Consortium. Given that most of the top 20 SNPs from pooling were validated in the same samples by individual genotyping, the lack of replication is likely to be due to the relatively small sample size in our stage 1 GWAS rather than due to problems with the pooling approach. We conclude that there are unlikely to be any moderate or large effects on ovarian cancer risk untagged by less dense arrays. However, our study lacked power to make clear statements on the existence of hitherto untagged small-effect variants.

Keywords

GWAS DNA Pooling Ovarian cancer risk Nanodrop spectroscopy

Type: Articles
Information: Twin Research and Human Genetics , Volume 15 , Issue 5 , October 2012 , pp. 615 - 623

DOI: https://doi.org/10.1017/thg.2012.38 [Opens in a new window]
Copyright: Copyright © The Authors 2012

Genome-Wide Association Studies (GWAS) have been an unprecedented success in identifying common alleles with moderate to small effects associated with different diseases and phenotypes. In particular, more than 100 common, low-penetrance loci of different cancers have been uncovered by GWAS (Varghese & Easton, Reference Varghese and Easton2010). The discovery of susceptibility loci will provide significant insights to cancer etiology and an improved understanding of the mechanisms of tumor biology. In addition, loci associated with tumor progression after treatment will offer targets for therapeutic intervention, and the risk predictions based on accumulated knowledge of cancer genetics, together with environmental risk factors, will help to identify individuals with an elevated risk of cancer (Fletcher & Houlston, Reference Fletcher and Houlston2010). Although each of the common loci identified through GWAS only account for a small proportion of risk, collectively more than 20% of familial risk of prostate cancer has been explained, and ~7%, ~6%, and ~5% of familial risk of lung, colorectal, and breast cancers, respectively, can now be explained by GWAS results (Varghese & Easton, Reference Varghese and Easton2010). These estimates are likely to be conservative, as the effects of causal variants are typically larger than the associations detected through tag single-nucleotide polymorphisms (SNPs); Fletcher & Houlston, Reference Fletcher and Houlston2010).

Globally, ovarian cancer is the seventh leading cause of cancer mortality among woman. Despite its relatively rare incidence, it has the same pattern of familial aggregation as other major cancers. Early twin studies have shown that most of the excess familial risk of ovarian cancer is due to genetic factors rather than shared environmental factors (Lichtenstein et al., Reference Lichtenstein, Holm, Verkasalo, Iliadou, Kaprio, Koskenvuo, Pukkala, Skytthe and Hemminki2000). It is well established that although rare mutations in BRCA1 and BRCA2, identified originally by linkage studies, are the most important genetic risk factors in terms of their high penetrance, they do not fully account for the excess ovarian cancer risk seen in families. To date, one large GWAS has been conducted on ovarian cancer susceptibility aiming to identify some of the remaining unexplained familial risk. This GWAS used relatively low-density Illumina 610K and 550K arrays for cases and controls, respectively Song et al., (Reference Craig, Hewitt, McMellon, Henders, Ma, Wallace, Sharma, Burdon, Visscher, Montgomery and MacGregor2009), Bolton et al., (Reference Bolton, Tyrer, Song, Ramus, Notaridou, Jones and Gayther2010), Goode et al., (Reference Goode, Chenevix-Trench, Song, Ramus, Notaridou, Lawrenson and Pharoah2010). The confirmed susceptibility loci reaching genome-wide significance level (p < 5 × 10⁻⁸) uncovered by this GWAS are at 9p22.2 near BNC2 (Song et al., Reference Song, Ramus, Tyrer, Bolton, Gentry-Maharaj, Wozniak and Gayther2009), 19p13.11 near C19orf62 (also known as MERIT40) (Bolton et al., Reference Bolton, Tyrer, Song, Ramus, Notaridou, Jones and Gayther2010), 8q24, and 2q31; two other borderline significant loci at 3q25 and 17q21 were also identified (Goode et al., Reference Goode, Chenevix-Trench, Song, Ramus, Notaridou, Lawrenson and Pharoah2010). In addition, a candidate gene study has implicated the TERT locus, which has been found to contain susceptibility SNPs by many other cancer GWAS (Johnatty et al., Reference Johnatty, Beesley, Chen, Macgregor, Duffy, Spurdle and Chenevix-Trench2010). All these loci confer small risks (per-allele relative risk less than 1.3), supporting the concept of polygenic architecture underlying ovarian cancer susceptibility. As found in other cancer types, ovarian cancer also shows histological subtype variation. The associations identified so far are stronger for serous tumors than for other histological subtypes. To identify histology-specific risk loci, separate GWAS on different subtypes will be more powerful than a single GWAS, including all subtypes. However, the high cost of GWAS limits the desirability of carrying out studies with individual genotyping for the less common subtypes, such as endometrioid, mucinous, and clear cell ovarian cancers, which may not be well powered.

The GWAS genotyping on pooled DNA has proved to be a time- and cost-effective alternative to conventional GWAS, which individually genotype all the study subjects (Craig et al., Reference Craig, Hewitt, McMellon, Henders, Ma, Wallace, Sharma, Burdon, Visscher, Montgomery and MacGregor2009; Macgregor et al., Reference Macgregor, Visscher and Montgomery2006, Reference Macgregor, Zhao, Henders, Martin, Montgomery and Visscher2008; Norton et al., Reference Norton, Williams, O'Donovan and Owen2004; Sham et al., Reference Sham, Bader, Craig, O'Donovan and Owen2002; Visscher & Le Hellard, Reference Visscher and Hellard2003). In this study we conducted a pooled GWAS on serous ovarian cancer risk, followed by validation of the pooled results and genotyping SNPs of interest in an independent large dataset from the Ovarian Cancer Association Consortium. We used the high-density Illumina 1M-Duo array containing 1.2 million SNPs for our pooled GWAS, because it has a superior coverage with 93% of common SNPs in the CEU population tagged at r ² ≥ 0.8. The aim of this study was two-fold: to test the hypothesis that common SNPs with moderate or low risks, which are not well tagged by the lower density arrays (Illumina 550K and 610K arrays used in the previous GWAS), also account for some of the residual ovarian cancer risk; and to determine whether the pooled GWAS can be effectively carried out on DNA quantified by spectrophotometry, as opposed to Picogreen absorption, which we have used previously.

Materials and Methods

Ethics Statement

This study was conducted according to the principles expressed in the Declaration of Helsinki. The study was approved by the human ethics committee of Queensland Institute of Medical Research. All participants provided written informed consent.

Samples

We used samples from the Australian Ovarian Cancer Study (AOCS) for the pooled GWAS. AOCS ascertained ovarian cancer cases through the surgical treatment centers in Australia, and from the Cancer Registries of Queensland, South and Western Australia, New South Wales, and Victoria, while controls were population-based and drawn from the Commonwealth electoral roll (Burkey & Kanetsky, Reference Burkey and Kanetsky2009). We selected 342 invasive serous cases and 643 controls for the pooled GWAS. All the study subjects were self-reported White with non-Hispanic origin. Age at diagnosis and interview was recorded for cases and controls, respectively. Detailed clinical information was also available for ovarian cancer patients, including primary site of tumor, stage and grade, and overall survival time. Most of the DNAs had been isolated using salt extraction (Chang et al., Reference Chang, Newton-Bishop, Bishop, Armstrong, Bataille, Bergman, Berwick, Bracci, Elwood, Ernstoff, Green, Gruis, Holly, Ingvar, Kanetsky, Karagas, Marchand, Mackie, Olsson, Østerlind, Rebbeck, Reich, Sasieni, Siskind, Swerdlow, Titus-Ernstoff, Zens, Ziegler and Barrett2009), but a subset had been isolated with Qiagen columns, and so these DNAs were kept separate.

DNA concentration was measured before the pools were made by spectrophometry using a Nanodrop, and the samples were adjusted through serial dilution to 48–52 ng/μL. Each DNA sample of 2 μL was combined for each pool, and the final concentration was verified by Nanodrop. The salt-extracted DNAs from 303 invasive serous cases were further divided into tertiles according to their overall survival time. We made a separate pool of 39 DNAs from unselected cases that were isolated with Qiagen columns. The controls were randomly placed in seven pools, each with a size of 90–91 samples. We then matched each of the three salt-extracted case sets with two of the control sets. The smaller Qiagen-extracted DNA pool was matched with the remaining control pool. Thus, we had four comparisons of case versus control pools, where each individual case was matched with approximately two individual controls (Table 1).

TABLE 1 Design of Case-Control Pool Comparisons

^aWe stratified the cases in three large pools by overall survival time. A small subset of 39 case DNAs isolated by Qiagen columns was kept together in one pool.

We used samples from 12 sites in the Ovarian Cancer Association Consortium (OCAC) in the replication stage (Table 2). In all, we genotyped 13,779 samples, including 6,966 non-Hispanic White controls and 4,651 non-Hispanic White invasive cases, among which 2,245 were of serous histology.

TABLE 2 Summary of OCAC Samples Used for the Replication Study

^aCases eligible for secondary analysis in the replication.

^bCases eligible for primary analysis in the replication.

^cAOCS cases and controls included in the pools using stage 1 were excluded from analysis in the replication study.

Genotyping and Quality Control

All the DNA pools were genotyped on Illumina Human 1M-Duo arrays using standard protocols. All pools were genotyped in triplicate, with the exception of one control pool, which was genotyped in quadruplicate. A number of quality control (QC) steps described elsewhere (Lu et al., Reference Lu, Dimasi, Hysi, Hewitt, Burdon, Toh, Ruddle, Li, Mitchell, Healey, Montgomery, Hansell, Spector, Martin, Young, Hammond, Macgregor, Craig and Mackey2010) were also applied here: (1) SNPs must have less than 10% negative intensity values on each pool; (2) The number of working probes for SNP on each pool must be larger than 20; (3) The sum of raw red and green intensity values must be more than 1,200; (4) Minor allele frequency (MAF) in the HapMap CEU samples is over 5%; (5) SNP must not present significant variance difference between case and control pools. A number of additional checks were also applied. (6) The differential amplification parameter of SNP must be between 1/3 and 3. ‘Differential amplification’ refers to a phenomenon that the alleles at a locus are unequally amplified; in these cases the allele frequency estimates are biased because of the imbalanced raw intensity value. However, the differential amplification cancels out to a good approximation when we assess the allele frequency difference between case and control pools. We discarded SNPs with very extreme differential amplification (<1/3 or >3). This additional check is equivalent to discarding SNPs with estimated allele frequencies that are very different from the reference samples, for example, the HapMap CEU samples used here. (7) SNPs that passed quality control for more than two pool pairs out of four were kept, because in general the more the working pool pairs, the more the reliable results. (8) For the SNPs of interest, the proxies (linkage disequilibrium (LD) r ² > 0.7) must have similar association results as the underlying SNP. We applied stringent quality controls to limit false positive results rising from pooling design. After a whole series of QC steps, 9,14,948 SNPs were retained.

Individual genotyping for 20 SNPs selected from the pooled GWAS was performed using MALDI-TOF spectrophotometric mass determination of allele-specific primer extension products using Sequenom's MassARRAY system and iPLEX technology (Sequenom Inc.). The design of oligonucleotides was carried out according to the guidelines of Sequenom and performed using MassARRAY Assay Design software (version 4.0). Multiplex PCR amplification of amplicons containing SNPs of interest was performed using the Qiagen HotStart Taq Polymerase on a Perkin Elmer GeneAmp 2400 thermal cycler with 5-ng genomic DNA. Primer extension reactions were carried out according to manufacturer's instructions for iPLEX chemistry. Assay data were analyzed using Sequenom TYPER software (Version 3.4). These SNPs passed the following standard QC checks: (1) p-value for the Hardy–Weinberg equilibrium (HWE) test ≥0.05 in both cases and controls; (2) call rate > 95%; (3) concordance >98% between duplicate pairs (at least 5% per study site). One SNP (rs12078260) failed the HWE test in controls.

Statistical Methods and Analytic Tools

In the pooled GWAS, the allele frequencies on each locus were estimated from each pool, and then the differences of allele frequencies between each pair of case/control pool were assessed in the association test. Details of pooling data analysis were described elsewhere (Lu et al., Reference Lu, Dimasi, Hysi, Hewitt, Burdon, Toh, Ruddle, Li, Mitchell, Healey, Montgomery, Hansell, Spector, Martin, Young, Hammond, Macgregor, Craig and Mackey2010). The four sets of association results from each pool pair were then meta-analyzed, where the allele frequency difference between each set of case and control pools was weighted by its inverse variance (binomial variances in case and control pools plus pooling error variances; Macgregor et al., Reference Macgregor, Visscher and Montgomery2006, Reference Macgregor, Zhao, Henders, Martin, Montgomery and Visscher2008). A pooling program that incorporates the steps of estimating pooled allele frequency, mean normalization, quality controls, and finally association test taking into account pool-specific errors, has been developed for the pooled GWAS. This program is available on request.

For individually genotyped data, the SNP association was assessed in a logistic regression model implemented in PLINK (Purcell et al., Reference Purcell, Neale, Todd-Brown, Thomas, Ferreira, Bender, Maller, Sklar, de Bakker, Daly and Sham2007). Assuming a log additive model of inheritance, the per-allele risk was estimated by fitting the number of rare alleles as continuous variable. We did not adjust for age effect in individual genotyping (IG) validation to allow for a direct comparison of pooled and individual genotyped results on the same AOCS samples. However, the age-adjusted results were similar (results not presented). In the replication stage, both age and study sites were adjusted for in the logistic regression model.

Results

In order to reduce heterogeneity, the majority of invasive serous cases from the AOCS included in the pooled GWAS had tumors that originated in the ovary (except for one case, whose tumor appeared to arise in fallopian tubes), and are of high stage (>92% cases with FIGO (International Federation of Gynaecology and Obstetrics) stage III or IV) and grade (>99% cases with grade 2 or 3). Since age at diagnosis for cases is a predictor of overall survival time, age differences were observed in the comparison of cases with poor survival and controls, which were younger than these cases. A nominally significant difference in mean age was also found in the comparison of cases extracted using Qiagen columns and controls (Table 1). After carrying out extensive quality control, we tested association for 9,14,948 SNPs on each comparison of case-control pools (for details of pooling designs, see Samples in Materials and Methods section), and meta-analyzed the four sets of genome-wide association results using a standard weighting method in order to maximize statistical power.

We chose 20 SNPs from the pooled GWAS for individual genotyping in the same AOCS samples as a validation of pooled results. These SNPs were among the top-ranked SNPs from the pooled GWAS that had evidence of association with ovarian cancer susceptibility, but none of these reached genome-wide significance. Moreover, these were selected for being in the subset of SNPs not well tagged by Illumina 610K array, as one of our aims was to test the hypothesis that this pooled GWAS using denser SNP arrays could uncover additional risk SNPs not identified by the previous GWAS. These 20 SNPs were successfully genotyped for nearly all the AOCS samples included in the pooled GWAS (971 out of 985 pooled samples), but one SNP failed quality control. Table 3 compares the odds ratios (OR) and p-values from the pooled GWAS and IG validation results. Despite slight difference in samples, good concordance was observed in OR estimates, with all risk directions in agreement in both sets of results. For 15/19 SNPs, the putative associations found in the pooled GWAS were clearly validated in IG results. Therefore, by comparing the results from pooled genotyping and individual genotyping on the same set of samples, we showed that GWAS using pooled DNA, quantified by spectrometry, has the potential to estimate allele frequencies accurately and provide an efficient test of association.

TABLE 3 Comparison of Pooled GWAS and Individual Genotyping (IG) Validation Results for 19 SNPs in AOCS Samples

^aThe first allele was the risk allele of SNP.

^bThe table was sorted by the strength of association found in pooled GWAS (P_pool).

In addition, we sought independent replication for these 19 SNPs by individually genotyping a total of 13,779 samples collected from 12 study sites in OCAC (Table 2). Among 4,651 eligible White invasive cases of non-Hispanic origin, the majority (>95%) were classified as having primary tumor in the ovary, as opposed to the fallopian tube or peritoneum. Unlike the AOCS cases included in the pooled GWAS, the OCAC cases on the whole were evenly distributed over all tumor stages and grades (~54% in high stage and ~58% in high grade). Two sets of analyses were performed according to histology: In the primary analysis we restricted to White non-Hispanic cases with the serous subtype, which allowed for direct replication of SNPs found in the stage 1 GWAS on serous ovarian cancer cases; whereas in the secondary analysis we included cases with all histological subtypes to determine whether these SNPs show association with ovarian cancer regardless of histological types. The association results adjusted for age and study site are presented in Table 4. The results showed no replication for any of the 19 SNPs in the analyses restricted to serous cases only (primary analysis), or in the analyses combining all histological subtypes (secondary analysis).

TABLE 4 Replication Results of 19 SNP from Pooled GWAS by Individual Genotyping of 13,779 OCAC Samples

^aSame risk alleles as listed in Table 2.

^bNumber of samples with non-missing genotypes for each SNP.

^cPrimary analysis was restricted to serous cases; secondary analysis including all histological subtypes.

Discussion

To date, one ovarian cancer GWAS has revealed several SNPs associated with susceptibility. None of the identified loci showed large effects (OR: 0.76–1.30 depending on the histological subtype), but the study was well powered to find common alleles with moderate effects (Song et al., Reference Song, Ramus, Tyrer, Bolton, Gentry-Maharaj, Wozniak and Gayther2009). In contrast, our study of the pooled GWAS on serous ovarian cancer susceptibility was under-powered to detect the alleles with moderate effects because of small sample size. In our pooled GWAS, the published risk SNPs, rs3814113 at 9p22.2, rs2072590 at 2q31, and rs2665390 at 3q25, showed similar ORs and in the same direction as reported previously (Goode et al., Reference Goode, Chenevix-Trench, Song, Ramus, Notaridou, Lawrenson and Pharoah2010; Song et al., Reference Song, Ramus, Tyrer, Bolton, Gentry-Maharaj, Wozniak and Gayther2009), but these reached nominal or borderline significance only (Table 5). The other three SNPs (rs8170 and rs2363956 at 19p13, and rs10088218 at 8q24) identified by the published GWAS (Bolton et al., Reference Bolton, Tyrer, Song, Ramus, Notaridou, Jones and Gayther2010; Goode et al., Reference Goode, Chenevix-Trench, Song, Ramus, Notaridou, Lawrenson and Pharoah2010) were not significantly associated with risk in our results, and SNP rs9303542 at the 17q21 (Goode et al., Reference Goode, Chenevix-Trench, Song, Ramus, Notaridou, Lawrenson and Pharoah2010) was not on Illumina Human 1M-Duo array (Table 5). We found a similar OR for rs6504172, which is in high linkage disequilibrium with rs9303542 (r ² = 0.841), but this SNP was not significantly associated with risk (p = 0.49). We therefore found no support for our hypothesis that additional common SNPs represented on the 1M-Duo arrays contribute to ovarian cancer risk, probably because of insufficient power in stage 1 of this pooled GWAS.

TABLE 5 Pooled GWAS Results on the Published Loci Known to be Associated with Serous Ovarian Cancer Susceptibility

^aFor a direct comparison of results, reported per-allele ORs are the results from the published GWAS restricted to serous cases.

^brs9303542 was not on Illumina Human 1Mduo array, but rs6504172 is in high linkage disequilibrium with rs9303542 (r ² = 0.841). It had a per-allele OR in the pooled GWAS of 1.10 (0.84–1.56) (p = 0.49).

In the pooling design, we divided serous ovarian cancer cases into four case pools according to the overall survival time and/or the method in which the DNAs were isolated. In theory it would be possible to test for association of SNPs with survival time by comparing good, medium, and poor survival pools, but we would have even less power to detect reliable association with survival, so these results are not presented.

Although we were under-powered to locate any common SNPs with weak effects, our study had the potential to identify common SNPs with moderate to large effects on ovarian cancer risk if any. A notable example in cancer genetics is the common variant in KITLG with a per allele risk of 2.5 for testicular cancer, which was identified from an initial GWAS in ~300 cases and ~900 controls (Kanetsky et al., Reference Kanetsky, Mitra, Vardhanabhuti, Li, Vaughn, Letrero, Letrero, Ciosek, Doody, Smith, Weaver, Albano, Chen, Starr, Rader, Godwin, Reilly, Hakonarson, Schwartz and Nathanson2009). This empirical example suggests that although most loci exhibit smaller effect sizes, common SNPs with moderate to large effect do exist in cancer genetics, and therefore it is of interest to test similar hypothesis in different cancer types. GWAS genotyping on pooled DNA does not suffer from substantial power loss compared to a conventional study using individual genotyping. For example, in our pooled GWAS, assuming an additive effect risk allele with 20% frequency that confers a relative risk of 2, power was 80% even after scaling the original sample size by 10% in order to account for additional variance because of pooling errors (Macgregor et al., Reference Macgregor, Zhao, Henders, Martin, Montgomery and Visscher2008), in comparison with 88% power using individual genotyping. An empirical study with examples of successful identification of the known variants, including the eye color locus at OCA2/HERC2 (15q11.2-q12), the age-related macular degeneration locus at CFH (1q32), and the locus for Pseudoexfoliation syndrome at LOXL1 (15q22) clearly showed that common alleles with large effects are not likely to be missed in the pooled GWAS (Craig et al., Reference Craig, Hewitt, McMellon, Henders, Ma, Wallace, Sharma, Burdon, Visscher, Montgomery and MacGregor2009). Therefore, our results suggest that there are probably not hitherto poorly tagged common SNPs with moderate to large effects still to be identified.

This study also demonstrates that it is not always necessary to measure DNA concentration by Picogreen absorption, prior to making DNA pools. At least for the set of DNAs we used, which were largely isolated by salt-extraction, this study demonstrated highly consistent results between pooled genotyping and individual genotyping. However, it is worth noting that we have previously found that the correlation between the concentrations measured by Nanodrop spectroscopy and Picogreen adsorption is high (r ² = 0.5107) for a related set of 200 DNAs.

It should be noted that lack of replication in this study was not due to problems in the DNA pooling method. Since additional errors, such as pool construction errors and pool measurement errors, could be involved (Sham et al., Reference Sham, Bader, Craig, O'Donovan and Owen2002), we have implemented careful experiments and rigorous analysis to address this concern. Firstly, we performed careful experiments to ensure equal quantity of DNA contributed by individual samples during the formation of the pools; secondly, all the pools were genotyped at least thrice to yield better allele frequency estimates, and we applied stringent quality controls to limit the number of possible false positives; lastly, we accounted for additional variance because of pooling errors in the association tests. We also validated pooling results using individual genotyping. Given that most of the top 20 SNPs from the pooled GWAS were validated in the same samples by individual genotyping, the lack of replication is most likely to be due to relatively small sample size in our stage 1 GWAS rather than due to problems with the pooling approach.

In order to improve power for GWAS using pooled DNA, larger sample sizes and higher density micro-arrays are required. However, to properly accommodate a large sample size, a balance between the statistical power and the accuracy of the allele frequency estimates is needed. A number of empirical studies have investigated the impact of pool size (up to 1,000 samples in the pool) on the accuracy of allele frequency estimate, and usually found no obvious relationship between the pool size and the accuracy of allele frequency estimation (Jawaid & Sham, Reference Jawaid and Sham2009; Le Hellard et al., Reference Le Hellard, Ballereau, Visscher, Torrance, Pinson, Morris, Thomson, Semple, Muir, Blackwood, Porteous and Evans2002; Macgregor, Reference Macgregor2007). As indicated in Macgregor et al. (Reference Macgregor2007), most variation from pooled DNA genotyping is attributable to array error rather than pool construction error. Therefore, constructing large pools is not likely to yield a great loss of power. An optimal pool design for a limited research budget will be a few large pools, which are then genotyped for multiple times. One major criticism of the pooled GWAS is that there is no information on individual genotypes or linkage disequilibrium information, so it is generally impossible to impute missing genotypes, evaluate haplotypes, or fine map the regions of interest. However, given the cost advantage, more expensive SNP arrays with better coverage can be used to partially compensate for the power loss because of imperfect linkage disequilibrium between genetic markers and causal variants. Furthermore, fine mapping of loci identified by GWAS is usually performed in a stage 2 or 3 of genotyping once the loci have been confirmed in additional samples. Here we have investigated the use of dense Illumina 1M-Duo arrays in locating variants that were poorly tagged by previous arrays. We found that moderate to large effects on ovarian cancer risk are unlikely to exist among the SNPs on this array, but we are not able to make a clear statement about the possible existence of additional SNPs with small effects because of limited study power.

In summary, we have carried out the pooled GWAS on 342 invasive serous cases and 643 controls. The accuracy of estimated odds ratios was then validated by individually genotyping the same subjects that were included in the pool. We showed that pooled genotyping using DNAs quantified by Nanodrop spectroscopy, together with analytical tools for the pooled data, work well in terms of achieving accurate OR estimations and providing reasonable association signals. We therefore propose to use pooled GWAS for less common subtypes of cancer or orphan diseases where research funds are limited. In addition, we have developed an analytical tool for analyzing the pooled GWAS data, which will be available on request.

Acknowledgments

The Australian Ovarian Cancer Study (AOCS) Management Group (D. Bowtell, G. Chenevix-Trench, A. deFazio, D. Gertig, A. Green, and P.M. Webb) gratefully acknowledges the contribution of all clinical and scientific collaborators (see http://www.aocstudy.org/). The Australian Cancer Study Management Group (A. Green, P. Parsons, N. Hayward, P.M. Webb, and D. Whiteman) thanks all of the project staff, collaborating institutions, and study participants. NJ Ovarian Cancer Study (NJO) investigators (EV Bandera, SH Olson, I Orlow, L Rodriguez-Rodriguez) would like to thank Melony Williams-King and the staff at the New Jersey State Cancer Registry, in particular, Lisa Paddock. BELOCS investigators would like to thank Gilian Peuteman, Thomas Van Brussel, and Dominiek Smeets for technical assistance. SEARCH investigators thank study participants, the Eastern Cancer Registration and Information Centre, the collaborating general practitioners, the SEARCH team for patient recruitment, and Caroline Baynes, Don Conroy, and Craig Luccarini for sample preparation. The German Ovarian Cancer Study (GER) thanks Ursula Eilber and Tanja Koehler for competent technical assistance. The Hannover–Jena Ovarian Cancer Study (HJO) gratefully acknowledges the contribution of our clinical collaborators Frauke Kramer, Wen Zheng, Peter Hillemanns, and Ingo Runnebaum. HAWAII investigators thank all study participants and members of research teams of all participating studies, including research nurses, research scientists, data entry personnel, and consultant gynecological oncologists. Grant Support: Y. Lu is partly supported by Australian National Health and Medical Research Council (NHMRC) grant 496675. S. Macgregor is supported by a Career Development Award from the NHMRC. NJO was funded by grants from the US National Cancer Institute (K07CA095666, R01CA83918, and K22CA138563) and the Cancer Institute of New Jersey. BELOCS was funded by the Nationaal Kankerplan initiatief 29. SEARCH is funded by a programme grants from Cancer Research UK (A490/10119 and A490/10124). The German Ovarian Cancer Study (GER) was supported by the German Federal Ministry of Education and Research of Germany, Programme of Clinical Biomedical Research (01 GB 9401); genotyping in part by the state of Baden-Württemberg through the Medical Faculty, University of Ulm (P.685); and data management by the German Cancer Research Center. HAWAII is funded by the US National Institutes of Health (R01 CA58598, N01-CN-55424, N01-PC-67001, and N01-PC-95001–20). UKO is Funded by the Cancer Research UK, the Eve Appeal, the OAK Foundation, and the UK Department of Health's NIHR UCL/UCLH Biomedical Research Centre funding scheme.

References

Bolton, K. L., Tyrer, J., Song, H., Ramus, S. J., Notaridou, M., Jones, C., & Gayther, S. A. (2010). Common variants at 19p13 are associated with susceptibility to ovarian cancer. Nature Genetics, 42, 880–884.CrossRef Google Scholar PubMed

Burkey, A. R., & Kanetsky, P. A. (2009). Development of a novel location-based assessment of sensory symptoms in cancer patients: Preliminary reliability and validity assessment. Journal of Pain and Symptom Management, 37, 848–862.CrossRef Google Scholar PubMed

Chang, Y. M., Newton-Bishop, J. A., Bishop, D. T., Armstrong, B. K., Bataille, V., Bergman, W., Berwick, M., Bracci, P. M., Elwood, J. M., Ernstoff, M. S., Green, A. C., Gruis, N. A., Holly, E. A., Ingvar, C., Kanetsky, P. A., Karagas, M. R., Le Marchand, L., Mackie, R. M., Olsson, H., Østerlind, A., Rebbeck, T. R., Reich, K., Sasieni, P., Siskind, V., Swerdlow, A. J., Titus-Ernstoff, L., Zens, M. S., Ziegler, A., & Barrett, J. H. (2009). A pooled analysis of melanocytic nevus phenotype and the risk of cutaneous melanoma at different latitudes. International Journal of Cancer, 124, 420–428.CrossRef Google Scholar PubMed

Craig, J. E., Hewitt, A. W., McMellon, A. E., Henders, A. K., Ma, L. J., Wallace, L., Sharma, S., Burdon, K. P., Visscher, P. M., Montgomery, G. W., & MacGregor, S. (2009). Rapid inexpensive genome-wide association using pooled whole blood. Genome Research, 19, 2075–2080.CrossRef Google Scholar PubMed

Fletcher, O., & Houlston, R. S. (2010). Architecture of inherited susceptibility to common cancer. Nature Reviews Cancer, 10, 353–361.CrossRef Google Scholar PubMed

Goode, E. L., Chenevix-Trench, G., Song, H., Ramus, S. J., Notaridou, M., Lawrenson, K., & Pharoah, P. D. (2010). A genome-wide association study identifies susceptibility loci for ovarian cancer at 2q31 and 8q24. Nature Genetics, 42, 874–879.CrossRef Google Scholar PubMed

Jawaid, A., & Sham, P. (2009). Impact and quantification of the sources of error in DNA pooling designs. Annals of Human Genetics, 73, 118–124.CrossRef Google Scholar PubMed

Johnatty, S. E., Beesley, J., Chen, X., Macgregor, S., Duffy, D. L., Spurdle, A. B., & Chenevix-Trench, G. (2010). Evaluation of candidate stromal epithelial cross-talk genes identifies association between risk of serous ovarian cancer and TERT, a cancer susceptibility “hot-spot.” PLoS Genetics, 6, e1001016.CrossRef Google Scholar PubMed

Kanetsky, P. A., Mitra, N., Vardhanabhuti, S., Li, M., Vaughn, D. J., Letrero, R., Letrero, R., Ciosek, S. L., Doody, D. R., Smith, L. M., Weaver, J., Albano, A., Chen, C., Starr, J. R., Rader, D. J., Godwin, A. K., Reilly, M. P., Hakonarson, H., Schwartz, S. M., & Nathanson, K. L. (2009). Common variation in KITLG and at 5q31.3 predisposes to testicular germ cell cancer. Nature Genetics, 41, 811–815.CrossRef Google Scholar PubMed

Le Hellard, S., Ballereau, S. J., Visscher, P. M., Torrance, H. S., Pinson, J., Morris, S. W., Thomson, M. L., Semple, C. A., Muir, W. J., Blackwood, D. H., Porteous, D. J., & Evans, K. L. (2002). SNP genotyping on pooled DNAs: Comparison of genotyping technologies and a semi-automated method for data storage and analysis. Nucleic Acids Research, 30, e74.CrossRef Google Scholar

Lichtenstein, P., Holm, N. V., Verkasalo, P. K., Iliadou, A., Kaprio, J., Koskenvuo, M., Pukkala, E., Skytthe, A., & Hemminki, K. (2000). Environmental and heritable factors in the causation of cancer analyses of cohorts of twins from Sweden, Denmark, and Finland. New England Journal of Medicine, 343, 78–85.CrossRef Google Scholar PubMed

Lu, Y., Dimasi, D. P., Hysi, P. G., Hewitt, A. W., Burdon, K. P., Toh, T., Ruddle, J. B., Li, Y. J., Mitchell, P., Healey, P. R., Montgomery, G. W., Hansell, N., Spector, T. D., Martin, N. G., Young, T. L., Hammond, C. J., Macgregor, S., Craig, J. E., & Mackey, D. A. (2010). Common genetic variants near the Brittle Cornea Syndrome locus ZNF469 influence the blinding disease risk factor central corneal thickness. PLoS Genetics, 6, e1000947.CrossRef Google Scholar PubMed

Macgregor, S. (2007). Most pooling variation in array-based DNA pooling is attributable to array error rather than pool construction error. European Journal of Human Genetics, 15, 501–504.CrossRef Google Scholar PubMed

Macgregor, S., Visscher, P. M., & Montgomery, G. (2006). Analysis of pooled DNA samples on high density arrays without prior knowledge of differential hybridization rates. Nucleic Acids Research, 34, e55.CrossRef Google Scholar PubMed

Macgregor, S., Zhao, Z. Z., Henders, A., Martin, N. G., Montgomery, G. W., & Visscher, P. M. (2008). Highly cost-efficient genome-wide association studies using DNA pools and dense SNP arrays. Nucleic Acids Research, 36, e35.CrossRef Google Scholar PubMed

Norton, N., Williams, N. M., O'Donovan, M. C., & Owen, M. J. (2004). DNA pooling as a tool for large-scale association studies in complex traits. Annals of Medicine, 36, 146–152.CrossRef Google Scholar PubMed

Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M. A., Bender, D., Maller, J., Sklar, P., de Bakker, P. I., Daly, M. J., & Sham, P. C. (2007). PLINK: A tool set for whole-genome association and population-based linkage analyses. American Journal of Human Genetics, 81, 559–575.CrossRef Google Scholar PubMed

Sham, P., Bader, J. S., Craig, I., O'Donovan, M., & Owen, M. (2002). DNA pooling: A tool for large-scale association studies. Nature Reviews Genetics, 3, 862–871.CrossRef Google Scholar PubMed

Song, H., Ramus, S. J., Tyrer, J., Bolton, K. L., Gentry-Maharaj, A., Wozniak, E., & Gayther, S. A. (2009). A genome-wide association study identifies a new ovarian cancer susceptibility locus on 9p22.2. Nature Genetics, 41, 996–1000.CrossRef Google Scholar PubMed

Varghese, J. S., & Easton, D. F. (2010). Genome-wide association studies in common cancers – what have we learnt? Current Opinion in Genetics and Development, 20, 201–209.CrossRef Google Scholar PubMed

Visscher, P. M., & Le Hellard, S. (2003). Simple method to analyze SNP-based association studies using DNA pools. Genetic Epidemiology, 24, 291–296.CrossRef Google Scholar PubMed

TABLE 1 Design of Case-Control Pool Comparisons

TABLE 2 Summary of OCAC Samples Used for the Replication Study

TABLE 3 Comparison of Pooled GWAS and Individual Genotyping (IG) Validation Results for 19 SNPs in AOCS Samples

TABLE 4 Replication Results of 19 SNP from Pooled GWAS by Individual Genotyping of 13,779 OCAC Samples

TABLE 5 Pooled GWAS Results on the Published Loci Known to be Associated with Serous Ovarian Cancer Susceptibility

Article contents

Genome-Wide Association Study for Ovarian Cancer Susceptibility Using Pooled DNA

Abstract

Keywords

Materials and Methods

Ethics Statement

Samples

Genotyping and Quality Control

Statistical Methods and Analytic Tools

Results

Discussion

Acknowledgments

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests