Genetic and environmental factors contribute to the etiology of cancer. Joint estimates of the heritability of the most common cancers based on the large twin cohorts in Denmark, Finland and Sweden were first published in 2000 (Lichtenstein et al., 2000). We expanded this analysis with the addition of data from Norway, extended the follow-up time, improved statistical methodology and tripled the number of twins with cancer. Hence, a sufficient number of cases had occurred to provide more accurate estimates for common cancers and reliable estimates for less common cancers (Mucci et al., Reference Mucci, Hjelmborg, Harris, Czene, Havelick, Scheike and Kaprio2016). The purpose of this paper is twofold. First, it describes the twin registers in the four countries participating in the Nordic Twin Study of Cancer (NorTwinCan) as a general reference for specific studies based on these registers. Second, it assesses whether the cancer mortality and incidence rates among individuals in the Nordic twin registers are similar to those in the general population.
While more than 2000 articles have been published based on these four Nordic twin registers during the past 30 years (PubMed search November 2016), the features of cancer risk pattern among twins compared to the population have not been systematically analyzed. For individual cancers, we have explored cancer risk among twins for prostate, breast, colorectal and lung cancer (Hjelmborg et al., Reference Hjelmborg, Scheike, Holst, Skytthe, Penney, Graff and Mucci2014, Reference Hjelmborg, Korhonen, Holst, Skytthe, Pukkala, Kutschke and Kaprio2017; Mucci et al., Reference Mucci, Hjelmborg, Harris, Czene, Havelick, Scheike and Kaprio2016). This article, however, includes systematic descriptions of the participating twin registers: background, organization, size, years of record collection and administrative aspects that determine internal and external validities. We assessed whether twins have the same mortality profile as the general population to reveal possible selection issues of those included in the twin registers. Infant mortality rates are higher among twins than singletons (Farooqui et al., Reference Farooqui, Grossman and Shannon1973; Keith, Reference Keith1994), but from childhood onwards, there is no reason to assume differential mortality between twins and nontwins (Christensen et al., Reference Christensen, Vaupel, Holm and Yashin1995; Kaprio, Reference Kaprio2013).
To assess our a priori hypothesis that the rates of cancer mortality and cancer incidence among twins do not differ from those in the general population, we conducted comprehensive analyses of standardized mortality ratios (SMR) for overall mortality across country, age, period, follow-up years and of overall cancer and site-specific standardized incidence ratio (SIR) analyses across country, period, age, sex and zygosity.
Methods
Participating Twin Registers
The NorTwinCan network consists of twin registers in Denmark, Finland, Norway and Sweden (Table 1). These registers are independent research entities but committed to collaborative studies by working to harmonize policies for quality assurance, logistics and study designs as well as for permission and terms of collaboration. Since the mid-20th century, every citizen in the Nordic countries has been assigned an individually unique personal identity code (PIC), which is used in population registration and throughout the national health-care systems. These codes enable complete follow-up through record linkage to ascertain vital status, addresses, health outcomes, migration and death. Research projects using the twin registers need appropriate permissions from the national data protection authorities and regional ethics committees, and from the boards of the twin registers.
Note: MZ = monozygotic twins, DZ = dizygotic twins and UZ = unknown zygosity.
In all four of the Nordic twin registries, zygosity classifications are based on questionnaire methodology relying on responses from the same-sex twins to items about their similarity (Cederlöf et al., Reference Cederlöf, Friberg, Jonsson and Kaij1961; Sarna et al., Reference Sarna, Kaprio, Sistonen and Koskenvuo1978). This method classifies zygosity correctly in more than 95% of the same-sex twin-pairs when compared to zygosity determination based on genetic markers (Christiansen et al., Reference Christiansen, Frederiksen, Schousboe, Skytthe, von Wurmb-Schwark, Christensen and Kyvik2003; Harris et al., Reference Harris, Magnus and Tambs2006; Sarna et al., Reference Sarna, Kaprio, Sistonen and Koskenvuo1978). Zygosity assessment by genetic markers has been increasingly used due to lower cost, and by 2016, zygosity had been defined by genetic markers for more than 10,000 thousand twin-pairs.
Denmark
Established in 1954, the Danish Twin Registry is the oldest population-based twin register covering twins born since 1870. The twin cohorts were ascertained in four waves using different methods as previously described in detail (Skytthe et al., Reference Skytthe, Kyvik, Holm, Vaupel and Christensen2002, Reference Skytthe, Kyvik, Holm and Christensen2011). At present, approximately 175,000 twin individuals have been included in the entire twin registry, although the NorTwinCan study included a subset. Vital status and emigration status are obtained through yearly linkage to the national civil registration system. In the current study, twin-pairs with both twins alive on January 1, 1943, or born thereafter, are included.
Finland
The Finnish Twin Cohort Study was initiated in the early 1970s. Twins were ascertained in 1974 based on selection from Central Population Register of all pairs of persons born on the same date, of the same sex, in the same parish and with the same surname at birth. The selection was restricted to persons born before 1958. A questionnaire was mailed to all potential twins aged 18 years or more in pairs with both twins alive in 1975 identifying more than 16,000 like-sexed twin-pairs, of which zygosity was known at baseline for 13,888, based on responses to questionnaires mailed in 1975. Not all pairs could be classified reliably as monozygotic (MZ) or dizygotic (DZ), so twins of unknown zygosity may be due to this or to lack of response (Sarna et al., Reference Sarna, Kaprio, Sistonen and Koskenvuo1978). Persons who were not biological twins but satisfied the selection criteria were excluded based on the questionnaire response or after enquiries to local parishes, as previously described in detail (Kaprio & Koskenvuo, Reference Kaprio and Koskenvuo2002; Kaprio et al., Reference Kaprio, Sarna, Koskenvuo and Rantasalo1978). The Finnish Twin Cohort has been repeatedly linked with the Central Population Register to obtain data on death and emigration. In this study, only same-sex twin-pairs born 1890–1957 are included. In the late 1990s, the twin cohort was expanded to include also opposite-sex twins born between 1940 and 1957; however, comprehensive inclusion of opposite-sex pairs was possible for pairs born 1950–1957 only.
Norway
The Norwegian Twin Registry was established in 2009 as a merger of three Twin Panels, covering, respectively, birth years 1895–1945, 1915–1960, and 1967–1979, as described in detail elsewhere (Bergem, Reference Bergem2002; Harris et al., Reference Harris, Magnus and Tambs2002, Reference Harris, Magnus and Tambs2006; Nilsen et al., Reference Nilsen, Brandt, Magnus and Harris2012). Twins born before 1960 had to be alive in 1960 for assignment of national PIC introduced in 1964, based on the 1960 national census. Twins born after 1967 were registered in the medical birth registry, complete from 1967 onwards. The registry has information on 48,008 twins, of whom 31,362 have provided consent. As linkage to Norwegian registers, including the Cancer Registry, requires consent from twins, twin-pairs where one or both twins are nonconsenting twins have been excluded in this study. Data from the Norwegian Twin Registry have been matched with information from the National Cause of Death Registry (with complete data in electronic format from 1951 onwards).
Sweden
Compiled in several waves, the initial Swedish twin cohort was comprised of twins born 1886–1925 who were identified by investigators from local parish registers, beginning in the 1960s. In 1961, a questionnaire was sent to all same-sex twin-pairs with both twins alive and living in Sweden. If both twins in a pair responded, the twin-pair was included in the register. Information about opposite-sex twin-pairs from these cohorts was added subsequently in the late 1990s. Twins born 1926–1958 were identified in 1970 from national birth registers, and a questionnaire was sent to all pairs alive and living in Sweden in 1973. Younger twin cohorts have only recently been contacted as part of different studies, in which zygosity was assessed for same-sex twins (Magnusson et al., Reference Magnusson, Almqvist, Rahman, Ganna, Viktorin, Walum and Lichtenstein2013). Data on death and emigration have been obtained on regular basis through linkage to the population register and the Swedish Mortality Register.
Study Sample
For the comparisons of SIR and SMR, we restricted these analyses to same-sex twins for several reasons. First, only opposite-sex twins born 1950–1957 would be comprehensively available from Finland. Second, no opposite-sex twins from the birth cohorts 1911–1930 are registered in the Danish Twin Registry. Third, only deceased opposite-sex twins from Norway were present from birth cohorts before 1960, and among younger birth cohorts, opposite-sex twins were included during 1967–1976. Fourth, the Swedish opposite-sex twins born before 1926 were incompletely represented. Opposite-sex pairs studied by Ahrenfeldt et al. (Reference Ahrenfeldt, Skytthe, Möller, Czene, Adami, Mucci and Lindahl-Jacobsen2015) showed no evidence for differences in cancer risk between OSDZ and SSDZ pairs.
Cancer Incidence
The twin data were linked to the national cancer registries in each country to identify twins with one or more cancer diagnoses since enrolment in the twin register. The linkages were conducted in 2011–2012 when cancer registration was complete through 2010 for Finland, 2009 for Denmark and Sweden and 2008 for Norway. Updated linkages currently underway will substantially increase the number of cancer cases. The Danish Cancer Registry holds information on tumors diagnosed since 1943 (Gjerstorff, Reference Gjerstorff2011), the Finnish and Norwegian Cancer Registries since 1953 (Pukkala et al., Reference Pukkala, Engholm, Højsgaard Schmidt, Storm, Khan, Lambe and Ursin2018; Teppo et al., Reference Teppo, Pukkala and Lehtonen1994) and the Swedish Cancer Registry since 1958. Cancer registration is virtually complete, with more than 98% of all known tumors included in the register in Denmark (Storm et al., Reference Storm, Michelsen, Clemmensen and Pihl1997), Finland (Teppo et al., Reference Teppo, Pukkala and Lehtonen1994) and Norway (Larsen et al., Reference Larsen, Småstuen, Johannesen, Langmark, Parkin, Bray and Møller2009). In Sweden, cancer cases are not traced via information in death certificates, which has caused incompleteness in about 4% of all cancer sites and a much higher percentage of incompleteness of some cancer types with short survival (Mattsson & Wallgren, Reference Mattsson and Wallgren1984).
Follow-Up
The twin registers are also followed for vital status and emigration through national registers in each country on causes of death and the central population registers for vital status. To calculate person-years at risk of death and incident cancer, follow-up started at various dates depending on the methods of ascertainment in each of the four cohorts (Supplemental Table S1). Follow-up ended at death, at date of emigration or at the common closing date of follow-up (31 December 2008 for Norway and Sweden, 31 December 2009 for Denmark and 31 December 2010 for Finland).
Statistical Methods
To assess whether mortality and cancer incidence in the population-based twin cohorts is representative of their respective background populations, the numbers of observed deaths, cancer cases and person-years at risk were counted for 5-year calendar periods, by sex, and 5-year age groupings.
The SMR for overall mortality was defined as the ratio of the observed to expected number of deaths. The expected numbers of deaths were calculated by multiplying the number of person-years in each stratum by the corresponding reference mortality rate downloaded from the Human Mortality Database (www.mortality.org).
Due to different cancer coding schemes in the four national cancer registers, we used the NORDCAN grouping of cancer diagnoses into 40 cancer sites to compare incidences across the four countries over time (Engholm et al., Reference Engholm, Ferlay, Christensen, Bray, Gjerstorff, Klint and Storm2010).
The SIR was defined as the ratio of the observed to expected number of cancer cases. The expected numbers of cases for total cancer and for the specific cancer types were calculated by multiplying the number of person-years in each stratum by the corresponding cancer incidence rate in the national population obtained from the NORDCAN database. The NORDCAN database incidences are not adjusted for competing risk of death from other causes and hence underestimate the true incidence in each stratum. Analyses adjusting for competing risks of death have been implemented in other papers from NorTwinCan. For the 95% confidence intervals (CI) of the SMR and SIR estimates, it was assumed that the number of observed cases followed a Poisson distribution.
Results
More than 260,000 persons from same-sex twin-pairs in the Nordic twin registers were included and the accumulated number of person-years was 6.65 million (Table 1). The mean length of follow-up was 25.5 years. Zygosity was known for 78% of twin-pairs; the main reason for missing data was the unavailability of the twins (deaths or missing address data) or lack of response to questionnaires. Of those with known zygosity, 39% were MZ twins. The birth year distribution of the twins in each national twin cohort is described in Supplemental Figure S1.
Overall Mortality
The SMR during a twin’s first year of life was 4.44 (95% CI [4.23, 4.67]), reflecting the high-risk nature of twin pregnancies. The SMR’s estimates, excluding the first year of life, were between 0.93 and 0.99 in all countries (Table 2). The SMR for all countries combined varied from 0.89 in MZ twins to 1.28 in the twins with unknown zygosity (Table 3), with similar results in each of the four countries. The higher SMRs in twins with unknown zygosity reflect the higher rates of substance use, smoking and psychiatric problems among nonresponders to health surveys.
Note: First year of life is excluded.
Note: First year of life is excluded.
MZ = monozygotic twins, DZ = dizygotic twins and UZ = unknown zygosity.
Supplemental Figure S2 shows the distributions for SMR by age, birth year, period and follow-up time. Danish twins’ mortality was similar to the general population during the whole period. The age pattern shows only minor deviations from the expected values of SMR, and this also applies to the birth cohort pattern.
Mortality in the Finnish twin cohort is also similar to the general population except for a slightly lower mortality at the beginning of the period. This pattern is found in all cohorts, especially among twins with known zygosity (Supplemental Figure S3). The longer the time since follow-up began, the closer the SMR comes to 1, indicating that left-truncation or the selection of both twins in a pair being alive at initiation of follow-up may entail oversampling of individuals with a lower mortality rate.
Mortality is lower among the Swedish twin cohorts by age, birth year and period for most twin categories, especially for the twins with known zygosity.
The mortality pattern deviates more from that in the general population in the Norwegian and Swedish than in the other Nordic twin cohorts. The SMR values among the older Norwegian twins are close to 1, but are increasingly lower for the younger cohorts. Low SMR estimates are also observed in the beginning of the period most likely due to the ‘healthy worker effect’ when both twins in a pair had to be alive for ascertainment at a specific date. Cohorts born before 1945 and with unknown zygosity are poorly represented in the data set; cohorts born between 1920 and 1945 with unknown zygosity do indeed only include twins who have died. Also, left-truncation or the selection of both twins in a pair being alive at initiation of follow-up may be influential.
Cancer Incidence
The number of cancer cases diagnosed in the study period exceeds 30,000. The observed numbers — excluding nonmelanoma skin cancer — among both men and women in all twin registers were slightly lower than those expected based on the national cancer incidence rates, yielding SIRs 0.97 (95% CI [0.96, 0.99]) in men and 0.96 (95% CI [0.94, 0.97]) in women (Table 4). The SIR estimates were below 1.0 in all countries for both sexes, except among Finnish and Norwegian men where the estimated SIR was 1.01 and 1.02, respectively, and not significantly different to unity. Overall cancer incidence was similar for men and women in three of the four countries; the exception was Norway, where a marked difference was observed for the women (SIR 0.92; 95% CI [0.87, 0.96]). The overall SIR for cancer incidence was similar for MZ and DZ twins in each country (Table 5).
The SIR estimates for the majority of site-specific cancers were not significantly different to 1.00 (Table 6). Only male lip, prostate and testis cancer showed significantly, albeit slightly, higher estimates of SIR than 1.00, while no sites among women showed significantly elevated estimates. Among men and women, 3 and 7 of 40 sites, respectively, showed SIR estimates significantly lower than 1. No adjustment for multiple testing is made here, but 13 of the 80 comparisons are significant at the p < .05 level, compared to 4 expected.
Among the major cancer sites with at least 100 cases, the lowest SIR estimates in both sexes were observed for kidney cancer, with SIRs of 0.82 in men and 0.83 in women (Table 6). Significantly lower SIR estimates were also seen for colon cancer: SIR values were 0.90 for men and 0.87 for women, and for lung cancer 0.88 and 0.95, respectively. Differences emerged when comparing the twin cohorts from each country, especially among men (data not shown). SIR estimates were significantly greater than 1.0 among the Norwegian men for cancer of the pharynx, stomach, larynx, testis and bone, but none were below 1.0 in this group.
Note: Expected numbers based on national population; SIR (O/E) with 95% CI.
No difference in SIR estimates is found for cancer in general between twins with known zygosity and twins with unknown zygosity — both groups have SIR estimates just below 1. But differences do emerge for site-specific cancers (Table 7). The most striking differences are found for lung cancer, where the SIR estimate for twins with known zygosity is 0.86 (95% CI [0.82, 0.89]) and 1.13 (95% CI [1.02, 1.26]) for twins with unknown zygosity, probably due to nonresponders being more likely to be smokers, and for prostate cancer, the SIR estimates were 1.06 (95% CI [1.02, 1.09]) and 0.85 (95% CI [0.76, 0.95]), respectively. This may reflect higher SES status among participants in the twin surveys, and higher likelihood to undergo prostate cancer screening in Norway, Sweden and Finland.
For same-sex twins with known zygosity, the site-specific SIR estimates for the MZ and the DZ twins are generally close to each other (Table 7). For 20 of the 40 sites investigated, the SIR point estimate is greater among the MZ than the DZ twins, and this difference is most pronounced for the prostate and testis cancer (prostate: 1.11; 95% CI [1.05, 1.17]) and 1.03 (95% CI [0.99, 1.08]) for MZ and DZ, respectively, and for testis, the values are 1.36 (95% CI [1.12, 1.65]) and 1.08 (95% CI [0.91, 1.28]) for MZ and DZ, respectively. Other notable sites where MZ values were significantly lower than DZ values include colon, pancreas and lung. In contrast, there were no zygosity differences in the SIR estimates for kidney cancer, which were significantly lower among both MZ and DZ twins.
Note: Expected numbers based on national population; SIR (O/E) with 95% CI.
MZ = monozygotic twins, DZ = dizygotic twins and UZ = unknown zygosity.
Discussion
This study reports on mortality and cancer incidence in twin cohorts from four Nordic countries with a long tradition of population-based research based on national registers. The data comprise information on more than 260,000 twins and enable research into genetic influences on the liability to develop specific types of cancer beyond what has previously been possible. Further, the twin design allows studies of environmental causes of cancer while accounting for background genetic and familial influences.
The Nordic twin registers are regularly linked with national population registers to update information on death. Follow-up for emigration has not always been considered important in these studies because its magnitude has remained small, but when the follow-up times increase, the dates of emigration as an end-of-follow-up event should also be systematically linked to these twin registers. Even a small proportion of never-dying persons will bias the SMR and SIR estimates markedly downward (Pukkala, Reference Pukkala and Dillner2011). Therefore, it is important to link every research cohort with the population registry before follow-up studies to confirm that every person in the cohort really exists in the population, either alive or with date of emigration or death.
The overall mortality rates are 1–5% lower in all four countries among individuals in the Nordic twin cohorts than in the general population. These differences can be explained by specific periods and/or birth cohorts for which the ascertainment or the follow-up had shortcomings. For example, the Danish twins born from 1953 to 1982 are identified based on information from the population register regarding the relationship between parents and children. However, for the birth cohorts 1953–1960, this information is incomplete and 10–40% of the twin-pairs born in this period are not included in the register (Skytthe et al., Reference Skytthe, Kyvik, Holm, Vaupel and Christensen2002). Likewise, the establishment of the early part of the register in Finland and Sweden was based on the identification of complete twin-pairs with both twins alive to be included in surveys (Cederlöf et al., Reference Cederlöf, Friberg, Jonsson and Kaij1961; Kaprio et al., Reference Kaprio, Sarna, Koskenvuo and Rantasalo1978). In Norway, twins must consent to be part of the NTR research program as a prerequisite for conducting linkages to the National Cancer Registry.
In comparison to the general population, mortality is lower among twins with known zygosity and significantly higher among twins with unknown zygosity. This most likely reflects bias in survey response, which is how zygosity status was determined. Those who participate in health surveys tend to have better mental and physical health and a healthier lifestyle than nonparticipants (Ellenberg, Reference Ellenberg1994; Nohr & Liew, Reference Nohr and Liew2018; Silva et al., Reference Silva, Santos, Coeli and Carvalho2015). Notably, smokers are less likely to participate in surveys, and this is seen in the SIR differences for lung cancer between twin-pairs of known and unknown zygosity. In these cohorts, smoking was, expectedly, a strong predictor of lung cancer incidence (Hjelmborg et al., Reference Hjelmborg, Korhonen, Holst, Skytthe, Pukkala, Kutschke and Kaprio2017). This bias is reflected in a higher risk of premature death and increased risk of cancers that are strongly associated with lifestyle factors such as tobacco- and alcohol-related cancers.
The Nordic health data infrastructure and the unique PICs are utilized in all important registers to allow electronic linkages of numerous register-based health indicators. Data from many other registries may indeed be used as outcome variables or as co-factors for controlling potential confounding. Hospital discharge registries, perinatal outcome registries, cause of death registries, registries of infectious diseases and various disease-specific registries (e.g., diabetes, AIDS), registers of prescribed medications, disability pensions and other administrative registers with health-relevant data are commonly used (Christensen et al., Reference Christensen, Kyvik, Holm and Skytthe2011; Pukkala, Reference Pukkala and Dillner2011). The Nordic cancer registries are also able to produce data for nonstandard categories based on variables such as morphology and spreading, and for certain precancerous lesions (Pukkala et al., Reference Pukkala, Engholm, Højsgaard Schmidt, Storm, Khan, Lambe and Ursin2018).
The pattern of incidence across cancer types in the twins tends to reflect that of the general population. The overall cancer risk was marginally lower than average across most of the twin cohorts. Again, this is most likely due to the response bias whereby participation rates are better among healthier and conscientious subjects. Assessment of zygosity also depends on survey participation and hence is a source of potential selection bias. The SIRs varied systematically between the groups of twins with known versus unknown zygosity, with higher rates among the group with known zygosity for cancers associated with high SES and better lifestyle and higher rates among the group with unknown zygosity for cancers associated with low SES and more smoking and alcohol use. However, the SIR based on data from all the twins combined (those with and without zygosity determination) indicates that twins are highly representative of the population overall. The few sites where the twin SIR deviate unexpectedly from the population values require further study; for example, there is no obvious reason why kidney cancer should be much less common among twins than singletons.
Our study illustrates multiple advantages of combining the twin cohorts in the Nordic countries. The increased number of cases facilitates studies of rare cancers that could not be conducted based on the data from single countries. Analyses based on greater sample sizes yield more precise estimates of familial risk, heritability and the influence of environmental factors on cancer liability (Mucci et al., Reference Mucci, Hjelmborg, Harris, Czene, Havelick, Scheike and Kaprio2016). Increasing the number of MZ twin-pairs in which only one of the twins has a rare cancer permits studies of environmental factors independent of the genetic liability to disease. Such pairs are also extremely valuable for epigenetic studies. The long history of similar surveys among twins in the four countries facilitates the combination of data from different national surveys to study gene–environment interaction, as exemplified by a recent study on genetic predisposition to smoking and lung cancer (Hjelmborg et al., Reference Hjelmborg, Korhonen, Holst, Skytthe, Pukkala, Kutschke and Kaprio2017). Finally, the social and demographic structure in the four countries is similar, and a reliable and complete registration of a number of health events and conditions is common in the Nordic countries.
Conclusions
With an annual increase of about 1600 cancer cases in the twin cohorts defined in this study, we estimate after our next linkage to cancer registries with information through to 2017 that more than 40,000 cancer cases will be available for further studies of cancer. The high internal validity of comparisons within a defined twin register cohort makes prospective twin register-based study designs preferable for etiological studies. Overall, the population representativeness is excellent, with only a slightly more favorable SMR and SIR profile than in the general population. This small deviation from the population values implies that generalization of results to entire national populations should be made with some caution. Because the Nordic twin registers are committed to work toward joint Quality Assurance standards, including defined accessibility to external research data requests, and as the twin registers together contain a huge number of prospectively occurring cases of cancer, the NorTwinCan twin register cohorts provide a solid basis for prospective studies on cancer causes and control, as well as opportunities to explore factors that influence cancer using multiple study designs made possible with twin data (Boomsma et al., Reference Boomsma, Busjahn and Peltonen2002).
Supplementary material
To view supplementary material for this article, please visit https://doi.org/10.1017/thg.2019.10.
Acknowledgments
Initial funding to NorTwinCan was provided by the Ellison Foundation. Several network researchers — in addition to those listed as authors — provided valuable comments in topics related to this study discussed in numerous joint network meetings. This study was supported by the Academy of Finland (grants 265240, 263278 to J Kaprio) and Nordic Cancer Union grants awarded in 2011 to J Kaprio, and in 2016 and 2017 to J R Harris. The Danish Twin Cohort was supported by the Odense University Hospital AgeCare program of the Academy of Geriatric Cancer Research. Additional contributions: We are grateful to the participants of the twin registries in Denmark, Finland, Norway and Sweden.