Introduction
Major depressive disorder (MDD) is the leading cause of disability globally (Global Burden of Disease Collaborative Network, 2020). While it is known to be influenced by both environmental and genetic factors (Kendall et al., Reference Kendall, Assche, Andlauer, Choi, Luykx, Schulte and Lu2021) more work is needed to understand the relationship between genetic factors and MDD in a wider range of human populations. Polygenic scores (PGS), which aggregate the small effects of many common variants into a single index, have been widely used to capture variation in genetic liability (Wray et al., Reference Wray, Lin, Austin, McGrath, Hickie, Murray and Visscher2021). PGS derived from large-scale genome-wide association studies (GWAS) of depression have been found to be associated with depression outcomes in a variety of clinical and population-based cohorts (Fang, Scott, Song, Burmeister, & Sen, Reference Fang, Scott, Song, Burmeister and Sen2020; Halldorsdottir et al., Reference Halldorsdottir, Piechaczek, Soares de Matos, Czamara, Pehl, Wagenbuechler and Binder2019), but the vast majority of studies have been concentrated in samples of European ancestry (Peterson et al., Reference Peterson, Kuchenbaecker, Walters, Chen, Popejoy, Periyasamy and Duncan2019). A growing body of research has demonstrated that Eurocentric PGS show poorer predictive performance in other populations, and the expansion of PGS research to more diverse samples has been recognized as an urgent priority (Kachuri et al., Reference Kachuri, Chatterjee, Hirbo, Schaid, Martin, Kullo and Ge2023; Martin et al., Reference Martin, Kanai, Kamatani, Okada, Neale and Daly2019; Wang, Tsuo, Kanai, Neale, & Martin, Reference Wang, Tsuo, Kanai, Neale and Martin2022).
Likewise, investigation of genetic factors from different populations with diverse environmental circumstances is a high priority. It is well-established that exposure to early adversity can increase the odds of lifetime depression (McKay et al., Reference McKay, Kilmartin, Meagher, Cannon, Healy and Clarke2022), but it remains unclear how genetic factors combine with environmental factors to influence depression risk across diverse social and economic settings. We address these fundamental questions with newly available environmental and genetic data from the Chitwan Valley Family Study (CVFS). The CVFS is a large, population-based cohort with genomic and carefully ascertained phenotypic data from over 10 000 individuals in Nepal. Though some studies have examined depression PGS in East Asian samples (Amare et al., Reference Amare, Schubert, Hou, Clark, Papiol, Cearns and Baune2021; Avinun, Nevo, Radtke, Brigidi, & Hariri, Reference Avinun, Nevo, Radtke, Brigidi and Hariri2020; Bigdeli et al., Reference Bigdeli, Ripke, Peterson, Trzaskowski, Bacanu, Abdellaoui and Kendler2017; Pearson-Fuhrhop et al., Reference Pearson-Fuhrhop, Dunn, Mortero, Devan, Falcone, Lee and Cramer2014), the performance of depression PGS in South Asian settings has been largely unexplored apart from one study of South Asians living in the UK (Truong et al., Reference Truong, Hull, Ruan, Huang, Hornsby, Martin and Natarajan2024). The CVFS also includes a wide range of community and family environmental measures (Axinn & Pearce, Reference Axinn and Pearce2006), and we have previously demonstrated that environmental exposures, including childhood trauma and social support, are associated with risk of MDD in the CVFS (Axinn et al., Reference Axinn, Choi, Ghimire, Cole, Hermosilla, Benjet and Smoller2022; Benjet et al., Reference Benjet, Axinn, Hermosilla, Schulz, Cole, Sampson and Ghimire2020; Hermosilla et al., Reference Hermosilla, Choi, Askari, Marks, Denckla, Axinn and Benjet2022). Leveraging the CVFS, this study examines the heritability and cross-ancestry genetic correlations for lifetime MDD in Nepal. It also assesses the associations between polygenic risk for depression and lifetime MDD, on its own and alongside a potent environmental factor – namely childhood exposure to potentially traumatic events – which may also influence MDD risk.
Methods
Sample
The CVFS features a general population sample of 151 neighborhoods, fully representative of Western Chitwan in Nepal (Axinn, Ghimire, & Williams, Reference Axinn, Ghimire and Williams2012). Years of previous research in Nepal by this team, combined with years of living in Chitwan among the study population, drove the innovative combinations of ethnography and survey research that characterize the CVFS (Axinn et al., Reference Axinn, Ghimire and Williams2012; Axinn & Pearce, Reference Axinn and Pearce2006). The panel study was launched 1995, following whole families across time, with the sample being refreshed periodically to maintain full representation of the general population. Across more than 20 years, multiple rounds of interviews have generated very high-quality data with high response rates, low panel attrition, low levels of item missing rates, and high reliability in measures (Axinn et al., Reference Axinn, Ghimire and Williams2012; Thornton, Ghimire, & Mitchell, Reference Thornton, Ghimire and Mitchell2012). From 2016–2018 selected modules from the World Mental Health Composite International Diagnostic Interview (WMH-CIDI 3.0) were administered to the full CVFS sample aged 15–59. The current sample was selected to be representative of Western Chitwan in 2016. The response rate for this survey was 93%, generating 10 714 completed interviews (Scott et al., Reference Scott, Zhang, Chardoul, Ghimire, Smoller and Axinn2021), with 10 308 also providing saliva-based DNA samples. All procedures were approved by the University of Michigan Institutional Review Board (HUM00104171) and by the Nepal Health Research Council. Written or verbal informed consent was obtained from all participants.
Phenotypic measures
Outcome: major depressive disorder (MDD)
As explained in detail elsewhere (Kessler & Üstün, Reference Kessler and Üstün2004), the WMH-CIDI is a fully structured diagnostic instrument administered by lay interviewers using computer-assisted methods. To create a Nepal-specific version of the WMH-CIDI measures, a multiethnic team of CVFS researchers applied state-of-the-art survey methodology grounded in a mixed-method informed understanding of the local setting (Scott et al., Reference Scott, Zhang, Chardoul, Ghimire, Smoller and Axinn2021). The initial process took more than three years to produce clinically valid measures (Ghimire, Chardoul, Kessler, Axinn, & Adhikari, Reference Ghimire, Chardoul, Kessler, Axinn and Adhikari2013). Building on this accomplishment, the CVFS team then created a setting-specific life history calendar (LHC) to pair with the WMH-CIDI. This LHC approach adds measures of highly memorable personal and family events to the memory cues given to respondents, creating personalized memory ‘anchors’ to facilitate recall of the occurrence of psychiatric symptoms. This LHC-CIDI method significantly improved measurement of lifetime experience with mental disorder (Axinn et al., Reference Axinn, Chardoul, Gatny, Ghimire, Smoller, Zhang and Scott2020). The LHC-CIDI was validated in the largest clinical validation ever conducted in Nepal, demonstrating concordance equaling or exceeding the performance of the WMH-CIDI in Europe and the United States (Axinn et al., Reference Axinn, Chardoul, Gatny, Ghimire, Smoller, Zhang and Scott2020). Here we use the Nepal-specific LHC-CIDI measure of lifetime occurrence of major depressive disorder (MDD).
Exposure: Childhood exposure to potentially traumatic events (PTE)
Two binary exposure variables, being beaten badly by a parent or caregiver and witnessing serious physical fights at home as a child, collected by trained interviewers each through a single question (e.g. ‘As a child, were you ever badly beaten up by your parents or the people who raised you?’) as part of the WMH-CIDI were combined into a single dichotomous variable of childhood exposure to PTE (coded 1 for endorsement of either experience and 0 for no endorsement of these experiences).
Covariates: sociodemographic variables
Sociodemographic covariates included age (in years), sex, and ethnicity. While race/ethnicity designations are complex in Nepal, the CVFS study population comprises six categories that capture key aspects of variation: Brahmin/Chhetri; Dalits; Hill Janjati (multiple ethnic groups of Tibetan origin, primarily Buddhist); Terai Janjati (multiple plains ethnic groups, primarily of Burmese decent); Newar; and other (primarily recent and temporary migrants from India).
Genomic quality control and imputation
10 294 DNA samples from the CVFS cohort were genotyped using the Illumina Infinium Global Screening Array and aligned to the human genome reference build GRCh38. Genomic quality control (QC) was performed using a high-performance cloud-based pipeline called GWASpy (details available here: https://github.com/atgu/GWASpy). A total of 76 samples were excluded due to low genotyping call rate, inbreeding coefficient greater than 0.2, mismatch between self-reported and genetic sex, or more than 10 000 Mendelian errors observed. Additionally, 32 samples were excluded from further analysis which appeared to be duplicates or members of a monozygotic twin pair. After genomic quality control, a total of 10 032 unique samples were available for subsequent analysis. Genetic principal components (PCs) were calculated in an unrelated subset of the data, and the remaining samples were projected onto this PC space. Visual inspection of PC plots revealed no outliers. Of note, phasing and imputation were performed using a new jointly called dataset of harmonized Human Genome Diversity Project (HGDP) and 1000 Genomes (1KG) Project samples (Koenig et al., Reference Koenig, Yohannes, Nkambule, Goodrich, Kim, Zhao and Martin2023). In other words, this reference panel is much more diverse and deeply sequenced than existing imputation panels, making it more appropriate for use in this diverse Nepalese cohort.
Heritability estimation
Given the household-based sampling methodology, this study contains related individuals. Therefore, we adopted a genome-based restricted maximum likelihood framework for heritability estimation using the Genome-wide Complex Trait Analysis (GCTA) software version 1.94.1 (Lee, Wray, Goddard, & Visscher, Reference Lee, Wray, Goddard and Visscher2011; Yang, Lee, Goddard, & Visscher, Reference Yang, Lee, Goddard and Visscher2011). Following the method introduced by Zaitlen et al. (Reference Zaitlen, Kraft, Patterson, Pasaniuc, Bhatia, Pollack and Price2013) GCTA was used to calculate a dense genetic relatedness matrix (GRMall) from all autosomal SNPs with minor allele frequency greater than 0.01. A sparse GRM (GRMclose) was also constructed, where all elements in the GRM less than 0.05 were set to zero, thereby only capturing relatedness among close family members. Subsequently, GCTA was used to simultaneously model the binary lifetime depression outcome as a function of GRMall or GRMclose treated as random effects, along with age, sex, ethnicity, and the first 20 genetic PCs fitted as fixed effects. This model thus produces estimates of SNP-based heritability ($h_{snp}^2$) and narrow-sense heritability ($h_{}^2$), an estimator which is comparable to that from traditional pedigree-based study designs. $h_{snp}^2 \;$is estimated as the proportion of phenotypic variance explained by GRMall, while $h_{}^2$ is the sum of $h_{snp}^2 \;$and $h_{kin}^2$, the variance explained by GRMclose. We subsequently fitted two additional models for males and females independently. Since this is a population-based sample and the prevalence of depression in Nepal is not well-characterized, the sample prevalence was used when transforming heritability estimates to the liability scale.
Cross-ancestry genetic correlation and GWAS
We used S-LDXR (Shi et al., Reference Shi, Gazal, Kanai, Koch, Schoech, Siewert and Price2021) to estimate the SNP-based cross-ancestry genetic correlation between summary statistics from GWAS of depression performed in a sample of European ancestry and a GWAS of lifetime MDD conducted in the current Nepalese sample. A generalized linear mixed model GWAS was performed using GCTA (Jiang, Zheng, Fang, & Yang, Reference Jiang, Zheng, Fang and Yang2021) in this sample with depression case-control status as the outcome, fitting the GRMclose as a random genetic effect along with sex, age, ethnicity, and the first 20 PCs as fixed effect covariates. Summary statistics from a GWAS of strictly defined lifetime MDD in a sample of 67 171 European ancestry participants from the UK Biobank was used (Cai et al., Reference Cai, Revez, Adams, Andlauer, Breen, Byrne and Flint2020), as this phenotype more closely resembles the definition used in the current Nepalese sample. S-LDXR was run using the default parameters and European and South Asian sub-samples from the 1000 Genomes Project Phase 3 release were used as reference LD panels.
Polygenic scoring
PGS for depression were generated based on a meta-analysis of two sets of publicly available GWAS summary statistics for major depression, from Howard et al. (Reference Howard, Adams, Clarke, Hafferty, Gibson, Shirali and McIntosh2019, Reference Howard, Adams, Shirali, Clarke, Marioni, Davies and McIntosh2018) in n = 500 199 individuals of European ancestry and Giannakopoulou et al. (Reference Giannakopoulou, Lin, Meng, Su, Kuo and Peterson2021) in 98 502 individuals of East Asian ancestry, which were combined using the inverse variance weighted approach implemented in METAL (Willer, Li, & Abecasis, Reference Willer, Li and Abecasis2010) prior to score generation. PRS-CS (Ge, Chen, Ni, Feng, & Smoller, Reference Ge, Chen, Ni, Feng and Smoller2019), a Bayesian polygenic scoring method, was used to generate weighted SNP effect estimates for major depression, which were then scored using PLINK v1.9. Although our primary analysis was based on the continuous PGS, we also divided the continuous PGS into categorical tertiles reflecting relatively low, intermediate, and high polygenic risk, for secondary analyses.
Polygenic score analyses
Building on our heritability analyses, we fitted a series of sequential mixed models using GCTA with lifetime MDD as the outcome, where the baseline model (M0) included GRMall and GRMclose as random effects and the first 20 PCs as fixed effects. Models M1 to M3 sequentially updated the preceding model to include additional demographic variables: age, sex, and ethnicity. Model M4 built on model M3 by including the scaled continuous PGS as a fixed effect. Model M5 subsequently included childhood PTE, while model M6 also included a term for the multiplicative interaction between PGS and childhood PTE. As supplementary analyses, Models M4–M6 were refitted using categorical PGS tertiles. The unique phenotypic variance explained (R 2) on the observed scale, controlling for variables included in prior models, was estimated for each fixed effect as the relative change in phenotypic residual variance (Ve) after inclusion of a given fixed effect (e.g. the R 2 for age is calculated as (VeM0–VeM1)/VeM0). These estimates were transformed to the liability scale using the method proposed by Lee, Goddard, Wray, and Visscher (Reference Lee, Goddard, Wray and Visscher2012) assuming the sample prevalence (15%) as the population prevalence.
Results
Sample characteristics
Following genomic QC filtering, a total analytic sample of 10 032 individuals was available for modeling with complete data on MDD status, demographic covariates, and childhood PTE. Sample characteristics are reported in Table 1. Of the 10 032 participants, 55% were female and 43% were of Brahmin ethnicity, with a mean age of 35.4 years (s.d. = 12.4). Fifteen percent of the sample met lifetime diagnostic criteria for MDD. The lifetime prevalence of MDD among women in this sample was 22% and among men was 7%. MDD cases were on average 5.9 years older than non-MDD cases.
To further situate the genetic ancestry of this sample among global populations, we performed principal component analysis within the unrelated set of 1000 Genomes Phase 3 samples and projected samples from the current study onto this PC space (Fig. 1a). We also estimated genetic PCs only among East and South Asian ancestry samples from 1000 genomes, and subsequently projected the CVFS samples onto this space (Fig. 1b). As shown in Fig. 1, the CVFS Nepalese samples lie on a continuum between East Asian samples and South Asian samples across the first two principal components, with some samples clustering more closely to samples of East Asian ancestry and some more closely resembling South Asian ancestry samples. This is consistent with findings from previous studies (Arciero et al., Reference Arciero, Kraaijenbrink, Asan, Haber, Mezzavilla, Ayub and Tyler-Smith2018; Xing et al., Reference Xing, Watkins, Shlien, Walker, Huff, Witherspoon and Jorde2010).
Heritability
GCTA analyses indicated that lifetime MDD in this Nepal-based cohort was significantly heritable. Figure 2 plots heritability estimates from the current Nepalese sample with those from previous studies in majority European ancestry samples (Fernandez-Pujals et al., Reference Fernandez-Pujals, Adams, Thomson, McKechanie, Blackwood, Smith and McIntosh2015; Polderman et al., Reference Polderman, Benyamin, de Leeuw, Sullivan, van Bochoven, Visscher and Posthuma2015). On the liability scale, the narrow-sense $h_{}^2$ of lifetime MDD was estimated at 0.26 (95% CI 0.18–0.34, p = 8.5 × 10−6). We also report $h_{}^2$ estimates on the liability scale for several alternative possible assumed prevalences ranging from 5% to 25% (online Supplementary Table S1). Point estimates for narrow-sense $h_{}^2$ on the liability scale were higher in both the female-only (0.35, 95% CI 0.21–0.49, p = 2.0 × 10−4) and male-only (0.45, 95% CI 0.20 to 0.70, p = 1.2 × 10−2) sub-sample. Although the method of Zaitlen et al. (Reference Zaitlen, Kraft, Patterson, Pasaniuc, Bhatia, Pollack and Price2013) produces unbiased estimates of $h_{}^2$, the estimates of individual variance components attributable to $h_{snp}^2$ and $h_{kin}^2$ have been found to be severely mis-estimated (Evans et al., Reference Evans, Tahmasbi, Vrieze, Abecasis, Das, Gazal and Keller2018). Thus, we only report estimates of the overall $h_{}^2$.
GWAS and cross-ancestry genetic correlation
S-LDXR estimated a genetic correlation of 0.26 (95% CI −0.29 to 0.81) between the Nepalese lifetime MDD GWAS and the European lifetime MDD GWAS. Our current sample was underpowered to identify a statistically significant cross-ancestry genetic correlation with lifetime MDD using GWAS summary statistics.
Associations between polygenic risk, childhood PTE, and lifetime MDD
Regression coefficients for fixed effects estimated from the final saturated model (M6) are shown in Table 2, while the sequential variance explained by each added predictor is provided in Table 3. Sex explained the largest proportion of variance in liability to depression (12.0%, 95% CI 10.8–13.2), followed by age (7.0%, 95% CI 5.9–7.8%) and childhood PTE (0.6, 95% CI 0.3–0.9%). The remaining fixed effect predictors, including ethnicity, the continuous PGS, and the interaction between PGS and childhood PTE, either resulted in a slightly worse model fit after inclusion or had a 95% confidence interval that overlapped with zero $h_{snp}^2$. Results were consistent when examining the PGS categorically by tertiles, with PGS tertiles accounting for only 0.04% (95% CI −0.04 to 0.12%) of variance in MDD liability, and its interaction with childhood PTE explaining 0.64% (95% CI 0.33–0.95%). As a sensitivity analysis, we also tested the performance of a model where the PGS was modeled with only genetic PCs as covariates. This yielded an effect size very similar to, and statistically indistinguishable from, the estimate when additionally controlling for age, sex, and ethnicity.
Note. This table shows regression coefficient estimates from a linear mixed model fitted using the GCTA software with binary lifetime MDD status as the outcome and all predictors as covariates. Two random genetic effect structures were fitted simultaneously: the full genetic relatedness matrix (GRM) and a sparse GRM where pairwise entries less than 0.05 were set to zero. This model also included the first 20 genetic principal components as fixed effect covariates. All continuous predictors were mean-centered and standardized to have unit variance prior to model fitting. cPTE, childhood exposure to potentially traumatic event; PGS, polygenic score.
Note. Sequential linear mixed models were fit using GCTA with lifetime major depression status as the outcome variable, building on a baseline model where random effects included the full genetic relatedness matrix (GRM) and a sparse GRM where pairwise entries less than 0.05 were set to zero, while fixed effects included the first 20 genetic principal components. Age, sex, ethnicity, depression polygenic score (PGS), childhood exposure to potentially traumatic event (cPTE), and their multiplicative interaction were included as fixed effects in a stepwise manner. The unique phenotypic variance explained (R 2) on the observed scale, controlling for variables included in prior models, was estimated for each fixed effect as the relative change in phenotypic residual variance after inclusion of a given fixed effect. These were transformed to variance explained on the liability scale ($R_{liability}^2$) assuming a population prevalence of 15%. Since the inclusion of ethnicity and PGS × cPTE terms resulted in a slightly worse fit, it is not meaningful to compute standard errors or confidence intervals.
Although the mixed model framework maximizes the amount of available data by explicitly modeling relatedness, there may be a concern that the random effects modeled by the GRM overlap with the effects captured by the PGS, resulting in larger standard errors and reduced power to detect an effect of the PGS. Thus, we performed a sensitivity analysis, using linear regression on binary case-control status to model the effects of PGS while controlling for age, sex, and PCs in an unrelated (less than 2nd degree) subset of 4606 subjects from the full dataset. Results from this analysis were highly consistent with the mixed model results, with the PGS effect estimated as 0.005 (95% CI −0.003 to 0.012), nominally higher than the mixed model effect estimate, but statistically indistinguishable.
Discussion
In this study, we examined genetic factors contributing to major depression risk in an ancestrally diverse, household-based sample of 10 032 individuals with high relatedness in Western Chitwan, Nepal. 15% of the sample met diagnostic criteria on a validated interview survey for lifetime MDD, comparable to that seen in US samples (Kessler et al., Reference Kessler, Berglund, Demler, Jin, Koretz, Merikangas and Wang2003). Lifetime MDD showed statistically significant narrow-sense heritability; however, polygenic scores for depression trained on large European and East Asian ancestry GWAS did not significantly predict lifetime MDD status in this Nepalese sample. Demographic variables and environmental exposures explained a far greater proportion of variance in liability to lifetime MDD in the current sample.
Genetic contributions to depression are not well characterized outside of European ancestry populations. In this Nepalese sample, lifetime MDD was found to have a statistically significant narrow-sense heritability, explaining 26% of variance in risk for MDD on the liability scale. This estimate is slightly lower but comparable to previously reported heritability estimates of lifetime MDD from European ancestry twin and family-based samples (Fig. 2), which range from 30%-50% (Kendall et al., Reference Kendall, Assche, Andlauer, Choi, Luykx, Schulte and Lu2021) indicating a substantial genetic basis. In addition, it is similar to the liability-scale SNP heritability (26%) reported from CONVERGE, a cohort of East Asian ancestry (Giannakopoulou et al., Reference Giannakopoulou, Lin, Meng, Su, Kuo and Peterson2021), though the latter was estimated based on severe recurrent depression at a lower prevalence.
In contrast, depression PGS trained on a meta-analysis of two large GWAS from European and East Asian ancestry samples performed poorly in predicting lifetime MDD in this sample. Limited previous studies have yielded mixed evidence on the predictive power of PGS for depression trained on European GWAS for depression outcomes in non-European populations: PGS associations, albeit with lower variance explained, have been observed for depressive symptoms in a cohort of over 3000 pregnant individuals in Peru (Shen et al., Reference Shen, Gelaye, Huang, Rondon, Sanchez and Duncan2020) or for combined anxiety/depression status among individuals of African and Hispanic ancestry in US health systems (Coombes et al., Reference Coombes, Landi, Choi, Singh, Fennessy, Jenkins and Biernacka2023), though a large sample of US veterans with African ancestry reported no such association (Bigdeli et al., Reference Bigdeli, Voloudakis, Barr, Gorman, Genovese and Peterson2022). While it is known that PGS show reduced effect sizes when applied to samples that differ in ancestry from the training sample (Wang et al., Reference Wang, Tsuo, Kanai, Neale and Martin2022), our observed discrepancy may have a number of possible explanations. First, it could reflect differences in the underlying genetic architecture of MDD across ancestral populations. We attempted to quantify these differences using S-LDXR, a method for estimating cross-ancestry genetic correlation, which suggested only a modest correlation (rg = 0.26) in common genome-wide factors for lifetime depression across European and Nepalese samples, although we were underpowered to make precise estimates. This is consistent with a previous GWAS that revealed only a partial overlap (11%) between the genetic loci associated with major depression in a European sample and those in individuals of East Asian ancestry (Giannakopoulou et al., Reference Giannakopoulou, Lin, Meng, Su, Kuo and Peterson2021). Second, low cross-ancestry PGS performance could reflect differences in how lifetime MDD is measured and reported across settings, including the experience and expression of specific symptoms. However, the Nepal-CIDI instrument has been clinically validated (Axinn et al., Reference Axinn, Chardoul, Gatny, Ghimire, Smoller, Zhang and Scott2020). Importantly to note, a large proportion of samples used to train our PGS came from a GWAS which used a shallow definition of depression (Howard et al., Reference Howard, Adams, Shirali, Clarke, Marioni, Davies and McIntosh2018), which have been shown to have lower heritability and decreased specificity (Cai et al., Reference Cai, Revez, Adams, Andlauer, Breen, Byrne and Flint2020). Third, different factors may contribute to MDD risk in the Nepalese setting; for example, this population had higher levels of exposure to poverty and violent events relative to other settings assessed with the CIDI. At the same time, recent studies document a lower population-based prevalence of MDD and other disorders in the CVFS compared to more wealthy European diaspora settings (Scott et al., Reference Scott, Zhang, Chardoul, Ghimire, Smoller and Axinn2021). The sources of this ‘resilience’ remain to be documented, but rates of psychiatric disorder are steadily rising across birth cohorts in this study sample, possibly due to higher rates of schooling, work for pay, and geographic moves for work (Scott et al., Reference Scott, Zhang, Chardoul, Ghimire, Smoller and Axinn2021). As social life in Nepal becomes more similar to other settings represented in our GWAS training data, the PGS may also become a stronger predictor of MDD.
Exposure to childhood trauma, a robust predictor of depression (McKay et al., Reference McKay, Kilmartin, Meagher, Cannon, Healy and Clarke2022), was statistically significantly associated with lifetime MDD in this sample, although it explained modest variance compared to the demographic factors. Factors that most strongly explained lifetime MDD status included sex and age, with females and older participants reporting higher rates of lifetime MDD. Stratified heritability analyses suggested that the heritability of lifetime MDD liability may be higher in males than females, although the difference between the two estimates was not statistically significant. Previous findings in European samples (Polderman et al., Reference Polderman, Benyamin, de Leeuw, Sullivan, van Bochoven, Visscher and Posthuma2015) have found heritability to be higher among females than males. However, our study appears to be underpowered to detect sex differences in heritability, since the 95% confidence interval for the male-only heritability estimate (0.20–0.70) overlapped with those from both the combined sample estimate (0.18–0.34) and the female-only estimate (0.21–0.49). One explanation for higher point estimates in this sample when males and females are modeled separately is that the genetic factors influencing depression liability may be different for males and females, introducing additional heterogeneity in effects and deflating heritability estimates when modeled together. Indeed, there is some evidence from family studies in European samples that genetic risk factors differ between males and females (Kendler, Gardner, Neale, & Prescott, Reference Kendler, Gardner, Neale and Prescott2001; Kendler, Gatz, Gardner, & Pedersen, Reference Kendler, Gatz, Gardner and Pedersen2006). Another explanation may be indirect parental genetic effects that differ between mothers and fathers and can bias GCTA estimates of heritability (Barry et al., Reference Barry, Walker, Cheesman, Davey Smith, Morris and Davies2023). One recent study has provided preliminary evidence for parent-specific genetic nurture effects in samples of European ancestry (Tubbs & Sham, Reference Tubbs and Sham2023).
Strengths of this study include, first, its focus on a genetically characterized population-based sample in Western Chitwan in Nepal, addressing the gap of limited research in non-European ancestry populations (Martin et al., Reference Martin, Kanai, Kamatani, Okada, Neale and Daly2019; Wang et al., Reference Wang, Tsuo, Kanai, Neale and Martin2022). To date, there have only been a handful of psychiatric genetics studies in South Asian populations (Periyasamy et al., Reference Periyasamy, John, Padmavati, Rajendren, Thirunavukkarasu, Gratten and Mowry2019) and none focused on MDD. Second, this study leverages a highly unique cohort where depression has been rigorously ascertained, with detailed and culturally validated diagnostic assessment of lifetime MDD using life history calendar methods to ensure recall in the context of a long-term, household-based longitudinal study, which is usually challenging to obtain in genomic studies. Third, because it is from a whole-family longitudinal study, this population-representative sample captured large numbers of related individuals to maximize power for heritability estimation.
This study also had several limitations. Notably, while one of the largest of its kind, this sample was still underpowered for detecting smaller polygenic effects or estimate statistically significant cross-ancestry genetic correlations, or for identifying potential sex differences in the heritability of MDD in this Nepalese population. This points to the need to collect more data to enable more well-powered genetic analyses of psychiatric phenotypes in South Asia. We also relied on GWAS summary statistics derived from European and East Asian ancestry populations, which are the largest available but less likely to be directly transferable to this population. Additionally, we examined a relatively simple measure of childhood trauma as a candidate exposure, measured retrospectively, and lifetime MDD is likely to be associated with a wider range of environmental factors (Köhler et al., Reference Köhler, Evangelou, Stubbs, Solmi, Veronese, Belbasis and Carvalho2018) that may benefit from more comprehensive study in this population in conjunction with genetic factors.
Conclusion
In this unique cohort of 10 032 individuals in Nepal, we find evidence that lifetime MDD has a substantial genetic basis in this population, which partially overlaps with the genetic architecture of MDD in previous European ancestry studies. However, the low genetic correlation for MDD between this sample and European samples (rg = 0.26) and the limited predictive performance of PGS derived from European and East Asian studies highlight the need for expanded genetic studies in South Asian populations. Demographic variables and specific environmental exposures such as childhood trauma emerged as more influential risk factors for lifetime MDD. Future research should focus on characterizing the genetic architecture of MDD in settings like Nepal and identifying population-specific risk factors for depression and other common psychiatric disorders. Findings underscore the importance of considering genetic, environmental, and sociocultural factors to enhance our understanding of MDD in diverse populations.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S0033291724001284
Acknowledgements
The authors thank the respondents of the CVFS, whose generous contributions made this research possible; the survey staff of the Institute for Social and Environmental Research–Nepal for collecting the data reported here; the staff of the Survey Research Operations unit of the University of Michigan's Survey Research Center for development and support of the technical systems that made the fieldwork in Nepal possible; and the World Mental Health Consortium leadership and staff at Harvard University for their input into the design and all subsequent steps of collecting and analyzing the data reported here. The authors thank the genomics team at the Broad Institute for sample processing and genotyping. The authors also thank Paul Schulz and Heather Gatny for data support and Jennifer Mamer and Alison Shereda for research and editorial assistance. The authors alone remain responsible for any errors or omissions in this manuscript.
Funding statement
This project was supported by the National Institute of Mental Health (JWS, WA, grant number R01MH110872), (KWC, grant number K08MH127413); a NARSAD Brain and Behavior Foundation Young Investigator Award (KWC, no grant number); the National Human Genome Research Institute (JDT, grant number T32HG010464); and a Eunice Kennedy Shriver National Institute of Child Health and Human Development Center Grant (DG, WA, P2CHD041028) to the Population Studies Center at the University of Michigan.
Competing interests
Ghimire is faculty at the University of Michigan and also the Director of the Institute for Social and Environmental Research in Nepal (ISER-N) that collected the data for the research reported here. Ghimire's conflict of interest management plan is approved and monitored by the Regents of the University of Michigan.
Ethical standards
The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008.