Introduction
Bipolar disorder (BD) is a severe, relapsing-remitting mental illness, taking its name from the severe fluctuations in mood, behaviors, and energy that patients experience. It is a complex and heterogenous condition, comprising multiple phenotypes, further to the type I/type II subcategories. Consequently, efforts have been made to further dissect BD into clinically meaningful subgroups underlying distinct pathophysiological pathways which would enable better stratification for diagnosis, prognosis and treatment (Coombes et al., Reference Coombes, Markota, Mann, Colby, Stahl, Talati and Biernacka2020). Several clinically measurable neurodevelopmental liabilities likely contribute to the pathophysiology and etiology of a subset of patients suffering from early onset BD. However, in some instances the evidence is sparse and replication is needed (Kloiber et al., Reference Kloiber, Rosenblat, Husain, Ortiz, Berk, Quevedo and Carvalho2020). In particular, several perinatal factors, including obstetric complications, cesarean section, older paternal age, and exposure to substances and infections during pregnancy were associated with a higher risk of developing BD, supposedly through a disruption in neurodevelopment (Chudal et al., Reference Chudal, Sourander, Polo-Kantola, Hinkka-Yli-Salomäki, Lehti, Sucksdorff and Brown2014; Martelon, Wilens, Anderson, Morrison, & Wozniak, Reference Martelon, Wilens, Anderson, Morrison and Wozniak2012; Parboosing, Bao, Shen, Schaefer, & Brown, Reference Parboosing, Bao, Shen, Schaefer and Brown2013), even though evidence from previous studies is inconsistent. Comorbidity with Autism spectrum disorder (ASD) and attention deficit hyperactivity disorder (ADHD) is further indication of a neurodevelopmental component to BD (Schiweck et al., Reference Schiweck, Arteaga-Henriquez, Aichholzer, Edwin Thanarajah, Vargas-Cáceres, Matura and Reif2021). Another neurodevelopmental marker is the familiar co-aggregation of BD and schizophrenia (SCZ) as well as shared genetic, and brain imaging markers (Valli, Fabbri, & Young, Reference Valli, Fabbri and Young2019). All this evidence suggests that a subset of patients with BD may have neurodevelopment impairment which would seem to clinically correlate with an early age at onset (AAO) and/or the presence of psychotic symptoms. The presence of psychotic features specifically would indicate a pathophysiological proximity to SCZ and suggest a continuum between SCZ and a subset of patients with BD (Arango, Fraguas, & Parellada, Reference Arango, Fraguas and Parellada2014). A case in point is the over-representation of premorbid cognitive, motor and language impairment in SCZ as well as BD with psychotic symptoms (Murray et al., Reference Murray, Sham, Van Os, Zanelli, Cannon and McDonald2004). Furthermore, childhood traumatic events and cannabis misuse prior to BD onset, both well-characterized neurodevelopmental disruptors, were associated with an early onset in BD (Daruy-Filho, Brietzke, Lafer, & Grassi-Oliveira, Reference Daruy-Filho, Brietzke, Lafer and Grassi-Oliveira2011; Lagerberg et al., Reference Lagerberg, Sundet, Aminoff, Berg, Ringen, Andreassen and Melle2011).
While early AAO is often used as an easily measurable clinical criteria to identify a clinically homogenous BD subgroup, standing out in terms of pathogenetic pathways and disease outcomes, it remains unclear which AAO cut-off best identifies patients with BD in terms of neurodevelopmental load. Indeed, one of the reasons behind the limited replication of some of the studies investigating the link between early AAO and neurodevelopmental factors is probably the lack of consensus around the definition of early AAO. The very first studies used cut-offs for early AAO that were arbitrarily chosen, usually in the 18–25 years old (y.o.) range [e.g. (Post et al., Reference Post, Leverich, Kupka, Keck, McElroy, Altshuler and Nolen2010; Strober, Reference Strober1992)]. A more principled and data-driven approach adopted to overcome this limitation was to find clusters of patients based on AAO by fitting the age of onset distribution using a mixture of Gaussians. A systematic review (Bolton, Warner, Harriss, Geddes, & Saunders, Reference Bolton, Warner, Harriss, Geddes and Saunders2021) of studies adopting this latter approach found that the best fit was provided by three sub-groups: early [mean 17.3 years (s.d. = 1.19)], intermediate [mean 26.0 years (s.d. = 1.72)], and late [mean 41.9 years (s.d. = 6.16)] AAO group. Most BD cases, around 45%, belong to early AAO group, with 35 and 20% of patients with BD displaying intermediate and late AAO respectively. Such a distribution indicates that the early AAO group as defined above represents the majority group. Thus, some authors suggested that the term ‘early onset’, when used for designating an unusually low AAO (~3 s.d. from the mean of the corresponding Gaussian), should be limited to onsets under 14 years of age considering onset in the 14–21 age band is the norm in BD and therefore does not correspond to a minority subgroup of patients. Overall, the proportion of patients developing BD by the age of 14 is 5.1% based on a meta-analytical literature review (Solmi et al., Reference Solmi, Radua, Olivola, Croce, Soardo, Salazar de Pablo and Fusar-Poli2022). However, while statistically principled and allegedly more robust than arbitrarily setting a cut-off, Gaussian mixture models (GMM) clustering only relies on the AAO distribution and, therefore, it is not guaranteed to converge to a solution where cluster membership is informative of any underlying pathophysiological process, such as the neurodevelopmental contribution to BD development. As a result, the clinical validity of an early AAO label assigned in this manner may be limited.
Existing studies investigating neurodevelopmental pathways in BD examined relatively small sample sizes, ranging from few dozens to few hundreds of patients at best (Kloiber et al., Reference Kloiber, Rosenblat, Husain, Ortiz, Berk, Quevedo and Carvalho2020); furthermore, neurodevelopmental disruptors and their correlates were mostly considered individually. In this study we capitalized on a large France-based national cohort of patients suffering from BD. Using a supervised learning approach, we aimed to determine which early AAO definition would maximize patients' separability in terms of neurodevelopmental load. Secondarily, we adopted an unsupervised learning approach and assuming, based on previous literature, that a subset of patients suffering from BD stand out from the rest in terms of neurodevelopmental load therefore constituting a separate cluster, we assessed how a data-driven two-clusters fit would stack up against each of the different early v. non-early AAO definitions.
Methods
Study population
The present study is based on the cross-sectional data set from Fonda-Mental Advanced Centres of Expertise for Bipolar Disorders (FACE-BD) cohort (Henry et al., Reference Henry, Etain, Godin, Dargel, Azorin and Gard2015). This data set was collected from 2009 to 2020 through a French-based network of 12 BD Expert Centers (Besancon, Bordeaux, Clermont-Ferrand, Creteil, Colombes, Grenoble, Marseille, Monaco, Montpellier, Nancy, Paris and Versailles). This network was instituted by the French nonprofit foundation ‘ FondaMental Foundation’. Patients were referred by their psychiatrist or general practitioner for assistance in the diagnosis and management of BD. Outpatients aged 16 years or above and meeting a DSM-IV diagnosis of type I, type II, or not otherwise specified (NOS) BD met the FACE-BD cohort inclusion criteria. The same evaluation package was used by all centers and the entire assessment was performed by members of a specialized multidisciplinary team. The study was conducted in compliance with the ethical principles of medical research involving human (WMA, Declaration of Helsinki). The assessment protocol was approved by the relevant ethical review board (CPP-Ile de France IX, 18th January 2010). All data were collected anonymously. A more detailed description of the FACE-BD cohort is provided elsewhere (Henry et al., Reference Henry, Etain, Godin, Dargel, Azorin and Gard2015).
Assessment
The Structured Clinical Interview for the Diagnostic and Statistical Manual of Mental Disorders–Fourth Edition (DSM-IV) Axis I Disorders (SCID) (Kübler, Reference Kübler2013) was used to record AAO, i.e. age at first mood episode. Variables related to neurodevelopmental impairment recorded in the FACE-BD cohort and considered for this study included cannabis misuse (along with age of first cannabis misuse when applicable) and a family history of BD in first degree relatives, which were both identified as part of the SCID. Childhood trauma was recorded with the Childhood Trauma Questionnaire (CTQ) (Bernstein, Ahluvalia, Pogge, & Handelsman, Reference Bernstein, Ahluvalia, Pogge and Handelsman1997) and childhood symptomatology of ADHD using the Wender Utah Rating Scale (WURS) (Gift, Reimherr, Marchant, Steans, & Reimherr, Reference Gift, Reimherr, Marchant, Steans and Reimherr2021). Childhood history of dyslexia, dysorthographia, dyscalculia, dysphasia, dyspraxia, language delay, stuttering, gait delay, and febrile seizures was recorded by a psychiatrist and a neuropsychologist. Variables related to gestation and perinatality included new-born length, weight, and skull circumference, gestational age, cesarean section, maternal and parental age at birth, Apgar score at minute 1 and at minute 5 after birth (Apgar, Reference Apgar1966), and neonatal hospitalization were collected by psychiatrists based on medical interview and medical file. A complete account of all variables recorded as part of the FACE-BD cohort is presented elsewhere (Henry et al., Reference Henry, Etain, Godin, Dargel, Azorin and Gard2015). All available variables from the FACE-BD cohort that have been previously associated with an aberrant neurodevelopment were included in this study, including variables with a known neurodevelopmental role but for which an association with BD has not been previously investigated to the best of our knowledge [e.g. febrile seizures (Nilsson et al., Reference Nilsson, Westerlund, Fernell, Billstedt, Miniscalco, Arvidsson and Gillberg2019)].
Machine learning strategies and data analysis
Early age at onset definitions
We set out to develop predictive models for early AAO, where the cut-off for binarizing early v. non-early AOO was set either by fitting a GMM to the AAO distribution or by systematically moving it between 14 and 25 y.o. We took a complete case approach, including only observations with no missing values under AAO.
We used both Bayesian information criterion (BIC) and Akaike information criterion (AIK) to compare different GMMs and to identify the optimum number of clusters in the range 2 to 10 (Raftery & Dean, Reference Raftery and Dean2006). The best fitting model was then used to assign each observation to the Gaussian for which the posterior probability of belonging is highest. The level of uncertainty of cluster allocation for an observation can be derived by subtracting the maximal posterior probability under a Gaussian from one. This value can be interpreted as the posterior probability of not belonging to the Gaussian maximizing the posterior. To the best of our knowledge, only one previous study (Hamshere et al., Reference Hamshere, Gordon-Smith, Forty, Jones, Caesar, Fraser and Smith2009) filtered out individuals based on cluster allocation uncertainty, using a threshold of 0.4. As the optimal cluster allocation uncertainty threshold cannot be determined a priori, we considered the effect of either enforcing no uncertainty threshold on GMM clustering or adopting progressively more stringent thresholds, namely >0.40, >0.20, and >0.10. The latter approach results in removing patients not meeting the required cluster allocation uncertainty level from further analyses, thereby expectedly creating purer and more separate clusters at the expense of decreasing the sample size.
As regards the interval 14 to 25 y.o., within which possible cut-offs were systematically explored, 14 was chosen as the lower bound since it has been previously suggested to single out ‘earlier than normally expected AAO’ (Bolton et al., Reference Bolton, Warner, Harriss, Geddes and Saunders2021). Furthermore only 5.1% of patients with BD have an earlier onset (Solmi et al., Reference Solmi, Radua, Olivola, Croce, Soardo, Salazar de Pablo and Fusar-Poli2022). The upper bound was informed by evidence showing that neurodevelopment ends at 25 years (Arain et al., Reference Arain, Haque, Johal, Mathur, Nel, Rais and Sharma2013).
Supervised learning experiments
For each different early AAO definition, we trained classifiers to predict whether patients had either an ‘early’ or a ‘non-early’ AAO based on their neurodevelopmental variables. Two different classification algorithms were trained and evaluated: Elastic Net (ENET) and Extreme Gradient Boosting (XGBoost). Stratified nested cross-validation was used as re-sampling strategy. This was preferred over a single train/test data split to obtain more robust and less unbiased estimates of classification performance (Cawley & Talbot, Reference Cawley and Talbot2010). Specifically, in the outer resampling loop, ten pairs of training/test tests were produced. In each of these outer training sets the optimum configuration of hyperparameters of the classification algorithms were selected through a random search while optimizing the area under the Receiver Operator Curve (AUROC) in a 5-fold cross-validation. The so-tuned classification algorithms were then fitted on each outer training set and their performance was evaluated on the outer test sets. AUROC was selected as performance metric due to class imbalance and to favor comparisons in future investigations since it is the typical choice in studies applying machine learning to BD. Lastly, severe class imbalance, i.e. 1:99 ratio classes ratio, in which case AUROC could be overly optimistic with respect to other metrics prioritizing identification of the minority class, e.g. precision and recall curves (Cook & Ramadas, Reference Cook and Ramadas2020), was excluded during exploratory data analysis (EDA).
An illustration of the supervised learning experiments' aim is reported in Fig. 1. A sketch of the analysis workflow and a fuller description of the supervised learning experiments pipeline are provided in online Supplementary Fig. S1 and Appendix.
Unsupervised learning
We tested the overlap between a data-driven grouping of the patients based on neurodevelopmental factors and the different early AAO definitions used in our supervised learning experiments. To this end, we adopted Deep Embeddings Clustering (DEC) (Xie, Girshick, & Farhadi, Reference Xie, Girshick and Farhadi2015). DEC is a deep neural network (DNN) powered approach to cluster analysis, capitalizing on DNNs' ability to learn a more clustering-friendly representation of the original data matrix due to their inherent universal function approximation properties and iteratively optimizing a clustering objective in the learned lower-dimensional feature space until a stopping criterion is met. DEC was shown to outperform vanilla k-means as well as extensions of k-means designed to operate in kernel feature space or directly on the representations learned by a DNN (Aljalbout, Golkov, Siddiqui, Strobel, & Cremers, Reference Aljalbout, Golkov, Siddiqui, Strobel and Cremers2018). Based on previous literature, we postulated the existence of two BD clusters (k = 2) with respect to neurodevelopmental load. Once cluster membership was assigned with DEC, data-driven grouping was matched against the different early AAO definitions using normalized mutual information (NMI) (McDaid, Greene, & Hurley, Reference McDaid, Greene and Hurley2013). NMI can take values between 0 (no mutual information) and 1 (perfect correlation). In our case, when comparing data-driven clustering obtained from DEC to clinical annotations, if the two groupings substantially overlap, higher values (closer to 1) should be expected. On the flip side, lower values (closer to 0) would be indicative of little overlap between the two groupings.
Results
Study population
4421 patients have been recruited in the FACE-BD cohort since the inception of expert centers in France. 423 cases were missing values for AAO and were subsequently removed from analyses, reducing the sample size for this study to 3998. Patients had a mean age of 40.45 (s.d. = 13.06). Females constituted 60.05% (n = 2401) of the sample. Type I BD was diagnosed in 45.37% of patients (n = 1814), type II BD in 43.27% (n = 1730), while 11.35% (n = 454) had a diagnosis of BD NOS. Variables from the FACE-BD cohort considered for this study are shown in online Supplementary Table S1.
Supervised learning experiments
In order to define clinical subgroups based on AAO (a numerical variable), two different approaches were adopted: GMM clustering and dichotomization varying the cut-off in the range 14 to 25 y.o.
GMM based definition
Based on AIC and BIC model section, we identified a three-cluster solution as the best fitting admixture model (Fig. 2). In line with previous reports (Joslyn, Hawes, Hunt, & Mitchell, Reference Joslyn, Hawes, Hunt and Mitchell2016), patients assigned to cluster 1, that is the Gaussian with the lowest mean value of AAO, were considered as ‘early AAO’ while patients from either cluster 2 or 3 were both labeled as ‘non-early AAO’. Results from GMM clustering and subsequently filtering cases on cluster allocation uncertainty at different thresholds are reported in Table 1. Following this approach, GMM clustering reduces to dichotomizing patients into early and non-early AAO where the cut-off between such two groups is set in a data-driven fashion based on the fit of an admixture model. As a more stringent threshold is used for filtering cases based on cluster allocation uncertainty, early and non-early AAO groups are progressively pulled apart and patients close to the boundaries are progressively dropped.
Cases assigned to cluster 1 were considered as early age at onset (AAO), whereas cases assigned to either cluster 2 or 3 were considered non-early AAO. Filtering cases on cluster allocation uncertainty using progressively more stringent thresholds results in increasingly pulling apart early from non-early AAO patients at the expenses of more cases being removed from further analyses. This can be seen from the maximum AAO of Cluster 1 decreasing and the minimum AAO of Cluster 2 increasing as the uncertainty threshold for filtering becomes more conservative.
Nested cross-validation performance across learners is reported in Fig. 3, illustrating to what degree classifiers could differentiate early v. non-early AAO based on the patients' neurodevelopmental variables. The highest average test AUROC resulted from adopting the most conservative uncertainty threshold (⩾0.1): XGBoost achieved an average 0.6970 (s.d. = 0.0195) test AUROC with ENET close behind at 0.6935 (s.d. = 0.0151). Enforcing this threshold was equivalent to setting the maximum AAO value in the early AAO group to 20 y.o. and the minimum AAO value in the non-early AAO group to 29 y.o. Using less conservative uncertainty thresholds up to the point of no filtering drew the two AAO groups closer together increasing the maximum value allowed for defining early AAO. In this manner, as the maximum AAO value in the early AAO group increased as a result of less stringent filtering, the average test AUROC across both XGBoost and ENET deteriorated.
Cut-offs in the range 14 to 25 years old
As an alternative to GMM clustering, we investigated how defining early v. non-early AAO patients by dichotomization with different cut-offs in the range 14 to 25 y.o. would affect separability in neurodevelopmental feature space. The proportions of early v. non-early AAO cases in our cohort across different age cut-offs ranged from 9.05% to 90.5% for 14 y.o. cut-off to 68.63% to 31.37% for 25 y.o. Average AUROC across outer test sets and one standard deviation are illustrated in Fig. 4. The highest average test AUROC was attained by XGBoost for the 16 y.o. cut-off (0.7327, s.d. = 0.0169). Overall, both classification algorithms reached higher (>0.70) average test AUROC values as the cut-off was kept in the range 14 to16 y.o., while for age >16 the performance showed a downward trend.
Unsupervised learning experiments
The use of an unsupervised learning approach was motivated by the lack of a ground-truth in terms of which early AAO definition best picks out distinctive neurodevelopmental pathways. Assuming the existence of two clusters in the neurodevelopmental feature space, DEC separated our cohort into 1314 (29.72%) and 3107 (70.28%) patients. NMI between DEC cluster assignments and the early AAO definitions explored in the supervised learning experiments was highest for early AAO as defined with a 17 y.o. cut-off (NMI = 0.41). NMI values for the different early v. non-early AAO definitions are reported in Fig. 4. Overall, NMI values across different definitions mirrored results from supervised learning experiments, showing that the use of an AAO cut-off up to 17 y.o. led to a greater overlap with the data-driven grouping by DEC. Similarly, a more conservative uncertainty threshold in GMM clustering, hence a lower maximum AAO value in the early AAO group, was associated with greater overlap with DEC clustering.
Discussion
BD may be considered as a clinical construct comprising manifold phenotypes probably underlying heterogeneous pathophysiological pathways. In this respect breaking it down into more homogeneous and biologically-grounded entities would not only improve our understanding of the disorder but also help identify clinical subgroups that might benefit from bespoke treatment and management. Based on the existing literature, one such subgroup of patients with BD is characterized by a distinctive neurodevelopmental load, relatively to other patients, and would have an early AAO as clinical trademark (Kloiber et al., Reference Kloiber, Rosenblat, Husain, Ortiz, Berk, Quevedo and Carvalho2020). In the present study, we addressed the question of which AAO cut-off would be more appropriate for differentiating BD patients based on their neurodevelopmental load using both a supervised and an unsupervised learning approach.
In the supervised learning experiments, to test the usefulness of different cut-offs towards revealing distinguishable neurodevelopmental loads, we binarized patients into early v. non-early AAO groups by systematically moving the cut-off in the range 14 to 25 y.o. Our results demonstrated that setting the cut-off at 16 y.o. led to the highest average AUROC value (0.7327, s.d. = 0.0169), suggesting that such definition of early AAO maximizes patients' separability from neurodevelopmental features. Interestingly, on the one hand, 16 y.o. was lower than any of cut-off derived with GMM analysis, even after accounting for cluster allocation uncertainty. On the other hand, this cut-off is higher than the one (i.e. 14 y.o.) previously posited as ‘earlier than expected’ AAO by Bolton et al. (Bolton et al. Reference Bolton, Warner, Harriss, Geddes and Saunders2021). Moreover, we demonstrated that the early AAO cut-off induced by GMM resulted in lower separability based on neurodevelopmental factors. The best GMM fit in our cohort was in line with those from previous reports (Joslyn et al., Reference Joslyn, Hawes, Hunt and Mitchell2016), placing most patients under the Gaussian with the lowest mean, therefore considering early AAO as the most common condition. Admixture analysis on AAO ultimately amounts to setting a cut-off between early v. non-early AAO in a data-driven fashion (where the cut-offs depend solely on the AAO distribution). The shortcomings in admixture analysis ability to find biologically meaningful, in terms of neurodevelopmental pathways, AAO cut-offs may be explained with two arguments. First, our results showed that a stronger signal, in terms of early v. non-early AAO cases separability from neurodevelopmental history, emerged as a result of a more stringent filtering on cluster allocation uncertainty. Indeed, a low cluster allocation uncertainty, i.e. >0.10, lowered the maximum AAO value to 20 y.o. in the early AAO group, which corresponded to the highest average test AUROC. However, the price for such filtering was a sharp drop in sample size (by 37.64%), which limits the application of this approach in clinical research. Of notice, to the best of our knowledge, cluster allocation uncertainty filtering was implemented in only one previous study, by Bolton et (Bolton et al., Reference Bolton, Warner, Harriss, Geddes and Saunders2021). Second, as argued by Montlahuc et al. (Montlahuc, Curis, Jonas, Bellivier, & Chevret, Reference Montlahuc, Curis, Jonas, Bellivier and Chevret2017), since episodes of BD are difficult to recognize in early adolescence, especially during retrospective interviews of adult patients, the AAO distribution gets left-truncated, which may potentially have a major impact on the number of detected clusters and explain the absence of an cluster in the left tail of the AAO distribution.
The need to set up multiple supervised learning experiments and explore different AAO cut-off definitions was motivated by the lack of knowledge around the best definition of ‘early’ AAO, conceptualized as a criterion that could point towards distinctive neurodevelopmental antecedents. As a complementary strategy, we adopted an unsupervised learning approach to examine how data-driven labels would compare against the various early AAO definitions used in the supervised learning experiments. The highest degree of overlap between data-driven clusters and the grouping induced by the different early v. non-early AAO definitions was recorded when lower cut-offs were used to define early AAO. The highest NMI was 0.41, achieved when early AAO was set to be ⩽17 y. o., coherently with the results from our supervised learning experiments. In other words, the unsupervised learning approach lent further support to the notion, emerging from our supervised learning experiments, that an AAO below 17 y.o. better corresponds to distinctive neurodevelopmental pathways than any of the grouping retrieved with GMM. Taken together, results from supervised and unsupervised learning experiments convergently show that the clinical label of early AAO best captures distinctive neurodevelopmental patterns when defined with a cut-off ⩽17 y.o. This result should be taken into consideration to provide better patients' stratification in future investigations of BD.
The results in this report should be balanced against several limitations. To the best of our knowledge, this is the first study using a machine learning framework to predict early AAO from neurodevelopmental antecedents. Therefore, we could not compare our results against other reports pursuing the same (or a comparable) aim. Given the post-hoc nature of this analysis, several neurodevelopment features reported in previous works as related to BD [for example perinatal infections, minor physical anomalies (MPAs), dermatoglyphic anomalies, soft neurological signs, ASD comorbidity, and family history of SCZ (Kloiber et al., Reference Kloiber, Rosenblat, Husain, Ortiz, Berk, Quevedo and Carvalho2020)] were not available in this study. Furthermore, the very high rate (>80%) of missing values under some neurodevelopmental variables (newborn length, weight, and skull circumference, Apgar score at minute 1 and at minute 5 after birth) forced us to exclude these from the predictors. While our sample size was large relatively to most previous studies, we lacked an independent cohort for estimating out-of-sample generalization. Another important limitation we should acknowledge is the lack of biological data, such as genomics and neuroimaging. Our results indeed relied only on clinical variables collected during retrospective interviews of adult patients; these can be affected by recall bias. Furthermore, as neurodevelopmental abnormalities in BD were also described in terms of genomics [e.g. (Cao et al., Reference Cao, Deng, Guan, Yang, Lin, Ma and Hu2014)] and neuroimaging [e.g. (Sarrazin et al., Reference Sarrazin, Cachia, Hozer, McDonald, Emsell, Cannon and Houenou2018)] findings, these modalities too should be studied for their predictive role. On the other hand, our work may help to stratify patients with more accuracy in future studies.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S003329172300020X
Acknowledgments
We are grateful to the participating subjects and to all the collaborators of Fondamental foundation
Financial support
This study was supported by the Fondation FondaMental, Créteil, France, and the Investissements d'Avenir Programs managed by the ANR under references ANR-11-IDEX-0004-02 and ANR-10-COHO-10-01.
It was also supported by the United Kingdom Research and Innovation (grant EP/S02431X/1), UKRI Centre for Doctoral Training in Biomedical AI at the University of Edinburgh, School of Informatics.
Conflict of interest
Ludovic Samalin has received grants, honoraria, or consulting fees from Janssen, Lundbeck and Otsuka. Pierre-Michel Llorca has received grants, honoraria, or consulting fees from Otsuka, Lundbeck, Eisai, Sanofi, Testimony.
The other authors declare no conflict of interest, including the contributing members of the and FACE-BD Groups.
Ethical standards
The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008.