Introduction
Bipolar disorder (BD) frequently emerges during adolescence (Lewinsohn, Klein, & Seeley, Reference Lewinsohn, Klein and Seeley2000; Lewinsohn, Seeley, Buckley, & Klein, Reference Lewinsohn, Seeley, Buckley and Klein2002; Perlis et al., Reference Perlis, Miyahara, Marangell, Wisniewski, Ostacher, DelBello and Investigators2004). Identification of treatment-specific predictors of drug therapies is important because only about half of individuals with BD respond to any specific medication (Hirschfeld et al., Reference Hirschfeld, Keck, Kramer, Karcher, Canuso, Eerdekens and Grossman2004; Keck et al., Reference Keck, Marcus, Tourkodimitris, Ali, Liebeskind and Saha2003; Keck, Welge, Strakowski, Arnold, & McElroy, Reference Keck, Welge, Strakowski, Arnold and McElroy2000). Reliable predictive models could guide personalized therapeutics and implementation of supplemental drug and psychotherapeutic interventions to decrease the high levels of morbidity and mortality associated with the disorder.
Interest in identifying neuroimaging biomarkers of treatment outcome in BD has received increased attention over the past two decades (Ketter & Wang, Reference Ketter and Wang2002; Lim et al., Reference Lim, Baldessarini, Vieta, Yucel, Bora and Sim2013). For example, Passarotti, Sweeney, and Pavuluri (Reference Passarotti, Sweeney and Pavuluri2011) used task-based functional magnetic resonance imaging (MRI) to predict the clinical effects of antipsychotic treatment in pediatric BD. Moore et al. (Reference Moore, Cortese, Glitz, Zajac-Benitez, Quiroz, Uhde and Manji2009) found total brain gray matter volume was increased in treatment-responsive adults with BD after 4 weeks of lithium treatment. Fleck et al. (Reference Fleck, Ernest, Adler, Cohen, Eliassen, Norris and Strakowski2017) used magnetic resonance spectroscopy to predict medication response in patients with BD, and Zhang et al. (Reference Zhang, Xiao, Sun, Patino, Tallman, Weber and DelBello2018) showed that pre-treatment cortical thickness measures predicted response to an antipsychotic drug in youth with BD. Among these measures, brain morphometric measures have been most widely used to predict and track the effects of drug therapies for psychiatric disorders (Hazlett et al., Reference Hazlett, Gu, Munsell, Kim, Styner, Wolff and Statistical2017; Vieira et al., Reference Vieira, Gong, Pinaya, Scarpazza, Tognin, Crespo-Facorro and Mechelli2020).
Quetiapine and lithium are effective and widely used treatments for BD. Different neuroprotection effects between quetiapine and lithium have been reported in previous studies. Specifically, lithium was found to be more effective than quetiapine in slowing the progression of white matter abnormalities (Berk et al., Reference Berk, Dandash, Daglas, Cotton, Allott, Fornito and Yücel2017). Dandash et al. found that lithium exhibited higher efficiency in reversing the hyperconnectivity of striatal areas than quetiapine (Dandash et al., Reference Dandash, Yücel, Daglas, Pantelis, McGorry, Berk and Fornito2018). For brain gray matter morphometry, our previous work suggested that discrete patterns of baseline cortical thickness in temporal and parietal regions differentially predicted treatment response of quetiapine but not lithium (Zhang et al., Reference Zhang, Xiao, Sun, Patino, Tallman, Weber and DelBello2018). Given the different neural pathways impacted by quetiapine and lithium, comparing the discriminative regions in these two predictive models may provide insights into the neuropharmacological mechanisms of clinical efficacy of the two drugs. While there has been a study using clinical features as predictors to build prediction models for these two drugs separately (Kim et al., Reference Kim, Dufour, Xu, Cohen, Sylvia, Deckersbach and Nierenberg2019), models based on MRI features have not yet been established.
Most studies of neuroimaging predictors of treatment outcome have examined individuals with long-term BD who had received extensive treatment in the past. Therefore, some findings in prior work may have been impacted by the effects of chronicity and previous or current medications. Studying recent onset cases with limited treatment history may better illuminate medication effects on illness processes and predictive brain features, as well as provide more clinically useful information. Further, studying early course individuals with BD may provide more clinically useful information because an extensive clinical history is not available to guide treatment decisions (Brooks & Vizueta, Reference Brooks and Vizueta2014). Third, focusing on patients experiencing an acute mixed/manic episode can facilitate assessment of MRI features in relation to treatment outcome. Fourth, preclinical studies have revealed changes in brain systems induced by short-term administration of lithium and quetiapine. For example, 7 days of lithium therapy resulted in changes in forebrain membrane properties in a rat model of mania (Vošahlíková, Roubalová, Brejchová, Alda, & Svoboda, Reference Vošahlíková, Roubalová, Brejchová, Alda and Svoboda2021), and 1-week treatment of quetiapine was found to enhance the level of hippocampal neurogenesis (Luo, Xu, & Li, Reference Luo, Xu and Li2005). In addition, evidence from clinical trials based on neuroimaging techniques also support brain changes following short-term quetiapine or lithium therapy. Brain D2 dopamine receptor occupancy can be observed following acute quetiapine treatment (Nord et al., Reference Nord, Nyberg, Brogren, Jucaite, Halldin and Farde2011), and significant decreases in myoinositol levels were identified in the right frontal cortex after 5–7 days of lithium administration (Moore et al., Reference Moore, Bebchuk, Parrish, Faulk, Arfken, Strahl-Bevacqua and Manji1999). Moreover, our previous study using brain structural MRI found youth with BD exhibited changes in brain structural network following 1-week quetiapine or lithium treatment (Lei et al., Reference Lei, Li, Tallman, Patino, McNamara, Strawn and DelBello2021). Although these findings suggest that brain changes can occur after acute drug treatment, few studies investigated whether such acute brain changes can serve as predictors of treatment outcome.
Previous studies have suggested that deep learning, a type of machine learning capable of capturing high orders of complexity and abstraction (Kim, Calhoun, Shim, & Lee, Reference Kim, Calhoun, Shim and Lee2016; LeCun, Bengio, & Hinton, Reference LeCun, Bengio and Hinton2015; Lei et al., Reference Lei, Pinaya, van Amelsvoort, Marcelis, Donohoe, Mothersill and Mechelli2020), may yield higher classifier accuracy than the current widely adopted traditional machine learning models (Pinaya et al., Reference Pinaya, Gadelha, Doyle, Noto, Zugman, Cordeiro and Sato2016; Vieira, Pinaya, & Mechelli, Reference Vieira, Pinaya and Mechelli2017). It has been reported that brain structure is undergoing non-linear trajectory of brain structural maturational changes in typically developing youth (Giedd et al., Reference Giedd, Blumenthal, Jeffries, Castellanos, Liu, Zijdenbos and Rapoport1999; Tamnes et al., Reference Tamnes, Herting, Goddings, Meuwese, Blakemore, Dahl and Mills2017). Previous studies have also found non-linear characteristics in brain alterations associated with BD and increased brain complexity in mania (Bahrami, Seyedsadjadi, Babadi, & Noroozian, Reference Bahrami, Seyedsadjadi, Babadi and Noroozian2005; Fernández, Al-Timemy, Ferre, Rubio, & Escudero, Reference Fernández, Al-Timemy, Ferre, Rubio and Escudero2018). Considering additional complicated neuroprogression and medication effects, non-linear models may be better positioned to address the brain complexity involved in a randomized clinical trial of pediatric BD compared with linear methods (e.g. principal component analysis, sparse learning) (Hazlett et al., Reference Hazlett, Gu, Munsell, Kim, Styner, Wolff and Statistical2017). Deep neural network can precisely recognize the most differentiable features related to medication response from the complex longitudinal structural patterns in a non-linear way, and facilitate the extraction of optimal low-dimensional representations for clinicians unequipped with expert feature engineering knowledge. We therefore utilized a two-stage prediction pipeline that includes a deep neural network component for non-linear dimensionality followed by an additional support vector machine (SVM) classifier (Hazlett et al., Reference Hazlett, Gu, Munsell, Kim, Styner, Wolff and Statistical2017).
With these considerations in mind, we recruited a cohort of young participants with BD who were early in their illness course and randomized to receive quetiapine or lithium trial. Our aims were to determine the utility of pre-treatment morphometric data and change in brain structure after 1 week of treatment for the prediction of treatment response at the end of the 6 weeks trial. We hypothesized that (i) baseline image data could significantly predict medication response in bipolar youth at the individual level; and (ii) change in morphometric features from baseline to 1 week after treatment initiation would predict treatment response, and the longitudinal joint model combining baseline data and week 1 structural changes data would predict medication response with highest accuracy. Finally, considering that the drug mechanisms of lithium and quetiapine are different (Ketter, Miller, Dell'Osso, & Wang, Reference Ketter, Miller, Dell'Osso and Wang2016), we hypothesized that (iii) pre-treatment morphometric features and their change may be differentially related to outcome prediction in the two treatment groups.
Methods
Participants
This study was approved by the University of Cincinnati and the Cincinnati Children's Hospital Medical Center Institutional Review Boards. Youth with bipolar I disorder were recruited from the Cincinnati Children's Hospital Medical Center, the University of Cincinnati, and the local community. Diagnosis of bipolar I disorder was confirmed using the Washington University in St. Louis Kiddie Schedule of Affective Disorders and Schizophrenia administered by raters with demonstrated inter-rater reliability (κ > 0.9) (Geller et al., Reference Geller, Zimerman, Williams, Bolhofner, Craney, DelBello and Soutullo2001). Young Mania Rating Scale (YMRS) was used to assess mania symptoms (Young, Biggs, Ziegler, & Meyer, Reference Young, Biggs, Ziegler and Meyer1978). Written informed assent and consent about the study procedures and purpose were provided by all participants and their legal guardians (registered in https://clinicaltrials.gov/; registration number: NCT00893581).
To be included, participants were required to be: (1) within the age range of 10–18 years old; (2) experiencing a manic or mixed episode; (3) having a baseline YMRS score ⩾20; (4) within 2 years from onset of BD; (5) having no prior psychiatric hospitalizations for mania; (6) having no treatment history with therapeutic doses of antipsychotics or mood stabilizers for over 3 months, and no psychotropic medication during the week prior to baseline scans. Participants were excluded if they had (1) a contraindication to MRI scanning; (2) an IQ <70; (3) a positive pregnancy test; (4) a history of a major systemic or neurological illness, or an episode of loss of consciousness >10 min; (5) any lifetime DSM-IV-TR substance use disorder (nicotine dependence was permitted); or (6) a lifetime DSM-IV-TR diagnosis of a pervasive developmental disorder or post-traumatic stress disorder.
Treatment procedures
Following initial clinical evaluation and MRI scanning, participants were randomized by an investigational pharmacist (C.C.K.) to double-blind treatment with lithium or quetiapine and evaluated clinically weekly for 6 weeks. The randomization schedule was stratified by presence/absence of attention deficit and hyperactivity disorder, presence/absence of psychosis, and the mood state (mixed v. manic episode). Quetiapine was initiated at 100 mg qhs and lithium carbonate was initiated at 30 mg/kg (maximum starting dose of 600 mg twice daily). Patients were also given placebo capsules for the medication to which they were not assigned. Quetiapine was titrated to a target dose of 400–600 mg/day based on tolerability and response. Lithium was titrated to a serum level of 1.0–1.2 mEq/L. Treatment was administered in a double-dummy, double-blind manner, with an unblinded study psychiatrist monitoring trough lithium levels and making dose adjustments independent from treating psychiatrist and clinical raters. Blinded clinical tolerability dose adjustment recommendations took precedent over un-blinded double dummy dose adjustment recommendations. There were no significant changes of the treatment methods or outcomes after trial commencement. We used the YMRS changes to assess antimanic treatment response at week 6, and treatment responder was defined as exhibiting ⩾50% reduction in YMRS scores from baseline (Patino et al., Reference Patino, Klein, Strawn, Blom, Tallman, Adler and DelBello2021; Wegbreit et al., Reference Wegbreit, Ellis, Nandam, Fitzgerald, Passarotti, Pavuluri and Stevens2011).
MRI acquisition
MRI scanning was performed on a 4-T Varian Unity INOVA scanner using a 12-channel head coil at baseline. All participants were instructed to be scanned at baseline prior to treatment and 1 week after treatment initiation. Earplugs and headphones were provided to reduce background noise, and foam padding around the head minimized head motion. Following a three-plane gradient echo scan for alignment and localization, a shim procedure was performed to generate a homogeneous magnetic field. High-resolution T1-weighted three-dimensional images were acquired with a Modified Driven Equilibrium Fourier Transform (MDEFT) protocol, optimized for the 4T Varian scanner [Tau (magnetization preparation time) = 1.1 s, TR = 13 ms, TE = 5.3 ms, field of view = 192 × 256 × 256 mm, matrix = 192 × 256 × 256, flip angle = 20 degrees, slice thickness = 1 mm]. Acquired T1-weighted images were inspected by two experienced neuroradiologists who made decisions about excessive motion artifact for scan exclusion. No observable scanning artifacts or gross brain abnormalities were identified in any participant included in the following analyses.
Image processing and brain morphometric feature sets
All structural MRI scans were processed on the same workstation using the FreeSurfer image analysis suite v6.0.0 (http://surfer.nmr.mgh.harvard.edu/) to obtain unbiased estimates of morphometric measures, including surface area, cortical thickness, and subcortical volumes (for detailed calculation and extraction of these morphometric measures, see online Supplementary Materials). Each participant was thus represented by a 150-dimensional feature vector consisting of morphometric features across the whole brain. Specifically, we included the surface area and cortical thickness of 68 cortical regions labeled in the Desikan/Killiany Atlas (Desikan et al., Reference Desikan, Segonne, Fischl, Quinn, Dickerson, Blacker and Killiany2006), as well as the volume of bilateral hippocampus and 12 subcortical regions (i.e. bilateral thalamus, amygdala, caudate, putamen, pallidum, and accumbens). To compare the prediction value of baseline brain structure and acute structural alterations induced by medication, the model performance was investigated for baseline morphometric measures and morphometric changes from baseline to week 1 independently. To thoroughly take advantage of longitudinal brain morphometry related to antimanic treatment for better model performance, we further combined baseline and week 1 morphometric features into a concatenated 300-dimensional vector as our longitudinally joint model. Therefore, under the unified prediction framework, there are finally three classification models based on different feature sets (i.e. baseline model, 1-week change model, and longitudinally joint model) to be tested in two medication groups.
Two-stage prediction model based on structural MRI
We implemented a two-stage prediction pipeline to differentiate medication responders from non-responders as described in previous prospective studies (Hazlett et al., Reference Hazlett, Gu, Munsell, Kim, Styner, Wolff and Statistical2017; Zhu et al., Reference Zhu, Lei, Qin, Suo, Li, Li and Gong2021; Yang et al., Reference Yang, Lei, Qin, Pinaya, Suo, Li and Gong2021). A feedforward multi-layer neural network was adopted as the initial stage for dimensionality reduction (Hinton & Salakhutdinov, Reference Hinton and Salakhutdinov2006), and SVM was included as the second stage to individually discriminate responders from non-responders (Cortes & Vapnik, Reference Cortes and Vapnik1995). The training procedures of neural network for dimensionality reduction mainly contain binary processing of raw features, unsupervised pre-training of stacked autoencoders, and supervised training of fine-tuned neural network. Once the training was finished, the representations stored in the layer before output layer were extracted as the optimal features fed into SVM. Detailed information about the rationale, architecture, and training process of the two-stage prediction model is shown in online Supplementary Materials.
The whole two-stage prediction pipeline was trained and evaluated using 10-fold stratified cross-validation. For each iteration of the cross-validation, we used one part of the dataset to evaluate our model (i.e. testing set) and the remaining nine parts for training (i.e. training set). To avoid information leakage, we performed the non-linear dimensionality reduction (i.e. training of neural network) only based on the training set, and the testing set was only used to evaluate performance and never used for tuning or training our models. The model performance was determined by the balanced accuracy, sensitivity, specificity, and area under receiver operating characteristic curve (AUC). We independently reported the model performance in quetiapine and lithium medication group. All the machine learning analyses were programmed using Python language, where the neural network was implemented in the Pytorch library (Paszke et al., Reference Paszke, Gross, Massa, Lerer, Bradbury, Chanan and Chintala2019), and the SVM was implemented in the Scikit-Learn library (Pedregosa et al., Reference Pedregosa, Varoquaux, Gramfort, Michel, Thirion, Grisel and Louppe2011). All the codes are available at https://github.com/QKmeans0902/Two_stage_prediction_pipeline.
Model transferability between different medication groups
In this analysis, we examined the model transferability between two medication groups. Specifically, we used dataset from one whole treatment group (quetiapine or lithium) for drug-specific model training and optimization, and the other group to evaluate the predictive utility of the model for treatment outcome to the other drug therapy. To maintain consistency with preceding analyses, model transferability test was also implemented in the identical two-stage prediction pipeline.
Features with greatest contributions to predicting medication outcomes
Given morphometric features with the highest predictive power may differ in different medication groups which can provide useful neurobiological implications, we sought to identify these potentially discriminating patterns. Detailed calculation of feature contribution is presented in online Supplementary Materials. Top 10 features with greatest contributions are reported for different models and medication groups.
Statistical analysis
Group comparisons of demographic and clinical characteristics were performed using SPSS software [version 23 (IBM SPSS Statistics)]. The significance level of prediction model was evaluated using permutation test (Golland & Fischl, Reference Golland and Fischl2003). Specifically, we randomly permuted the labels of training set prior to training. The model training and evaluation under cross-validation were then performed to get the estimated accuracy based on the permuted dummy training set. The permutation was repeated 1000 times, and a distribution of the accuracy for randomly permuted data was obtained. The significance level was thus defined as the probability of observing an accuracy from the distribution based on permuted data no less than the real accuracy. Differences in age and IQ were assessed by two-sample t test, while differences in sex and number of responders were assessed using χ2 test. The p value of differences in parental socioeconomic status was calculated by Mann–Whitney U test given the non-normality, and the statistical test for differences in longitudinal clinical scale scores was two-way repeated measures ANOVA.
Results
Demographic and clinical characteristics
This prospective randomized clinical trial recruited 149 youth with BD. Data from nine participants were excluded due to failure to finish baseline structural MRI scan or excessive head motion. Eleven participants in the lithium treatment group (n = 61) and eight participants in the quetiapine treatment group (n = 79) were lost to follow-up. Ultimately, 71 participants treated with quetiapine and 50 participants treated with lithium who had completed all follow-up structural MRI examinations and clinical assessments were finally included in the analysis.
The demographic and clinical characteristics are listed in Table 1 and online Supplementary Table S1. No evidence of significant differences in age, sex, IQ, parental socioeconomic status, or number of responders was found between the quetiapine and lithium groups (all p > 0.05). No significant time × treatment group interaction effects in YMRS scores were observed. For the group effect, we found no significant between-group differences in YMRS scores at any follow-up timepoint (p > 0.05). For the time effect, the YMRS scores were significantly decreased from baseline to week 6 in both quetiapine and lithium groups (p < 0.001).
IQ, intelligence quotient; PSES, parental socioeconomic status; BMI, body mass index; YMRS, Young Mania Rating Scale; CDRS-R, Children's Depression Rating Scale - Revised; CGI-S, Clinical Global Impressions-Severity.
Note: Data are presented as mean ± standard unless otherwise indicated. The p values were calculated by two-sample t test or χ2 test unless otherwise indicated.
1 Age are presented as mean ± standard with age range in parenthesis.
2 Data were available in 111 of 121 participants.
3 Data were available in 116 of 121 participants.
4 The p values refer to significance level of time × treatment group interaction effects using two-way repeated measures ANOVA.
Prediction performance of medication response
The prediction pipeline in the current study was shown in Fig. 1. Baseline structural MRI predicted treatment response at week 6 with balanced accuracy of 79.6% (AUC: 0.89, p < 0.001) for the quetiapine treatment group. For the lithium treatment group, baseline morphometric measures achieved balanced accuracy of 75.8% (AUC: 0.85, p = 0.003). When using 1-week brain morphometric changes as predictors, balanced accuracies in the quetiapine and lithium treatment group were 76.5% (AUC: 0.81, p < 0.001) and 78.9% (AUC: 0.85, p < 0.001), respectively. Combining brain morphometric features at baseline and week 1 allowed for the highest prediction performance. For the quetiapine treatment group, the balanced accuracy was 83.2% (AUC: 0.93, p < 0.001). For the lithium treatment group, the balanced accuracy was 83.5% (AUC: 0.89, p < 0.001). All the evaluation metrics for model performance were listed in Table 2.
BAC, balanced accuracy; SEN, sensitivity; SPE, specificity; AUC, area under receiver operating characteristic curve.
Model transferability analyses between drug treatments
All models failed to reach comparable performance in the transferability test when prediction models developed for one drug were applied to the other drug treated group (all p > 0.05; Table 2). Baseline morphometric measures showed slightly above chance level performance (i.e. 50%) using the quetiapine model to predict lithium treatment outcome (balanced accuracy, 54.4%) and vice versa (balanced accuracy, 55.7%). Using week 1 structural change data, both models showed poor transferability below chance level (balanced accuracy of 43.8% for quetiapine model predicting lithium outcome and balanced accuracy of 47.7% for lithium predicting quetiapine outcome). When considering both baseline and week 1 morphometric data together, both models transferred around chance level (quetiapine model predicting lithium outcome: balanced accuracy, 51.5%; lithium model predicting quetiapine outcome: balanced accuracy, 48.9%).
Features with greatest contributions to treatment response prediction
Within the models considering the predictive utility of pre-treatment morphometric features, top predictive features in the lithium group mostly included cortical thickness measures, while in the quetiapine group, cortical thickness, cortical surface area, and subcortical volume measures were leading predictors. When using acute structural changes from baseline to week 1 for prediction, top features in the lithium group included cortical surface area and hippocampal volume, while in the quetiapine group, the leading predictors were cortical thickness measures (Table 3 and Fig. 2).
L, left; R, right; CT, cortical thickness; SA, surface area; Vol, volume.
Discussion
In the present study, we tested the ability of brain morphometric features to predict medication treatment response in youth with BD. We know of only two previous studies that applied pre-treatment MRI data to predict treatment outcome in BD: Wade et al. (Reference Wade, Joshi, Njau, Leaver, Vasavada, Woods and Narr2016) trained an SVM with subcortical volume and cortical thickness features to predict response to electroconvulsive therapy in depressed adults, eight of whom had BD. They reported an accuracy of 89%. Fleck et al. (Reference Fleck, Ernest, Adler, Cohen, Eliassen, Norris and Strakowski2017) trained a machine learning model with pre-treatment fMRI and magnetic resonance spectroscopy data to predict response to lithium at week 8 in 20 adults with BD, classifying responders v. non-responders with 80% accuracy. Our study is considerably larger, which focused on youth with limited clinical and treatment history and considered acute neuroanatomic changes after 1 week of treatment as well as baseline predictors.
Consistent with our first hypothesis, we found that baseline morphometric measures predicted medication response at the individual level with significant accuracy, 79.6% for the quetiapine group and 75.8% for the lithium group. Our findings indicate that pre-treatment morphometric features may provide predictors of treatment response approaching the level needed for clinical application, consistent with findings of the two prior smaller studies (Fleck et al., Reference Fleck, Ernest, Adler, Cohen, Eliassen, Norris and Strakowski2017; Wade et al., Reference Wade, Joshi, Njau, Leaver, Vasavada, Woods and Narr2016). We also assessed the ability of acute changes in morphometric features one week following treatment initiation to predict clinical outcome at week 6. The rationale for this aspect of our design is that early drug-induced changes in brain might provide an indication of clinically-relevant effects of drugs before clinical changes themselves can predict outcome of a drug trial. Changes in morphometric features from baseline achieved comparable prediction accuracy for the two study drugs (76.5% for quetiapine group and 78.9% for lithium group), though models were different for the two drugs. Previous studies using somewhat longer re-test periods (2–4 weeks) after drug treatment initiation have shown neuroanatomic changes with short-term acute treatment with lithium (Anand et al., Reference Anand, Nakamura, Spielberg, Cha, Karne and Hu2020) and antipsychotic drugs (Keshavan et al., Reference Keshavan, Bagwell, Haas, Sweeney, Schooler and Pettegrew1994). Our findings suggest that relevant changes can be detected even earlier after treatment initiation. A study of lithium treatment in BD reported effects in the hippocampus/amygdala complex after several years, regions known to be important in emotion processing (Germana et al., Reference Germana, Kempton, Sarnicola, Christodoulou, Haldane, Hadjulis and Frangou2010). Our study extends these earlier findings by showing that effects occurring as early as one week can predict later treatment response. Thus, the monitoring of early drug effects on brain structure may provide a way of predicting treatment response weeks before treatment outcome can be determined clinically, potentially providing important information to clinicians considering adjustments to treatment plans in individuals who are not early treatment responders.
Combining baseline brain features and acute treatment-induced effects (1-week) in models predicting 6-week treatment response modestly improved the accuracy (83.2% for the quetiapine group and 83.5% for the lithium group). Therefore, combining baseline and follow-up data within a single model may provide a promising direction for maximizing early prediction of treatment outcome in individuals with BD. While these findings require replication in an independent sample to confirm predictive utility of the models developed, they offer a promising preliminary step toward the development of clinically useful MRI-based biomarkers for guiding optimal and flexible treatment planning for individuals in the early course of BD.
Given these differences in prediction performance, we sought to identify which morphometric features provide the greatest contribution to treatment outcome. At baseline, cortical thickness features were better predictors of lithium response, while cortical surface area and subcortical volumes were more prominent as features predicting response to quetiapine. In contrast, when considering morphometric changes at week 1, the opposite pattern was observed with changes in cortical surface area and subcortical volume measures being more prominent in lithium response prediction, and changes in cortical thickness measures being more predictive of quetiapine response. According to the radial unit hypothesis and the supra-granular layer expansion hypothesis, cortical thickness is determined by the number and size of cells within a cortical column and surrounding neuropil, while surface area is less dynamic being primarily driven by the number of cortical columns established during brain maturation (Rakic, Reference Rakic2009). This suggests that dynamic illness-related pathophysiological changes may be more related to lithium outcomes, while relatively enduring features of brain development at the baseline may be better predictors of quetiapine response and dynamic changes in cells and neuropil at one week may be more related to quetiapine treatment. Changes in surface area at one week following lithium treatment are less readily interpretable, but may indicate a more global shift in the neocortical mantle following administration of lithium salts (Hozer et al., Reference Hozer, Sarrazin, Laidi, Favre, Pauling, Cannon and Houenou2020).
For the regionality of the most discriminative brain morphometric patterns, at baseline, the most discriminative regions for quetiapine were mainly located in temporal and subcortical regions, while frontal and parietal regions predicted treatment outcome better in lithium group. When considering morphometric changes at week 1, the regions contributing most to prediction mainly comprised frontal and temporal regions for both quetiapine and lithium. Previous research had reported disruption in fronto-temporal neural circuitry in remitted patients with BD (Robinson et al., Reference Robinson, Bearden, Monkul, Tordesillas-Gutierrez, Velligan, Frangou and Glahn2009). Our finding may indicate that the acute morphometric changes in fronto-temporal neural circuitry may in part underlie therapeutic treatment effects for both quetiapine and lithium. Nevertheless, we should note the difference in sample size of these two medication groups which may impact precision in the estimation of contribution weight of features. Although our study is designed as a randomized clinical trial to reduce group differences as much as possible, and no significant confounds (i.e. age, sex, IQ, illness duration, YMRS, CDRS-R, etc.) were identified in our analyses, some other potential differences in patient samples to a degree can exist despite randomization. Replication is still needed to determine the extent that differences in patient samples impacted the current findings.
In addition to these general findings, our transferability test for the two treatments was also informative. The results showed that all models failed to reach comparable performance in the transferability test, indicating that the contributing features of the predictive models differed between treatment groups, both for baseline data and for treatment-related change at week 1. While predictive models for the two treatments had similar success, the quite limited transferability of drug-specific models suggests modest level of shared predictors, so that drug class-specific predictive models appear to be needed for treatment outcome prediction. Different changes in morphometric features at week 1 are perhaps not surprising given the markedly different pharmacology of the two study drugs, though the nature of the differences offers insight into regional differences in early drug impact on brain that are relevant to their therapeutic efficacy.
Following recent recommendations on overcoming methodological issues that can lead to inflated results (Arbabshirani, Plis, Sui, & Calhoun, Reference Arbabshirani, Plis, Sui and Calhoun2017; Wolfers, Buitelaar, Beckmann, Franke, & Marquand, Reference Wolfers, Buitelaar, Beckmann, Franke and Marquand2015; Woo, Chang, Lindquist, & Wager, Reference Woo, Chang, Lindquist and Wager2017), we adopted two methodological precautions. First, to reduce the risk of overfitting and overly optimistic utility prediction estimates (Arbabshirani et al., Reference Arbabshirani, Plis, Sui and Calhoun2017; Wolfers et al., Reference Wolfers, Buitelaar, Beckmann, Franke and Marquand2015; Woo et al., Reference Woo, Chang, Lindquist and Wager2017) for the high-dimensional neuroimaging data, we first applied DL technology to identify compact hierarchical features and achieve dimension reduction. Second, we examined region-level features with less noise and lower risk of overfitting than voxel-level data (Vieira et al., Reference Vieira, Pinaya and Mechelli2017). Other approaches for dimension reduction and development of classification algorithms may confirm and potentially refine models developed in the current study (Claude, Houenou, Duchesnay, & Favre, Reference Claude, Houenou, Duchesnay and Favre2020; Collins & Moons, Reference Collins and Moons2019).
Of note, there are several limitations in our current study. First, the models described in this paper will require replication in an independent dataset before any application in clinical decision making. Second, each patient in our study received either quetiapine or lithium monotherapy. Thus, our study can only predict response of each drug and identify predictors of response in two independent treatment groups. Information about prediction of individual-level preferential response to quetiapine or lithium cannot be provided, which should be interpreted with caution. Third, our study was not powered to examine the relationship between MRI features and outcome separately in different age groups. Fourth, the utility of predictive models is generalizable within drug class and remains to be evaluated. Fifth, additional outstanding questions include the ability of other imaging modalities, such as DTI and resting-state fMRI, to inform and supplement prediction models. In addition, it would be helpful to further evaluate the utility of our models for predicting treatment outcomes in adults and in individuals with a long history of illness and drug therapy.
In conclusion, the present study demonstrates that brain morphometric features may predict treatment response for two widely used medications for the treatment of youth with BD. Brain morphometric features provided similar overall prediction performance for both lithium and quetiapine therapies, though the features used to achieve that prediction varied between the two medications. In addition, our findings indicate that brain morphometric changes occurring after a single week of medication exposure are predictive of treatment outcome. These findings provide insight into brain morphometric features associated with the treatment outcome prediction in bipolar youth treated with quetiapine and lithium, and support the potential use of neuroanatomical scans as biomarkers for the optimization of treatment or personalized medication approach which offers potential for reducing the risk of a failed medication trial detectable only following a full course of treatment.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S0033291722000757.
Financial support
This study was supported by the National Institute of Mental Health (NIMH) Grant (M.P.D., Grant No. 5R01MH080973) and the National Natural Science Foundation of China (Q.G., Grant No. 81621003).
Conflict of interest
Dr Strawn has received research support from the National Institutes of Health (NIMH/NIEHS/NICHD) as well as Allergan, Neuronetics, and Otsuka. He has received material support from and provided consultation to Myriad Genetics and receives royalties from the publication of two texts (Springer) and serves as an author for UpToDate and an Associate Editor for Current Psychiatry. He has spoken in CME presentations for Neuroscience Education Institute and CMEology. Finally, Dr Strawn also has provided consultation to the FDA and Intracellular Therapeutics. Dr Sweeney consults to VeriSci. Dr DelBello and Dr Adler are on the lecture bureau for Otsuka, and Dr Adler is on the lecture bureau for Janssen. Dr Patino and Dr DelBello have received research support from Acadia, Allergan, Janssen, Johnson and Johnson, Lundbeck, Otsuka, Pfizer, Sunovion, and Supernus, and Dr DelBello has provided consultation or advisory board services for Alkermes, Allergan, Assurex, CMEology, Janssen, Johnson and Johnson, Lundbeck, Neuronetics, Otsuka, Pfizer, Sunovion, and Supernus. Dr Adler has received research support from Merck, Forest, and Alkermes, and provided consultation for Janssen. All other authors declare that they have no competing interests.