Introduction
Anxiety and depression are the most common mental health problems, often occurring together and constituting a significant fraction of the global disease burden (Bandelow et al., Reference Bandelow, Reitt, Röver, Michaelis, Görlich and Wedekind2015; Evans-Lacko et al., Reference Evans-Lacko, Aguilar-Gaxiola, Al-Hamzawi, Alonso, Benjet, Bruffaerts and Thornicroft2018; Hirschfeld, Reference Hirschfeld2001). Selective serotonin reuptake inhibitors (SSRIs) are commonly prescribed as first-line pharmacological treatments for both depression (Bogowicz et al., Reference Bogowicz, Curtis, Walker, Cowen, Geddes and Goldacre2021; Kendrick, Stuart, Newell, Geraghty, & Moore, Reference Kendrick, Stuart, Newell, Geraghty and Moore2015) and anxiety disorders (Garakani et al., Reference Garakani, Murrough, Freire, Thom, Larkin, Buono and Iosifescu2020). However, how SSRIs work beyond their initial pharmacological action on the serotonin transporter remains unclear (Harmer, Duman, & Cowen, Reference Harmer, Duman and Cowen2017).
Neuropsychological models propose that antidepressants, including SSRIs, may alter cognitive processing, leading to improvements in depressive and anxiety symptoms (Harmer, Goodwin, & Cowen, Reference Harmer, Goodwin and Cowen2009a). Reinforcement learning provides a framework for investigating links between cognitive and biological processes and hence the effect of SSRIs on cognition (Huys, Browning, Paulus, & Frank, Reference Huys, Browning, Paulus and Frank2021; Lan & Browning, Reference Lan and Browning2022; Maia & Frank, Reference Maia and Frank2011). Preclinical and experimental research has established that several cognitive functions relevant to the etiology of anxiety and depression are sensitive to SSRIs (e.g. Geurts, Huys, den Ouden, & Cools, Reference Geurts, Huys, den Ouden and Cools2013a; Guitart-Masip, Duzel, Dolan, & Dayan, Reference Guitart-Masip, Duzel, Dolan and Dayan2014; Harmer, Reference Harmer2013; Michely, Eldar, Erdman, Martin, & Dolan, Reference Michely, Eldar, Erdman, Martin and Dolan2022; Michely, Eldar, Martin, & Dolan, Reference Michely, Eldar, Martin and Dolan2020; Roiser, Elliott, & Sahakian, Reference Roiser, Elliott and Sahakian2012a; Roiser et al., Reference Roiser, Levy, Fromm, Goldman, Hodgkinson, Hasler and Drevets2012b). However, there is little evidence tying these experimental effects of SSRIs on cognition to improvement in symptoms in clinical settings as only a few clinical randomized controlled trials (RCTs) have evaluated candidate mechanisms to explain treatment effects (Ahmed et al., Reference Ahmed, Bone, Lewis, Freemantle, Harmer, Duffy and Lewis2022; Cuthbert & Insel, Reference Cuthbert and Insel2013; Morris et al., Reference Morris, Sanislow, Pacheco, Vaidyanathan, Gordon and Cuthbert2022; Pizzagalli et al., Reference Pizzagalli, Smoski, Ang, Whitton, Sanacora, Mathew and Krystal2020). Evaluations in the context of RCTs comparing SSRIs and placebo provide a strong test of whether specific cognitive or learning processes are the mechanisms through which SSRIs alleviate symptoms of anxiety and depression.
The present study investigates whether SSRIs improve symptoms by modulating reinforcement learning processes, specifically aversive Pavlovian control. Aversive Pavlovian control refers to the automatic, stereotyped inhibition of actions in the face of negative expectations (Bolles, Reference Bolles1970; Dayan, Niv, Seymour, & Daw, Reference Dayan, Niv, Seymour and Daw2006), an effect that can be robustly observed in humans using neurocognitive probes (Boureau & Dayan, Reference Boureau and Dayan2011; Guitart-Masip et al., Reference Guitart-Masip, Huys, Fuentemilla, Dayan, Duzel and Dolan2012; Huys, Moutoussis, & Williams, Reference Huys, Moutoussis and Williams2011b). Aversive Pavlovian control is a promising candidate mechanism for the treatment of SSRIs. It is sensitive to serotonergic functioning in animal (Abela et al., Reference Abela, Browne, Sargin, Prevot, Ji, Li and Fletcher2020; Amo et al., Reference Amo, Fredes, Kinoshita, Aoki, Aizawa, Agetsuma and Okamoto2014; Doya, Miyazaki, & Miyazaki, Reference Doya, Miyazaki and Miyazaki2021; Ohmura, Tanaka, Tsunematsu, Yamanaka, & Yoshioka, Reference Ohmura, Tanaka, Tsunematsu, Yamanaka and Yoshioka2014) and preclinical studies (Crockett, Clark, Apergis-Schoute, Morein-Zamir, & Robbins, Reference Crockett, Clark, Apergis-Schoute, Morein-Zamir and Robbins2012; Crockett, Clark & Robbins, Reference Crockett, Clark and Robbins2009; Geurts, Huys, den Ouden, & Cools, Reference Geurts, Huys, den Ouden and Cools2013b; Hebart & Gläscher, Reference Hebart and Gläscher2015). Moreover, aversive Pavlovian control is associated with symptoms of depression and anxiety in clinical samples (Huys et al., Reference Huys, Gölzer, Friedel, Heinz, Cools, Dayan and Dolan2016; Nord, Lawson, Huys, Pilling, & Roiser, Reference Nord, Lawson, Huys, Pilling and Roiser2018) and in general population samples with anxiety traits (Mkrtchian, Aylward, Dayan, Roiser, & Robinson, Reference Mkrtchian, Aylward, Dayan, Roiser and Robinson2017). Influential reviews have highlighted the prominence of inhibition in response to negative expectations in depression (Roiser et al., Reference Roiser, Elliott and Sahakian2012a) and of avoidance driven by negative expectations in anxiety (LeDoux, Moscarello, Sears, & Campese, Reference LeDoux, Moscarello, Sears and Campese2017). Modifying aversive Pavlovian control is hence clinically promising (Huys, Russek, Abitante, Kahnt, & Gollan, Reference Huys, Russek, Abitante, Kahnt and Gollan2022; Martell, Dimidjian, & Herman-Dunn, Reference Martell, Dimidjian and Herman-Dunn2010).
In terms of underlying mechanisms, computational models have proposed formal relationships between rumination, acute reductions in central serotonin levels, and the attenuation of aversive Pavlovian control (Dayan & Huys, Reference Dayan and Huys2008, Reference Dayan and Huys2009; Huys et al., Reference Huys, Eshel, O'Nions, Sheridan, Dayan and Roiser2012; Robinson et al., Reference Robinson, Overstreet, Allen, Letkiewicz, Vytal, Pine and Grillon2013). At the neural level, the subgenual anterior cortex has been implicated in aversive Pavlovian control in research involving primates (Amemori & Graybiel, Reference Amemori and Graybiel2012) and healthy volunteers (Lally et al., Reference Lally, Huys, Eshel, Faulkner, Dayan and Roiser2017). This brain region is also recognized for its involvement in anxiety, as demonstrated in a study of healthy volunteers with contextual fear (Alvarez, Chen, Bodurka, Kaplan, & Grillon, Reference Alvarez, Chen, Bodurka, Kaplan and Grillon2011; Hasler et al., Reference Hasler, Fromm, Alvarez, Luckenbaugh, Drevets and Grillon2007) and it has been linked to depression in both preclinical (Drevets, Savitz, & Trimble, Reference Drevets, Savitz and Trimble2008; Ramirez-Mahaluf, Perramon, Otal, Villoslada, & Compte, Reference Ramirez-Mahaluf, Perramon, Otal, Villoslada and Compte2018) and depression treatment studies (Mayberg et al., Reference Mayberg, Lozano, Voon, McNeely, Seminowicz, Hamani and Kennedy2005).
Appetitive Pavlovian control may also be affected. In both clinical and subclinical depression samples, there have been reports of blunted reward responses (Bylsma, Morris, & Rottenberg, Reference Bylsma, Morris and Rottenberg2008; Eshel & Roiser, Reference Eshel and Roiser2010; Halahakoon et al., Reference Halahakoon, Kieslich, O'Driscoll, Nair, Lewis and Roiser2020; Pizzagalli, Jahn, & O'Shea, Reference Pizzagalli, Jahn and O'Shea2005; Steele, Kumar, & Ebmeier, Reference Steele, Kumar and Ebmeier2007), possibly due to reduced specificity (Huys et al., Reference Huys, Gölzer, Friedel, Heinz, Cools, Dayan and Dolan2016; Nord et al., Reference Nord, Lawson, Huys, Pilling and Roiser2018). Serotonergic manipulations have also shown effects on appetitive Pavlovian processes in animals (Cohen, Amoroso, & Uchida, Reference Cohen, Amoroso and Uchida2015) and healthy volunteers (Michely et al., Reference Michely, Eldar, Martin and Dolan2020).
As such, Pavlovian control may be a candidate mediator of the effect of SSRIs on anxiety and depression. Here, we report a test of this hypothesis in the context of the PANDA RCT (Lewis et al., Reference Lewis, Duffy, Ades, Amos, Araya, Brabyn and Lewis2019). This trial compared sertraline to placebo for the treatment of depression in primary care in the UK (Duffy et al., Reference Duffy, Bacon, Clarke, Donkor, Freemantle, Gilbody and Lewis2019; Lewis et al., Reference Lewis, Duffy, Ades, Amos, Araya, Brabyn and Lewis2019; Salaminios et al., Reference Salaminios, Duffy, Ades, Araya, Button, Churchill and Lewis2017). PANDA found no evidence that sertraline reduced depressive symptoms to a clinically meaningful extent at 6 weeks, with only a weak effect at 12 weeks. However, they found evidence that sertraline reduced anxiety at 6 and 12 weeks. We measured Pavlovian inhibition and a number of other reinforcement learning processes during this trial using computational modeling of the affective Go/NoGo task (Guitart-Masip et al., Reference Guitart-Masip, Huys, Fuentemilla, Dayan, Duzel and Dolan2012). This is a well-established learning paradigm in which computational analyses allow appetitive and aversive Pavlovian processes to be measured (Guitart-Masip et al., Reference Guitart-Masip, Huys, Fuentemilla, Dayan, Duzel and Dolan2012).
We pre-registered an analysis plan investigating five main hypotheses (osf.io/7q8v2). The primary analyses aimed to test whether treatment with the SSRI sertraline alters aversive Pavlovian control and whether aversive Pavlovian control is related to anxiety, i.e. whether Pavlovian inhibition might mediate the effect of sertraline on anxiety. We also examined the relationship between appetitive Pavlovian biases and depressive symptomatology. Overall, task compliance was poor, and the primary hypotheses were not supported. However, exploratory analyses did reveal that higher changes in aversive Pavlovian bias early on were linked to more severe depression after 12 weeks. Additionally, there was an effect of SSRI treatment on the aversive learning rate at week two and an association between learning from losses and anxiety.
Methods
Ethics
The National Research Ethics Service Committee, East of England – Cambridge South approved the study (ref: 13/EE/0418). The Medicines and Healthcare Products Regulatory Agency gave clinical trial authorization. Written informed consent was obtained from each participant before the study.
Participants
We present secondary analyses of data acquired in the context of the PANDA trial. PANDA was a randomized, double-blind, placebo-controlled pragmatic study investigating the clinical effectiveness of sertraline on depressive symptoms as the primary outcome.
Patients (aged 18–74 years) were recruited from 179 primary care surgeries in four UK sites (Bristol, Liverpool, London, York). The critical entry criterion was that general practitioners (GPs) and/or patients were uncertain about the potential benefits of an antidepressant. No lower or higher thresholds were set on depression severity or duration. The study aimed for a diverse participant pool by including doctors with varied decision-making approaches, promoting clinical equipoise, and capturing the spectrum of depressive symptom severity. The exclusion criteria were: unable to understand or complete study questionnaires in English; antidepressant treatment in the past eight weeks; comorbid psychosis, schizophrenia, mania, hypomania, bipolar disorder, dementia, eating disorder, or major alcohol or substance abuse; and medical contraindications for sertraline.
Patients were randomized to sertraline or placebo, stratified by severity, duration, and site, and followed up after 2, 6, and 12 weeks. For the first week, patients received one capsule (50 mg sertraline or placebo) a day. From week two onwards, they took two capsules per day, either containing 100 mg of sertraline or placebo, for up to 11 weeks. Medication could be increased to 150 mg in consultation with the local principal investigator in cases of non-response after six weeks. The study was double-blind: study patients, care providers, and all members of the research team were blinded to the study treatment allocation (Salaminios et al., Reference Salaminios, Duffy, Ades, Araya, Button, Churchill and Lewis2017).
Measurements
The Go/NoGo task (Fig. 1a) was designed to study Pavlovian appetitive and aversive influence on choice by crossing action (go v. nogo) and valence (rewards v. losses; Guitart-Masip et al., Reference Guitart-Masip, Huys, Fuentemilla, Dayan, Duzel and Dolan2012). Participants were verbally instructed that each fractal would lead to a more favorable outcome with either go or nogo, but that outcomes were probabilistic (cf. Fig. 1 for detailed task description). Each task administration employed a different fractal set. Fractal sets were randomized across participants and assessment timepoints. The Go/NoGo task was assessed at baseline, at two weeks (follow-up 1), and at six weeks (follow-up 2), but it was not part of the 12 weeks assessment (follow-up 3). The Generalized Anxiety Disorder Assessment (GAD-7; Spitzer, Kroenke, Williams, & Löwe, Reference Spitzer, Kroenke, Williams and Löwe2006), the Patient Health Questionnaire-9 (PHQ-9; Kroenke, Spitzer, & Williams, Reference Kroenke, Spitzer and Williams2001), and the Beck Depression Inventory (BDI; Beck, Steer, & Brown, Reference Beck, Steer and Brown1996) were completed at baseline and every follow-up. Several baseline variables were acquired (cf. Table 1).
Data are reported in N(%) or mean(s.d.). There was no evidence for differences in baseline characteristics between the treatment groups shown by the p values (≤0.05). PHQ-9 = Patient Health Questionnaire, 9-item version total score (possible range 0–27). GAD-7 = Generalized Anxiety Disorder Assessment, 7-item version total score (possible range 0–21). BDI = Beck Depression Inventory, 21-item version total score (possible range 0–63). CIS-R = Clinical Interview Schedule-Revised measuring depression severity score (possible range 0–21).
Computational models
Previously published computational models for this task (Guitart-Masip et al., Reference Guitart-Masip, Huys, Fuentemilla, Dayan, Duzel and Dolan2012; Mkrtchian et al., Reference Mkrtchian, Aylward, Dayan, Roiser and Robinson2017; Moutoussis et al., Reference Moutoussis, Bullmore, Goodyer, Fonagy, Jones, Dolan and Dayan2018; Scholz et al., Reference Scholz, Hook, Kandroodi, Algermissen, Ioannidis, Christmas and den Ouden2022) provide formal, quantitative descriptions of the evolution of decisions over the course of learning during the task. The core parameters of interest in the models are the Pavlovian parameters. These capture appetitive Pavlovian influences through the extent to which participants automatically emit ‘go’ responses when faced with reward stimuli, and aversive Pavlovian inhibition through the extent to which they automatically emit ‘nogo’ responses when faced with loss stimuli. The Pavlovian processes are separate from instrumental learning processes, which emit ‘go’ and ‘nogo’ according to which of the two actions is more likely to lead to the better outcome. Other parameters include reward and loss sensitivity, learning rates, irreducible noise, and an overall ‘go’ bias.
Data validation
To evaluate whether the existing data was in principle sufficient to assess the key hypotheses, and to provide an informative a-priori estimate of power, two authors (J. M. and Q. J. M. H) were provided with blinded access to the behavioral task data only, but without access to group allocation, demographics, or measures of symptoms. These authors fitted different reinforcement learning (RL) models (for a list of the models, see Supplementary Materials B.1 RL Models) as described previously in the literature (cf. Huys et al. (Reference Huys, Cools, Gölzer, Friedel, Heinz, Dolan and Dayan2011a) and Supplementary Materials B.2 Model Fitting Procedure & B.3 Model Comparison). All datasets of the study were combined, disregarding within-subject information (i.e. treating repeated sessions as independent task assessments). In the supplements, we report the recoverability and reliability of the parameters (cf. Supplementary Materials Fig. B.3 and B.4).
Models were fitted separately to the data and compared using the integrated Bayesian Information Criterion (iBIC; Figure 2a) at the group level, where the individual likelihoods were first integrated over the individual parameters using a sampling procedure and then summed over all individuals. The most parsimonious model included learning rates, outcome sensitivities, and Pavlovian biases, all separated into rewarding and punishing contexts. Figure 2c shows that simulated data captured the empirical data qualitatively. Hence, standard models of the task are able to parametrically capture the variability of behavioral performance in the task across individuals and sessions on a trial-by-trial level.
In the Go/NoGo task, non-informative responses (e.g. always emitting the same response) cannot provide information about Pavlovian or other cognitive processes and therefore do not inform parameter estimates. Whether the data of a particular task run are meaningful can be evaluated formally by examining whether a model encompassing the core processes provides a more parsimonious account of the behavioral data than a random baseline model. In other words, to examine whether the observed behavioral data meaningfully constrained the model parameter estimates, we compared the integrated likelihood of the most parsimonious model to the integrated likelihood of a random baseline model for each dataset from each individual at each session. The integrated likelihood integrated over an individual parameters refers to the likelihood of the data given the group-level hyperparameters. A task run was deemed as missing if the integrated likelihood of the random baseline model was more than three times higher than that of the most parsimonious model at the group level (Fig. 2b). Note, the model selection process conducted only on the informative task runs yielded consistent results with those obtained on the complete dataset (cf. Supplementary Materials Fig. B.2).
The parameters for each informative task run were extracted from the most parsimonious model to test the hypotheses.
Preregistration
The key hypotheses and analyses were pre-registered on OSF (osf.io/7q8v2; cf. Supplementary Materials Table D.4).
Statistical analyses
Predictors of missing and non-informative data at baseline were identified using a univariate logistic regression. Significantly related baseline variables were used as covariates in all further analyses.
To investigate drug effects, we employed a mixed-effects linear regression (1) using group allocation as the independent variable and the parameter estimate (e.g. aversive Pavlovian bias as the dependent variable) controlling for stratification variables (baseline CIS-R total score in three categories, duration of depressive episode in two categories, and site) and including random intercepts. We reported mean differences (MD), 95% confidence intervals (CI), and the corresponding p values (p).
Next, we examined whether parameter estimates relate to depressive or anxiety symptoms using a mixed-effects multiple linear regression (2) with the parameter estimate as independent variable and log-transformed symptom scores (e.g. GAD-7 total score) as dependent variable. Random slopes and intercepts per individual were included. We controlled for group allocation and stratification variables. We reported regression coefficients (β), 95% confidence intervals (CI), and the corresponding p values (p).
For both analyses, we performed separate mixed-effects models for baseline and week two, baseline and week six and over all three time-points. To investigate a potential drug time interaction, we additionally performed a regression including a group-time interaction. The group variable in the mixed-effects models was coded [0,1,1] for a patients allocated to sertraline and [0, 0, 0] for a patients allocated to placebo. Both groups have a 0 at baseline because they were unmedicated at that time.
To investigate whether a baseline parameter estimate predict treatment outcome, we performed a simple linear regression predicting symptoms core at the at week 12 controlling for symptoms at baseline, group allocation and stratification variables.
As an exploratory analysis we examined whether early change in aversive Pavlovian bias (week 2 – baseline) relates to log-transformed BDI total score at week 12 using a simple linear regression including an interaction effect between group-allocation and Pavlovian bias.
Exploratory analyses repeated the analysis type 1 above for each individual parameter and used Bonferroni-correction to correct for testing multiple parameters (p ≤ (0.05/8) ≤ 0.00625).
Additionally, we conducted simple linear regression examining group differences in parameter slopes (early change = week two – baseline; late change = week six – week two). We also repeated analysis type 2 for each of the parameter estimates and the three psychological measures (GAD-7, PHQ-9, BDI) and used Bonferroni-correction to correct for testing multiple parameters (p ≤ (0.05/8 × 3) ≤ 0.002).
Finally, to assess test–retest reliability we calculated Pearson correlation of individuals' parameters between the different time points and employed intra-class correlation coefficients (ICCs; McGraw & Wong, Reference McGraw and Wong1996) using the informative data (cf. Supplementary Materials B.6 Test-Retest Reliability).
Results
A total of 655 patients were recruited and randomly assigned to sertraline (326, 50%) and placebo (329, 50%). Two patients in the sertraline group did not complete a substantial proportion of the baseline assessment and were excluded. Additionally, 25 patients (9 from the sertraline group and 16 from placebo) did not complete the Go/NoGo task at any time-point. This left 628 participants (315 sertraline and 313 placebo) for analyses (cf. Fig. A.1 in Supplementary Materials). Task data for seven patients at baseline, 99 patients at 2 weeks, and 145 patients at 6 weeks were missing. Missing follow-up data were more common in participants who had higher baseline depressive and anxiety symptoms, financial difficulties, were from ethnic minorities and were recruited from London (cf. Supplementary Materials Table E.5). Missing data did not differ statistically by treatment allocation.
Basic task characteristics
Examination of the average percent correct response per condition showed the typical interaction pattern characteristic of Pavlovian inference found in previous studies (Guitart-Masip et al., Reference Guitart-Masip, Huys, Fuentemilla, Dayan, Duzel and Dolan2012; Mkrtchian et al., Reference Mkrtchian, Aylward, Dayan, Roiser and Robinson2017; Moutoussis et al., Reference Moutoussis, Bullmore, Goodyer, Fonagy, Jones, Dolan and Dayan2018; Scholz et al., Reference Scholz, Hook, Kandroodi, Algermissen, Ioannidis, Christmas and den Ouden2022) at all measurement points (Fig. 1b). Performance was better in Pavlovian congruent (go to win and nogo to avoid) than incongruent (go to avoid and nogo to win) conditions (|t| ∈ [4.65, 16.70], p < 0.001). There were no differences in average performance between patients allocated to sertraline and patients allocated to placebo (|MD| ∈ [0.00, 0.03], p > 0.05).
Computational modeling results
Overall, 747 (46%) task runs did not contain interpretable and informative behavioral data. Variables associated with non-informative behavior were higher age, lower education, and past antidepressant use. At week 2 non-informative task runs (N = 230, 43%) were more likely in patients who were allocated to the sertraline group (57%, X2 = 7.06, p = 0.008). In addition, baseline anxiety score, depression severity, and employment status were predictive of non-informative behavior at week 6 (cf. Supplementary Materials Table E.6). For all further analyses we focused on the 886 informative task runs from 435 patients (66% of those originally randomized) and adjusted for significant predictors of non-informative data as covariates. Characteristics of the remaining sample according to the study arm are shown in Table 1. Baseline characteristics of the sample were not statistically distinguishable between treatment groups.
The effect of sertraline on anxiety remained significant in the smaller included sample (week 6: MD = −0.1, CI[ − 0.17 to − 0.03], p = 0.005; week 12: MD = −0.12, CI[ − 0.17 to − 0.06], p ≤ 0.001; over time: MD = −0.08, CI[ − 0.12 to − 0.04], p ≤ 0.001). This is important since our preregistered hypotheses were developed under consideration of this effect.
Preregistered hypotheses
The preregistered hypotheses were not supported (Table 2): there was no evidence that the aversive Pavlovian inhibition was affected by sertraline (Fig. 3a and b); that aversive Pavlovian inhibition was related to anxiety symptoms; that the baseline aversive Pavlovian bias was predictive of treatment response; that the appetitive Pavlovian bias was associated with depression or that the reward sensitivity was related to anhedonia.
We tested whether sertraline alters aversive Pavlovian control (Hypothesis 1; H1) and whether aversive Pavlovian control is related to anxiety (Hypothesis 2; H2). Hypothesis 3 regarding the aversive Pavlovian bias as a mediator for the effect of sertraline on anxiety was not investigated as there was no evidence for H1 and H2. Hypothesis 4 (H4) tested whether aversive Pavlovian bias at baseline before starting SSRI treatment predicted treatment outcome. Hypothesis 5 (H5) examined the relationship between the appetitive Pavlovian bias and depressive symptoms. Hypothesis 6 (H6), tested for a relationship between reward sensitivity and anhedonia. We controlled for stratification variables and variables associated with missing data in all analyses.
Exploratory analyses
Exploratory analyses of a subsample with better test–retest correlation, and of a subsample with low symptoms did not support the pre-registered hypotheses (cf. Supplementary Materials F Subsample Analyses).
Two sets of results in the exploratory analyses are noteworthy. The first relates to early change in the aversive Pavlovian bias. The slope of the aversive Pavlovian bias between baseline and week two was positively related to depressive symptoms at week 12 (log-transformed PHQ9 total score: β = 0.06, CI[0.0 − 0.11], p = 0.044; log-transformed BDI total score: β = 0.07, CI[0.01 − 0.13], p = 0.016). A larger increase in aversive Pavlovian bias was associated with more severe subsequent depressive symptoms. Furthermore, the BDI model revealed an interaction between group allocation and early change in the aversive Pavlovian bias (β = 0.14, CI [0.02–0.26], p = 0.024; Figure 3c). That is, early change in aversive Pavlovian bias was more strongly related to BDI scores at week 12 in the sertraline group (β = 0.14, CI[0.05 − 0.23]), than in the placebo group (β = 0.02, CI[ − 0.03 to 0.07]). However, note that sertraline had no effect on the early change in aversive Pavlovian bias (MD = −0.08, CI[ − 0.32 to 0.15], p = 0.49). The second set of findings relates to the speed at which participants adapted behavior following losses (the loss learning rate). There was an effect of sertraline on the loss learning rate at week 2 (MD = 0.6, CI[0.22 − 0.97], p = 0.002; Figure 3d). The sertraline group learned faster from losses at week 2 than the placebo group. Early change in loss learning rate (week two – baseline) was higher in the sertraline group (MD = 0.75, CI[0.18 − 1.3], p = 0.009; Figure 3e), whereas later change (week six minus week two) was lower in the sertraline group (MD = −0.72, CI[ − 1.27 to − 0.17], p = 0.011; Figure 3e). In the sertraline group, the early change was different from zero (t = 2.74, p = 0.007), whereas the later change was not (t = −0.32, p = 0.75). In contrast, in the placebo group, the early change did not differ from zero (t = −0.70, p = 0.483), but the late change did (t = 3.44, p < 0.001). Hence, the group difference in the late change was due to an increase in loss learning rate from baseline to week 6 in the placebo group. The aversive learning rate is strongly driven by switching after losses in the early part of the learning curve. Indeed, there was an elevated switching probability after losses during the first eight trials in the sertraline group (MD = 0.21, CI [0.0–0.41], p = 0.048; averaged across the go-to-avoid and nogo-to-avoid conditions). Finally, the loss learning rate was also positively associated with the anxiety scores (at week 2: β = 0.01, CI[0.0 − 0.02], p = 0.047; at week 6: β = 0.02, CI[0.0 − 0.03], p = 0.016; across all sessions: β = 0.02, CI[0.01 − 0.03], p = 0.001). However, there was no evidence for an association between anxiety symptoms and either the loss learning rate at baseline (β = 0.01, CI [−0.0 to 0.02], p = 0.24) or the early change in loss learning rate (week 2 – baseline; β = −0.02, CI[ − 0.04 to 0.01], p = 0.164). Additionally, this effect could not be shown based on the early switch probability described above (β = 0.0, CI[ − 0.02 to 0.02], p = 0.73).
Repeating these analyses on the complete sample including all task runs resulted in a broadly consistent pattern of effects (c.f. Supplementary Materials C Findings in the Whole Sample).
In post hoc analyses, we adjusted for the use of other antidepressants and/or psychotherapy, adherence score, and the number of tablets, yielding consistent results as detailed in the Supplementary Materials G Post-hoc Analyses.
Task reliability
Parameters showed poor to moderate reliability (ICC(3,1) ranging from 0 to 0.53; cf. Supplementary Materials B.6 Test-Retest Reliability). The aversive Pavlovian bias was the most reliable parameter (ICC(3, 1) = 0.53, CI[0.41 − 0.64], p < 0.001). The Pavlovian parameters and the go bias also significantly changed over time. The Pavlovian biases decreased (aversive: β = −0.1, CI[ − 0.16 to − 0.04], p = 0.001; appetitive: β = −0.08, CI[ − 0.13 to − 0.04], p < 0.001) and the go bias increased (β = 0.13, CI[0.05 − 0.21], p < 0.001) over sessions which likely led to an increase in task accuracy (β = 0.02, CI[0.01 − 0.03], p < 0.001). We note that age reduced accuracy (β = −0.03, CI[ − 0.04 to − 0.02], p < 0.001), most likely due to increasing Pavlovian biases (aversive: β = 0.19, CI[0.11 − 0.26], p < 0.001; appetitive: β = 0.16, CI[0.11 − 0.21], p < 0.001) and reducing go bias with age (β = −0.4, CI [−0.49 to−0.32], p < 0.001).
Discussion
We investigated the effects of the SSRI sertraline on reinforcement learning mechanisms in the PANDA trial, a pragmatic multicenter, double-blind, placebo-controlled, randomized clinical trial. SSRIs are first-line pharmacological treatments for depression and anxiety, but the mechanism of SSRI action is still unknown. A better understanding of how SSRIs work could lead to improved response predictions and new, refined treatments. Our goal was to identify clinically relevant mechanisms to link receptor action to cognition and affective processing. Reinforcement learning enables such links and hence is a promising framework for investigating the mechanisms of SSRI action. The PANDA trial was the largest individual placebo-controlled trial not funded by the pharmaceutical industry. The sample was recruited in primary care based on clinical equipoise, and depressive symptoms ranged from mild to severe. Findings might therefore be of relevance to the broader primary care population. As sertraline acts through similar mechanisms as other SSRIs (Cipriani et al., Reference Cipriani, Furukawa, Salanti, Chaimani, Atkinson, Ogawa and Geddes2018), the findings may also be relevant for other SSRIs.
Due to the poor task performance, almost half of the performed task runs were excluded. Early on (at week two) non-informative data was more prevalent in the sertraline group, suggesting that patients in the active group may have responded more randomly. Such randomness can be a signature of low overall motivation to perform the task. One possibility is that such a broad motivational reduction could be a signature of SSRI-induced affective blunting (Barnhart, Makela, & Latocha, Reference Barnhart, Makela and Latocha2004; Marazziti et al., Reference Marazziti, Mucci, Tripodi, Carbone, Muscarella, Falaschi and Baroni2019; McCabe, Cowen, & Harmer, Reference McCabe, Cowen and Harmer2009; Price, Cole, & Goodwin, Reference Price, Cole and Goodwin2009). However, there were no discernible differences in symptoms between patients who provided informative and non-informative data at week two, and sertraline had a positive impact on learning at week two in the included sample. These findings speak against a broad blunting effect.
The primary goal of this study was to test whether aversive Pavlovian bias mediates the effect of sertraline on anxiety. We found no evidence supporting an influence of sertraline on aversive Pavlovian bias. This result contrasts with previous research suggesting that Pavlovian inhibition is sensitive to serotonin (Crockett et al., Reference Crockett, Clark and Robbins2009, Reference Crockett, Clark, Apergis-Schoute, Morein-Zamir and Robbins2012; Geurts et al., Reference Geurts, Huys, den Ouden and Cools2013b; Hebart & Gläscher, Reference Hebart and Gläscher2015). There are several possible reasons for this discrepancy. First, it may be that serotonin manipulations have different effects on Pavlovian inhibition in samples with and without depression and/or anxiety. While the current study was performed in a clinical population, previous studies primarily examined healthy volunteers. Second, previous research focused on acute changes via tryptophan depletion (Crockett et al., Reference Crockett, Clark and Robbins2009, Reference Crockett, Clark, Apergis-Schoute, Morein-Zamir and Robbins2012; Geurts et al., Reference Geurts, Huys, den Ouden and Cools2013a; Hebart & Gläscher, Reference Hebart and Gläscher2015) or a single administration of an SSRI citalopram (Guitart-Masip et al., Reference Guitart-Masip, Duzel, Dolan and Dayan2014) rather than the chronic administration examined here. It has long been posited that acute and chronic SSRI administration have opposite effects (e.g. Harmer, Cowen, & Goodwin, Reference Harmer, Cowen and Goodwin2011; Harmer et al., Reference Harmer, O'Sullivan, Favaron, Massey-Chase, Ayres, Reinecke and Cowen2009b). Third, we cannot rule out that some of the Pavlovian inhibition signal is conflated with the loss learning signal as there are non-negligible correlations between parameters (cf. Supplementary Materials Fig. B.6). This is likely compounded by broader issues with data quality, which in turn reduce the ability of models to distinguish aversive Pavlovian inhibition and learning from losses. Subsample analyses attempting to identify either test–retest or symptom loads as reasons for the null results did not yield clear results.
Exploratory analyses identified relationships between sertraline, aversive processing, and symptoms. First, sertraline affected learning from losses but not from rewards. This finding is in keeping with well-supported empirical evidence demonstrating that serotonin modulation impacts learning (Bari et al., Reference Bari, Theobald, Caprioli, Mar, Aidoo-Micah, Dalley and Robbins2010; Brigman et al., Reference Brigman, Mathur, Harvey-White, Izquierdo, Saksida, Bussey and Holmes2010; Michely et al., Reference Michely, Eldar, Martin and Dolan2020; Scholl et al., Reference Scholl, Kolling, Nelissen, Browning, Rushworth and Harmer2017), and specifically punishment learning (Chamberlain et al., Reference Chamberlain, Müller, Blackwell, Clark, Robbins and Sahakian2006; Cools, Roberts, & Robbins, Reference Cools, Roberts and Robbins2008; Tanaka et al., Reference Tanaka, Schweighofer, Asahi, Shishida, Okamoto, Yamawaki and Doya2007, Reference Tanaka, Shishida, Schweighofer, Okamoto, Yamawaki and Doya2009). Prolonged serotonin alterations have downstream effects including augmented learning and plasticity (Dayer, Reference Dayer2014; Kraus, Castrén, Kasper, & Lanzenberger, Reference Kraus, Castrén, Kasper and Lanzenberger2017). In the current dataset, the learning rate from losses increased over the first two weeks of sertraline treatment relative to placebo. The placebo group then ‘caught up’, removing the group differences in loss learning rate at six weeks. Changes in the performance of learning tasks are frequently observed and thought to represent a type of meta-learning, i.e. learning more broadly about the strategy of performing a task rather than learning within the task itself (Botvinick et al., Reference Botvinick, Ritter, Wang, Kurth-Nelson, Blundell and Hassabis2019; Doya, Reference Doya2002; Langdon et al., Reference Langdon, Botvinick, Nakahara, Tanaka, Matsumoto and Kanai2022; Vanschoren, Reference Vanschoren, Hutter, Kotthoff and Vanschoren2019). As such, the late change in performance in the placebo group compared to the early change in the sertraline group suggests that sertraline may have increased the speed at which this meta-learning may have occurred and may have done so by specifically altering behavioral adaptation after losses within the task. One complication is that, at two weeks, there was already some evidence for changes in anxiety symptoms, and an inverse causal path (with anxiety mediating the effect of sertraline) cannot be excluded.
The loss learning rate was correlated with anxiety symptoms at both follow-up time points and over all measurement points. This is, in principle, in line with previous research outlined in a recent meta-analysis reporting higher punishment learning rates and slightly lower reward learning rates in patients (Pike & Robinson, Reference Pike and Robinson2022). Yet, this is difficult to reconcile with, first, the SSRI-induced increase in learning from punishment, and second the fact that both anxiety and depression are treated by SSRIs and are linked to heightened punishment learning themselves. Interestingly, a similar conundrum was present in the literature on learned helplessness, which was associated with increased levels of serotonin (Petty, Kramer, & Moeller, Reference Petty, Kramer and Moeller1994), but could also be reversed as a response to SSRIs (Hajszan et al., Reference Hajszan, Szigeti-Buck, Sallam, Bober, Parducz, Maclusky and Duman2010; Kirby, Reference Kirby2006; Malberg & Duman, Reference Malberg and Duman2003; Maudhuit et al., Reference Maudhuit, Prévot, Dangoumau, Martin, Hamon and Adrien1997). Hence, coupling increases in serotonin levels with a simple account of serotonin levels on behavior is unlikely to be able to explain SSRI effects. Indeed, the serotonin system is known to be exquisitely complex, with many different serotonin receptors distinctively distributed (Hansen et al., Reference Hansen, Shafiei, Markello, Smart, Cox, Nørgaard and Misic2022). A possible explanation could be that SSRIs facilitate learning faster in a punishing environment, thus leading to less negative and more positive (or neutral) feedback. It is interesting to consider how this bias towards learning from losses might be linked to mood. Self-reports of happiness are linked to positive prediction errors (Rutledge, Skandali, Dayan, & Dolan, Reference Rutledge, Skandali, Dayan and Dolan2014), suggesting that negative prediction errors might similarly influence negative affective states. In other words, SSRIs might gradually improve mood by enhancing negative expectations through faster loss learning, thereby giving rise to less disappointing and more rewarding experiences.
Finally, improvements in depressive symptoms in the sertraline group were preceded by an early decrease in the aversive Pavlovian bias. In other words, patients on sertraline showed a higher increase in their tendency to withhold an action when facing a loss between baseline and the 2-week follow-up, the higher their depressive symptoms were after 12 weeks.
Overall, the findings draw a complex picture involving aversive processing, sertraline, and symptoms, possibly reflecting the known complexity of the serotonin system. Despite the methodological limitations and the failure to support the preregistered hypotheses, the exploratory data suggest alterations in the processing of losses. A tentative possibility is that SSRIs alter the speed of learning from losses early on, inducing a shift from Pavlovian to instrumental learning when confronted with losses. The alteration in aversive Pavlovian bias was not directly linked to sertraline. However, sertraline appeared to modulate the association between Pavlovian inhibition and future treatment outcomes. Reducing aversive Pavlovian control might hence promote approach responses in a punishing environment, facilitating unexpected rewarding experiences and thus helping to alleviate depressive symptoms.
Limitations
Inclusion in the trial was based on clinical equipoise, i.e. inclusion was based on an uncertainty whether medication could clinically be helpful for a particular person. This may have decreased the power to detect differences from placebo. For mechanistic studies such as the current one, it could be better to study a cohort of typical responders, i.e. patients who are prescribed medication with clinical confidence.
Extensive validation analyses showed that task performance was frequently objectively poor resulting in a large fraction of the task runs being non-informative. Non-informative task runs had to be excluded from analyses because formally they cannot provide information about cognitive mechanisms. We attempted to address this by correcting for baseline variables that were significantly associated with non-informative task runs. The sertraline and the placebo group in the final informative sample continued to be matched on baseline characteristics. Nevertheless, the exclusion of data has severely curtailed the power in the study. Furthermore, because noninformative data was more common in the drug than the non-drug arm, a causal interpretation is no longer warranted.
The poor task performance has important implications for future mechanistic research in this domain. Although the task has been extensively used in laboratory studies (Guitart-Masip et al., Reference Guitart-Masip, Huys, Fuentemilla, Dayan, Duzel and Dolan2012), combined with neuroimaging (Guitart-Masip et al., Reference Guitart-Masip, Fuentemilla, Bach, Huys, Dayan, Dolan and Duzel2011), pharmacological (Guitart-Masip et al., Reference Guitart-Masip, Duzel, Dolan and Dayan2014) and other interventions and adapted (Millner, Gershman, Nock, & den Ouden, Reference Millner, Gershman, Nock and den Ouden2018; Moutoussis et al., Reference Moutoussis, Bullmore, Goodyer, Fonagy, Jones, Dolan and Dayan2018; Swart et al., Reference Swart, Froböse, Cook, Geurts, Frank, Cools and den Ouden2017), it did not prove effective in a longitudinal clinical trial. This reinforces the paramount importance of acceptability and effectiveness testing of cognitive measurements for translational research and calls for an involvement of stakeholders in the design of research tasks.
The relationships between cognitive mechanisms and symptoms were weak. This probably reflects more general findings in the field (Eisenberg et al., Reference Eisenberg, Bissett, Zeynep Enkavi, Li, MacKinnon, Marsch and Poldrack2019), but also the specific limitations around data quality mentioned above which limit the strength of possible associations (Spearman, Reference Spearman1904). We also note that our computational modeling approach was very conservative in that all parameters were allowed to change freely between participants and sessions, with no constraints for within-participant data.
Exploratory results were presented based on passing a conservative significance threshold and their relevance to the preregistered hypotheses. Nevertheless, they should be treated with caution prior to replication.
Conclusion
This study represents a significant exploration of specific reinforcement learning processes in a pragmatic RCT for depression. Specific reinforcement learning mechanisms did show a relationship to aspects of depression and anxiety and its treatment with SSRIs, but this was weak and not as hypothesized a priori. Sertraline influenced aversive processing in the first two treatment weeks by altering how participants learn to execute a passive or active action to avoid loss. Moreover, symptoms were associated with aversive processing but how this relationship relates to SSRI appears complex. The fact that almost half of the data was non-informative emphasizes the importance of developing patient-acceptable task probes.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S0033291724000837.
Acknowledgements
We would like to thank all the members of the Trial Committee, the Data Management Committee and the Public and Patient Involvement representatives. We would also like to thank colleagues who have contributed to the study through recruitment, administrative help and other advice. Finally, we would like to thank Agnes Norbury, Anahit Mkrtchian, Tore Erdmann, Jiazhou Chen, Jenny Fielder, Anna Hall, Jakub Onysk, Jade Serfaty, and Lana Tymchyk for their feedback on the manuscript.
Author contributions
Gl. L. secured the funding for the clinical trial. Gl. L. and L. D. were responsible for writing the detailed protocol, trial management, and data collection. M. M. provided the software for Go/NoGo experiment and helped with data management. M. M. and L. D. contributed to the training of the researchers. J. M. and Q. J. M. H. performed the computational modeling of the Go/NoGo task and planned the analyses with input from all authors. J. M. and Q. J. M. H. wrote the initial draft of the manuscript. All authors contributed to and approved the final manuscript.
Funding statement
This article summarizes independent research funded by the National Institute for Health Research (NIHR) under its Program Grants for Applied Research, Reference Number RP-PG-061010048. We acknowledge support from the UCLH NIHR BRC. J. M. was supported by an International Max Planck Research School on Computational Methods in Psychiatry and Ageing Research (IMPRS COMP2PSYCH) fellowship.
Competing interests
Q. J. M. H. has obtained a research grant from Koa Health, and consultancy fees from Aya Technologies Limited and Alto Neuroscience. All other authors report no competing interest.