Major depressive disorder (MDD) is one of the most prevalent psychiatric disorders (Kessler et al., Reference Kessler, Berglund, Demler, Jin, Merikangas and Walters2005), with 12-month and lifetime prevalence rates estimated at 5.5% and 15%, respectively (Kessler & Bromet, Reference Kessler and Bromet2013). It significantly impairs quality of life and daily functioning, affecting both physical and mental health (American Psychiatric Association, 2013). Anhedonia, one of the two main features of depression, is defined as decreased pleasure from, or reduced interest in, activities that were once experienced as enjoyable (American Psychiatric Association, 2013), and is considered one of the most prominent endophenotypes of the disorder (Beard et al., Reference Beard, Millner, Forgeard, Fried, Hsu, Treadway and Björgvinsson2016; Malgaroli, Calderon, & Bonanno, Reference Malgaroli, Calderon and Bonanno2021; Pizzagalli, Reference Pizzagalli2014). Crucially, anhedonia is also associated with poorer prognosis and treatment efficacy (McMakin et al., Reference McMakin, Olino, Porta, Dietz, Emslie, Clarke and Birmaher2012), outperforming all other symptoms in predicting treatment outcomes (e.g. Fried, Epskamp, Nesse, Tuerlinckx, & Borsboom, Reference Fried, Epskamp, Nesse, Tuerlinckx and Borsboom2016).
Recently, a new perspective on anhedonia has emerged conceptualizing it not merely as a simple lack of pleasure, but rather as a more general or complex dysfunction in reward processing (e.g. Pizzagalli, Reference Pizzagalli2022). According to this view, as reward processing is comprised of several discrete sub-steps (e.g. learning stimulus–reward associations; ensuing desire and anticipation; motivation and effort in acquiring rewards; consummatory pleasure), anhedonia may arise when one or more of these sub-steps are impaired (Kring & Barch, Reference Kring and Barch2014; Pizzagalli, Reference Pizzagalli2022; Rizvi, Pizzagalli, Sproule, & Kennedy, Reference Rizvi, Pizzagalli, Sproule and Kennedy2016). In support of this conceptualization, behavioral research has found an association between anhedonia and dysregulations in the reward system and in reward processing (Rømer Thomsen, Reference Rømer Thomsen2015; Vrieze et al., Reference Vrieze, Pizzagalli, Demyttenaere, Hompes, Sienaert, de Boer and Claes2013), showing, for example, a lack of systematic behavioral preference (i.e. developing a reward-related response bias) for rewarded stimuli among depressed and anhedonic individuals (Pizzagalli, Iosifescu, Hallett, Ratner, & Fava, Reference Pizzagalli, Iosifescu, Hallett, Ratner and Fava2008; Pizzagalli, Jahn, & O'Shea, Reference Pizzagalli, Jahn and O'Shea2005).
An important field of research that has been mainly overlooked in this renewed view of anhedonia is the study of reward learning from an attentional perspective – the effects of prior reward learning on subsequent attention allocation, also known as reward-based selection history or experience-based attention selection (Awh, Belopolsky, & Theeuwes, Reference Awh, Belopolsky and Theeuwes2012). As attention precedes behavior and guides thought and higher-order cognitive processes, such as working memory and decision-making (Desimone & Duncan, Reference Desimone and Duncan1995; Feldmann-Wüstefeld, Busch, & Schubö, Reference Feldmann-Wüstefeld, Busch and Schubö2019), reward-related attentional allocation seems vital for subsequent stages of reward processing. Put differently, exploring how one's learning of the (rewarding) value of specific stimuli affects the way attention is later allocated to those stimuli, when encountered, may shed much-needed light on ensuing anhedonia-related processes. Indeed, much research among healthy individuals has repeatedly shown that stimuli imbued with a rewarding value can later guide visuospatial attention allocation, even without conscious intent (Anderson, Reference Anderson2016, Reference Anderson2017; Failing & Theeuwes, Reference Failing and Theeuwes2018; Gaspelin, Gaspar, & Luck, Reference Gaspelin, Gaspar and Luck2019; Schwark, Dolgov, Sandry, & Volkman, Reference Schwark, Dolgov, Sandry and Volkman2013).
Conversely, only few studies to date have explored the effects of reward-based selection history in depression, showing that while depressed individuals exhibit an intact ability to learn stimulus–reward associations, these fail in producing subsequent changes in attention processes, characteristic of non-depressed individuals (Anderson, Reference Anderson2017; Anderson, Leal, Hall, Yassa, & Yantis, Reference Anderson, Leal, Hall, Yassa and Yantis2014; Brailean, Koster, Hoorelbeke, & De Raedt, Reference Brailean, Koster, Hoorelbeke and De Raedt2014). While providing initial evidence for aberrant reward-based selection history in depression, these studies entail several limitations curbing our understanding of this important phenomenon. Three limitations are related to the quantification of attention allocation via reaction-time (RT)-based measures. First, as RT-based measures are derived from keypresses occurring at the very end of the information processing sequence, different attentional components taking place earlier in the process can only be indirectly inferred from facilitated/impaired performance, providing no information about the course and dynamics of attention deployment before or after the moment of measurement (Lazarov et al., Reference Lazarov, Suarez-Jimenez, Tamman, Falzon, Zhu, Edmondson and Neria2019; Lee & Lee, Reference Lee and Lee2014; Thomas, Goegan, Newman, Arndt, & Sears, Reference Thomas, Goegan, Newman, Arndt and Sears2013). Second, RT-based tasks exhibit poor psychometric properties, including low internal consistency and test–retest reliability (Brown et al., Reference Brown, Eley, Broeren, Macleod, Rinck, Hadwin and Lester2014; Draheim, Mashburn, Martin, & Engle, Reference Draheim, Mashburn, Martin and Engle2019; Rodebaugh et al., Reference Rodebaugh, Scullin, Langer, Dixon, Huppert, Bernstein and Lenze2016; Schmukle, Reference Schmukle2005; Staugaard, Reference Staugaard2009; Waechter & Stolz, Reference Waechter and Stolz2015), which are vital for trusting emergent results. Third, keypresses give rise to potential confounding elements related to the execution of the required motor responses (Hadwin & Field, Reference Hadwin, Field, Hadwin and Field2010; Kimble, Fleming, Bandy, Kim, & Zambetti, Reference Kimble, Fleming, Bandy, Kim and Zambetti2010; Krajbich, Bartling, Hare, & Fehr, Reference Krajbich, Bartling, Hare and Fehr2015), which is particularly relevant in depression due to psychomotor retardation (Caligiuri & Ellwanger, Reference Caligiuri and Ellwanger2000).Footnote †Footnote 1 Two additional shortcomings are related to the nature of rewards used during training/learning, as prior research has exclusively used monetary rewards. First, monetary reward, considered a secondary rather than a primary reinforcer, is less of a motivational driving force for depressed individuals, who tend to exhibit a priori disinterest in maximizing monetary gain (e.g. Godara, Sanchez-Lopez, & De Raedt, Reference Godara, Sanchez-Lopez and De Raedt2019; Maddox, Gorlick, Worthy, & Beevers, Reference Maddox, Gorlick, Worthy and Beevers2012; Pizzagalli et al., Reference Pizzagalli, Iosifescu, Hallett, Ratner and Fava2008). Indeed, primary and secondary rewards are associated with different neurological pathways (e.g. Blood & Zatorre, Reference Blood and Zatorre2001; Menon & Levitin, Reference Menon and Levitin2005; Sescousse, Caldú, Segura, & Dreher, Reference Sescousse, Caldú, Segura and Dreher2013; Thut et al., Reference Thut, Schultz, Roelcke, Nienhusmeier, Missimer, Maguire and Leenders1997). Second, mirroring the RT-based nature of tasks used, rewards were delivered via reaction-based feedback for single trials, following a short time interval between the response and reward deliverance, rather than in a continuous ‘online’ manner that better reflects the dynamic nature of attention allocation. This is imperative for examining the influence of ongoing reward conditioning on continuous attentional allocation (Brailean et al., Reference Brailean, Koster, Hoorelbeke and De Raedt2014).
The first aim of the study was to examine reward learning from an attentional perspective while addressing extant limitations of selection history research in depression (Anderson, Reference Anderson2017; Anderson et al., Reference Anderson, Leal, Hall, Yassa and Yantis2014; Brailean et al., Reference Brailean, Koster, Hoorelbeke and De Raedt2014). Hence, here, reward-based selection history was examined using an eye-tracking-based gaze-contingent music reward procedure (Lazarov, Pine, & Bar-Haim, Reference Lazarov, Pine and Bar-Haim2017b; Shamai-Leshem, Lazarov, Pine, & Bar-Haim, Reference Shamai-Leshem, Lazarov, Pine and Bar-Haim2021), in which ongoing musical reward feedback was provided for attention allocation to one type of stimuli over another (i.e. two types of shapes; rounded over angular), creating an association between the (rewarded) stimulus type and the (rewarding) music. Attention allocation was assessed pre- and post-training using a reliable free-viewing eye-tracking attention allocation task (Lazarov, Abend, & Bar-Haim, Reference Lazarov, Abend and Bar-Haim2016; Lazarov, Ben-Zion, Shamai, Pine, & Bar-Haim, Reference Lazarov, Ben-Zion, Shamai, Pine and Bar-Haim2018; Lazarov et al., Reference Lazarov, Basel, Dolan, Dillon, Pizzagalli and Schneier2021a), presenting similar stimuli to those used in training, but without gaze-contingent music. Based on past research, we predicted a differential change pattern in attention allocation from pre- to post-training (i.e. near-transfer effects), such that this change would be greater among individuals with low levels of depression symptoms, compared with individuals with high levels of depression symptoms. Potential group differences in reward learning during training (i.e. online training) were also explored, although prior research shows no group differences on this measure.
While encouraging a specific behavior with rewards (i.e. positive reinforcement) is clearly relevant to anhedonia and depression (Carvalho & Hopko, Reference Carvalho and Hopko2011; Manos, Kanter, & Busch, Reference Manos, Kanter and Busch2010), the same behavior can be also strengthened by the removal of an aversive or negative stimulus for performing the desired behavior. This process is known as negative reinforcement (Abreu & Santos, Reference Abreu and Santos2008; Reinen et al., Reference Reinen, Whitton, Pizzagalli, Slifstein, Abi-Dargham, McGrath and Schneier2021) – the removal of an aversive stimulus to increase the probability of a (desired) behavior being repeated (Gordan & Amutan, Reference Gordan and Amutan2014). Thus, negative and positive reinforcement are similar in that both can be used to attain the same result – an increase in a (desired) behavior – but via different reinforcing cues/stimuli. In depression and anhedonia, research on learning processes shows that negative reinforcement can facilitate learning processes better than positive reinforcement (Beevers et al., Reference Beevers, Worthy, Gorlick, Nix, Chotibut and Maddox2013; Chiu & Deldin, Reference Chiu and Deldin2007; Eshel & Roiser, Reference Eshel and Roiser2010; Hevey, Thomas, Laureano-Schelten, Looney, & Booth, Reference Hevey, Thomas, Laureano-Schelten, Looney and Booth2017; Maddox et al., Reference Maddox, Gorlick, Worthy and Beevers2012; Reinen et al., Reference Reinen, Whitton, Pizzagalli, Slifstein, Abi-Dargham, McGrath and Schneier2021; Santesso et al., Reference Santesso, Steele, Bogdan, Holmes, Deveney, Meites and Pizzagalli2008). Relatedly, attention research shows depression to be associated with an attentional preference for aversive/dysphoric stimuli, over neutral or positive ones (Gotlib, Krasnoperova, Yue, & Joormann, Reference Gotlib, Krasnoperova, Yue and Joormann2004; Hamilton & Gotlib, Reference Hamilton and Gotlib2008; Johnston et al., Reference Johnston, Tolomeo, Gradin, Christmas, Matthews and Douglas Steele2015; Rudich-Strassler, Hertz-Palmor, & Lazarov, Reference Rudich-Strassler, Hertz-Palmor and Lazarov2022; Suslow, Husslack, Kersting, & Bodenschatz, Reference Suslow, Husslack, Kersting and Bodenschatz2020). One intriguing question in the present context is whether implementing the same gaze-contingent procedure while substituting the appetitive music reward (i.e. positive reinforcement) with the removal of an aversive sound (i.e. negative reinforcement) for performing the desired behavior (i.e. gazing rounded shapes), would yield similar learning patterns, within and following training. This constituted the study's second aim. Hence, we replicated the above-described procedure, using a new cohort of participants, with one pivotal change – during training, gazing rounded shapes stopped an aversive white noise that would otherwise play.
Study 1: positive reinforcement
Method
Participants
Participants were individuals with high (HD) and low (LD) levels of depression symptoms – 28 HD participants [M age = 23.0 ± 1.4, 22 (78.6%) females] and 30 LD participants [M age = 23.4 + 1.8, 21 (70.0%) females]. Demographic and clinical characteristics by group are described in Table 1 (left panel). See online Supplementary Material for a detailed description of recruitment processes; inclusion/exclusion criteria; and the power analysis used to determine sample size.
Note. HD, high depression (group); LD, low depression (group); BDI-II, Beck Depression Inventory-II; PHQ-9, Patient Health Questionnaire-9; STAI-T, State-Trait Anxiety Inventory – Trait; SHAPS, Snaith-Hamilton Pleasure Scale; BMRQ, Barcelona Music Reward Questionnaire; CD-RISC, Connor-Davidson Resilience Scale.
Measures
Depression was assessed using the Patient Health Questionnaire-9 (PHQ-9; Kroenke, Spitzer, & Williams, Reference Kroenke, Spitzer and Williams2001) and the Beck Depression Inventory (BDI-II; Beck, Steer, & Brown, Reference Beck, Steer and Brown1996), trait anxiety via the State-Trait Anxiety Inventory-Trait subscale (STAI-T; Spielberger, Gorsuch, & Lushene, Reference Spielberger, Gorsuch and Lushene1970), anhedonia via the Snaith–Hamilton Pleasure Scale (SHAPS; Snaith et al., Reference Snaith, Hamilton, Morley, Humayan, Hargreaves and Trigwell1995), and musical anhedonia using the Barcelona Music Reward Questionnaire (BMRQ; Mas-Herrero, Marco-Pallares, Lorenzo-Seva, Zatorre, & Rodriguez-Fornells, Reference Mas-Herrero, Marco-Pallares, Lorenzo-Seva, Zatorre and Rodriguez-Fornells2012). See online Supplementary Material for a detailed description of each measure.
Experimental tasks
Attention allocation assessment task. Attention allocation was assessed using an well-established eye-tracking-based free-viewing task (Lazarov et al., Reference Lazarov, Abend and Bar-Haim2016, Reference Lazarov, Ben-Zion, Shamai, Pine and Bar-Haim2018, Reference Lazarov, Basel, Dolan, Dillon, Pizzagalli and Schneier2021a, Reference Lazarov, Suarez-Jimenez, Zhu, Pine, Bar-Haim and Neria2021b) adapted for the present study (see online Supplementary Material for a full description). Briefly, participants freely viewed 30 4-by-4 shape matrices (i.e. 16 shapes per matrix), presented for 6000 ms each, with half of the shapes being without sharp angles (i.e. rounded shapes) and half having sharp angles (i.e. angular shapes; see Fig. 1, right panel, for an example). Attention allocation was quantified as dwell time percent (DT%) on rounded shapes (see below).
Gaze-contingent training task. The training task was a modified version of the assessment task, designed to divert participants’ attention toward rounded over angular shapes via music reward. Specifically, before each training block, participants chose a 12 minute music track (from an extensive music menu) to which they wanted to listen during the task. During each block, 30 successive shape matrices were presented, each for 24 s, with no inter-trial intervals. Importantly, the music played only when fixating one of the rounded shapes. Fixating one of the angular shapes stopped the music. Here, too, attention allocation was quantified as DT% on rounded shapes. See online Supplementary Material for a full task description.
Attention allocation (DT%). For each matrix, in both tasks, two areas of interest (AOI) were defined – the target AOI comprised of the eight (rewarded) rounded shapes, and the non-target AOI comprised of the eight (non-rewarded) angular shapes (see Fig. 1, left panel). Total dwell time (in milliseconds) on each AOI in each matrix (i.e. aggregating dwell time across the eight single shapes comprising the AOI) was calculated, and the proportion of dwell time (DT%) on the target AOI, relative to the total dwell time on both AOIs, was computed, reflecting attention allocation to rounded shapes on the matrix. DT% was then averaged across the presented matrices in a block (30 matrices).
General procedure
The general procedure is fully described in the online Supplementary Material (see also Fig. 2). Briefly, during day 1, participants completed the assessment task (i.e. pre-training assessment), followed by two training blocks (B1, B2), and then competed the self-report measures. During day 2, participants first completed two additional training blocks (B3, B4), followed by the post-training assessment task, and were then questioned for explicit rule learning.
Data analysis
Main analysis. Independent-samples t tests compared groups on descriptive characteristics (e.g. age, PHQ-9, BDI-II, STAI-T, SHAPS and BMRQ), with a χ2 test comparing groups on gender ratio. An independent-samples t test was also used to compare groups on attention allocation (DT%) at pre-training assessment.
Attention allocation during training, termed online learning, was analyzed using a repeated-measures analysis of variance (ANOVA) for DT% on rounded shapes, with group (HD/LD) as a between-subject variable, and training block (B1-to-B4) as a within-subject variable. A χ2 test was used to compare groups on explicit rule learning.
To examine learning generalization from pre- to post-training, termed near-transfer effects, a repeated-measures ANOVA for DT% on rounded shapes was used, with group (HD/LD) as a between-subject variable, and time (pre-training/post-training) as a within-subject variable. Follow-up analyses included separate paired-samples t tests to compare DT% on rounded shapes between pre- and post-training within groups, and an independent-samples t test was used to examine between-group differences at post-training assessment. To address within-trial changes in attention allocation during the assessment task, we also conducted a time-course analysis of attention allocation by entering Epoch as another within-subject variable to the above-described ANOVA. Following extant eye-tracking-based attentional research exploring within-trial changes in attention allocation (Armstrong & Olatunji, Reference Armstrong and Olatunji2012; Felmingham, Rennie, Manor, & Bryant, Reference Felmingham, Rennie, Manor and Bryant2011; Kimble et al., Reference Kimble, Fleming, Bandy, Kim and Zambetti2010), each 6 s trial was divided into three 2 s time epochs (i.e. Epochs 1–3).
While no study to date has specifically explored differences between anxious and non-anxious individuals on attention learning/training, prior research has implicated anhedonia and deficient reward learning in anxiety disorders (e.g. Pike & Robinson, Reference Pike and Robinson2022; Taylor, Hoffman, & Khan, Reference Taylor, Hoffman, Khan and Pizzagalli2022). Hence, to rule out anxiety levels as a possible alternative explanation for significant between-groups results, we also conducted a repeated-measures analysis of co-variance, controlling for anxiety scores, for significant findings.
All analyses were two-sided, using α of 0.05. Effect sizes are reported in η2p for ANOVAs and Cohen's d for t tests. Analyses were carried with the ‘stats’ package in R, and visualized using the ‘ggplot2’ package (Wickham, Reference Wickham2011).
Sensitivity analysis. Each main analysis was followed by a sensitivity analysis to ensure that emergent null findings did not stem from lack of power (i.e. type II errors). As opposed to the main analysis in which DT% (on rounded shapes) was averaged across the 30 matrices per block, yielding a single index per block, here each single matrix (i.e. each single trial) was treated as a separate observation. Specifically, we conducted a mixed-effects linear regression with DT% on rounded shapes as the dependent variable, and introduced each matrix to the model as a separate observation, instead of collapsing the 30 matrices of each assessment/training block into a single observation.Footnote 2 Thus, for training, instead of having four observations per participant (i.e. 4 training blocks), each participant now provided 120 observations (i.e. 4 training blocks × 30 matrices per block), resulting in a substantially more powerful model. Similarly, in the assessment-phase model, each participant provided 60 observations (2 assessment blocks × 30 matrices per block) instead of only two. Participants were modelled as random factors to account for within-subject variance.
As in the main analysis, to rule out anxiety levels as a possible alternative explanation for significant between-groups results, we introduced anxiety scores to the model as a covariate. While groups did not differ on musical anhedonia, we decided to introduce BMRQ scores to the model to ascertain that musical anhedonia was unrelated to performance on the tasks.
We controlled the false discovery rate (FDR) with Benjamini and Hochberg FDR correction for multiple comparisons (Benjamini & Hochberg, Reference Benjamini and Hochberg1995). Effect sizes in the sensitivity analysis are reported with standardized β. Analyses were carried with the ‘lmerTest’ package in R (Kuznetsova, Brockhoff, & Christensen, Reference Kuznetsova, Brockhoff and Christensen2017).
Secondary analysis. As HD participants were not formally assessed per the diagnostic criteria for depression, and to account for within-group heterogeneity in PHQ-9 scores, we repeated our analysis using PHQ-9 scores as a continuous predictor in a mixed-effects linear model. As noted for the sensitivity analysis, each single matrix was treated as a separate observation while modeling participants as random effects. To address the potential effects of anhedonia (SHAPS scores) on emergent findings (i.e. the two groups also differed on SHAPS scores, also showing within-group heterogeneity), while accounting for its multicollinearity with PHQ-9 scores, we replicated this analysis in separate models with SHAPS scores, rather than PHQ-9 scores, as the predictor.
Results
Data and codes for all analyses are openly available in Open Science Foundation.
Sample characteristics
Demographic and clinical characteristics by group are described in Table 1 (left panel). The HD group scored significantly higher on depression (PHQ-9; BDI-II), trait anxiety (STAI-T), and anhedonia (SHAPS) (p < 0.001). No group differences emerged for age, gender ratio, or musical anhedonia (BMRQ).
Online learning (DT% during training)
DT% on rounded shapes by group and training block is shown in Fig. 3A [left panel; see Fig. 3B (left panel) for individual trajectories]. Only a main effect of block emerged, F (3,168) = 24.5, p < 0.001, η2p = 0.30, reflecting an increase in attention allocation toward rounded shapes during training, indicative of online reward learning. The sensitivity analysis confirmed these results (online Supplementary Table S1). Results were replicated with PHQ-9/SHAPS scores as a continuous predictor in mixed-effects linear models (see online Supplementary Table S9/S10, respectively).
Explicit rule learning
No significant group difference was noted for explicit rule learning [HD: 11 (39.3%) learners; LD: 14 (46.7%) learners], χ2(1) = 0.09, p = 0.76.
Near-transfer effects (DT% from pre- to post-training)
No significant group difference emerged for DT% on rounded shapes at pre-training (HD = 0.54 + 0.08, LD = 0.52 + 0.05), t (56) = 1.27, p = 0.21.
DT% on rounded shapes by group and assessment time is shown in Fig. 3c [left panel; see Fig. 3d (left panel) for individual trajectories]. A significant group-by-time interaction effect emerged, F (1,56) = 7.82, p = 0.007, η2p = 0.12. Post-hoc analysis of the LD group revealed a significantly higher DT% at post-training (mean = 0.61 ± 0.18) relative to pre-training (mean = 0.52 ± 0.05), t (29) = 2.79, p = 0.009, Cohen's d = 0.52. For the HD group, no significant difference was found between pre- (mean = 0.54 ± 0.08) and post-training (mean = 0.52 ± 0.12). Group difference for DT% on rounded shapes at post-training was significant, t (56) = −2.15, p = 0.036, Cohen's d = −0.57. The group-by-time interaction effect remained significant after controlling for anxiety levels, F (1,56) = 7.82, p = 0.007, η2p = 0.12.
Additional analyses showed that Epoch had no effect on DT% on rounded shapes and did not significantly interact with either time or group (see online Supplementary Table S7); that the sensitivity analysis confirmed the main results (see online Supplementary Table S2); and that results were replicated when treating PHQ-9/SHAPS scores as a continuous predictor [see online Supplementary Table S11/S12 and Fig. S1/S2 (left panel), respectively].
Study 2: negative reinforcement
Method
Participants
Akin to study 1, participants were 28 HD [M age = 23.2 + 2.9, 23 (82.1%) females] and 30 LD [M age = 24.3 + 3.4, 21 (70.0%) females] participants. Demographic and clinical characteristics by group are described in Table 1 (right panel). See online Supplementary Material for a comprehensive description.
Measures
Same measures were administered in study 2. Yet, as white noise, not music, was used as the reinforcer, rather than assessing musical anhedonia, we assessed resilience to adverse events via the Connor–Davidson Resilience Scale (CD-RISC; Campbell-Sills & Stein, Reference Campbell-Sills and Stein2007) and noise annoyance using a single question developed and recommended by the International Commission on Biological Effects of Noise (ICBEN; Fields et al., Reference Fields, de Jong, Gjestland, Flindell, Job, Kurra and Schumer2001). See online Supplementary Material for a detailed description of these measures.
Procedure, tasks, and measures
The procedure was identical to that of study 1, but with one crucial change – rather than music playing when fixating one of the rounded shapes, here gazing one of these shapes (the target AOI) stopped an aversive white noise that would otherwise play. See online Supplementary Material for additional information on the procedure, tasks, and measures.
Data analysis
The statistical approach was similar to that of study 1.
Results
Data and codes for all analyses are openly available in Open Science Foundation.
Sample characteristics
Demographic and clinical characteristics by group are described in Table 1 (right panel). As in study 1, the HD group scored significantly higher on depression (PHQ-9; BDI-II), trait anxiety (STAI-T), and anhedonia (SHAPS) (p < 0.001). The HD group also showed significantly lower resilience (CD-RISC) (p < 0.001), and scored marginally higher on noise annoyance (p = 0.08). No group differences emerged for age, or gender ratio.
Online learning
DT% on rounded shapes by group and block is shown in Fig. 3a [right panel; see Fig. 3b (right panel) for individual trajectories]. Akin to study 1, only a main effect of block emerged, F (3,168) = 17.5, p < 0.001, η2p = 0.24, reflecting the intended increase in attention allocation (toward rounded stimuli) during training in both groups. The sensitivity analysis confirmed these results (online Supplementary Table S3). Like study 1, results were replicated with PHQ-9/SHAPS scores as a continuous predictor (see online Supplementary Table S13/S14, respectively).
Explicit rule learning
No significant group difference was noted for explicit rule learning [HD: 15 (53.6%) learners; LD: 13 (43.3%) learners], χ2(1) = 0.27, p = 0.61.
Near-transfer effects
No significant group difference emerged for DT% on rounded shapes at the pre-training assessment (HD = 0.52 ± 0.04, LD = 0.50 ± 0.04), t (56) = 1.19, p = 0.24.
DT% on rounded shapes by group and assessment time is shown in Fig. 3c [right panel; see Fig. 3d (right panel) for individual trajectories]. Only a significant main effect of time emerged, F (1,56) = 18.8, p < 0.001, η2p = 0.25. Diverging from study 1, no group-by-time interaction effect emerged, reflecting no group differences in DT% change from pre- to post-training. Post-hoc analysis of the main effect of time revealed a significantly higher DT% at post-training relative to pre-training in both the LD, t (29) = 2.97, p = 0.006, d = 0.55, and the HD group, t (27) = 3.16, p = 0.004, d = 0.61, with more time spent dwelling on rounded shapes at post-training (LD: mean = 0.57 ± 0.13; HD: mean = 0.60 ± 0.15), compared to pre-training (LD: mean = 0.50 ± 0.04; HD: mean = 0.52 ± 0.04). No group difference in DT% on rounded shapes emerged at post-training.
Akin to study 1, additional analyses showed that Epoch had no effect on DT% on rounded shapes and did not significantly interact with either time or group (see online Supplementary Table S8); that the sensitivity analysis confirmed the main results (see online Supplementary Table S4); and that results were replicated when treating PHQ-9/SHAPS scores as a continuous predictor [see online Supplementary Table S15/S16 and Fig. S1/S2 (right panel), respectively].
Additional analyses
To further explore the emergent data across both studies, the below-described additional analyses were conducted. For a complete description of all data analyses and results, see online Supplementary Material. All analysis codes are openly available in Open Science Foundation.
First, to better elucidate the discrepancy between the two studies in the group-by-time interaction effect of learning generalization (i.e. near-transfer effect), we conducted an integrated group-level analysis using a unified model consisting of all participants from both studies (N = 116). Results confirmed that the discrepancy between the two studies was statistically significant [i.e. a group (HD/LD)-by-time (pre-post)-by-reinforcer (music/white noise) interaction; see online Supplementary Table S5].
Second, to better understand the emergent learning processes, individual eye-tracking gaze data were analyzed at the individual level, taking a within-person approach. Individual-level analyses included exploration of: (1) learning magnitude (for both online learning and near-transfer effects); (2) predicting learning (exploring whether specific changes in DT% between subsequent training steps could predict online learning and near-transfer effects); (3) (online) learning speed; and (4) (online) learning patterns (i.e. cluster analysis).
For learning magnitude, results replicated those of the group-level analyses – no group differences in online learning with either reinforcer, with a significant group difference on near-transfer, but only when reinforced with music (see Fig. 4 for descriptive individual trajectories of online learning; 4A for music and 4B for white noise). The associations pattern between the three learning indices (online learning, near-transfer, explicit rule learning) further supported this result (see Fig. 5).
For predicting learning, results showed that matrices 6–10 (i.e. change in DT% from matrices 1–5 to matrices 6–10 matrices) were the only assemblage that consistently predicted online learning in both groups under both reinforcers. For near-transfer effects this assemblage was predictive among LD participants under both types of reinforcers, while among HD participants it was predictive only under white noise (see online Supplementary Table S6).
Zooming in on the above-emergent ‘hot-spot’ (matrices 1–10), learning speed results showed no group differences in speed under the music reinforcer. Conversely, HD participants showed faster learning compared with LD participants when reinforced with white noise.
Finally, the cluster analysis yielded three learning patterns – quick learners, slow learners, and non-learners, which differed significantly on their respected learning trajectories (see Fig. 6). Cluster distribution did not differ between reinforcer types, which was also independent of group under both music and white noise. Conversely, cluster was associated with explicit rule learning under both reinforcers. Comparing clusters on near transfer effects (i.e. learning magnitude) showed that both learner types (quick, slow) showed significantly higher learning than non-learners, under both reinforcers, with the two learner types not differing under either music or white noise. See online Supplementary Tables S_CA1–CA15.
Discussion
The present study examined reward learning in depression from an attentional perspective. In study 1, individuals with high (HD) and low (LD) levels of depressive symptoms underwent a novel gaze-contingent music reward learning procedure while their attention allocation to rewarded and non-rewarded stimuli was examined, during and following training. While no group differences in learning emerged during training, groups differed significantly in their attention allocation at post-training – unlike LD participants, HD participants showed no learning-related changes post-training. In study 2, a similar procedure with negative (i.e. white noise), rather than positive (i.e. music), reinforcement yielded no group differences, with both groups showing the intended change in attention allocation post-training. Results of both studies were maintained when controlling for anxiety, and were replicated when using a more powerful sensitivity analysis and when treating depression scores as a continuous variable/predictor.
The impaired near-transfer effect following reward learning in HD participants concur with past research showing blunted reward responsiveness and impaired reward learning in depression and anhedonia, from both a neuroscience (Borsini, Wallis, Zunszain, Pariante, & Kempton, Reference Borsini, Wallis, Zunszain, Pariante and Kempton2020; Eshel & Roiser, Reference Eshel and Roiser2010; Keren et al., Reference Keren, O'Callaghan, Vidal-Ribas, Buzzell, Brotman, Leibenluft and Wolke2018; Luking, Pagliaccio, Luby, & Barch, Reference Luking, Pagliaccio, Luby and Barch2016; Pizzagalli, Reference Pizzagalli2022; Whitton, Treadway, & Pizzagalli, Reference Whitton, Treadway and Pizzagalli2015), and a behavioral perspective (Eshel & Roiser, Reference Eshel and Roiser2010; Halahakoon et al., Reference Halahakoon, Kieslich, O'Driscoll, Nair, Lewis and Roiser2020), while elaborating extant knowledge to the realm of attention. This lack of near-transfer effects is also in line with the few early RT-based studies of selection history in depression, that also showed less attentional capture post-training by previously rewarded stimuli in individuals with high depressive symptoms (Anderson et al., Reference Anderson, Leal, Hall, Yassa and Yantis2014, Reference Anderson, Chiu, DiBartolo and Leal2017). Yet, elaborating on these earlier studies, here, eye-tracking methodology was used to assess attention allocation following training, rather than RT-based attentional measures derived from manual keypresses, which enabled the exploration of the time course and dynamics of attention deployment (Lazarov et al., Reference Lazarov, Suarez-Jimenez, Tamman, Falzon, Zhu, Edmondson and Neria2019). Relatedly, eye tracking was also used in the reward training/learning procedure itself – reward (i.e. the music) was delivered in a continuous gaze-contingent ‘online’ manner, rather than following a short time interval after the manual response (i.e. the keypress), better corresponding with the dynamic nature of ongoing attention allocation (Brailean et al., Reference Brailean, Koster, Hoorelbeke and De Raedt2014). Finally, music reward, considered a primary reinforcer, was used, rather than monetary reward, considered a secondary reinforcer which is less motivating for depressed individuals (e.g. Godara et al., Reference Godara, Sanchez-Lopez and De Raedt2019; Maddox et al., Reference Maddox, Gorlick, Worthy and Beevers2012; Pizzagalli et al., Reference Pizzagalli, Iosifescu, Hallett, Ratner and Fava2008). The fact that groups did not differ on musical anhedonia, also found to be unrelated to performance on the tasks, suggests that current findings cannot be attributed to group differences on the rewarding value of the music (i.e. the liking aspect of anhedonia).
Unlike group differences in near-transfer following reward learning (study 1), no corresponding group differences were noted when using a similar negative-reinforcement procedure (study 2), echoing previous (non-attentional) research showing enhanced sensitivity to negative outcomes among depressed individuals (Baek et al., Reference Baek, Kwon, Chae, Chung, Kralik, Min and Lee2017; Beevers et al., Reference Beevers, Worthy, Gorlick, Nix, Chotibut and Maddox2013; Chandrasekhar Pammi et al., Reference Chandrasekhar Pammi, Pillai Geethabhavan Rajesh, Kesavadas, Rappai Mary, Seema, Radhakrishnan and Sitaram2015; Hevey et al., Reference Hevey, Thomas, Laureano-Schelten, Looney and Booth2017; Johnston et al., Reference Johnston, Tolomeo, Gradin, Christmas, Matthews and Douglas Steele2015; Maddox et al., Reference Maddox, Gorlick, Worthy and Beevers2012; Reinen et al., Reference Reinen, Whitton, Pizzagalli, Slifstein, Abi-Dargham, McGrath and Schneier2021; Santesso et al., Reference Santesso, Steele, Bogdan, Holmes, Deveney, Meites and Pizzagalli2008; Smoski et al., Reference Smoski, Lynch, Rosenthal, Cheavens, Chapman and Krishnan2008; Trew, Reference Trew2011), while expanding extant knowledge to the realm of attention. Our integrated and exploratory analyses further support the difference between the effects of positive and negative reinforcement among HD and LD individuals. Specifically, results showed that while positive and negative reinforcements yielded similar online learning in both groups, near-transfer effects (the intended shift in attention post-training) were noted under both reinforcements only among LD participants, while HD participants presented near-transfer effects exclusively under negative reinforcement. This suggests that for HD individuals, aversive reinforcers may yield better learning-based shifts in attention, compared with positive reinforcers, reflecting a specific aberration in reward-related selection history in depression. This is further supported by the emergent correlations between the three learning indices (i.e. online learning, explicit rule learning, and near-transfer effects), which were positively associated among LD participants regardless of reinforcer type. Conversely, association with near-transfer effects among HD participants emerged only when using negative, but not positive, reinforcement. Predicting learning based on the first 10 training matrices echoed these findings, as these matrices predicted online learning in both groups under both reinforcer types, but were predictive of near-transfer effects under both reinforcer types only among LD participants. For HD participants, prediction emerged only under the white noise reinforcer. Taken together, these results echo the ‘inverse functionality’ effect – hypo- and hyper-striatal activity among depressed individuals in response to reward and punishment, respectively (Groenewold, Opmeer, de Jonge, Aleman, & Costafreda, Reference Groenewold, Opmeer, de Jonge, Aleman and Costafreda2013; Johnston et al., Reference Johnston, Tolomeo, Gradin, Christmas, Matthews and Douglas Steele2015; Scheuerecker et al., Reference Scheuerecker, Meisenzahl, Koutsouleris, Roesner, Schöpf, Linn and Frodl2010; Ubl et al., Reference Ubl, Kuehner, Kirsch, Ruttorf, Diener and Flor2015), highlighting the potency of negative reinforcers in the facilitation of learning and attention modulation among depressed individuals. While this phenomenon is well-established in neuroscience, it has been relatively neglected in behavioral research, including attention.
Unlike the divergent results of the two studies/reinforcers when exploring near-transfer effects, examining performance during training (i.e. online learning) yielded similar results in both studies – both HD and LD participants showed the intended increase in attention allocation toward rounded shapes (online learning), echoing previous research on learning in depression, using both positive (e.g. Anderson, Reference Anderson2017; Anderson et al., Reference Anderson, Leal, Hall, Yassa and Yantis2014) and negative reinforcement (e.g. Maddox et al., Reference Maddox, Gorlick, Worthy and Beevers2012; Reinen et al., Reference Reinen, Whitton, Pizzagalli, Slifstein, Abi-Dargham, McGrath and Schneier2021). Our cluster analysis of gaze data during training further supports the notion that HD and LD participants do not differ in online learning, regradless of reinforcer type. Specifically, while three significant clusters emerged (i.e. quick learners, slow learners, and non-learners), cluster distribution did not differ between the two reinforcer types (positive, negative) or between groups (HD, LD) under both the music and white noise conditions.
Surprisingly, exploring learning speed during the first 10 training matrices showed a higher learning speed in the HD group, compared to the LD group, when reinforced with white noise, but not when reinforced with music, which may represent a possible ‘compensation’ mechanism enabling generalization of learning under negative reinforcement. Put differently, heightened experienced averseness of negative outcomes among HD individuals may better motivate or enable generalization of learning, which is absent when encountering rewards. Thus, ‘escaping’ aversive stimuli might be better embedded and reflected in subsequent selection history. The fact that compared with LD participants, HD individuals scored significantly lower on resilience to adverse events, and scored higher on noise annoyance (albeit at trend level), supports this suggestion, with noise annoyance also predicting attention allocation during training with white noise (at trend level).
Several limitations should be acknowledged. First, participants were individuals with high and low levels of depression symptoms. While stringent inclusion criteria were used (i.e. two depression measures at screening; score stability on the PHQ-9Footnote 3), future research should replicate the present study using clinically diagnosed MDD patients. Still, using depression scores as a continuous, rather than a grouping, variable yielded similar results, strengthening our confidence in current findings. Second, as the present study aimed to explore selection history in depression, building on past research in the field (Anderson, Reference Anderson2017; Anderson et al., Reference Anderson, Leal, Hall, Yassa and Yantis2014; Brailean et al., Reference Brailean, Koster, Hoorelbeke and De Raedt2014), participants were recruited based on depression scores. It is very likely, however, that anhedonia – a key feature of depression (American Psychiatric Association, 2013) – plays a primary role in reward-based selection history in depression, possibly contributing to emergent results. Indeed, our sensitivity analysis with anhedonia scores yielded similar results to those obtained with depression scores. To further elucidate the specific role of anhedonia in reward-based selection history, future research could replicate the present one while recruiting participants based on anhedonia symptoms (e.g. SHAPS scores), or specifically recruit those high on anhedonia but low on depression. Relatedly, as anhedonia is a clinical feature of additional psychopathologies (e.g. PTSD), future research could also replicate the present study in these other conditions. Third, the present study did not include a follow-up assessment of attention allocation to examine the stability of near-transfer effects over time, which is especially important for the effects noted for HD participants under negative reinforcement (study 2). Future research in depression should explore this, as previously done for positive-reinforcement procedures (Schneier & Lazarov, Reference Schneier and Lazarov2022; Shamai-Leshem et al., Reference Shamai-Leshem, Lazarov, Pine and Bar-Haim2021). Fourth, as non-emotional geometrical shapes were used, we could not explore whether negative reinforcement could counter/overcome attention biases to negative information characterizing depressed individuals (Suslow et al., Reference Suslow, Husslack, Kersting and Bodenschatz2020), especially as positive-reinforcement procedures have failed in doing so (Shamai-Leshem et al., Reference Shamai-Leshem, Lazarov, Pine and Bar-Haim2021). Future studies could replicate the present study using emotional stimuli (e.g. sad/happy faces). Fifth, rounded shapes were randomly chosen by the research team to serve as the target shape type, assuming no a priori differences between groups on attentional preference for rounded vs. angular shapes. Indeed, the pre-training assessment task showed no group differences in DT% (which was around 0.5 in both groups across both studies). Yet, we encourage future research to counterbalance the angular vs. rounded shapes as target shapes. Finally, reward-based selection history was explored using an established gaze-contingent paradigm previously used in depression (Shamai-Leshem et al., Reference Shamai-Leshem, Lazarov, Pine and Bar-Haim2021). While advantageous in some aspects, this paradigm is ‘deterministic’ in nature as reinforcement is delivered using a 100% ratio – each fixation on rounded shapes resulted in music playing (study 1)/removal of noise (study 2). Hence, it does not entail the trial-wise dynamics of probabilistic reinforcement learning tasks used in past research on selection history in depression (Anderson, Reference Anderson2017; Anderson et al., Reference Anderson, Leal, Hall, Yassa and Yantis2014). As probabilistic reinforcement learning tasks have shown a bidirectional interaction between attention allocation and trial-and-error reinforcement learning processes (e.g. Leong, Radulescu, Daniel, DeWoskin, & Niv, Reference Leong, Radulescu, Daniel, DeWoskin and Niv2017), future research could incorporate non-100% (positive or negative) reinforcement ratios within the present paradigm.
Current findings may have some clinical implications, especially for reinforcement-based interventions. In attention, research has utilized gaze-contingent attention modification procedures to modify patients’ (biased) attention to dysphoric over positive/neutral stimuli (for reviews see Gotlib & Joormann, Reference Gotlib and Joormann2010; LeMoult & Gotlib, Reference LeMoult and Gotlib2019; Suslow et al., Reference Suslow, Husslack, Kersting and Bodenschatz2020), hoping to alleviate depression symptoms (Möbius, Ferrari, van den Bergh, Becker, & Rinck, Reference Möbius, Ferrari, van den Bergh, Becker and Rinck2018; Shamai-Leshem et al., Reference Shamai-Leshem, Lazarov, Pine and Bar-Haim2021; Woolridge, Harrison, Best, & Bowie, Reference Woolridge, Harrison, Best and Bowie2021). Especially relevant is a recent randomized control trial that used a similar gaze-contingent music reward procedure to divert participants’ attention away from sad and toward happy faces (Shamai-Leshem et al., Reference Shamai-Leshem, Lazarov, Pine and Bar-Haim2021). While online learning was observed during training, no significant differences in symptom reduction were noted between the active and a placebo group that received non-contingent music throughout training (see also Möbius et al., Reference Möbius, Ferrari, van den Bergh, Becker and Rinck2018; Woolridge et al., Reference Woolridge, Harrison, Best and Bowie2021 for similar null findings). Importantly, the two groups also did not differ on pre-to-post changes in attention allocation (i.e. near-transfer effects). Considering current results, this lack of clinical efficacy may be attributed to aberrant reward-based selection history in depression, namely, failure to induce experienced-based shifts in attention following training. Put differently, if near-transfer effects are not achieved post-training, why should far-transfer effects (i.e. symptom change) follow? (Lazarov, Abend, Seidner, Pine, & Bar-Haim, Reference Lazarov, Abend, Seidner, Pine and Bar-Haim2017a). Indeed, using the same procedure in social anxiety showed a significant reduction in attention allocation post-treatment, sustained 3 months following training (Zhu et al., Reference Zhu, Lazarov, Dolan, Bar-Haim, Dillon, Pizzagalli and Schneier2022), which partially mediated a significant reduction in symptoms (Lazarov et al., Reference Lazarov, Pine and Bar-Haim2017b). Thus, present findings strengthen the specificity of aberrant reward-based selection history to depression. Taking a broader perspective, present results may be also relevant for other interventions, such as behavioral activation, that aim to induce or restore positive affect in depressed patients by encouraging rewarding activities (Hopko, Lejuez, Ruggiero, & Eifert, Reference Hopko, Lejuez, Ruggiero and Eifert2003). Yet, due to reward bluntness, this most often remains an unmet therapeutic goal (Craske et al., Reference Craske, Meuret, Ritz, Treanor, Dour and Rosenfield2019). Current results accentuate the intricacy, including attentional ones, of relying on hedonic capacity in MDD interventions.
To conclude, present results implicate aberrant selection history in individuals with high levels of depression symptoms, but only when based on positive reinforcement, that is, when using rewards. When negative reinforcement is used, no deficits emerge. Current findings may offer future avenues for both clinical and basic research on learning and attention in depression and anhedonia, highlighting the efficacy of negative, over positive, reinforcement in producing experience-based changes in attention allocation.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S0033291723002519.
Acknowledgements
We wish to thank Enbal Blaiberg, Tal Daniel and Hadar Hallel for their help in data collection.
Financial support
This work was supported by the Israel Science Foundation, grant number 374/20 (Amit Lazarov). Nimrod Hertz-Palmor is supported by the Gates Cambridge Trust (#OPP1144).