Hostname: page-component-cd9895bd7-dk4vv Total loading time: 0 Render date: 2024-12-28T21:15:57.397Z Has data issue: false hasContentIssue false

Allowing for non-adherence to treatment in a randomized controlled trial of two antidepressants (citalopram versus reboxetine): an example from the GENPOD trial

Published online by Cambridge University Press:  03 March 2014

N. J. Wiles*
Affiliation:
School of Social and Community Medicine, University of Bristol, UK
K. Fischer
Affiliation:
Estonian Genome Centre, University of Tartu, Estonia
P. Cowen
Affiliation:
Department of Psychiatry, University of Oxford, UK
D. Nutt
Affiliation:
Department of Neuropsychopharmacology, Imperial College London, UK
T. J. Peters
Affiliation:
School of Clinical Sciences, University of Bristol, UK
G. Lewis
Affiliation:
Mental Health Sciences Unit, University College London, UK
I. R. White
Affiliation:
MRC Biostatistics Unit, Cambridge, UK
*
*Address for correspondence: N. J. Wiles, Ph.D., Centre for Academic Mental Health, School of Social and Community Medicine, University of Bristol, Oakfield House, Oakfield Grove, Clifton, Bristol BS8 2BN, UK. (Email: nicola.wiles@bristol.ac.uk)
Rights & Permissions [Opens in a new window]

Abstract

Background

Meta-analyses suggest that reboxetine may be less effective than other antidepressants. Such comparisons may be biased by lower adherence to reboxetine and subsequent handling of missing outcome data. This study illustrates how to adjust for differential non-adherence and hence derive an unbiased estimate of the efficacy of reboxetine compared with citalopram in primary care patients with depression.

Method

A structural mean modelling (SMM) approach was used to generate adherence-adjusted estimates of the efficacy of reboxetine compared with citalopram using GENetic and clinical Predictors Of treatment response in Depression (GENPOD) trial data. Intention-to-treat (ITT) analyses were performed to compare estimates of effectiveness with results from previous meta-analyses.

Results

At 6 weeks, 92% of those randomized to citalopram were still taking their medication, compared with 72% of those randomized to reboxetine. In ITT analysis, there was only weak evidence that those on reboxetine had a slightly worse outcome than those on citalopram [adjusted difference in mean Beck Depression Inventory (BDI) scores: 1.19, 95% confidence interval (CI) –0.52 to 2.90, p = 0.17]. There was no evidence of a difference in efficacy when differential non-adherence was accounted for using the SMM approach for mean BDI (–0.29, 95% CI –3.04 to 2.46, p = 0.84) or the other mental health outcomes.

Conclusions

There was no evidence of a difference in the efficacy of reboxetine and citalopram when these drugs are taken and tolerated by depressed patients. The SMM approach can be implemented in standard statistical software to adjust for differential non-adherence and generate unbiased estimates of treatment efficacy for comparisons of two (or more) active interventions.

Type
Original Articles
Creative Commons
Creative Common License - CCCreative Common License - BY
The online version of this article is published within an Open Access environment subject to the conditions of the Creative Commons Attribution licence http://creativecommons.org/licenses/by/3.0/
Copyright
Copyright © Cambridge University Press 2014

Introduction

Antidepressants are often prescribed in primary care as the first-line treatment for depression. In England in 2011, 46 million prescriptions for antidepressants were issued at a cost of £270 million (HSCIC, 2012). Selective serotonin reuptake inhibitors (SSRIs) are the most commonly prescribed (54% of prescriptions in 2011), with tricyclic antidepressants (TCAs) accounting for a further 29% of prescriptions issued (HSCIC, 2012).

Data on the comparative effectiveness of the various antidepressants suggest that there is little difference between different antidepressants (Freemantle et al. Reference Freemantle, Anderson and Young2000; Cipriani et al. Reference Cipriani, Furukawa, Salanti, Geddes, Higgins, Churchill, Watanabe, Nakagawa, Omori, McGuire, Tansella and Barbui2009). Two meta-analyses suggest that reboxetine may be less effective (Cipriani et al. Reference Cipriani, Furukawa, Salanti, Geddes, Higgins, Churchill, Watanabe, Nakagawa, Omori, McGuire, Tansella and Barbui2009; Eyding et al. Reference Eyding, Leigemann, Grouven, Harter, Kromp, Kaiser, Kerekes, Gerken and Wieseler2010) but others have reported no such differences (Papakostas et al. Reference Papakostas, Nelson, Kasper and Moller2008).

Reboxetine is a selective noradrenaline reuptake inhibitor (NaRI), and is the only drug of this class of antidepressants currently licensed in the UK. It is prescribed infrequently (0.1% of total prescriptions for antidepressants in 2011) (HSCIC, 2012). Notably, meta-analyses have highlighted a lower adherence to treatment with reboxetine compared with other antidepressants (Cipriani et al. Reference Cipriani, Furukawa, Salanti, Geddes, Higgins, Churchill, Watanabe, Nakagawa, Omori, McGuire, Tansella and Barbui2009; Eyding et al. Reference Eyding, Leigemann, Grouven, Harter, Kromp, Kaiser, Kerekes, Gerken and Wieseler2010). This differential non-adherence poses problems when examining the results of randomized controlled trials (RCTs) comparing two active treatments because commonly used methods to handle missing data may lead to biased estimates of effectiveness. In the meta-analysis by Cipriani et al. (Reference Cipriani, Furukawa, Salanti, Geddes, Higgins, Churchill, Watanabe, Nakagawa, Omori, McGuire, Tansella and Barbui2009), it was assumed that those patients who were missing outcome data had not responded to treatment. However, as reboxetine was less well tolerated than SSRIs, this imputation has the potential to introduce bias such that the outcome for those on reboxetine may seem less favourable. Similarly, meta-analysis of trials that have used a last observation carried forward (LOCF) approach to handling missing outcome data (Eyding et al. Reference Eyding, Leigemann, Grouven, Harter, Kromp, Kaiser, Kerekes, Gerken and Wieseler2010) may be biased in a similar direction. However, neither study explored the potential for bias based on their approach to dealing with missing data.

Importantly, these meta-analyses have focused on treatment effectiveness, that is the average outcome of the ‘offer’ of treatment obtained from intention-to-treat (ITT) analyses, irrespective of adherence to the allocated treatment. However, once it has been established that a medication can be tolerated by a patient, clinicians are often interested in knowing the benefit conferred by that drug when taken as prescribed. There is therefore clinical utility in estimating the efficacy of the drug under ‘ideal conditions’ (Last, Reference Last1995), which includes full adherence to treatment. Estimates of treatment efficacy from ‘per-protocol’ analyses may be biased (Fleming, Reference Fleming2008), and are further complicated in trials of two (or more) active interventions when there is differential adherence to the allocated treatments. A structural mean modelling (SMM) approach to deal with the issue of non-adherence in trials of two active treatments has been proposed by Fischer et al. (Reference Fischer, Goetghebeur, Vrijens and White2011).

The current study had two aims. First, to test whether two commonly used approaches to dealing with missing data introduce bias in estimates of effectiveness derived in the presence of differential non-adherence between treatment arms. Second, to use data from the GENetic and clinical Predictors Of treatment response in Depression (GENPOD) trial (Lewis et al. Reference Lewis, Mulligan, Wiles, Cowen, Craddock, Ikeda, Grozeva, Mason, Nutt, Sharp, Tallon, Thomas, O'Donovan and Peters2011; Wiles et al. Reference Wiles, Mulligan, Peters, Cowen, Mason, Nutt, Sharp, Tallon, Thomas, O'Donovan and Lewis2012) to illustrate how to adjust for differential non-adherence in an RCT of two active interventions and hence to derive an unbiased estimate of the efficacy of reboxetine compared with citalopram in the treatment of primary care patients with a new episode of depression.

Method

The GENPOD trial

The GENPOD trial (Thomas et al. Reference Thomas, Mulligan, Mason, Tallon, Wiles, Cowen, Nutt, O'Donovan, Sharp, Peters and Lewis2008) was designed to test two primary hypotheses regarding (1) genetic and (2) clinical predictors of response to antidepressant medication. There was no evidence that the genetic serotonin polymorphism 5-HTTLPR (Lewis et al. Reference Lewis, Mulligan, Wiles, Cowen, Craddock, Ikeda, Grozeva, Mason, Nutt, Sharp, Tallon, Thomas, O'Donovan and Peters2011) or severity of depression (Wiles et al. Reference Wiles, Mulligan, Peters, Cowen, Mason, Nutt, Sharp, Tallon, Thomas, O'Donovan and Lewis2012) was associated with response to antidepressant medication. Secondary analysis of these trial data can provide information on the comparative efficacy of an SSRI (citalopram) and an NaRI (reboxetine).

Participants

Following agreement that an antidepressant should be prescribed, general practitioners (GPs) referred patients to the research team. Those eligible were aged 18–74 years, had a Beck Depression Inventory (BDI; Beck et al. Reference Beck, Steer and Brown1996) score of ⩾15 and met ICD-10 criteria for a depressive episode (F32) using the computerized Clinical Interview Schedule – Revised (CIS-R; Lewis et al. Reference Lewis, Pelosi, Araya and Dunn1992; Lewis, Reference Lewis1994). Those who gave written informed consent were randomized to receive either the SSRI citalopram (20 mg daily) or the NaRI reboxetine (4 mg twice daily).

Patients with psychosis, bipolar disorder or major substance or alcohol abuse problems were excluded, as were those who had taken antidepressants in the 2 weeks prior to baseline or who could not complete self-administered questionnaires.

Baseline measures

In addition to age, gender, BDI score and CIS-R score, the following data were recorded at baseline: ethnicity, marital status, employment status, financial strain [based on questions from the Breadline Britain survey (Gordon et al. Reference Gordon, Levitas, Pantazis, Payne, Townsend, Adelman, Ashworth, Middleton, Bradshaw and Williams2000) and a single question asking about how they were managing financially (five response options)], details of home ownership (home owner, tenant, other), whether they had any longstanding illness, disability or infirmity, total number of physical symptoms (based on a list of 28 symptoms), history of depression (self/family) and prior treatment for depression, personality – conscientiousness [Big Five Inventory (BFI); John et al. Reference John, Donahue and Kentle1991], Hospital Anxiety and Depression Scale (HADS; Zigmond & Snaith, Reference Zigmond and Snaith1983) score, life events, social support, alcohol use (Alcohol Use Disorders Identification Test for Primary Care, AUDIT-PC; Piccinelli et al. Reference Piccinelli, Tessari, Bortolomasi, Piasere, Semenzin, Garzotto and Tansella1997), and scores on the 12-item Short Form Health Survey (SF-12) mental and physical subscales (Jenkinson & Layte, Reference Jenkinson and Layte1997).

Randomization procedure

Randomization was conducted by means of a computer-generated code, administered centrally and communicated by telephone and hence concealed from the recruiting researcher. Allocation was stratified by severity of overall symptoms (CIS-R score < 28 or ⩾28) and centre. The researcher gave the allocated medication to the participant. Neither patients nor researchers were blind to treatment allocation.

Allocated treatments

Patients randomized to citalopram were prescribed 20 mg daily. Citalopram taken at this dose has been shown to occupy about 80% of serotonin transporter reuptake sites, which is reported to be the level of occupancy needed to produce reliable antidepressant effects (Meyer et al. Reference Meyer, Wilson, Ginovart, Goulding, Hussey, Hood and Houle2001).

Those randomized to reboxetine were advised to start on 2 mg twice daily and increase to 4 mg twice daily after 4 days. This stepped approach to starting reboxetine treatment was used on the advice of psychopharmacologists to minimize problems with lack of tolerance of this drug. Acute doses of 4 mg of reboxetine increase cortisol levels indicative of increased noradrenergic function (Hill et al. Reference Hill, Taylor, Harmer and Cowen2003) and this dose of drug also produces peripheral autonomic effects consistent with noradrenaline reuptake blockade (Szabadi et al. Reference Szabadi, Bradshaw, Boston and Langley1998). GPs could increase the dose of either allocated treatment if deemed clinically appropriate.

Measures of treatment adherence

Participants were asked about their use of antidepressant medication in the follow-up questionnaires (six closed response options: I have not taken any of my tablets; I have taken hardly any of my tablets; I have taken less than half of my tablets; I have taken more than half of my tablets; I have taken nearly all my tablets; I have taken my tablets every day).

Outcome measures

Self-reported outcome data were collected 6 and 12 weeks after randomization. For the purpose of this study, which demonstrates the approach to adjusting for differential non-adherence between the two treatments, we used the 6-week outcome data. The (original) primary outcome was the total BDI score at 6 weeks. Secondary outcomes were the HADS total and subscale scores and the SF-12 mental and physical subscale scores.

Dataset

The 6-week follow-up was completed by 91% of participants (n = 546) [citalopram: 274/298 (92%) and reboxetine: 272/303 (90%)]. Younger individuals, those with more life events and less social support were more likely to have missing data (Lewis et al. Reference Lewis, Mulligan, Wiles, Cowen, Craddock, Ikeda, Grozeva, Mason, Nutt, Sharp, Tallon, Thomas, O'Donovan and Peters2011). Adjustment for these variables made no difference to the main trial findings (Lewis et al. Reference Lewis, Mulligan, Wiles, Cowen, Craddock, Ikeda, Grozeva, Mason, Nutt, Sharp, Tallon, Thomas, O'Donovan and Peters2011) and there was no evidence that these factors were associated with adherence to medication (data not shown). Therefore, for the present analyses, the dataset comprised the 546 participants with 6-week follow-up data (complete cases).

Statistical analysis

All analyses were conducted in Stata version 11.1 (Stata Corporation, USA). To compare the data from the GENPOD trial with the previous literature on the comparative effectiveness of antidepressants (Papakostas et al. Reference Papakostas, Nelson, Kasper and Moller2008; Cipriani et al. Reference Cipriani, Furukawa, Salanti, Geddes, Higgins, Churchill, Watanabe, Nakagawa, Omori, McGuire, Tansella and Barbui2009; Eyding et al. Reference Eyding, Leigemann, Grouven, Harter, Kromp, Kaiser, Kerekes, Gerken and Wieseler2010), we first conducted analyses on the effectiveness of reboxetine versus citalopram according to the ITT principle. We then examined the effect of two approaches to handling missing data that have been used in the previous meta-analyses to illustrate the potential for bias in such estimates of effectiveness in the presence of differential non-adherence. Finally, we focused on the application of the novel SMM approach to estimating treatment efficacy in the presence of differential non-adherence.

Estimates of effectiveness

The primary comparative ITT analysis compared the BDI score at 6 weeks between the two groups as randomized, with adjustment for baseline BDI score and the stratification variables. To estimate treatment effectiveness, data from all participants followed up at 6 weeks were included in these analyses, irrespective of adherence to the allocated medication.

Effect of imputing missing outcomes as ‘non-recovery’ or using an LOCF approach to handling missing outcome data on estimates of effectiveness

Previous studies comparing outcomes for those taking citalopram and reboxetine (Cipriani et al. Reference Cipriani, Furukawa, Salanti, Geddes, Higgins, Churchill, Watanabe, Nakagawa, Omori, McGuire, Tansella and Barbui2009; Eyding et al. Reference Eyding, Leigemann, Grouven, Harter, Kromp, Kaiser, Kerekes, Gerken and Wieseler2010) analysed data on an ITT basis but either: (1) assumed that those who were missing outcome data (which frequently equates to all those who had stopped the trial medication in psychopharmacology trials) had not responded to treatment (Cipriani et al. Reference Cipriani, Furukawa, Salanti, Geddes, Higgins, Churchill, Watanabe, Nakagawa, Omori, McGuire, Tansella and Barbui2009) or (2) summarized data from publications that used an LOCF approach to handle missing data (Eyding et al. Reference Eyding, Leigemann, Grouven, Harter, Kromp, Kaiser, Kerekes, Gerken and Wieseler2010). The effect of these two different approaches to handling missing data was examined by artificially constraining the GENPOD dataset such that only those who had continued to take their medication at 6 weeks were regarded as having outcome data.

Adherence-adjusted efficacy estimates

The final set of analyses generated unbiased estimates of treatment efficacy in the presence of differential non-adherence between treatment arms. The SMM method assumes that the mean outcomes in the two arms would be equal in the absence of treatment, and that each treatment has a (separate) linear causal effect on outcome. To estimate the two causal effects of treatment, the approach developed by Fischer et al. (Reference Fischer, Goetghebeur, Vrijens and White2011) relies on identifying baseline variables that predict adherence differently in the two arms (i.e. they interact with a randomized group in a model for adherence) but that do not predict the causal effect of treatment (i.e. they do not interact with treatment in a causal model for clinical outcome). Baseline variables that predict adherence and/or outcome (as main effects) are also useful in improving precision. The following procedure was used to identify these baseline variables.

(1) Identifying predictors of outcome

All baseline variables that were possible predictors of outcome [age, gender, ethnicity, marital status, employment status, housing status, financial strain, history of depression (self/family), prior treatment for depression, longstanding illness, disability/infirmity, social support, life events, alcohol score, BDI score, HADS total/anxiety/depression subscale scores, SF-12 mental and physical subscale scores, and number of physical symptoms] were examined in univariable linear regression models with the BDI score at 6 weeks as a continuous outcome. Those variables that were identified as predictors of outcome at p < 0.20 were entered into a multivariable model. The most parsimonious model was identified using backwards selection and the likelihood ratio test until all remaining variables were retained at p < 0.10. Any variables not selected in the initial phase (univariable model: p ⩾0.20) were included in the final multivariable model one by one and retained if p < 0.10. This modelling process was repeated for each of the additional outcomes (HADS total and subscale scores and SF-12 mental and physical subscale scores). All models were adjusted for stratification variables and treatment allocation to improve precision.

This liberal modelling approach ensured that all potentially influential variables were included. Omission of a potentially important predictor of outcome from the SMM model would result in a loss of precision.

(2) Identifying predictors of adherence

GENPOD relied upon self-reported use of antidepressant medication. A quantitative measure of adherence is required for the SMM approach. Therefore, a pragmatic decision was made to rescale the six response options using increments of 0.2 to generate an adherence score scaled from zero to one, where zero represented total non-adherence and one indicated ‘perfect’ adherence. This rescaling of the adherence measure assumed that a 0.2 point increase in adherence had the same meaning across the scale.

The following baseline variables were possible predictors of adherence: sociodemographic factors (age, gender, ethnicity, marital status, employment status, housing status, financial strain), social support, history of depression (self/family)/prior treatment for depression, longstanding illness/disability/infirmity, personality – conscientiousness, life events, alcohol use, SF-12 physical subscale score, and eight physical symptoms (rapid heartbeat, agitation, dry mouth, sweating, constipation, diarrhoea, daytime drowsiness, and hot flushes). The total number of physical symptoms at baseline was excluded from the list because it was thought that individual physical symptoms may be more relevant to the question of adherence. For example, if someone was already experiencing a dry mouth, taking a drug likely to affect this may differentially affect adherence. The possible predictors of adherence were initially examined in univariable linear regression models with adherence score as the outcome, with adjustment for treatment allocation and predictors of outcome (identified using the process described earlier). All variables that were identified as predictors of adherence (either as a main effect or an interaction with treatment allocation in the univariable models at p < 0.20) were entered into a multivariable model with the variable specified in the appropriate form (main effect or main effect and interaction). Interactions were evaluated one at a time using the likelihood ratio test. Those variables for which the main effect or interaction was significant at p < 0.10 were retained in the final multivariable model.

In GENPOD, the primary hypotheses were about differential response to antidepressant treatment dependent on severity of depression and genotype. To be consistent with this hypothesis, it was deemed inappropriate to examine severity as a predictor of adherence to medication because severity may have predicted the effect of treatment other than through adherence. Therefore, all measures of severity of depressive symptoms (CIS-R, BDI, HADS and SF-12 mental subscale score) were excluded from the list of potential predictors of adherence.

(3) Generating adherence-adjusted estimates

The SMM approach (Fischer et al. Reference Fischer, Goetghebeur, Vrijens and White2011) was implemented using an instrumental variable (IV) model approach in Stata [ivregress command: two-stage least-squares (2sls) approach] for each of the outcomes (BDI, HADS and SF-12 mental and physical subscale scores). Each model was specified in the following format:

$${\eqalign{& \hbox{ivregress 2sls}\ {y}\ {x}1\ x2\ x3\cr& \quad (c1\ c2 = {\rm r}\ x1\ {\rm r}\ast x1\ x2\ {\rm r}\ast x2), }$$

where y = outcome, x1 = list of predictors of outcome (identified in stage 1), x2 = list of predictors of adherence (identified in stage 2), x3 = stratification variables (centre and CIS-R severity stratum), c1 = adherence score for those randomized to treatment group 1 (citalopram), c2 = adherence score for those randomized to treatment group 2 (reboxetine), r = treatment allocation, and * denotes an interaction, e.g. r*x1= interaction between treatment allocation and predictors of outcome.

The SMM method requires identification of baseline variables that predict adherence differentially in the two arms (Fischer et al. Reference Fischer, Goetghebeur, Vrijens and White2011). These variables were included in x2 and not in x1, so the interaction r*x2 was an essential part of the model specification whereas the interaction r*x1 is unlikely to be important and could be omitted. Variables that may modify the causal effect of treatment should not be included in x1 or x2.

Taking outcome as BDI score at 6 weeks as an example, the IV model estimated the causal effects of full adherence to the two treatments (citalopram and reboxetine); that is, the difference in mean BDI scores for full adherence with the treatment compared to no adherence with any treatment. The difference between the two treatments was then tested formally using the lincom command (lincom c2 – c1), which estimates an adherence-adjusted difference in mean BDI scores between the two treatment groups and its 95% confidence interval (CI).

Sensitivity analyses were conducted removing predictors of adherence from the list of x2 variables one by one to examine the robustness of the findings from the SMM IV approach for each of the outcomes.

Results

Trial participation and follow-up

The Consolidated Standards Of Reporting Trials (CONSORT) flowchart and baseline comparability of the randomized groups have been published previously (Lewis et al. Reference Lewis, Mulligan, Wiles, Cowen, Craddock, Ikeda, Grozeva, Mason, Nutt, Sharp, Tallon, Thomas, O'Donovan and Peters2011). In total, 601 participants were randomized to receive either citalopram (n = 298) or reboxetine (n = 303). The mean age of participants was 38.8 years (s.d. = 12.4) and 68% (n = 408) were female. More than 90% of participants had moderate (n = 305) or severe depression (n = 245) according to ICD-10 criteria. The 6-week follow-up was completed by 91% (n = 546) of participants (citalopram: n = 274 and reboxetine: n = 272).

Adherence to, and dose of, medication

Of those randomized to citalopram, 90% (n = 246) were still taking their medication at the time of the 6-week follow-up, compared with 72% (n = 195) of those randomized to reboxetine (difference: 18.4%, 95% CI 12.0–24.8, p < 0.001). At the 6-week follow-up, 149 (55%) of those randomized to receive citalopram reported having taken their tablets ‘every day’, 90 (33%) had taken ‘nearly all’ their tablets, and 34 (12%) had taken ‘less than half’, ‘hardly any’ or none of their tablets. The comparable figures for those randomized to receive reboxetine were 113 (42%), 89 (33%) and 70 (26%). As reported previously (Lewis et al. Reference Lewis, Mulligan, Wiles, Cowen, Craddock, Ikeda, Grozeva, Mason, Nutt, Sharp, Tallon, Thomas, O'Donovan and Peters2011), the dose of the allocated medication was increased by the GP only for a minority of participants [citalopram: n = 55 (20%); reboxetine: n = 13 (5%)] during the trial.

Estimates of effectiveness

Among the 546 participants who completed the 6-week follow-up, their mean BDI score at baseline was 33.6 (s.d. = 9.7). The corresponding figures by trial arm are given in Table 1. In an ITT analysis (Table 2), there was only weak evidence to suggest that those randomized to reboxetine had a worse outcome. On average, those on reboxetine scored one point higher on the BDI, although the 95% CI included no difference between groups. The results of the effectiveness analyses for the other mental health outcome measures (HADS total and anxiety/depression subscales; SF-12 mental subscale) were consistent with this (Table 2). Hence, those randomized to receive reboxetine had, on average, a higher score on the HADS (total and subscales) and a lower score on the SF-12 mental subscale, indicative of a worse outcome. Indeed, for the SF-12 mental health subscale, those randomized to reboxetine had a mean score that was, on average, two points lower compared to those randomized to receive citalopram. The CI surrounding this estimate excluded the possibility of no difference. There was little evidence for any difference in outcome in terms of physical health (SF-12 physical subscale score) between those randomized to receive reboxetine compared to citalopram (Table 2).

Table 1. Baseline and 6-week follow-up scores on the outcome measures according to allocated treatment group, in those who completed the 6-week follow-up

BDI, Beck Depression Inventory; HADS, Hospital Anxiety and Depression Scale; SF-12, 12-item Short-Form Health Survey; s.d., standard deviation.

a n = 273 for SF-12 scores.

Table 2. Differences in outcomes at 6 weeks from analysis of treatment effectiveness and estimates of efficacy from SMM models that account for differential non-adherence to allocated treatment

ITT, Intention-to-treat; BDI, Beck Depression Inventory; HADS, Hospital Anxiety and Depression Scale; SF-12, 12-item Short-Form Health Survey; CI, confidence interval.

a Adjusted for centre, baseline severity strata (Clinical Interview Schedule – Revised, CIS-R) and baseline score for outcome measure.

Difference is reboxetine minus citalopram. A positive difference for BDI and HADS (and a negative difference for SF-12 outcomes) indicates that those on reboxetine have a worse outcome than those on citalopram.

Effect of imputing missing outcomes as ‘non-recovery’ or using an LOCF approach to handling missing outcome data on estimates of effectiveness

There was little evidence of a difference in the binary outcome of ‘recovery’ (BDI score < 10 at 6 weeks) using observed data collected (irrespective of adherence to allocated medication) for 91% of GENPOD participants at 6 weeks when data were analysed using an ITT approach (Table 3).

Table 3. Examining the effect of different approaches to handling missing outcome data on the difference between treatment groups (estimates of effectiveness) in the presence of differential adherence to treatment

BDI, Beck Depression Inventory; ITT, intention-to-treat; LOCF, last observation carried forward; OR, odds ratio; CI, confidence interval; s.d., standard deviation.

a Adjusted for centre, baseline severity strata (Clinical Interview Schedule – Revised, CIS-R) and baseline BDI score.

Difference is reboxetine minus citalopram. An OR < 1 for ‘recovery’ or a positive difference for differences in BDI scores indicates that those on reboxetine have a worse outcome compared to those on citalopram.

Applying the assumption that those who stopped their medication had a poor outcome to the GENPOD data demonstrated that differential adherence to medication between arms introduced bias such that the outcome for those randomized to reboxetine appeared worse [odds ratio (OR) for response 0.70, 95% CI 0.45–1.10)]. Additional imputation of a poor outcome for those individuals not followed up at 6 weeks had little effect (Table 3).

Similarly, using an LOCF approach to impute missing outcome data for those who had stopped their medication at 6 weeks suggested that, on average, the outcome for those randomized to reboxetine was three points higher on the BDI (more depressed) compared with those randomized to citalopram. Analysis of the observed outcome data at 6 weeks provided only weak evidence for a difference in outcome between the groups (Table 3).

Adherence-adjusted efficacy estimates

The analyses identified several predictors of outcome and adherence within the GENPOD dataset (see the online Appendix). As expected, for all outcomes, the strongest predictor of outcome was the baseline measurement. In terms of predictors of adherence, those from a non-white ethnic background were less likely to adhere to medication, whereas those who reported a rapid heartbeat were more likely to adhere to medication. Interactions with treatment allocation were found for three variables: marital status, prior history of depression and the personality trait of conscientiousness. Those who were married, those with a previous history of depression and those who were more conscientious were less likely to adhere to reboxetine. The full specification of the IV models that generated the adherence-adjusted estimates can be found in the online Appendix.

The adherence-adjusted differences in mean outcomes between the treatment groups are presented in Table 2. There was weak evidence that reboxetine was less efficacious than citalopram in terms of outcome on the SF-12 mental subscale, although the CI included the possibility of no difference. However, there was no evidence of a difference in efficacy between the two treatments based on the other outcomes including the BDI.

Sensitivity analyses for the adherence-adjusted efficacy estimates

The results of the sensitivity analyses examining the effect of removing predictors of adherence from the final SMM IV models for all outcomes are summarized in Table 4. Although the adjusted difference in means between treatment groups varied according to the list of predictors of adherence included in the SMM model (for some outcomes more than others), the estimates were broadly consistent when the CIs were compared.

Table 4. Sensitivity analyses around adherence-adjusted instrumental variable (IV) efficacy estimates of the mean difference in outcome between treatment groups

BDI, Beck Depression Inventory; HADS, Hospital Anxiety and Depression Scale; SF-12, 12-item Short-Form Health Survey; BFI, Big Five Inventory; CI, confidence interval.

a Adjusted for centre, baseline severity strata (Clinical Interview Schedule – Revised, CIS-R) and baseline score for outcome measure.

There was no evidence to support an interaction between severity of depression (or genotype) and response to antidepressant in the GENPOD trial (Lewis et al. Reference Lewis, Mulligan, Wiles, Cowen, Craddock, Ikeda, Grozeva, Mason, Nutt, Sharp, Tallon, Thomas, O'Donovan and Peters2011; Wiles et al. Reference Wiles, Mulligan, Peters, Cowen, Mason, Nutt, Sharp, Tallon, Thomas, O'Donovan and Lewis2012). Therefore, excluding severity as a predictor of adherence may be questioned. However, there was no evidence for an interaction between severity of depression and adherence to medication (for test of equality of coefficients: interaction between severity and adherence to citalopram/reboxetine, p = 0.27).

Discussion

We have demonstrated how to implement the SMM approach described by Fischer et al. (Reference Fischer, Goetghebeur, Vrijens and White2011) in a standard statistical software package to obtain an unbiased estimate of treatment efficacy for a trial comparing two active treatments. Analysis was straightforward once suitable covariates for the SMM approach were identified. Data from the GENPOD trial of the two antidepressants citalopram and reboxetine were used as an exemplar.

The results of an effectiveness analysis (conducted according to the ITT principle) found only weak evidence that those randomized to reboxetine had a slightly worse outcome than those randomized to citalopram in terms of depressive symptoms (on the BDI/HADS). This is in contrast to previous meta-analyses (Cipriani et al. Reference Cipriani, Furukawa, Salanti, Geddes, Higgins, Churchill, Watanabe, Nakagawa, Omori, McGuire, Tansella and Barbui2009; Eyding et al. Reference Eyding, Leigemann, Grouven, Harter, Kromp, Kaiser, Kerekes, Gerken and Wieseler2010) that suggested that reboxetine was less effective than other antidepressants.

It is common practice in psychopharmacology trials for participants who stop taking their allocated medication not to be followed up. Outcomes are then imputed by assuming that those who stopped their allocated medication had a poor outcome or by carrying forward an earlier observation (LOCF). When we applied these approaches to the GENPOD data, by artificially assuming that outcomes were observed only for those who continued on their medication, we found stronger evidence of a poor outcome for those randomized to reboxetine compared with the results of analyses using all observed data. This clearly demonstrates that these common approaches to handling missing data may generate biased estimates of effectiveness when there is differential non-adherence between treatment arms.

Using the SMM approach to account for differential non-adherence to treatment between trial arms, we found no evidence of a difference in efficacy in terms of depressive symptoms (BDI) between reboxetine and citalopram at 6 weeks. The adherence-adjusted estimate (based on the difference in causal effects for full adherence to the treatment) was close to the null. There was weak evidence for a difference in efficacy between treatment with reboxetine and citalopram for the SF-12 mental subscale. In discussing these differences, it is important to consider whether these are clinically relevant. Although there is no consensus regarding a ‘minimum clinically important difference’ on these outcome scales, a change of 0.33 s.d. is often used as the target difference in primary care depression trials (Baxter et al. Reference Baxter, Winder, Chalder, Wright, Sherlock, Haase, Wiles, Montgomery, Taylor, Fox, Lawlor, Peters, Sharp, Campbell and Lewis2010). Hence, we would regard a three-point change in BDI score, a two-point change in HADS score (one point on subscales) and a three- to four-point change in SF12 scores to be clinically important. The differences and CIs observed in terms of estimates of efficacy from analyses using the SMM approach are smaller than these and, except for the results for the SF-12 mental subscale, we can therefore exclude the possibility of a clinically important difference between citalopram and reboxetine in those who can tolerate the medications.

Strengths and limitations

The SMM approach used depends on finding baseline covariates that predict adherence differently in the two randomized groups but that may be assumed not to modify the causal effect of treatment. Bias would occur if the latter assumption failed. In addition, it is assumed that the average outcome does not depend on treatment assignment (the ‘exclusion restriction’). In a non-blinded trial such as GENPOD, there is a theoretical possibility that this assumption could be violated given prior beliefs about the treatment. However, there is little evidence to suggest that patients had different expectations of outcome for the two antidepressants.

Predictors of adherence were removed from the final SMM IV models one at a time to examine the robustness of the findings. The results of these sensitivity analyses show that the estimates were broadly consistent with the final SMM model incorporating all predictors of adherence.

GENPOD relied upon a self-report measure of adherence to medication. Use of electronic monitoring bottles would provide a more accurate measure of adherence. Such data would also provide a continuous adherence score as required for application of the SMM methodology. We rescaled the self-report adherence data to generate a continuous measure of adherence to apply this methodology, albeit therefore introducing some modelling assumptions. At the same time, there was no reason for participants to be motivated to mislead the researchers about their use of medication and we therefore have no reason to suppose that this measure was biased.

In total, 601 participants were recruited into the GENPOD trial, making this one of the largest primary care depression trials conducted. Nonetheless, despite its large size, it is of note that estimates obtained from models based on instrumental variables methods remain imprecise.

Comparisons with existing literature for comparative effectiveness of antidepressants

Meta-analyses have suggested that reboxetine may be less effective than other antidepressants (Cipriani et al. Reference Cipriani, Furukawa, Salanti, Geddes, Higgins, Churchill, Watanabe, Nakagawa, Omori, McGuire, Tansella and Barbui2009; Eyding et al. Reference Eyding, Leigemann, Grouven, Harter, Kromp, Kaiser, Kerekes, Gerken and Wieseler2010). However, in effectiveness analyses of data from the GENPOD trial, we found only weak evidence of very small differences in mental health outcomes (that were unlikely to be clinically significant) at 6 weeks for those randomized to reboxetine compared with citalopram. Both meta-analyses (Cipriani et al. Reference Cipriani, Furukawa, Salanti, Geddes, Higgins, Churchill, Watanabe, Nakagawa, Omori, McGuire, Tansella and Barbui2009; Eyding et al. Reference Eyding, Leigemann, Grouven, Harter, Kromp, Kaiser, Kerekes, Gerken and Wieseler2010) reported that patients randomized to reboxetine were more likely to discontinue treatment compared with those randomized to SSRIs, which is consistent with the findings from GENPOD. However, as we have demonstrated, the assumption that individuals with missing outcome data have not responded to treatment may introduce bias in estimates of effectiveness, such that those on reboxetine seem to do worse. It is therefore important to continue to follow-up trial participants to collect outcome data even if they stop taking the trial medication.

Extensions to the SMM methodology

We have described the SMM approach for estimating efficacy for a singly-measured quantitative outcome. For a repeated-measured quantitative outcome, a structural nested mean model could be used (Robins, Reference Robins1994). For a binary outcome, the SMM approach can be used to estimate risk differences, but if interest lies in risk ratios or ORs then a multiplicative SMM or a generalized SMM is needed (Vansteelandt & Goetghebeur, Reference Vansteelandt and Goetghebeur2003). For time-to-event outcomes, rank-preserving structural nested failure time models could be used (Robins & Tsiatis, Reference Robins and Tsiatis1991).

The methods we have described are especially appropriate for equivalence and non-inferiority trials because ITT analysis is known to be anti-conservative in such trials (Jones et al. Reference Jones, Jarvis, Lewis and Ebbutt1996) whereas per-protocol analyses are potentially biased (Fleming, Reference Fleming2008). An alternative approach to handling non-adherence is the complier average causal effect (CACE; Dunn et al. Reference Dunn, Maracy, Dowrick, Ayuso-Mateos, Dalgard, Page, Lehtinen, Casey, Wilkinson, Vázquez-Barquero and Wilkinson2003) model, but this is not well defined in trials comparing two active treatments and also requires adherence to be binary. Dichotomizing a continuous adherence measure is usually undesirable (White et al. Reference White, Kalaitzaki and Thompson2011).

Implications and further research

It is common practice in RCTs of pharmacological interventions for participants not to be followed up if they stop taking the trial medication. Such a policy is at odds with conducting primary trial analyses according to the principle of ITT, and assumptions that are then made regarding missing data frequently bias estimates of effectiveness.

Differential non-adherence between treatment arms presents a particular challenge for trialists. However, as illustrated, it is possible to implement the analytical methods described (Fischer et al. Reference Fischer, Goetghebeur, Vrijens and White2011) in a standard statistical software package to take account of non-adherence to treatment when comparing two (or more) active interventions. Such methods will generate an unbiased estimate of the difference in treatment efficacy that is of value to the clinician in terms of describing the likely outcomes when drugs are both taken, and tolerated, by patients.

Supplementary material

For supplementary material accompanying this paper visit http://dx.doi.org/10.1017/S0033291714000221.

Acknowledgements

The GENPOD study was supported by the Mental Health Research Network and the Medical Research Council (MRC grant ref. G0200243). We are grateful for the support of the patients who agreed to participate and their general practitioners. We thank the people who contributed towards the GENPOD fieldwork, including the following: the late H. Lester, L. Webber, M. Turnbull, L. Paterson, B. Newton, A. Smith, N. Morris, L. Franks, J. Farrimond, N. Filer, C. Jarrett, A. Hill, J. Mulligan, V. Mason, D. Tallon and L. Thomas. We are grateful to colleagues who have been involved with the GENPOD study as co-applicants but who have not participated in drafting this manuscript: D. Sharp and M. O'Donovan. K.F. was supported by the Estonian Science Foundation (grant no. ETF9353). I.R.W. was supported by the MRC (Unit Programme no. U105260558).

Declaration of Interest

P.C. has been a paid member of advisory boards of Eli Lilly, Servier, Wyeth and Xytis and has been a paid lecturer for Eli Lilly, Servier and Glaxo Smith Kline. He has provided expert advice for solicitors representing Glaxo Smith Kline. D.J.N. has acted as consultant and speaker for both Lundbeck and Pfizer.

References

Baxter, H, Winder, R, Chalder, M, Wright, C, Sherlock, S, Haase, AM, Wiles, NJ, Montgomery, AA, Taylor, AH, Fox, KR, Lawlor, DA, Peters, TJ, Sharp, D, Campbell, J, Lewis, G (2010). Physical activity as a treatment for depression: the TREAD randomized controlled trial protocol. Trials 11, 105.CrossRefGoogle Scholar
Beck, A, Steer, RA, Brown, GK (1996). Manual for the Beck Depression Inventory –Second Edition. The Psychological Corporation: San Antonio, TX.Google Scholar
Cipriani, A, Furukawa, TA, Salanti, G, Geddes, JR, Higgins, JPT, Churchill, R, Watanabe, N, Nakagawa, A, Omori, IM, McGuire, H, Tansella, M, Barbui, C (2009). Comparative efficacy and acceptability of 12 new-generation antidepressants: a multiple-treatments meta-analysis. Lancet 373, 746758.Google Scholar
Dunn, G, Maracy, M, Dowrick, C, Ayuso-Mateos, JL, Dalgard, OS, Page, H, Lehtinen, V, Casey, P, Wilkinson, C, Vázquez-Barquero, JL, Wilkinson, G; ODIN group (2003). Estimating psychological treatment effects from a randomised controlled trial with both non-compliance and loss to follow-up. British Journal of Psychiatry 183, 323331.Google Scholar
Eyding, D, Leigemann, M, Grouven, U, Harter, M, Kromp, M, Kaiser, T, Kerekes, MF, Gerken, M, Wieseler, B (2010). Reboxetine for acute treatment of major depression: systematic review and meta-analysis of published and unpublished placebo and selective serotonin reuptake inhibitor controlled trials. British Medical Journal 341, c4737.Google Scholar
Fischer, K, Goetghebeur, E, Vrijens, B, White, IR (2011). A structural mean model to allow for noncompliance in a randomized trial comparing 2 active treatments. Biostatistics 12, 247257.Google Scholar
Fleming, TR (2008). Current issues in non-inferiority trials. Statistics in Medicine 27, 317332.CrossRefGoogle ScholarPubMed
Freemantle, N, Anderson, IM, Young, P (2000). Predictive value of pharmacological activity for the relative efficacy of antidepressant drugs. British Journal of Psychiatry 177, 292302.CrossRefGoogle ScholarPubMed
Gordon, D, Levitas, R, Pantazis, C, Payne, S, Townsend, P, Adelman, L, Ashworth, K, Middleton, S, Bradshaw, J, Williams, J (2000). Poverty and Social Exclusion Survey of Britain Questionnaire. Joseph Rowntree Foundation: York.Google Scholar
Hill, SA, Taylor, MJ, Harmer, CJ, Cowen, PJ (2003). Acute reboxetine administration increases plasma and salivary cortisol. Journal of Psychopharmacology 17, 273275.Google Scholar
HSCIC (2012). Prescription Cost Analysis – England, 2011. Health and Social Care Information Centre: London.Google Scholar
Jenkinson, C, Layte, R (1997). Development and testing of the UK SF-12 (short form health survey). Journal of Health Services Research and Policy 2, 1418.CrossRefGoogle ScholarPubMed
John, OP, Donahue, EM, Kentle, RL (1991). The ‘Big Five’ Inventory – versions 4a and 54. Institute of Personality and Social Research, University of California, Berkeley: Berkeley, CA.Google Scholar
Jones, B, Jarvis, P, Lewis, J, Ebbutt, A (1996). Trials to assess equivalence: the importance of rigorous methods. British Medical Journal 313, 3639.CrossRefGoogle ScholarPubMed
Last, JM (1995). A Dictionary of Epidemiology. Oxford University Press: Oxford.Google Scholar
Lewis, G (1994). Assessing psychiatric disorder with a human interviewer or a computer. Journal of Epidemiology and Community Health 48, 207210.Google Scholar
Lewis, G, Mulligan, J, Wiles, N, Cowen, PJ, Craddock, N, Ikeda, M, Grozeva, D, Mason, V, Nutt, DJ, Sharp, D, Tallon, D, Thomas, L, O'Donovan, M, Peters, TJ (2011). The 5HT transporter polymorphism and response to antidepresants: a randomized controlled trial. British Journal of Psychiatry 198, 464471.CrossRefGoogle Scholar
Lewis, G, Pelosi, AJ, Araya, R, Dunn, G (1992). Measuring psychiatric disorder in the community: a standardized assessment for use by lay interviewers. Psychological Medicine 22, 465486.Google Scholar
Meyer, JH, Wilson, AA, Ginovart, N, Goulding, V, Hussey, D, Hood, K, Houle, S (2001). Occupancy of serotonin transporters by paroxetine and citalopram during treatment of depression: a [11C]DASB PET imaging study. American Journal of Psychiatry 158, 18431849.CrossRefGoogle ScholarPubMed
Papakostas, GI, Nelson, JC, Kasper, S, Moller, H-J (2008). A meta-analysis of clinical trials comparing reboxetine, a norepinephrine reuptake inhibitor, with selective serotonin reuptake inhibitors for the treatment of major depressive disorder. European Neuropsychopharmacology 18, 122127.Google Scholar
Piccinelli, M, Tessari, E, Bortolomasi, M, Piasere, O, Semenzin, M, Garzotto, N, Tansella, M (1997). Efficacy of the alcohol use disorders identification test as a screening tool for hazardous alcohol intake and related disorders in primary care: a validity study. British Medical Journal 314, 420424.Google Scholar
Robins, JM (1994). Correcting for non-compliance in randomized trials using structural nested mean models. Communications in Statistics – Theory and Methods 23, 23792412.Google Scholar
Robins, JM, Tsiatis, AA (1991). Correcting for non-compliance in randomized trials using rank preserving structural failure time models. Communications in Statistics – Theory and Methods 20, 26092631.CrossRefGoogle Scholar
Szabadi, E, Bradshaw, CM, Boston, PF, Langley, RW (1998). The human pharmacology of reboxetine. Human Psychopharmacology 13, S3S12.Google Scholar
Thomas, L, Mulligan, J, Mason, V, Tallon, D, Wiles, N, Cowen, PJ, Nutt, DJ, O'Donovan, M, Sharp, D, Peters, T, Lewis, G (2008). Genetic and clinical predictors of treatment response in depression: the GENPOD randomized trial protocol. Trials 9, 29.Google Scholar
Vansteelandt, S, Goetghebeur, E (2003). Causal inference with generalized structural mean models. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 65, 817835.Google Scholar
White, IR, Kalaitzaki, R, Thompson, SG (2011). Allowing for missing outcome data and incomplete uptake of randomized interventions, with application to an internet-based alcohol trial. Statistics in Medicine 30, 31923207.Google Scholar
Wiles, NJ, Mulligan, J, Peters, TJ, Cowen, PJ, Mason, V, Nutt, DJ, Sharp, D, Tallon, D, Thomas, L, O'Donovan, M, Lewis, G (2012). Severity of depression and response to antidepressants: the GENPOD randomised controlled trial. British Journal of Psychiatry 200, 130136.Google Scholar
Zigmond, AS, Snaith, RP (1983). The hospital anxiety and depression scale. Acta Psychiatrica Scandinavica 67, 361370.Google Scholar
Figure 0

Table 1. Baseline and 6-week follow-up scores on the outcome measures according to allocated treatment group, in those who completed the 6-week follow-up

Figure 1

Table 2. Differences in outcomes at 6 weeks from analysis of treatment effectiveness and estimates of efficacy from SMM models that account for differential non-adherence to allocated treatment

Figure 2

Table 3. Examining the effect of different approaches to handling missing outcome data on the difference between treatment groups (estimates of effectiveness) in the presence of differential adherence to treatment

Figure 3

Table 4. Sensitivity analyses around adherence-adjusted instrumental variable (IV) efficacy estimates of the mean difference in outcome between treatment groups

Supplementary material: File

Wiles Supplementary Material

Appendix

Download Wiles Supplementary Material(File)
File 93.7 KB