In a meta-analysis on the selective serotonin reuptake inhibitors (SSRIs) published in BMC Psychiatry (Reference Jakobsen, Katakam, Schou, Hellmuth, Stallknecht, Leth-Møller, Iversen, Banke, Petersen, Klingenberg, Krogh, Ebert, Timm, Lindschou and Gluud1), Jakobsen and co-workers argued that the harm caused by these drugs outweighs any possible beneficial effects. We wrote a commentary in Acta Neuropsychiatrica (Reference Hieronymus, Lisinski, Naslund and Eriksson2), in which we questioned the scientific basis for this claim, to which some of the authors have now responded (Reference Katakam, Sethi, Jakobsen and Gluud3). This is a response to their response.
Bias
Some may regard the many factual inaccuracies as the major problem with the recent Copenhagen Trial Unit (CTU) contributions (Reference Jakobsen, Katakam, Schou, Hellmuth, Stallknecht, Leth-Møller, Iversen, Banke, Petersen, Klingenberg, Krogh, Ebert, Timm, Lindschou and Gluud1,Reference Katakam, Sethi, Jakobsen and Gluud3) to the SSRI debate. We will come back to this. Others may find it even more cumbersome that these alleged experts in how to interpret clinical trials have gone astray with respect to basic statistics, both in their first paper (Reference Jakobsen, Katakam, Schou, Hellmuth, Stallknecht, Leth-Møller, Iversen, Banke, Petersen, Klingenberg, Krogh, Ebert, Timm, Lindschou and Gluud1) and in their response to our comment (Reference Katakam, Sethi, Jakobsen and Gluud3). We will come back to this as well. And yet others might find it particularly difficult to understand how such a large number of relevant trials could escape the authors when scanning the databases, and that many relevant trials still have escaped them. Also this will be commented below.
In our view, the most problematic aspect of this debacle, however, is how the apparent eagerness of the authors to portray the SSRIs as ineffective and harmful make them distort and misquote their own data, both in the BMC Psychiatry paper and when they present their results in lay media. Revealing what the CTU group probably had hoped for when starting this project, first author Janus Jakobsen has hence made numerous appearances in Scandinavian media claiming that he and his co-authors have shown SSRIs to enhance the risk for suicide (4,Reference Jakobsen, Naqash and Gluud5), notwithstanding that they, as Jakobsen must realise, have not shown SSRIs to enhance the risk for suicide. They now acknowledge, in their reply to our comment, that their results in fact do not show the risk of completed suicide to be increased in those given an SSRI, but suggest this outcome to be due to low statistical power, platitudinously pointing out that ‘absence of evidence is not evidence of absence’. But the absence of a significant association between SSRI treatment and suicide in the huge data set of Jakobsen and co-workers is not likely to be due to low power: completed suicide was hence numerically less common in SSRI-treated patients. When arguing that ‘signals of SAEs do not require statistical levels below a certain level to be taken seriously’, they hence forget that their data, however disappointing this may seem to them, did not contain even the faintest signal of an enhanced risk for completed suicide in patients treated with SSRIs.
To justify their claim that SSRIs may enhance the risk for completed suicide, though their own data suggest otherwise, the authors seem to reason as follows: (i) completed suicide is a serious adverse event (SAE), (ii) SAEs are (at least according to their own calculations) more common in patients on SSRIs, (iii) ergo: completed suicide is more common in patients on SSRIs. But the alleged association between SSRI treatment and SAEs, needless to say, does not justify the conclusion that SSRI intake is associated with all possible SAEs. If one defines brain tumor and headache as ‘head-related AEs’, and demonstrate that drug X enhances the risk for headache, one may not conclude that X also causes brain tumour, especially not when a separate analysis of the occurrence of brain tumours lends no support whatsoever for such an assumption. The argumentation from the CTU group on this aspect is puzzling.
The same is true for the issue of ‘death’. While Jakobsen by means of lay media has informed SSRI-medicating depressed patients that their treatment may well kill them (4,Reference Jakobsen, Naqash and Gluud5), the numbers presented in his paper in fact (as far as we can tell) suggest deaths to be numerically less common in patients treated with an SSRI than in those given placebo.
There are numerous additional examples of the reluctance of the CTU researchers to interpret and report their own results in an impartial manner. While they cannot avoid mentioning that the p-value for the superiority of SSRIs over placebo with respect to reduction of the conventional effect parameter, the sum rating on the HDRS17 scale, was <0.00001, they spare no effort to convince the reader that ‘[t]he “true” effect of SSRIs might not even be statistically significant’ (Reference Jakobsen, Katakam, Schou, Hellmuth, Stallknecht, Leth-Møller, Iversen, Banke, Petersen, Klingenberg, Krogh, Ebert, Timm, Lindschou and Gluud1). As the main reason for Jakobsen and co-workers to make the eyebrow-raising claim that a p-value of <0.00001 may well be an artifact is that all the trials included in their analysis displayed a high risk of bias, one may ask why they at all cared to present these data, and now apparently plan to spend time on updating their analysis. Perhaps a brief note stating that, in their view, the risk for bias in all available trials precludes any conclusion regarding the possible efficacy of SSRIs would have sufficed? Readers being concerned over this sombre verdict over the literature in the field may, however, find comfort in a recent comprehensive antidepressant meta-analysis published in the Lancet (Reference Cipriani, Furukawa, Salanti, Chaimani, Atkinson, Ogawa, Leucht, Ruhe, Turner, Higgins, Egger, Takeshima, Hayasaka, Imai, Shinohara, Tajika, Ioannidis and Geddes6) where only 9% of the trials were categorised as being at high risk of bias, and where the moderate risk of bias marring the majority of the trials did not preclude the authors from drawing conclusions regarding efficacy.
Leaving the issue of statistical significance aside, Jakobsen and co-workers also claim that the beneficial effect of SSRIs, be it significant or not, is too miniscule to be of any clinical significance. While they are of course free to advocate this opinion, we believe that less biased authors would have felt obliged to mention, in order to provide a balanced view, that there is indeed an abundant literature (of which the authors are clearly aware: see below) suggesting that using HDRS17 as measure of effect markedly underrates SSRI-induced improvement, that is, that the clinical significance of the effect of SSRIs is likely to be considerably higher than the effect captured using this measure.
When they, on the other hand, manage to produce shaky p-values indicating that SSRIs may cause SAEs (without conducting any sensitivity analyses), it is correspondingly unfortunate that they refrain from discussing any of the many reasons why this observation should be interpreted with caution, for example, (i) that they have limited insight into the actual clinical impact of the SAEs tentatively associated with SSRI treatment, (ii) that their decisions on whether a certain adverse event should be categorised as serious or not were somewhat arbitrary (see Supplementary Material Table 1 for examples), (iii) that the same can be said for their decisions regarding which treatment groups to include (see below) (Reference Adamson, Sellman, Foulds, Frampton, Deering, Dunn, Berks, Nixon and Cape7,Reference Pettinati, Oslin, Kampman, Dundon, Xie, Gallis, Dackis and O’Brien8) and to what extent follow-up phases were to be included in the analysis (see Supplementary Material File 1) (Reference Detke, Wiltse, Mallinckrodt, McNamara, Demitrack and Bitter9–11), (iv) that trial reports often detail potential SAEs in the active treatment group (this being the issue of interest) while not providing corresponding information regarding similar events in patients on placebo (see Supplementary Material File 1) (Reference Feighner and Overo12–Reference Wernicke, Dunlop, Dornseif, Bosomworth and Humbert14), and (v) that SAE data from older trials must generally be interpreted with caution. Well, in their reply, the CTU team does in fact acknowledge that ‘the reporting of SAEs in most of the publications was very poor and incomplete’ (Reference Katakam, Sethi, Jakobsen and Gluud3), but without mentioning that this unfortunate state of affairs renders their own analyses and conclusions, not least with respect to the reporting of SAEs in the placebo groups, correspondingly poor and incomplete.
As a final example of the unfortunate bias characterising the CTU report, it may be mentioned that the authors, according to the exchange with the reviewers made public by BMC Psychiatry, in an early version of their manuscript concluded that there was a ‘high risk of publication bias’ – that is, that their material was skewed in favor of SSRI-positive studies – and presented the significant outcome of an Egger test to back this up. One of the reviewers, however, pointed out that the authors had misread their own analysis: what were actually ‘missing’ from the funnel plot were not studies disfavoring SSRIs, as assumed by Jakobsen and co-workers, but, on the contrary, trials supporting their usefulness (Reference Leucht15). Instead of reporting this unexpected indication of inverse publication bias, Jakobsen and co-workers chose to delete the statement that they had identified a ‘high risk of publication bias’ from the discussion, and refrained from reporting the significant Egger test, instead just stating that ‘visual inspection of the funnel plot did not show clear signs of asymmetry’ (Reference Jakobsen16). This is not how a researcher should deal with unwelcome results.
CTU comments on errors
When assessing the clinical importance of tentative drug-induced adverse events, there are of course two aspects to consider: (i) if they are relatively more prevalent in patients on active treatment than on placebo (i.e. the relative risk expressed, e.g., as an odds ratio) and (ii) how common they are (i.e. the absolute risk expressed, e.g., as a percentage point difference). In our commentary on the CTU paper, we pointed out (three times, actually) that many of the errors and mistakes we had identified in the CTU analysis should not be expected to exert any major influence on the relative risk estimates, but could be assumed to influence the absolute risk estimates. This is, for example, the case when Jakobsen and co-workers (i) do not include all relevant subjects when calculating the risk for SAEs, (ii) use total number of SAEs rather than total number of patients with an SAE as input data, (iii) exclude zero-event trials, and (iv) include outliers such as a study (Reference Pettinati, Oslin, Kampman, Dundon, Xie, Gallis, Dackis and O’Brien8) where the majority of SAEs were related to alcohol abuse. The CTU group triumphantly elaborating that these mistakes do not exert any major influence on relative risk estimates is hence entirely pointless.
Not only in this regard, but throughout their rebuttal, the comments from the CTU group on the errata disclosed in their paper are disappointing. For example, when we point out that they had included a trial with no placebo group (Reference Ball, Snavely, Hargreaves, Szegedi, Lines and Reines17), they reassure the reader that no mistake has been made: ‘as the trial included three groups: (i) aprepitant + paroxetine, (ii) aprepitant and (iii) paroxetine, we correctly considered groups i and ii for our review’. The major problem with this assertion, however, is that it is incorrect; it is hence evident from figures 2, 4, 5 and 6 that the groups considered were not (i) and (ii) but rather aprepitant alone (ii) and paroxetine alone (iii). Moreover, if the authors do indeed regard it appropriate to include treatment groups where an SSRI has been co-administered with another drug, it remains to be explained why they excluded the comparison of naltrexone versus naltrexone plus sertraline from another trial (Reference Pettinati, Oslin, Kampman, Dundon, Xie, Gallis, Dackis and O’Brien8), the inclusion of which would have changed the combined OR for sertraline versus comparator with respect to SAEs from 1.53 (0.59–3.94) to 0.86 (0.43–1.71). Adding to the confusion, they did, on the other hand, include the corresponding comparison of naltrexone versus naltrexone plus citalopram from another study (Reference Adamson, Sellman, Foulds, Frampton, Deering, Dunn, Berks, Nixon and Cape7) where, in contrast, the inclusion of this kind of comparison enhanced the apparent association between SSRI administration and SAEs.
With respect to their omission of an escitalopram arm in trial SCT-MD-01, we are told by the CTU group in their response that ‘there were no SAEs in the escitalopram 10 mg and placebo group’. This is a surprising statement as there were indeed two SAEs in the placebo group (18). But what the authors also fail to acknowledge is that they, by excluding SSRI-treated patients without SAEs, while retaining all placebo-treated patients, inflated the apparent rate of SAEs on SSRI treatment. Thus, while excluding the SAE-free escitalopram 10 mg arm yields a combined OR for study SCT-MD-01 of 0.97 (0.17–10.95), retaining all arms instead yields an OR of 0.67 (0.12–3.70). We are, however, pleased to note that this mistake has been discreetly corrected, but without further comment, in their reply (see table 2).
To justify the exclusion of female-specific SAEs in study GSK/810, which were more common in patients on placebo (19), the CTU group claims that ‘it was not clear whether the same participants had any other SAEs that were reported in the main table in that study report’. However, when extracting data from similar GSK study reports presenting separate sets of fatal and non-fatal SAEs, respectively (20,21), the CTU group adopted the opposite policy, that is, it included both types of SAEs, though they may well have occurred in the same individuals, which of course makes their post-hoc explanation for excluding the female-specific SAEs in GSK/810 less credible than one had hoped for. When female-specific SAEs are excluded in GSK/810, the combined OR is 1.13 (0.29–4.45); when they are included it drops to 0.77 (0.25–2.40).
Additional errors
When we pointed out that the CTU report was marred by a large number of factual errors and inconsistencies, we hoped that the authors should realise that they had underestimated the difficulties of this endeavor, and that their results were far too shaky to permit the claims they have been trumpeting. But instead they seem to find comfort in the fact that they have now managed to obtain new significant p-values, after correcting a selection of the inaccuracies that we mentioned. This line of reasoning, however, misses the point of our previous criticism: the many errata we listed in our comment were hence just examples, to be seen as illustrations of a flawed process, which does not become hunky dory just by the correction of some of these examples.
For brevity, and to avoid boring the reader, we will not elaborate on the many additional mistakes, apart from those previously mentioned, that can be found in the original CTU paper. We, however, do provide some more examples in Supplementary Material File 1. It should, however, be underlined that these are also just illustrative samples that were identified upon our relatively cursory review; anyone caring to take a closer look would probably find more to add to the errata list.
It may seem petty to dissect the many errors unfortunately marring the paper from the CTU group. However, if one, like Jakobsen and co-workers, invoke unimpressive p-values to question the opinion of a vast majority of researchers in the field, as well as of medical authorities throughout the world, with respect to a treatment that by many is regarded as life-saving, it is preferable to get one’s numbers right.
Missing trials
When denouncing previous meta-analyses in this field, Jakobsen and co-workers in their BMC Psychiatry paper name ‘not searching all relevant databases’ as one reason to discard these reports. In our comment, we hence found it relevant to mention some of the many trials they had missed themselves. In their reply, the CTU group expresses surprise over the fact that we did not include these trials when we, after correcting for other mistakes, tried to replicate their statistical analyses.
This criticism is unjustified. We repeated the CTU analysis using the same trials that were included in their analysis merely to demonstrate the lack of robustness in their result: when rectifying the methodological mistakes we had identified upon our first review, the p-value that was the basis for their claim of fame turned from significant to non-significant. It has, however, never been our ambition to provide them with an exhaustive list of missed publications, or to correct for all possible errors, or to conduct a comprehensive meta-analysis of our own.
Just as it would not have made any sense for us to add a number of additional, unsystematically assembled data to our re-analysis, it is not very helpful that the CTU group has now conducted a new analysis to which they have added the trials mentioned by us, as well as some they have now identified by themselves, as yet a large number of trials are missing. For example, while the CTU group, during their second scanning of the relevant registries, managed to locate two Eli Lilly-sponsored studies, HMAQa (22) and HMATb (23), they seem to have missed two other studies, HMAQb (24) and HMATa (25), in the same repository; it so happens that the two trials they found were both compatible with the assumption that SAEs are more common in SSRI-treated subjects while those they missed, on the contrary, suggest SAEs to be less common in subjects on SSRIs. Likewise, when fine-combing the GSK registry, they again seem to have missed some apparently relevant studies (26,27). Moreover, while including one trial of a substance P antagonist that did not include a placebo arm (Reference Ball, Snavely, Hargreaves, Szegedi, Lines and Reines17), they missed four additional studies regarding the same drug that were actually both placebo- and paroxetine-controlled (Reference Keller, Montgomery, Ball, Morrison, Snavely, Liu, Hargreaves, Hietala, Lines, Beebe and Reines28,Reference Liu, Snavely, Ball, Lines, Reines and Potter29). And while the CTU team did find SAE data for a placebo-controlled reboxetine trial using an SSRI as comparator, they failed to identify efficacy data for the same study, and they also failed to identify three unpublished reboxetine studies also including SSRI and placebo arms (30–Reference Massana34). And these are just a selection from the smorgasbord of missed studies (where one may also, to mention just some additional examples, find refs (35–40)).
Statistics
In their response the authors confirm that they deviated from the protocol with respect to how their data were analysed, but seem to take this incident unexpectedly light-heartedly, given that deviating from the protocol is usually regarded as a felony of the gravest kind by Cochranists. Had we been Jakobsen and co-workers, we would have found this lapse somewhat embarrassing, particularly as they explicitly stated in their original report that ‘[t]he methodology was not changed after the analysis of the review results began’. But changed it was, and had it not been changed, the results would not have been those that Jakobsen and co-workers must have hoped for. We also note with interest the reason now presented for why they changed statistical techniques as compared with what was stated in the protocol: it was due to the fact that SAEs in SSRI trials were more rare than Jakobsen and co-workers had anticipated. That SAEs were unexpectedly rare in SSRI trials have certainly not been the impression conveyed when they subsequently commented on their results in lay media.
In their rebuttal, the CTU group denounces our attempt to re-analyse their SAE data after correcting for some of their many errors as we used the software they reported to have used themselves (RevMan 5.3) rather than the one they did in fact use, that is STATA. It is correct that the continuity correction method used by RevMan may make the relative prevalence of SAEs in placebo groups appear higher than it actually is as these groups are usually smaller, but Katakam and co-authors are mistaken when assuming this to be the major source of the discrepancy between the results obtained with RevMan and STATA, respectively. Instead, the major reason for this incongruity is the fact that the Maentel-Haenszel procedure in STATA is a fixed-effect one: when the authors claimed that they had used the random-effect implementation in RevMan for this analysis they hence misinform the reader twice: it was not RevMan and it was not a random-effect procedure.
With regards to the use of reciprocal zero-cell correction, the CTU group has again failed to implement the procedure according to the recommendations in the cited paper (Reference Sweeting, Sutton and Lambert41). The simple principle behind the method of Sweeting is the following: if one arm is, for example, three times the size of the other, the continuity correction factor (CCF) added to that arm should be three times as big; this is done so that relatively more events are not added to the smaller arm. However, according to the simulation studies detailed in the same paper, for this method to perform optimally, the size of the CCF should be constrained to sum to 1, which seems to have escaped the CTU group. While it is impossible to assess the extent or direction of bias introduced by choosing a particular statistical method in a material such as this, which comprises different age groups, treatments, doses, outcomes, comorbidities and treatment imbalances, the haphazard implementation and presentation of statistics marring the CTU paper is unfortunate as such choices do impact the results of the analyses.
In their reply (Reference Katakam, Sethi, Jakobsen and Gluud3), the CTU states that ‘once it was evident that SAEs were rare events, we followed the Cochrane methodology (Reference Jakobsen, Nielsen, Feinberg, Katakam, Fobian, Hauser, Poropat, Djurisic, Weiss, Bjelakovic, Bjelakovic, Klingenberg, Liu, Nikolova, Koretz and Gluud55) and the method recommended by Sweeting et al.’ Is that really so? If they had indeed come to understand that STATA was the strategy of choice when the studied event is rare, it is surprising that they did not re-write the Methods section accordingly, but even more surprising that they seem to have refrained from using Sweeting’s method for events that were even rarer than SAEs in general, such as individual adverse events including suicides, suicide attempts and suicidal ideation. One rather gets the impression that Jakobsen and co-workers, after having used RevMan 5.3 for the analyses, tested also STATA for the one comparison they found particularly important, that is the analysis of SAEs, and found this method to be more generous (turning a non-significance of 0.069, our calculation, into significance), but without realising that it was the shift to a fixed-effects model that did the trick.
Do SSRIs cause SAEs?
The aim of our re-analysis of the CTU data in our previous comment on this issue, in which we found no main effect of SSRIs on SAE frequency, but a heightened risk in the elderly subgroup, was not to provide a final answer on the issue of a possible association between SSRIs and SAEs, but to cast light on the fragility of the alleged effect trumpeted in Scandinavian media by Dr. Jakobsen. Instead of being discouraged by this setback, the CTU group now presents a number of new analyses where they have corrected some (but not all) of their mistakes: in one of these they have also included data from the missed trials identified by us plus data from trials that they had previously identified but where relevant results apparently had escaped both independent reviewers (Reference Claghorn, Earl, Walczak, Stoner, Wong, Kanter and Houser13,Reference Kranzler, Mueller, Cornelius, Pettinati, Moak, Martin, Anthenelli, Brower, O’Malley, Mason, Hasin and Keller42–Reference Sramek, Kashkin, Jasinsky, Kardatzke, Kennedy and Cutler47) as well as data from additional studies that they had previously missed. Obtaining a p-value for the difference between SSRIs and placebo with respect to SAEs in the non-elderly of 0.045, the authors conclude that their original finding is more robust than ever. Considering that still many trials and errors have escaped them, we are, however, less impressed.
To illustrate the lack of robustness of the significance making the CTU more convinced than ever before on the harm of the SSRIs, we now present the results from sensitivity analyses of four overlapping populations: (i) all relevant studies from the original publication, (ii) all pertinent studies present in the reply by the CTU group, (iii) all studies in (ii) as well as the above-mentioned missed studies from Eli Lilly (24,25), GSK (26,27), Merck (Reference Keller, Montgomery, Ball, Morrison, Snavely, Liu, Hargreaves, Hietala, Lines, Beebe and Reines28,Reference Liu, Snavely, Ball, Lines, Reines and Potter29) and Pharmacia&Upjohn (30–Reference Eyding, Lelgemann, Grouven, Härter, Kromp, Kaiser, Kerekes, Gerken and Wieseler33), that is, trials belonging to development programmes from which the CTU group had included some but not all relevant trials, and from which we are reasonably certain that we have now obtained the full set of pertinent studies, and (iv) all trials in (iii) as well as the additional examples of missed trials (35–40) that we provide in this reply. Six trials deemed eligible by the CTU were excluded from all analyses: three trials for not presenting SAEs and/or selectively presenting potential SAEs (Reference Adamson, Sellman, Foulds, Frampton, Deering, Dunn, Berks, Nixon and Cape7,Reference Claghorn, Earl, Walczak, Stoner, Wong, Kanter and Houser13,Reference Ravindran, Teehan, Bakish, Yatham, O’Reilly, Fernando, Manchanda, Charbonneau and Buttars48), one trial for not being placebo-controlled (Reference Ball, Snavely, Hargreaves, Szegedi, Lines and Reines17), one trial for being partially uncontrolled (49), and one study (Reference Mancino, McGaugh, Chopra, Guise, Cargile, Williams, Thostenson, Kosten, Sanders and Oliveto50) in which there did not occur any SAEs in the relevant arms according to the study report on ClinicalTrials.gov (51). All analyses were carried out using reciprocal zero-cell correction, with the CCF constrained to sum to 1 and using a fixed-effect Maentel-Haenszel implementation in STATA. All study-level changes to the material used by the CTU are detailed in Supplementary Material File 2.
As shown in Table 1, when analysing all studies without accounting for age, the association between SSRIs and SAEs is significant only for the second population (OR 1.24, 1.01 to 1.52; p=0.039). When stratifying by age, the association is non-significant for all four sensitivity populations in the non-elderly subgroup (p range: 0.40–0.82) but significant for all four populations in the elderly subgroup (p range: 0.007–0.011). As discussed in our previous commentary (Reference Hieronymus, Lisinski, Naslund and Eriksson2), it should, however, be noted that the possible clinical significance of SSRI-related SAEs in the elderly remains unknown.
Table 1 Sensitivity analyses
We again emphasise that it is not our ambition to provide definitive p-values – there may still be many trials that have been overlooked, and there are several crucial caveats related to the SAE reporting that are difficult to address. After having conducted these analyses, we, however, remain convinced that the data presented by the CTU group does not justify a reconsideration of the conventional view regarding the tolerability of the SSRIs.
Effect
With respect to the issue of efficacy, the CTU group in their rebuttal defends the use of HDRS17 as a measure of response, and rejects the use of an alternative measure, HDRS6, that according to numerous trials is less psychometrically and conceptually flawed (Reference Timmerby, Andersen, Sondergaard, Ostergaard and Bech52). Their main argument for this stance seems to be that the HDRS6 has not been validated against ‘patient-centered clinically relevant outcomes (e.g. suicidality; suicide; deaths)’. This is, however, a consternating argument, as two of the authors, Jakobsen and Gluud, recently failed to validate the measure they seem to favor, that is, HDRS17, against suicide and suicide attempts (Reference Jakobsen, Simonsen, Rasmussen and Gluud53). Their conclusion in that paper was the following: ‘Other publications […] have concluded that the HDRS scale is heterogeneous and that the scale is psychometrically and conceptually flawed […] There seems to be a need for other more clinically relevant assessment methods’. When writing their BMC Psychiatry paper, the authors were hence well aware of the shortcomings marring the HDRS17, and that these shortcomings have been suggested to make the difference between active drug and placebo appear smaller than it actually is, but refrained from mentioning this important caveat.
We again conclude that the efficacy data presented by the CTU group do not add much to what has previously been reported by others, and that the authors are mistaken when arguing that these results suggest the effect of SSRIs to be clinically insignificant. As elaborated elsewhere (Reference Hieronymus, Emilsson, Nilsson and Eriksson54), not just the use of the HDRS17 as measure of effect, but also many other methodological problems marring antidepressant trials, can be expected to make SSRIs appear less effective than they actually are.
Concluding remarks
We regret to conclude that the response from the CTU group is on par with their original contribution in terms of inaccuracies, misleading statements and bias. On the CTU web page it is stated that the systematic review represents the highest form of publication in terms of quality of evidence. What this episode illustrates is that systematic reviews and meta-analyses, when conducted without the required rigor and impartiality, on the contrary may be grossly misleading. Moreover, like a recent, similarly flawed and harshly criticised CTU analysis regarding treatment of hepatitis C (Reference Jakobsen, Nielsen, Feinberg, Katakam, Fobian, Hauser, Poropat, Djurisic, Weiss, Bjelakovic, Bjelakovic, Klingenberg, Liu, Nikolova, Koretz and Gluud55), it also shows that, when analysing treatment trials, interest in Cochrane checklists and handbooks can never substitute for actual insight into the subject of study, in this case psychiatry and psychopharmacology.
Funded by the Danish state, the CTU is a body with alleged expertise in evidence-based medicine with the task to provide impartial guidance to society in health-care issues. It is hence problematic when CTU researchers produce and disseminate questionable data to discourage the public from the use of effective medication for disorders as severe as depression (Reference Jakobsen, Katakam, Schou, Hellmuth, Stallknecht, Leth-Møller, Iversen, Banke, Petersen, Klingenberg, Krogh, Ebert, Timm, Lindschou and Gluud1) and hepatitis C (Reference Jakobsen, Nielsen, Feinberg, Katakam, Fobian, Hauser, Poropat, Djurisic, Weiss, Bjelakovic, Bjelakovic, Klingenberg, Liu, Nikolova, Koretz and Gluud55). To summarise, the CTU team has not shown completed suicide or death in general to be more common in patients on SSRIs, their data do not justify the claim that SAEs are more common in subjects treated with an SSRI regardless of age, and they have not provided any new information that justifies a re-evaluation of the efficacy of the SSRIs. To regain credibility, we recommend Jakobsen and co-workers to retract their BMC Psychiatry paper, because of its many errata, and to clarify that their public questioning of antidepressants was unfounded.
In a meta-analysis on the selective serotonin reuptake inhibitors (SSRIs) published in BMC Psychiatry (Reference Jakobsen, Katakam, Schou, Hellmuth, Stallknecht, Leth-Møller, Iversen, Banke, Petersen, Klingenberg, Krogh, Ebert, Timm, Lindschou and Gluud1), Jakobsen and co-workers argued that the harm caused by these drugs outweighs any possible beneficial effects. We wrote a commentary in Acta Neuropsychiatrica (Reference Hieronymus, Lisinski, Naslund and Eriksson2), in which we questioned the scientific basis for this claim, to which some of the authors have now responded (Reference Katakam, Sethi, Jakobsen and Gluud3). This is a response to their response.
Bias
Some may regard the many factual inaccuracies as the major problem with the recent Copenhagen Trial Unit (CTU) contributions (Reference Jakobsen, Katakam, Schou, Hellmuth, Stallknecht, Leth-Møller, Iversen, Banke, Petersen, Klingenberg, Krogh, Ebert, Timm, Lindschou and Gluud1,Reference Katakam, Sethi, Jakobsen and Gluud3) to the SSRI debate. We will come back to this. Others may find it even more cumbersome that these alleged experts in how to interpret clinical trials have gone astray with respect to basic statistics, both in their first paper (Reference Jakobsen, Katakam, Schou, Hellmuth, Stallknecht, Leth-Møller, Iversen, Banke, Petersen, Klingenberg, Krogh, Ebert, Timm, Lindschou and Gluud1) and in their response to our comment (Reference Katakam, Sethi, Jakobsen and Gluud3). We will come back to this as well. And yet others might find it particularly difficult to understand how such a large number of relevant trials could escape the authors when scanning the databases, and that many relevant trials still have escaped them. Also this will be commented below.
In our view, the most problematic aspect of this debacle, however, is how the apparent eagerness of the authors to portray the SSRIs as ineffective and harmful make them distort and misquote their own data, both in the BMC Psychiatry paper and when they present their results in lay media. Revealing what the CTU group probably had hoped for when starting this project, first author Janus Jakobsen has hence made numerous appearances in Scandinavian media claiming that he and his co-authors have shown SSRIs to enhance the risk for suicide (4,Reference Jakobsen, Naqash and Gluud5), notwithstanding that they, as Jakobsen must realise, have not shown SSRIs to enhance the risk for suicide. They now acknowledge, in their reply to our comment, that their results in fact do not show the risk of completed suicide to be increased in those given an SSRI, but suggest this outcome to be due to low statistical power, platitudinously pointing out that ‘absence of evidence is not evidence of absence’. But the absence of a significant association between SSRI treatment and suicide in the huge data set of Jakobsen and co-workers is not likely to be due to low power: completed suicide was hence numerically less common in SSRI-treated patients. When arguing that ‘signals of SAEs do not require statistical levels below a certain level to be taken seriously’, they hence forget that their data, however disappointing this may seem to them, did not contain even the faintest signal of an enhanced risk for completed suicide in patients treated with SSRIs.
To justify their claim that SSRIs may enhance the risk for completed suicide, though their own data suggest otherwise, the authors seem to reason as follows: (i) completed suicide is a serious adverse event (SAE), (ii) SAEs are (at least according to their own calculations) more common in patients on SSRIs, (iii) ergo: completed suicide is more common in patients on SSRIs. But the alleged association between SSRI treatment and SAEs, needless to say, does not justify the conclusion that SSRI intake is associated with all possible SAEs. If one defines brain tumor and headache as ‘head-related AEs’, and demonstrate that drug X enhances the risk for headache, one may not conclude that X also causes brain tumour, especially not when a separate analysis of the occurrence of brain tumours lends no support whatsoever for such an assumption. The argumentation from the CTU group on this aspect is puzzling.
The same is true for the issue of ‘death’. While Jakobsen by means of lay media has informed SSRI-medicating depressed patients that their treatment may well kill them (4,Reference Jakobsen, Naqash and Gluud5), the numbers presented in his paper in fact (as far as we can tell) suggest deaths to be numerically less common in patients treated with an SSRI than in those given placebo.
There are numerous additional examples of the reluctance of the CTU researchers to interpret and report their own results in an impartial manner. While they cannot avoid mentioning that the p-value for the superiority of SSRIs over placebo with respect to reduction of the conventional effect parameter, the sum rating on the HDRS17 scale, was <0.00001, they spare no effort to convince the reader that ‘[t]he “true” effect of SSRIs might not even be statistically significant’ (Reference Jakobsen, Katakam, Schou, Hellmuth, Stallknecht, Leth-Møller, Iversen, Banke, Petersen, Klingenberg, Krogh, Ebert, Timm, Lindschou and Gluud1). As the main reason for Jakobsen and co-workers to make the eyebrow-raising claim that a p-value of <0.00001 may well be an artifact is that all the trials included in their analysis displayed a high risk of bias, one may ask why they at all cared to present these data, and now apparently plan to spend time on updating their analysis. Perhaps a brief note stating that, in their view, the risk for bias in all available trials precludes any conclusion regarding the possible efficacy of SSRIs would have sufficed? Readers being concerned over this sombre verdict over the literature in the field may, however, find comfort in a recent comprehensive antidepressant meta-analysis published in the Lancet (Reference Cipriani, Furukawa, Salanti, Chaimani, Atkinson, Ogawa, Leucht, Ruhe, Turner, Higgins, Egger, Takeshima, Hayasaka, Imai, Shinohara, Tajika, Ioannidis and Geddes6) where only 9% of the trials were categorised as being at high risk of bias, and where the moderate risk of bias marring the majority of the trials did not preclude the authors from drawing conclusions regarding efficacy.
Leaving the issue of statistical significance aside, Jakobsen and co-workers also claim that the beneficial effect of SSRIs, be it significant or not, is too miniscule to be of any clinical significance. While they are of course free to advocate this opinion, we believe that less biased authors would have felt obliged to mention, in order to provide a balanced view, that there is indeed an abundant literature (of which the authors are clearly aware: see below) suggesting that using HDRS17 as measure of effect markedly underrates SSRI-induced improvement, that is, that the clinical significance of the effect of SSRIs is likely to be considerably higher than the effect captured using this measure.
When they, on the other hand, manage to produce shaky p-values indicating that SSRIs may cause SAEs (without conducting any sensitivity analyses), it is correspondingly unfortunate that they refrain from discussing any of the many reasons why this observation should be interpreted with caution, for example, (i) that they have limited insight into the actual clinical impact of the SAEs tentatively associated with SSRI treatment, (ii) that their decisions on whether a certain adverse event should be categorised as serious or not were somewhat arbitrary (see Supplementary Material Table 1 for examples), (iii) that the same can be said for their decisions regarding which treatment groups to include (see below) (Reference Adamson, Sellman, Foulds, Frampton, Deering, Dunn, Berks, Nixon and Cape7,Reference Pettinati, Oslin, Kampman, Dundon, Xie, Gallis, Dackis and O’Brien8) and to what extent follow-up phases were to be included in the analysis (see Supplementary Material File 1) (Reference Detke, Wiltse, Mallinckrodt, McNamara, Demitrack and Bitter9–11), (iv) that trial reports often detail potential SAEs in the active treatment group (this being the issue of interest) while not providing corresponding information regarding similar events in patients on placebo (see Supplementary Material File 1) (Reference Feighner and Overo12–Reference Wernicke, Dunlop, Dornseif, Bosomworth and Humbert14), and (v) that SAE data from older trials must generally be interpreted with caution. Well, in their reply, the CTU team does in fact acknowledge that ‘the reporting of SAEs in most of the publications was very poor and incomplete’ (Reference Katakam, Sethi, Jakobsen and Gluud3), but without mentioning that this unfortunate state of affairs renders their own analyses and conclusions, not least with respect to the reporting of SAEs in the placebo groups, correspondingly poor and incomplete.
As a final example of the unfortunate bias characterising the CTU report, it may be mentioned that the authors, according to the exchange with the reviewers made public by BMC Psychiatry, in an early version of their manuscript concluded that there was a ‘high risk of publication bias’ – that is, that their material was skewed in favor of SSRI-positive studies – and presented the significant outcome of an Egger test to back this up. One of the reviewers, however, pointed out that the authors had misread their own analysis: what were actually ‘missing’ from the funnel plot were not studies disfavoring SSRIs, as assumed by Jakobsen and co-workers, but, on the contrary, trials supporting their usefulness (Reference Leucht15). Instead of reporting this unexpected indication of inverse publication bias, Jakobsen and co-workers chose to delete the statement that they had identified a ‘high risk of publication bias’ from the discussion, and refrained from reporting the significant Egger test, instead just stating that ‘visual inspection of the funnel plot did not show clear signs of asymmetry’ (Reference Jakobsen16). This is not how a researcher should deal with unwelcome results.
CTU comments on errors
When assessing the clinical importance of tentative drug-induced adverse events, there are of course two aspects to consider: (i) if they are relatively more prevalent in patients on active treatment than on placebo (i.e. the relative risk expressed, e.g., as an odds ratio) and (ii) how common they are (i.e. the absolute risk expressed, e.g., as a percentage point difference). In our commentary on the CTU paper, we pointed out (three times, actually) that many of the errors and mistakes we had identified in the CTU analysis should not be expected to exert any major influence on the relative risk estimates, but could be assumed to influence the absolute risk estimates. This is, for example, the case when Jakobsen and co-workers (i) do not include all relevant subjects when calculating the risk for SAEs, (ii) use total number of SAEs rather than total number of patients with an SAE as input data, (iii) exclude zero-event trials, and (iv) include outliers such as a study (Reference Pettinati, Oslin, Kampman, Dundon, Xie, Gallis, Dackis and O’Brien8) where the majority of SAEs were related to alcohol abuse. The CTU group triumphantly elaborating that these mistakes do not exert any major influence on relative risk estimates is hence entirely pointless.
Not only in this regard, but throughout their rebuttal, the comments from the CTU group on the errata disclosed in their paper are disappointing. For example, when we point out that they had included a trial with no placebo group (Reference Ball, Snavely, Hargreaves, Szegedi, Lines and Reines17), they reassure the reader that no mistake has been made: ‘as the trial included three groups: (i) aprepitant + paroxetine, (ii) aprepitant and (iii) paroxetine, we correctly considered groups i and ii for our review’. The major problem with this assertion, however, is that it is incorrect; it is hence evident from figures 2, 4, 5 and 6 that the groups considered were not (i) and (ii) but rather aprepitant alone (ii) and paroxetine alone (iii). Moreover, if the authors do indeed regard it appropriate to include treatment groups where an SSRI has been co-administered with another drug, it remains to be explained why they excluded the comparison of naltrexone versus naltrexone plus sertraline from another trial (Reference Pettinati, Oslin, Kampman, Dundon, Xie, Gallis, Dackis and O’Brien8), the inclusion of which would have changed the combined OR for sertraline versus comparator with respect to SAEs from 1.53 (0.59–3.94) to 0.86 (0.43–1.71). Adding to the confusion, they did, on the other hand, include the corresponding comparison of naltrexone versus naltrexone plus citalopram from another study (Reference Adamson, Sellman, Foulds, Frampton, Deering, Dunn, Berks, Nixon and Cape7) where, in contrast, the inclusion of this kind of comparison enhanced the apparent association between SSRI administration and SAEs.
With respect to their omission of an escitalopram arm in trial SCT-MD-01, we are told by the CTU group in their response that ‘there were no SAEs in the escitalopram 10 mg and placebo group’. This is a surprising statement as there were indeed two SAEs in the placebo group (18). But what the authors also fail to acknowledge is that they, by excluding SSRI-treated patients without SAEs, while retaining all placebo-treated patients, inflated the apparent rate of SAEs on SSRI treatment. Thus, while excluding the SAE-free escitalopram 10 mg arm yields a combined OR for study SCT-MD-01 of 0.97 (0.17–10.95), retaining all arms instead yields an OR of 0.67 (0.12–3.70). We are, however, pleased to note that this mistake has been discreetly corrected, but without further comment, in their reply (see table 2).
To justify the exclusion of female-specific SAEs in study GSK/810, which were more common in patients on placebo (19), the CTU group claims that ‘it was not clear whether the same participants had any other SAEs that were reported in the main table in that study report’. However, when extracting data from similar GSK study reports presenting separate sets of fatal and non-fatal SAEs, respectively (20,21), the CTU group adopted the opposite policy, that is, it included both types of SAEs, though they may well have occurred in the same individuals, which of course makes their post-hoc explanation for excluding the female-specific SAEs in GSK/810 less credible than one had hoped for. When female-specific SAEs are excluded in GSK/810, the combined OR is 1.13 (0.29–4.45); when they are included it drops to 0.77 (0.25–2.40).
Additional errors
When we pointed out that the CTU report was marred by a large number of factual errors and inconsistencies, we hoped that the authors should realise that they had underestimated the difficulties of this endeavor, and that their results were far too shaky to permit the claims they have been trumpeting. But instead they seem to find comfort in the fact that they have now managed to obtain new significant p-values, after correcting a selection of the inaccuracies that we mentioned. This line of reasoning, however, misses the point of our previous criticism: the many errata we listed in our comment were hence just examples, to be seen as illustrations of a flawed process, which does not become hunky dory just by the correction of some of these examples.
For brevity, and to avoid boring the reader, we will not elaborate on the many additional mistakes, apart from those previously mentioned, that can be found in the original CTU paper. We, however, do provide some more examples in Supplementary Material File 1. It should, however, be underlined that these are also just illustrative samples that were identified upon our relatively cursory review; anyone caring to take a closer look would probably find more to add to the errata list.
It may seem petty to dissect the many errors unfortunately marring the paper from the CTU group. However, if one, like Jakobsen and co-workers, invoke unimpressive p-values to question the opinion of a vast majority of researchers in the field, as well as of medical authorities throughout the world, with respect to a treatment that by many is regarded as life-saving, it is preferable to get one’s numbers right.
Missing trials
When denouncing previous meta-analyses in this field, Jakobsen and co-workers in their BMC Psychiatry paper name ‘not searching all relevant databases’ as one reason to discard these reports. In our comment, we hence found it relevant to mention some of the many trials they had missed themselves. In their reply, the CTU group expresses surprise over the fact that we did not include these trials when we, after correcting for other mistakes, tried to replicate their statistical analyses.
This criticism is unjustified. We repeated the CTU analysis using the same trials that were included in their analysis merely to demonstrate the lack of robustness in their result: when rectifying the methodological mistakes we had identified upon our first review, the p-value that was the basis for their claim of fame turned from significant to non-significant. It has, however, never been our ambition to provide them with an exhaustive list of missed publications, or to correct for all possible errors, or to conduct a comprehensive meta-analysis of our own.
Just as it would not have made any sense for us to add a number of additional, unsystematically assembled data to our re-analysis, it is not very helpful that the CTU group has now conducted a new analysis to which they have added the trials mentioned by us, as well as some they have now identified by themselves, as yet a large number of trials are missing. For example, while the CTU group, during their second scanning of the relevant registries, managed to locate two Eli Lilly-sponsored studies, HMAQa (22) and HMATb (23), they seem to have missed two other studies, HMAQb (24) and HMATa (25), in the same repository; it so happens that the two trials they found were both compatible with the assumption that SAEs are more common in SSRI-treated subjects while those they missed, on the contrary, suggest SAEs to be less common in subjects on SSRIs. Likewise, when fine-combing the GSK registry, they again seem to have missed some apparently relevant studies (26,27). Moreover, while including one trial of a substance P antagonist that did not include a placebo arm (Reference Ball, Snavely, Hargreaves, Szegedi, Lines and Reines17), they missed four additional studies regarding the same drug that were actually both placebo- and paroxetine-controlled (Reference Keller, Montgomery, Ball, Morrison, Snavely, Liu, Hargreaves, Hietala, Lines, Beebe and Reines28,Reference Liu, Snavely, Ball, Lines, Reines and Potter29). And while the CTU team did find SAE data for a placebo-controlled reboxetine trial using an SSRI as comparator, they failed to identify efficacy data for the same study, and they also failed to identify three unpublished reboxetine studies also including SSRI and placebo arms (30–Reference Massana34). And these are just a selection from the smorgasbord of missed studies (where one may also, to mention just some additional examples, find refs (35–40)).
Statistics
In their response the authors confirm that they deviated from the protocol with respect to how their data were analysed, but seem to take this incident unexpectedly light-heartedly, given that deviating from the protocol is usually regarded as a felony of the gravest kind by Cochranists. Had we been Jakobsen and co-workers, we would have found this lapse somewhat embarrassing, particularly as they explicitly stated in their original report that ‘[t]he methodology was not changed after the analysis of the review results began’. But changed it was, and had it not been changed, the results would not have been those that Jakobsen and co-workers must have hoped for. We also note with interest the reason now presented for why they changed statistical techniques as compared with what was stated in the protocol: it was due to the fact that SAEs in SSRI trials were more rare than Jakobsen and co-workers had anticipated. That SAEs were unexpectedly rare in SSRI trials have certainly not been the impression conveyed when they subsequently commented on their results in lay media.
In their rebuttal, the CTU group denounces our attempt to re-analyse their SAE data after correcting for some of their many errors as we used the software they reported to have used themselves (RevMan 5.3) rather than the one they did in fact use, that is STATA. It is correct that the continuity correction method used by RevMan may make the relative prevalence of SAEs in placebo groups appear higher than it actually is as these groups are usually smaller, but Katakam and co-authors are mistaken when assuming this to be the major source of the discrepancy between the results obtained with RevMan and STATA, respectively. Instead, the major reason for this incongruity is the fact that the Maentel-Haenszel procedure in STATA is a fixed-effect one: when the authors claimed that they had used the random-effect implementation in RevMan for this analysis they hence misinform the reader twice: it was not RevMan and it was not a random-effect procedure.
With regards to the use of reciprocal zero-cell correction, the CTU group has again failed to implement the procedure according to the recommendations in the cited paper (Reference Sweeting, Sutton and Lambert41). The simple principle behind the method of Sweeting is the following: if one arm is, for example, three times the size of the other, the continuity correction factor (CCF) added to that arm should be three times as big; this is done so that relatively more events are not added to the smaller arm. However, according to the simulation studies detailed in the same paper, for this method to perform optimally, the size of the CCF should be constrained to sum to 1, which seems to have escaped the CTU group. While it is impossible to assess the extent or direction of bias introduced by choosing a particular statistical method in a material such as this, which comprises different age groups, treatments, doses, outcomes, comorbidities and treatment imbalances, the haphazard implementation and presentation of statistics marring the CTU paper is unfortunate as such choices do impact the results of the analyses.
In their reply (Reference Katakam, Sethi, Jakobsen and Gluud3), the CTU states that ‘once it was evident that SAEs were rare events, we followed the Cochrane methodology (Reference Jakobsen, Nielsen, Feinberg, Katakam, Fobian, Hauser, Poropat, Djurisic, Weiss, Bjelakovic, Bjelakovic, Klingenberg, Liu, Nikolova, Koretz and Gluud55) and the method recommended by Sweeting et al.’ Is that really so? If they had indeed come to understand that STATA was the strategy of choice when the studied event is rare, it is surprising that they did not re-write the Methods section accordingly, but even more surprising that they seem to have refrained from using Sweeting’s method for events that were even rarer than SAEs in general, such as individual adverse events including suicides, suicide attempts and suicidal ideation. One rather gets the impression that Jakobsen and co-workers, after having used RevMan 5.3 for the analyses, tested also STATA for the one comparison they found particularly important, that is the analysis of SAEs, and found this method to be more generous (turning a non-significance of 0.069, our calculation, into significance), but without realising that it was the shift to a fixed-effects model that did the trick.
Do SSRIs cause SAEs?
The aim of our re-analysis of the CTU data in our previous comment on this issue, in which we found no main effect of SSRIs on SAE frequency, but a heightened risk in the elderly subgroup, was not to provide a final answer on the issue of a possible association between SSRIs and SAEs, but to cast light on the fragility of the alleged effect trumpeted in Scandinavian media by Dr. Jakobsen. Instead of being discouraged by this setback, the CTU group now presents a number of new analyses where they have corrected some (but not all) of their mistakes: in one of these they have also included data from the missed trials identified by us plus data from trials that they had previously identified but where relevant results apparently had escaped both independent reviewers (Reference Claghorn, Earl, Walczak, Stoner, Wong, Kanter and Houser13,Reference Kranzler, Mueller, Cornelius, Pettinati, Moak, Martin, Anthenelli, Brower, O’Malley, Mason, Hasin and Keller42–Reference Sramek, Kashkin, Jasinsky, Kardatzke, Kennedy and Cutler47) as well as data from additional studies that they had previously missed. Obtaining a p-value for the difference between SSRIs and placebo with respect to SAEs in the non-elderly of 0.045, the authors conclude that their original finding is more robust than ever. Considering that still many trials and errors have escaped them, we are, however, less impressed.
To illustrate the lack of robustness of the significance making the CTU more convinced than ever before on the harm of the SSRIs, we now present the results from sensitivity analyses of four overlapping populations: (i) all relevant studies from the original publication, (ii) all pertinent studies present in the reply by the CTU group, (iii) all studies in (ii) as well as the above-mentioned missed studies from Eli Lilly (24,25), GSK (26,27), Merck (Reference Keller, Montgomery, Ball, Morrison, Snavely, Liu, Hargreaves, Hietala, Lines, Beebe and Reines28,Reference Liu, Snavely, Ball, Lines, Reines and Potter29) and Pharmacia&Upjohn (30–Reference Eyding, Lelgemann, Grouven, Härter, Kromp, Kaiser, Kerekes, Gerken and Wieseler33), that is, trials belonging to development programmes from which the CTU group had included some but not all relevant trials, and from which we are reasonably certain that we have now obtained the full set of pertinent studies, and (iv) all trials in (iii) as well as the additional examples of missed trials (35–40) that we provide in this reply. Six trials deemed eligible by the CTU were excluded from all analyses: three trials for not presenting SAEs and/or selectively presenting potential SAEs (Reference Adamson, Sellman, Foulds, Frampton, Deering, Dunn, Berks, Nixon and Cape7,Reference Claghorn, Earl, Walczak, Stoner, Wong, Kanter and Houser13,Reference Ravindran, Teehan, Bakish, Yatham, O’Reilly, Fernando, Manchanda, Charbonneau and Buttars48), one trial for not being placebo-controlled (Reference Ball, Snavely, Hargreaves, Szegedi, Lines and Reines17), one trial for being partially uncontrolled (49), and one study (Reference Mancino, McGaugh, Chopra, Guise, Cargile, Williams, Thostenson, Kosten, Sanders and Oliveto50) in which there did not occur any SAEs in the relevant arms according to the study report on ClinicalTrials.gov (51). All analyses were carried out using reciprocal zero-cell correction, with the CCF constrained to sum to 1 and using a fixed-effect Maentel-Haenszel implementation in STATA. All study-level changes to the material used by the CTU are detailed in Supplementary Material File 2.
As shown in Table 1, when analysing all studies without accounting for age, the association between SSRIs and SAEs is significant only for the second population (OR 1.24, 1.01 to 1.52; p=0.039). When stratifying by age, the association is non-significant for all four sensitivity populations in the non-elderly subgroup (p range: 0.40–0.82) but significant for all four populations in the elderly subgroup (p range: 0.007–0.011). As discussed in our previous commentary (Reference Hieronymus, Lisinski, Naslund and Eriksson2), it should, however, be noted that the possible clinical significance of SSRI-related SAEs in the elderly remains unknown.
Table 1 Sensitivity analyses
CTU, Copenhagen Trial Unit.
* Includes the missed trials from the Eli Lilly, GlaxoSmithKline, Pharmacia&Upjohn and Merck development programs.
† These subgroups are identical.
‡ Includes all trials in the partially completed data set plus six additional trials (see text).
We again emphasise that it is not our ambition to provide definitive p-values – there may still be many trials that have been overlooked, and there are several crucial caveats related to the SAE reporting that are difficult to address. After having conducted these analyses, we, however, remain convinced that the data presented by the CTU group does not justify a reconsideration of the conventional view regarding the tolerability of the SSRIs.
Effect
With respect to the issue of efficacy, the CTU group in their rebuttal defends the use of HDRS17 as a measure of response, and rejects the use of an alternative measure, HDRS6, that according to numerous trials is less psychometrically and conceptually flawed (Reference Timmerby, Andersen, Sondergaard, Ostergaard and Bech52). Their main argument for this stance seems to be that the HDRS6 has not been validated against ‘patient-centered clinically relevant outcomes (e.g. suicidality; suicide; deaths)’. This is, however, a consternating argument, as two of the authors, Jakobsen and Gluud, recently failed to validate the measure they seem to favor, that is, HDRS17, against suicide and suicide attempts (Reference Jakobsen, Simonsen, Rasmussen and Gluud53). Their conclusion in that paper was the following: ‘Other publications […] have concluded that the HDRS scale is heterogeneous and that the scale is psychometrically and conceptually flawed […] There seems to be a need for other more clinically relevant assessment methods’. When writing their BMC Psychiatry paper, the authors were hence well aware of the shortcomings marring the HDRS17, and that these shortcomings have been suggested to make the difference between active drug and placebo appear smaller than it actually is, but refrained from mentioning this important caveat.
We again conclude that the efficacy data presented by the CTU group do not add much to what has previously been reported by others, and that the authors are mistaken when arguing that these results suggest the effect of SSRIs to be clinically insignificant. As elaborated elsewhere (Reference Hieronymus, Emilsson, Nilsson and Eriksson54), not just the use of the HDRS17 as measure of effect, but also many other methodological problems marring antidepressant trials, can be expected to make SSRIs appear less effective than they actually are.
Concluding remarks
We regret to conclude that the response from the CTU group is on par with their original contribution in terms of inaccuracies, misleading statements and bias. On the CTU web page it is stated that the systematic review represents the highest form of publication in terms of quality of evidence. What this episode illustrates is that systematic reviews and meta-analyses, when conducted without the required rigor and impartiality, on the contrary may be grossly misleading. Moreover, like a recent, similarly flawed and harshly criticised CTU analysis regarding treatment of hepatitis C (Reference Jakobsen, Nielsen, Feinberg, Katakam, Fobian, Hauser, Poropat, Djurisic, Weiss, Bjelakovic, Bjelakovic, Klingenberg, Liu, Nikolova, Koretz and Gluud55), it also shows that, when analysing treatment trials, interest in Cochrane checklists and handbooks can never substitute for actual insight into the subject of study, in this case psychiatry and psychopharmacology.
Funded by the Danish state, the CTU is a body with alleged expertise in evidence-based medicine with the task to provide impartial guidance to society in health-care issues. It is hence problematic when CTU researchers produce and disseminate questionable data to discourage the public from the use of effective medication for disorders as severe as depression (Reference Jakobsen, Katakam, Schou, Hellmuth, Stallknecht, Leth-Møller, Iversen, Banke, Petersen, Klingenberg, Krogh, Ebert, Timm, Lindschou and Gluud1) and hepatitis C (Reference Jakobsen, Nielsen, Feinberg, Katakam, Fobian, Hauser, Poropat, Djurisic, Weiss, Bjelakovic, Bjelakovic, Klingenberg, Liu, Nikolova, Koretz and Gluud55). To summarise, the CTU team has not shown completed suicide or death in general to be more common in patients on SSRIs, their data do not justify the claim that SAEs are more common in subjects treated with an SSRI regardless of age, and they have not provided any new information that justifies a re-evaluation of the efficacy of the SSRIs. To regain credibility, we recommend Jakobsen and co-workers to retract their BMC Psychiatry paper, because of its many errata, and to clarify that their public questioning of antidepressants was unfounded.
Acknowledgements
None.
Conflicts of Interest
F.H. has received speaker’s fees from Servier. E.E. has been on advisory boards and/or received speaker’s honoraria and/or research grants from Eli Lilly, GlaxoSmithKline, Servier and Lundbeck. The other authors report no conflicts of interest.
Supplementary material
To view supplementary material for this article, please visit https://doi.org/10.1017/neu.2018.15