Introduction
In recent decades, the proportion of persons exposed to psychotropic drugs has markedly grown (Alonso et al. Reference Alonso, Angermeyer, Bernert, Bruffaerts, Brugha, Bryson, de Girolamo, Graaf, Demyttenaere, Gasquet, Haro, Katz, Kessler, Kovess, Lepine, Ormel, Polidori, Russo, Vilagut, Almansa, Arbabzadeh-Bouchez, Autonell, Bernal, Buist-Bouwman, Codony, Domingo-Salvany, Ferrer, Joo, Martinez-Alonso, Matschinger, Mazzi, Morgan, Morosini, Palacin, Romera, Taub and Vollebergh2004; Verdoux et al. Reference Verdoux, Tournier and Begaud2010; Olfson et al. Reference Olfson, Blanco, Wang, Laje and Correll2014). Easier access to efficient psychotropic drugs may be viewed as progress in public health terms since some health care needs are still unmet in most countries owing to underdetection and undertreatment of psychiatric disorders. However, the impact of these prescribing trends should be carefully assessed at the population level. Indeed, a growing number of persons are exposed not only to the benefits but also to the risks associated with psychotropic drug use. Accurate and updated knowledge documenting the benefit/risk ratio of these drugs is therefore essential at all decision-making levels: users, prescribers, scientific societies and drug regulatory agencies (Eichler et al. Reference Eichler, Abadie, Raine and Salmonson2009; Moncrieff, Reference Moncrieff2018).
Randomised controlled trials (RCTs) are widely considered to be the gold standard method to assess the benefits of drugs. In all evidence-based classifications, findings obtained by RCTs are considered to be of greater quality than those obtained by observational studies. However, despite the preponderance of RCTs in evidence-based pharmacological research, relying only on the findings of RCTs to estimate the benefit/risk ratio of drugs may lead to inappropriate decisions. Regarding the identification of drug-related harm, the strengths of observational studies are increasingly recognised, as they can be carried out in large samples of unselected participants who are treated in real-life conditions and who may be followed up over long periods (Vandenbroucke & Psaty, Reference Vandenbroucke and Psaty2008).
The aim of this editorial is to illustrate the limitations and strengths of observational studies with selected examples of pharmaco-epidemiological studies exploring the adverse effects induced by exposure to psychotropic drugs.
Limitations of observational studies: lack of randomization
A wide range of research methods are included in the category ‘observational studies’: case series, cross-sectional studies or surveys, case-control studies, retrospective and prospective cohort studies (Vandenbroucke, Reference Vandenbroucke2003). The sources of data may also vary widely: registries, administrative health databases, patient and population surveys and patient chart reviews.
Since detailing the designs and limitations of each of these methods is beyond the scope of this paper, this section focuses on their common main source of biases related to lack of random allocation of the drug. In real-life conditions, the decision to prescribe a drug may not be independent of the outcome of interest. Confounding by indication or by disease severity are hence the major sources of biases in observational studies exploring drug-related harms (Vandenbroucke, Reference Vandenbroucke2004; Vandenbroucke & Psaty, Reference Vandenbroucke and Psaty2008; Norgaard et al. Reference Norgaard, Ehrenstein and Vandenbroucke2017). These biases are not ineluctably present in all observational studies: the adverse effect and its risk factors cannot impact the decision to prescribe if they are not known at the time of prescribing (Loke et al. Reference Loke, Golder and Vandenbroucke2011). To illustrate this issue, Vandenbroucke (Reference Vandenbroucke2004) cited the example of smoking and lung cancer: the findings of observational studies cannot be dismissed because of lack of randomisation, since adolescents begin to smoke for reasons unrelated to the vulnerability to lung cancer.
Confounding by an indication of disease severity may occur if the clinical factors guiding the treatment choice are also risk factors for the adverse effect (Vandenbroucke, Reference Vandenbroucke2006). In such cases, the adverse outcome may be wrongly imputed to drug exposure while it is a symptom/outcome of the disease/health condition motivating the decision to prescribe. In the field of psychotropic drugs, an example of possible confounding by indication may be found in observational studies investigating the link between exposure to antidepressants and suicidality: the same clinical factors, such as the presence and severity of depressive symptoms, may be associated with the decision to prescribe an antidepressant as well as with an increased risk of suicide (Bridge & Axelson, Reference Bridge and Axelson2008; Barbui et al. Reference Barbui, Esposito and Cipriani2009).
Several pharmaco-epidemiological and biostatistical methods have been developed to reduce the impact of biases in observational studies (Norgaard et al. Reference Norgaard, Ehrenstein and Vandenbroucke2017). For instance, self-controlled designs such as case-crossover are based upon comparisons of time spent underexposed v. unexposed conditions within the same persons, rather than on a comparison of exposed v. unexposed persons. With regard to biostatistical methods, the high-dimensional propensity score method uses a wide range of variables collected in healthcare databases to reduce residual confounding by potentially important unobserved confounders (Schneeweiss et al. Reference Schneeweiss, Rassen, Glynn, Avorn, Mogun and Brookhart2009; Verdoux et al. Reference Verdoux, Pambrun, Tournier, Bezin and Pariente2017). Pseudorandomised designs such as instrumental variable analysis or Mendelian randomisation may also be used (Norgaard et al. Reference Norgaard, Ehrenstein and Vandenbroucke2017).
Strengths of observational studies: large sample size
The sample size is a key issue in identifying drug-related harms. For rare adverse effects, there is a major risk of wrongly concluding that a drug is safe due to type II error (Loke et al. Reference Loke, Golder and Vandenbroucke2011). As a consequence, only frequent adverse effects can be identified by RCTs (Alves et al. Reference Alves, Batel-Marques and Macedo2012). There are several examples in the field of psychotropic drugs of serious adverse effects which were undetected during the pre-marketing clinical trials and observed only in post-marketing use. For instance, clozapine-induced agranulocytosis was first identified by case reports in Finland (Amsler et al. Reference Amsler, Teerenhovi, Barth, Harjula and Vuopio1977). Case reports and retrospective observational studies played a key role in the warning issued by the Food and Drug Administration in 2003 about diabetes induced by second-generation antipsychotics (SGAs), which were fatal in a number of patients (Lindenmayer & Patel, Reference Lindenmayer and Patel1999; Sernyak et al. Reference Sernyak, Leslie, Alarcon, Losonczy and Rosenheck2002). The increased risk of metabolic adverse effects was subsequently confirmed by population-based cohort studies (Koro et al. Reference Koro, Fedder, L'Italien, Weiss, Magder, Kreyenbuhl, Revicki and Buchanan2002). The controversy between the drug companies and regulatory agencies regarding the imputability of these adverse effects to SGAs (Rosack, Reference Rosack2003) illustrates the need for post-marketing observational studies independent from the drug companies to limit conflicts of interests and suspicion about hidden findings during industrial pre-marketing trials (Moncrieff, Reference Moncrieff2007).
Strengths of observational studies: unselected populations
Another major strength of observational studies is the inclusion of unselected persons, or at least persons less selected than participants in RCTs (Nordon et al. Reference Nordon, Bovagnet, Belger, Jimenez, Olivares, Chevrou-Severac, Verdoux, Haro, Abenhaim and Karcher2018). Indeed, the impact of drugs may be examined in youths or elderly persons, pregnant women and persons with comorbid conditions, who are frequently excluded from RCTs. Participants are hence more likely to be representative of the population actually reached and treated by the drug in real-life conditions.
The most illustrative example of a population excluded from RCTs is that of pregnant women. For obvious ethical reasons, pregnancy is a priori an exclusion criterion in all RCTs, except the few directly exploring drugs for pregnancy complications. To investigate teratogenic effects of drugs in humans, the only methodological option is thus to rely on observational studies. The risk of teratogenesis may be considered as the most feared drug-induced harm at all decisional levels (user, prescriber, regulatory agencies, etc). From a historical perspective, the dramatic consequences of prenatal exposure to drugs have strongly influenced the assessment of drug-induced harms. The implementation of drug regulatory agencies and pharmacovigilance centres was triggered by the thalidomide disaster in the 1960s (Botting, Reference Botting2002) followed by the diethylstilboestrol scandal (Verdoux et al. Reference Verdoux, Ropers, Costagliola, Clavel-Chapelon and Paoletti2007). Despite more and more stringent regulatory safeguards, health scandals due to delayed warnings about teratogenic effects of drugs are still occurring.
For instance, signals about the neural tube defects and the cognitive deficits induced by prenatal exposure to anticonvulsants such as valproate were available as early as the 1990s (Tanoshima et al. Reference Tanoshima, Kobayashi, Tanoshima, Beyene, Koren and Ito2015). These risks were later confirmed by observational studies based upon stringent methodology such as that used in the European Surveillance of Congenital Anomalies (EUROCAT) antiepileptic-study database collecting information over 20 years in 3.8 million births in 14 European countries (Jentink et al. Reference Jentink, Loane, Dolk, Barisic, Garne, Morris, de Jong-van den Berg and Group2010). The French National Agency for the Safety of Medicines and Health Products (ANSM) issued the first safety warning about the risks of valproate during pregnancy in 2014 (ANSM, 2014) before banning valproate use in 2017 in the event of a pregnancy (Casassus, Reference Casassus2017). What is noticeable in this recent health scandal is that the decision of the health authorities was triggered by the alert launched by an association of valproate victims (APESAC) rather than by the large body of evidence drawn from observational studies. This episode indirectly demonstrates that overemphasising the weaknesses and biases of observational studies, leading to a lack of confidence and suspicion about the validity of their findings may have deleterious health consequences.
Another question raised by the association of valproate victims was related to the increased risk of autism in children prenatally exposed to this drug. Exploring whether in utero exposure to drugs increases the risk of behavioural disturbances in children is probably one of the most challenging questions in pharmaco-epidemiology, as control of biases and confounding factors is especially complex. These issues are very controversial and usually generate passionate debates. Until the last decade, very few studies had adequately addressed this question (Verdoux et al. Reference Verdoux, Tournier and Begaud2009). Recently, observational cohort studies have reported that the risk of autism may be increased by prenatal exposure to valproate (Christensen et al. Reference Christensen, Gronborg, Sorensen, Schendel, Parner, Pedersen and Vestergaard2013) or antidepressants (Rai et al. Reference Rai, Lee, Dalman, Golding, Lewis and Magnusson2013). For instance, the study by Rai et al. (Reference Rai, Lee, Dalman, Golding, Lewis and Magnusson2013) was carried out in all youths aged 0–17 living in Stockholm County in 2001–2011, for whom information on maternal use of antidepressants in pregnancy and psychiatric diagnosis were collected in registers. The four different methods used to control for confounding factors provided concordant results, i.e. that prenatal exposure to antidepressants was associated with a higher risk of autism. As the current debates regarding whether the reported associations are or not causal are often opinion- rather than evidence-based, it is crucial to consider that the designs of these studies provide solid guarantees regarding the validity of the findings. In other words, the observational nature of the collected data should not be used as an argument to dismiss these results.
Although the selection is less stringent in observational studies than in RCTs, potential selection biases have nevertheless to be kept in mind when interpreting the findings. To continue on the theme of teratogenic risks, the example of prenatal exposure to lithium is a well-known illustration of how the design of observational studies may impact the assessment risk. The first alarming reports of an increased risk of congenital heart diseases in babies prenatally exposed to lithium were issued by the Internal Registries of Lithium Babies implemented in several countries, based on the model of the first registry initiated by Schou in 1969 in Denmark. Using these data, Nora and colleagues reported in 1974 that the risk of Ebstein's anomaly was 400 times greater in babies prenatally exposed to lithium than in the general population (Nora et al. Reference Nora, Nora and Toews1974). This finding had a strong impact on clinical practice over the next decades. Its validity was reconsidered only when pharmaco-epidemiological studies carried in representative samples reported a much lower teratogenic risk (Cohen et al. Reference Cohen, Friedman, Jefferson, Johnson and Weiner1994). Indeed, data collection in lithium registries was based upon voluntary physician reports, favouring reports of births with neonatal anomalies, and this selection bias induced an overestimation of the teratogenic risk of lithium. A recent systematic review and meta-analysis identified 62 observational studies exploring the teratogenic risk of lithium (McKnight et al. Reference McKnight, Adida, Budge, Stockton, Goodwin and Geddes2012). No association was found between congenital abnormality (including Ebstein's anomaly) and exposure to lithium. Since the number of events and infants exposed to lithium were small, it is not possible to exclude definitively the risk of teratogenesis associated with lithium exposure. What is sure considering the current state of knowledge is that the risk is much lower than that estimated by the first studies, and also much lower than that induced by prenatal exposure to anticonvulsants.
Strengths of observational studies: duration of exposure and follow-up
As RCTs are carried out over a short period, they can only identify adverse effects induced by short- or medium-term drug exposure. Data collected in observational studies are hence the main source of information to investigate the long-term effects of drug exposure, including delayed adverse effects that may occur after drug withdrawal. From a public health perspective, it is crucial to carefully explore signals about long-term or delayed drug adverse effects. This is especially true for drugs with high prescribing rates in the general population, as even a small increase in risk may be associated with a significant number of attributable cases (Rose, Reference Rose1992). When a signal is observed, demonstrating the causal nature of the association between drug exposure and the outcome of interest is always a complex task. This uncertainty often generates passionate debates and scientific controversies, such as those concerning the health consequences of exposure to oral contraceptives or to long-term hormone therapy for menopausal symptoms (Bhupathiraju et al. Reference Bhupathiraju, Grodstein, Stampfer, Willett, Hu and Manson2016).
In the field of psychotropic drugs, this issue may be illustrated by the debate regarding the link between exposure to benzodiazepines and dementia. The deleterious acute effects of benzodiazepines on cognitive performance, which may persist after withdrawal, are well-documented (Barker et al. Reference Barker, Greenwood, Jackson and Crowe2004). Several pharmaco-epidemiological studies performed in cohorts of elderly persons recruited in the general population or in health insurance databases have shown that elderly benzodiazepine users are at increased risk of dementia (Verdoux et al. Reference Verdoux, Lagnaoui and Begaud2005). Considering a large number of persons exposed to benzodiazepines in the general population, such a signal should be explored cautiously before concluding that a causal link may exist between drug exposure and poor cognitive outcome. Indeed, there may be a reverse causality bias as benzodiazepines may have been prescribed to treat prodromal symptoms of dementia such as insomnia, anxiety or depressive symptoms. This bias was addressed in a study carried in the PAQUID (Personnes Agées quid?) cohort including elderly persons living in the community (Billioti de Gage et al. Reference Billioti de Gage, Begaud, Bazin, Verdoux, Dartigues, Peres, Kurth and Pariente2012). These persons were randomly selected from the electoral rolls with seven follow-up assessments over 20 years. To control for confounding by indication, the study exploring the link between benzodiazepines and dementia was restricted to persons who were free of dementia at the 5-year follow-up and who reported the use of benzodiazepines for the first time at this assessment. An association was found between new use of benzodiazepines and an increased risk of dementia over the 15-year follow-up in multivariate analyses adjusted for a large range of potential confounders. Despite this design, no definite conclusion could be made regarding the causal nature of this association. Indeed, as no adjustment was made for anxiety and sleep disorders at the time of benzodiazepine initiation, reverse causation could not be excluded. A selection bias was also possible: people with missing information for benzodiazepine use were excluded, and these persons with unperformed follow-up may be at increased risk of dementia. To address these biases, another study was performed using data collected in the Quebec health insurance program database (Billioti de Gage et al. Reference Billioti de Gage, Moride, Ducruet, Kurth, Verdoux, Tournier, Pariente and Begaud2014). Exposure to benzodiazepines was documented in a time window ranging from 5 to 10 years before the first diagnosis of dementia. An increased risk of dementia in benzodiazepine users was found in analyses adjusted for depression, anxiety and insomnia, with a dose–response relationship between duration of exposure and risk of dementia. This study also showed that the risk of dementia was higher with use of long-acting compared with short-acting benzodiazepines. In spite of these findings supporting the existence of a causal association between drug exposure and poor cognitive outcome, this causal link cannot yet be definitely established. Indeed, it is not possible to exclude that prescription of benzodiazepines is just a marker of unmeasured psychiatric risk factors for dementia, i.e. that benzodiazepine use is not on the causal pathway between these risk factors and the incident cognitive deficits. This issue will be clarified only by prospective studies exploring the long-term cognitive outcome of young adults chronically exposed to benzodiazepines.
Studies exploring the long-term impact of drug exposure may sometimes provide reassuring findings indicating that some risks may have been overestimated. For instance, the risk of end-stage chronic kidney disease (CKD) is a feared consequence of long-term exposure to lithium. Owing to this potential complication, prescribers may be reluctant to prescribe lithium to patients and may restrict the access to this very efficient drug (Verdoux et al. Reference Verdoux, Pambrun, Cortaredona, Coldefy, Le Neindre, Tournier and Verger2016). A systematic review and meta-analysis assessing risks associated with lithium exposure identified 30 observational studies exploring renal function (McKnight et al. Reference McKnight, Adida, Budge, Stockton, Goodwin and Geddes2012). Very little evidence was found for a clinically significant impact of lithium on renal function, the most consistent finding being a small reduction in urinary concentrating ability in lithium users. The risk of end-stage CKD was explored by a study using Danish registers (Kessing et al. Reference Kessing, Gerds, Feldt-Rasmussen, Andersen and Licht2015). This population-based design minimised the detection bias induced by the intensive screening of renal function in lithium users. The study showed that lithium exposure was associated with an increased rate of CKD, but not with an increased rate of irreversible end-stage CKD with either dialysis or transplantation. On the basis of these findings, the authors concluded that lithium prescribed at recommended serum levels is a safe drug for the long-term treatment of mood disorders.
Conclusion
The strengths of observational studies regarding the identification of drug-related harms mirror the limitations of RCTs and vice-versa. Neither RCTs nor observational studies should be viewed as the exclusive methods able to provide valid answers to all the questions in the field of clinical pharmacology. The use of stringent methods is the best criterion to assess whether findings obtained in pharmacological studies investigating drug safety should be translated into public health and clinical decisions. Although RCTs (and meta-analyses) are required to address questions on efficacy and short-term tolerability of drugs, observational studies (and meta-analyses) are essential to address questions on rare or long-term adverse effects of drug exposure. The best information on drug safety from both RCTs and observational studies should hence be integrated in decisions issued by health regulatory agencies and in recommendations from scientific medical societies (Vandenbroucke & Psaty, Reference Vandenbroucke and Psaty2008) to guide practitioners’ prescribing practices and class actions filed by health system users’ associations.
Acknowledgements
The authors thank Ray Cooke for copyediting the manuscript.
Financial support
This research received no specific grant from any funding agency, commercial or not-for-profit sectors.
Conflict of Interest
None.