Introduction
After the experience of a traumatic event negative health-related symptoms can be observed in many adult trauma survivors. The range of negative symptoms typically includes re-experiencing the trauma, hyperarousal and avoidance of trauma-associated stimuli – the three core symptom clusters of the posttraumatic stress disorder diagnosis – however, alterations in mood and cognition occur as well (American Psychiatric Association, 2013). About 10% to 20% of trauma survivors show all symptoms of a full-blown posttraumatic stress disorder (PTSD; Norris & Slone, Reference Norris, Slone, Friedman, Keane and Resick2007), and around 8% of adults meet PTSD criteria at least once in their life (de Vries & Olff, Reference de Vries and Olff2009; Kessler, Petukhova, Sampson, Zaslavsky, & Wittchen, Reference Kessler, Petukhova, Sampson, Zaslavsky and Wittchen2012). However, the diagnosis of PTSD is not very distinct, with many possible manifestations and combinations of symptoms (Galatzer-Levy & Bryant, Reference Galatzer-Levy and Bryant2013). In addition, partial PTSD is also associated with considerable impairments (Marshall et al., Reference Marshall, Olfson, Hellman, Blanco, Guardino and Struening2001), and with similar health-seeking behaviour as observed among individuals who fulfil diagnostic criteria for PTSD (Stein, Walker, Hazen, & Forde, Reference Stein, Walker, Hazen and Forde1997). PTSD symptoms have a high risk for chronicity, comorbid medical and psychiatric symptoms, and suicide (Frayne et al., Reference Frayne, Seaver, Loveland, Christiansen, Spiro, Parker and Skinner2004; Kessler et al., Reference Kessler, Berglund, Demler, Jin, Merikangas and Walters2005; Kessler, Sonnega, Bromet, Hughes, & Nelson, Reference Kessler, Sonnega, Bromet, Hughes and Nelson1995; Krysinska & Lester, Reference Krysinska and Lester2010; Pietrzak, Goldstein, Southwick, & Grant, Reference Pietrzak, Goldstein, Southwick and Grant2011, Reference Pietrzak, Goldstein, Southwick and Grant2012; Wittchen et al., Reference Wittchen, Jacobi, Rehm, Gustavsson, Svensson, Jonsson and Steinhausen2011). Further, PTSD symptoms often lead to social and occupational impairment, and are associated with substantial economic and societal costs (Kessler, Reference Kessler2000). National treatment guidelines suggest several efficacious treatments for PTSD (Forbes et al., Reference Forbes, Creamer, Bisson, Cohen, Crow, Foa and Ursano2010), including a variety of trauma-focused psychotherapeutic treatment approaches (American Psychological Association, 2017; Foa, Keane, Friedman, & Cohen, Reference Foa, Keane, Friedman and Cohen2009; Forbes et al., Reference Forbes, Creamer, Phelps, Bryant, McFarlane, Devilly and Newton2007; Institute of Medicine, 2008; National Institute for Health and Care Excellence, 2005; World Health Organization, 2013), but also pharmacological treatments (American Psychological Association, 2017; Foa et al., Reference Foa, Keane, Friedman and Cohen2009). However, many patients with PTSD do not receive adequate treatment for their symptoms (Lewis et al., Reference Lewis, Arseneault, Caspi, Fisher, Matthews, Moffitt and Danese2019; Liebschutz et al., Reference Liebschutz, Saitz, Brower, Keane, Lloyd-Travaglini, Averbuch and Samet2007; Rodriguez et al., Reference Rodriguez, Weisberg, Pagano, Machan, Culpepper and Keller2003).
In 1986, writing about one's own trauma experience was proposed as potentially beneficial treatment for trauma survivors by Pennebaker and Beall (Reference Pennebaker and Beall1986). Expressive writing originally consisted of four writing sessions of 15 minutes duration and did not involve additional contact with a mental health professional. Initially, promising results have been demonstrated for expressive writing treatment in reducing symptom severity and increasing well-being (Smyth, Reference Smyth1998). However, benefits in subsequent meta-analyses were mostly small to moderate reflecting considerable variations of treatment effects across meta-analyses (Frattaroli, Reference Frattaroli2006; Frisina, Borod, & Lepore, Reference Frisina, Borod and Lepore2004; Mogk, Otte, Reinhold-Hurley, & Kröner-Herwig, Reference Mogk, Otte, Reinhold-Hurley and Kröner-Herwig2006; Smyth & Pennebaker, Reference Smyth and Pennebaker2008). These findings motivated adaptions of the original paradigm in order to increase the initially observed beneficial treatment effects of writing treatments (Smyth & Pennebaker, Reference Smyth and Pennebaker2008). Such adaptions, for instance, included the addition of interactions with a therapist or the provision of more detailed and guided writing instructions. Importantly, the main component of the treatment remained the writing itself and a number of mechanisms have been described to explain the observed treatment benefits (including improved self-regulation, cognitive processing of the trauma memory, and restoring perceptions of control; Andersson & Conley, Reference Andersson and Conley2008; Frattaroli, Reference Frattaroli2006; Smyth & Pennebaker, Reference Smyth and Pennebaker2008). Besides the assumed beneficial health effects, the parsimony of writing treatments, as well as the huge potential to close gaps in the provision of PTSD treatment through remote (e.g. online) delivery may have contributed to the treatment's continuing popularity over the last three decades. Several meta-analyses have been conducted over the last 20 years that showed small to moderately sized beneficial effects of the original expressive writing assignments in improving PTSD symptoms (Frattaroli, Reference Frattaroli2006; Frisina et al., Reference Frisina, Borod and Lepore2004; Mogk et al., Reference Mogk, Otte, Reinhold-Hurley and Kröner-Herwig2006; Smyth, Reference Smyth1998; Smyth & Pennebaker, Reference Smyth and Pennebaker2008). More recent meta-analyses focused on novel developments in writing treatments and did not include studies using the original writing paradigm (Kuester, Niemeyer, & Knaevelsrud, Reference Kuester, Niemeyer and Knaevelsrud2016; van Emmerik, Reijntjes, & Kamphuis, Reference van Emmerik, Reijntjes and Kamphuis2013).
While early randomized clinical trials (RCTs) evaluating the effects of writing treatments primarily used neutral writing assignments as control groups (e.g. writing about daily activities), more recent RCTs also incorporated passive comparators (i.e. waiting list control), and psychotherapeutic PTSD treatments. Today's plethora of available RCTs creates the opportunity to make multiple comparisons between the original and adapted writing treatments, psychotherapeutic PTSD treatments, as well as active and passive control groups in RCTs of writing treatments. The complex pattern of evidence from these differently controlled RCTs complicates the integration of available research findings using conventional pairwise meta-analytic approaches and calls for a network meta-analytic summary of available RCTs.
We conducted a systematic review and network meta-analysis including studies with full and partial PTSD as well as studies which included participants who had been exposed to trauma and suffered from PTSD symptoms. We included all available direct comparisons between an expressive writing treatment as stand-alone treatment (i.e. not as part of a complex treatment package) that was compared with a psychotherapeutic PTSD treatment, with an active writing control, or with a passive waiting-list control. We distinguished between original and enhanced writing treatments and summarized the available evidence in the short- and long-term.
Methods
This study was conducted in accordance with the PRISMA-NMA statement (Hutton et al., Reference Hutton, Salanti, Caldwell, Chaimani, Schmid, Cameron and Jansen2015; Moher, Liberati, Tetzlaff, Altmann, & Group, Reference Moher, Liberati, Tetzlaff, Altmann and Group2009), and was registered on PROSPERO (number: CRD 42018094075; Gerger, Gaab, & Werner, Reference Gerger, Gaab and Werner2018).
Identification of studies
We searched EMBASE, Medline, PsycINFO, and Cochrane Controlled Trials Register using key words and text words related to writing treatments, trauma experience and RCTs (see eAppendix 1). In addition, one researcher (CW) searched through the reference lists of relevant systematic reviews, and meta-analyses (Frattaroli, Reference Frattaroli2006; Frisina et al., Reference Frisina, Borod and Lepore2004; Kuester et al., Reference Kuester, Niemeyer and Knaevelsrud2016; Mogk et al., Reference Mogk, Otte, Reinhold-Hurley and Kröner-Herwig2006; Smyth, Reference Smyth1998; van Emmerik et al., Reference van Emmerik, Reijntjes and Kamphuis2013) for potentially relevant trials. The initial literature search was conducted between 8 June 2016 and 15 November 2016. The last update of the database search was conducted on 6 September 2020. Study inclusion was finished on 5 October 2020. Two reviewers (CW and HG) independently screened the full texts of potentially relevant publications using a structured manual. Disagreements were resolved by consensus.
Selection criteria
We included RCTs that applied at least one trauma-focused writing treatment, which aimed at reducing PTSD symptoms, and which was not part of a complex treatment package. We allowed any delivery method (e.g. paper and pencil or electronic or internet-based ), as long as it was a purely written intervention and not mixed with any other intervention like verbal cognitive behavioural therapy. RCTs were included even when the trauma-focused writing treatment was not the main focus of the experimental investigation but served as a control condition for psychotherapy. We included comparisons between trauma-focused expressive writing treatments with PTSD psychotherapies, neutral writing and waiting-list control groups.
We defined trauma-focused writing as a writing treatment that targeted the traumatic event the participant had experienced. We classified expressive writing treatments as 1st those that referred to the original paradigm by Pennebaker and Beall (Reference Pennebaker and Beall1986), and 2nd as enhanced writing interventions those that included additional elements assumed to increase their efficacy (i.e. therapist contact exceeding the initial writing instruction, or more elaborated and directive instructions for each individual writing session). Writing treatments were classified as expressive writing (EW) if authors either explicitly referred to the original paradigm by Pennebaker and Beall (Reference Pennebaker and Beall1986), or writing treatments were similarly structured as the original writing paradigm (e.g. three3–four sessions of 15–30 min duration). Importantly, to be considered EW no therapist involvement was allowed. Also, no individualized instructions for each writing session were allowed. Writing treatments were classified as enhanced writing (EW+) if the treatment description 1st did not explicitly refer to the original Pennebaker writing paradigm and if 2nd writing treatments included additional elements assumed to increase their efficacy: the treatments included either the presence of a therapist during writing sessions, or any therapist feedback. In many cases experimental manipulation of the writing content was used (e.g. more directive writing instructions which changed for each writing session). Enhanced writing treatments typically also used more or longer writing sessions compared with the original paradigm. However, the use of longer sessions alone was not sufficient for a writing treatment to classify as enhanced writing. Studies that used only experimental manipulations of formal aspects of the writing task (e.g. writing in the first-person v. writing in the third-person; Andersson & Conley, Reference Andersson and Conley2013; Kenardy & Tan, Reference Kenardy and Tan2006) but which had no additional comparator were not included in the analyses. Neutral control writing was defined as a writing task that did not focus on a traumatic event (e.g. writing about daily tasks). We included RCTs with adults (i.e. mean age of the study sample was 18 or above). Participants needed to have experienced at least one traumatic event according to the Diagnostic and Statistical Manual of Mental Disorders fifth edition PTSD criterion A (DSM-5; American Psychiatric Association, 2013), and they needed to report the occurrence of either full or partial PTSD, or the presence of PTSD symptoms in the aftermath of trauma experience (see eAppendix 2 for a more detailed description). We excluded studies on expressive writing with samples that did not report the presence of PTSD symptoms (e.g. Burton & King, Reference Burton and King2004; Pennebaker & Beall, Reference Pennebaker and Beall1986; Ramirez & Beilock, Reference Ramirez and Beilock2011; Tondorf et al., Reference Tondorf, Kaufmann, Degel, Locher, Birkhäuer, Gerger and Gaab2017). We had no language restrictions and we did not require studies to be double-blind for inclusion, as a blinding of therapists and participants is not possible in psychotherapy research.
Outcomes
Our primary outcome was the longest available follow-up assessment of PTSD symptom severity measured on a continuous validated scale, or using structured interviews assessing PTSD symptoms according to diagnostic criteria. In addition to the longest available follow-up, we assessed treatment effects immediately after treatment termination (⩽1 month after treatment termination) and long-term effects (>1 month after termination). If more than one PTSD scale was used in the trial, we used a predefined hierarchy, which gave most frequently used scales precedence (see eAppendix 2 for the pre-defined hierarchy). Results from intention-to-treat (ITT) analyses were preferred over results from per-protocol or completer analyses, and observer-rated outcomes were used in our analyses only if self-rated outcomes were not reported. As secondary outcome we included the acceptability of PTSD treatments as indicated by patients dropping out of treatment before treatment termination. If no reasons for early termination were provided, we used the total drop-out rates per group.
Data collection
For the effect size calculation, we extracted sample sizes (N), means (M) and standard deviations (s.d.) for each treatment group. In case these values were missing, other statistical data that can be converted into means and standard deviations were extracted. Conversions were calculated according to formulas previously suggested (Cohen, Reference Cohen1988; Higgins & Green, updated March Reference Higgins and Green2011; Lakens, Reference Lakens2013; Lipsey & Wilson, Reference Lipsey and Wilson2001). If the N was missing in the table of analysis, we used the N of the descriptive statistics, and if group Ns were missing, we assumed same sample size per group. We contacted one study author, because insufficient information was available, but the author did not reply. Studies were excluded, if the outcome data could not be calculated, imputed, or obtained from the authors. For the calculation of risk ratios (RRs) as indicators of treatment acceptability we extracted the number of drop-outs between beginning and end of treatment.
In addition to the data for effect size calculation characteristics of the included population (e.g. type of trauma, age of the study sample, PTSD diagnosis), the intervention (e.g. number of treatment sessions, reference to the original Pennebaker writing paradigm, presence of a therapist during writing sessions, location of writing), and the study (e.g. year of publication) were coded. We rated risk of bias for the results presented in each individual included study using the dimensions defined in the Cochrane Risk of Bias (RoB) Assessment Tool (Higgins & Green, updated March Reference Higgins and Green2011). Across studies we rated the indirectness of the available evidence (i.e. whether a single study differed from the target studies we were interested in with respect to population, intervention, outcome assessment, or the type of comparison; Guyatt et al., Reference Guyatt, Oxman, Kunz, Woodcock, Brozek, Helfand and Vist2011). In order to rate the confidence in the entire network meta-analytic results on a meta-level across all included studies we used the CINeMA framework (Salanti, Del Giovane, Chaimani, Caldwell, & Higgins, Reference Salanti, Del Giovane, Chaimani, Caldwell and Higgins2014) (see eAppendix 2 for a detailed description of ratings for RoB, indirectness, and network confidence). Two independent raters (HG and CW) extracted all data from all included studies on a standardized form (Microsoft Office Excel 2011 and 2018) after intensive training in using the manual with operational descriptions of each item. Disagreements were solved by consensus between these two raters.
Data analysis
Standardized mean differences (SMDs) were calculated first with the data collected at the end of treatment (34 studies), and second with the data from long-term follow-up (26 studies). In our analyses using the longest available follow-up data we included all 44 identified studies with a preference for long-term data if both, end of treatment and long-term data, were available. In our protocol, we defined the analyses using short-term data as primary outcomes. This choice was made because we expected that all studies would report results at the end of treatment and we wanted the main analyses to include all available studies. Contrary to our expectations, several studies reported long-term follow-up data only. Therefore, we decided to use the most complete results using the longest available follow-up data as primary outcome (i.e. we used these data for subsequent explorations of heterogeneity and robustness of findings in our sensitivity analyses). However, in accordance with the protocol, we report all results, using short-term data only (34 studies), long-term data only (26 studies), and using all available data (i.e. the longest available follow-up from 44 studies). The magnitude of SMD was interpreted as small (0.20 s.d. units), moderate (0.50 s.d. units), or large (0.80 s.d. units; Cohen, Reference Cohen1988). RRs were calculated for the drop-out rates between start and end of treatment: losses to follow-up were not considered. We used a 2-sided p < 0.05 to indicate statistical significance.
A network was created including five jointly randomizable treatments: 1st expressive writing (original; EW), 2nd enhanced expressive writing (EW+), 3rd PTSD psychotherapies (PT), and we included 4th neutral writing controls (NW), and 5th waiting list controls (WL). Network geometry was summarized in a graph which presents the five treatments as nodes (larger nodes indicate a larger number of studies per treatment), and the available comparisons between treatments as edges between the nodes (the thickness of the edges represents the number of available comparisons). We assumed that any patient that meets all inclusion criteria is likely, in principle, to be randomized to any of the interventions in the synthesis comparator set. We addressed the assumption of transitivity in the network meta-analysis (Salanti, Reference Salanti2012), by 1st assessing whether the included interventions are similar across studies using a different design, and 2nd checking whether the distribution of potential moderators is balanced across comparisons (Jansen & Naci, Reference Jansen and Naci2013).
We considered random-effects models rather than a fixed-effect model because the included studies were different with respect to clinical and other factors (see eTable 1). SMDs were calculated for all relevant comparisons within each study. In addition, indirect evidence was estimated using the entire network of evidence. To conduct network meta-analyses within a frequentist framework we used the package netmeta version 0.9–7 (Rücker, Schwarzer, Krahn, & König, Reference Rücker, Schwarzer, Krahn and König2018) for the open-source software environment R (version 3.5.1; R Core Team, 2018). The R function pairwise transformed the dataset to the contrast-based format, which is needed for conducting the network meta-analysis.
To express heterogeneity between studies the Q statistic was used (Cochran, Reference Cochran1950). Further τ 2 was calculated to get an estimate of the variance between studies (Higgins, Reference Higgins2008). For the primary outcome a value of τ 2 = 0.04 was considered as low heterogeneity, 0.09 as moderate and 0.16 as high heterogeneity (Borenstein, Hedges, Higgins, & Rothstein, Reference Borenstein, Hedges, Higgins and Rothstein2011). In addition we used I2 as an indicator of the amount of observed variance that can be attributed to between-study heterogeneity (Higgins, Thompson, Deeks, & Altman, Reference Higgins, Thompson, Deeks and Altman2003) which can roughly be interpreted as follows: 0%–40%: might not be important; 30%–60%: may represent moderate heterogeneity; 50%–90%: may represent substantial heterogeneity; 75%–100%: considerable heterogeneity (Borenstein et al., Reference Borenstein, Hedges, Higgins and Rothstein2011). In the network meta-analyses, we assumed a common estimate for the between-study heterogeneity variance across all included comparisons.
We used local, as well as global methods to detect inconsistency in the network (Efthimiou et al., Reference Efthimiou, Debray, van Valkenhoef, Trelle, Panayidou and Moons2016): 1st locally using the netsplit command (i.e. splitting direct and indirect evidence), and 2nd globally using the decomp.design command (i.e. using the design-by-treatment interaction model). We compared the magnitude of heterogeneity between consistency and inconsistency models to determine how much of the total heterogeneity was explained by inconsistency.
We conducted sensitivity analyses excluding studies with imputed standard deviations, studies with high indirectness ratings, studies that reported only observer-rated outcomes, studies that did not use established but rather experimental PTSD psychotherapies, and studies that included only patients who reported PTSD symptoms but not full or partial PTSD, in order to test the robustness of results.
Results
The systematic database search identified 5439 records. Following the title and abstract screening 119 full-text articles were considered potentially relevant. However, 44 RCTs* with a total of 7724 participants were included in our analyses (see Fig. 1). Nine included studies were publicly available dissertation theses. All included studies were published between 1996 and 2018, and were available in English. The time to last available follow-up ranged between 7 and 420 days with a median of 42 days (see eTable 1), and longer intervals for the last follow-up assessment were observed in studies with enhanced writing and psychotherapy (see eTable 2). Forty-one studies reported self-rated outcomes and three studies reported only observer-rated outcomes, two of which reported adequate blinding of outcome assessors, in one study with observer-rated outcomes we found no information regarding blinding of outcome assessors. Six studies used psychotherapeutic PTSD treatments as comparator including cognitive behavioural treatment (CBT) in one study, cognitive processing therapy (CPT) in two studies (Resick et al., Reference Resick, Galovski, Uhlmansiek, Scher, Clum and Young-Xu2008; Sloan, Marx, Lee, & Resick, Reference Sloan, Marx, Lee and Resick2018; van Emmerik, Kamphuis, & Emmelkamp, Reference van Emmerik, Kamphuis and Emmelkamp2008), eye movement desensitization and reprocessing (EMDR) in one study (Largo-Marsh, Reference Largo-Marsh1996; Largo-Marsh & Spates, Reference Largo-Marsh and Spates2002), one study applied a psychotherapeutic approach described as active facilitator disclosure (Slavin-Spenny, Cohen, Oberleitner, & Lumley, Reference Slavin-Spenny, Cohen, Oberleitner and Lumley2011), which includes talking about the trauma experience and the emotions relating to that experience, as well as the identification of missing content in the participant's story, and one study applied a highly directive protocol aimed at promoting evidence-based processes to improve PTSD symptoms (Alessandri, Reference Alessandri2017). In four studies experimental manipulations of the writing paradigm (e.g. instruction to focus on emotion v. on insights) were applied in addition to NW as control. In these cases, we combined the groups that used experimental manipulations (see eTable 1). In studies with psychotherapeutic PTSD treatments or waiting list control as comparator the proportion of participants with full or partial PTSD was larger (83.3% and 66.7%, respectively) than in the studies that used writing assignments as treatment (33.3% for EW and 32% for EW+) and neutral writing as comparator (30.4%; see eTable 2).
We identified a network of treatments in which comparisons were available for all possible treatment combinations. This allowed for estimating inconsistency between direct and indirect evidence for each comparison. See Fig. 2 for the identified network of comparisons and eTable 1 for additional characteristics of the included studies.
RoB was considered moderate in 14 studies and high in 30 studies (eTable 3). Indirectness was considered low in eight studies, moderate in 27 studies and high in nine studies (eTable 4). The network meta-analyses relied mostly on evidence with moderate to high RoB and with moderate indirectness (see eFigs 1 and 2). Confidence in the network meta-analyses was considered moderate for one comparison and low for three comparisons (i.e. EW v. NW, EW+ v. WL, and EW+ v. PT; eTable 5).
We checked for baseline differences between PTSD scores and found PTSD scores to be significantly smaller in the EW+ groups compared with the WL groups with an SMD of −0.12 (95% CI −0.23 to −0.02; see Table 1 and eAppendix 3).
EW, expressive writing; EW+, enhanced writing; FU, follow-up; NW, neutral writing; PT, psychotherapy; WL, waiting list.
SMDs below 0 indicate superiority of a comparator over WL. SMDs above 0 indicate superiority of WL over a comparator. Statistically significant results are printed bold.
Comparative efficacy
At the end of treatment EW+ and PT were significantly more efficacious than EW, NW, and WL in reducing PTSD symptoms (Fig. 3a), and there were no significant differences between EW, NW, and WL observed (Fig. 3a and Table 1). We found evidence for very large between study heterogeneity (τ 2 = 0.17) and significant inconsistency (Q = 14.78; df = 4; p = 0.005).
At the longest available follow-up superiority of EW+ and PT over EW, NW, and WL decreased slightly but was still statistically significant (Fig. 3b; Table 1). Also, EW and NW showed moderately sized significant superiority over WL in the long-term. We found moderate heterogeneity (τ 2 = 0.08) and significant inconsistency (Q = 49.23; df = 10; p < 0.0001; eAppendix 4) in this analysis. Sensitivity analyses indicated some variation in the observed SMDs (Table 1). The general pattern of results, however, shows significant superiority of all active treatment groups over WL, and small to moderate differences between the active treatment groups (eAppendix 5). Pairwise meta-analyses confirmed this overall pattern (Figs 3a and b; eAppendix 6).
Exploratory findings excluding two-arm comparisons between EW+ and WL
In a post-hoc analysis we excluded two-arm studies that compared EW+ with WL because of the finding of significant baseline differences in this comparison (Table 1) and the observation that this comparison contributed considerably to the inconsistency observed in the network meta-analysis (see eAppendix 4). This analysis showed small to moderate significant superiority of all active treatments over WL (Fig. 3c; Table 1). PT was significantly superior over NW, and no significant differences were found between EW+, EW, and NW using longest available follow-up data (Fig. 3c). Heterogeneity was low to moderate in this analysis (τ 2 = 0.05) and inconsistency was reduced but still significant (Q = 18.28; df = 9; p = 0.03).
Comparative acceptability
With respect to the acceptability of treatments we observed significantly more drop-outs in PT as compared with WL (RR = 2.05, 1.04 to 4.04; Fig. 4). Between EW+, EW, NW, and WL no significant differences were observed (Fig. 4; eAppendix 7). We found low to moderate heterogeneity (τ 2 = 0.05) and statistically non-significant inconsistency (Q = 3.28; df = 6; p = 0.77). Pairwise meta-analyses confirmed the statistically non-significant differences in drop-outs between the different treatment approaches (Fig. 4; eAppendix 8).
Discussion
Our network meta-analysis addresses the comparative efficacy between expressive writing treatments as compared with psychotherapeutic PTSD treatments, neutral writing treatment and waiting list controls. In order to consider recent developments in writing treatments we classified them into those that referred to the original paradigm developed by Pennebaker & Beall (EW) and those that included additional elements assumed to increase their efficacy (i.e. therapist contact and more elaborated and structured instructions for the individual writing sessions; EW+). To the best of our knowledge this is the most comprehensive summary of RCTs on the efficacy of writing treatments on PTSD symptoms so-far. Using network-meta-analysis we were able to include all available comparisons between writing treatments and active as well as passive comparators in one statistical model. From a clinical perspective it is important to consider that most of the studies which used EW and EW+ as treatment included trauma survivors who reported some PTSD symptoms, but who would not qualify for a partial or full PTSD diagnosis.
Our results show that in the short-term EW+ and PT significantly outperformed EW, NW and WL, with EW and NW showing only small and non-significant superiority over WL. In the long-term, however, all active treatments outperformed WL significantly, with EW+ and PT again significantly outperforming EW and NW. It is important to note that the average duration of treatment and the number of treatment sessions were considerably higher in EW+ and PT as compared with EW and NW (see eTable 2). Thus, the amount of time spent in treatment is confounded with the type of treatment. Our analyses do not allow for conclusions whether the actual content of EW+ and PT or the time spent in treatment contributed most to the treatments' effects. The observed superiority of EW+ and PT was small to moderate and probably not of clinical significance (Stefanovics, Rosenheck, Jones, Huang, & Krystal, Reference Stefanovics, Rosenheck, Jones, Huang and Krystal2018). We found evidence for significantly more drop-out in PT as compared with WL. Although we aimed to extract data on treatment drop-outs only (as opposed to more general losses to follow-up), a huge variability in definitions and the reporting of drop-outs, but also different reasons for dropping out (e.g. occurrence of adverse effects v. symptom improvement) complicate data extraction, and in turn interpretations of these data with respect to treatment acceptability. Our analyses, including several sensitivity analyses, showed considerable variability between results from individual studies, as indicated by between study heterogeneity, but there were also differences between direct and indirect estimates of comparative efficacy, as indicated by significant inconsistency.
Based on previous reports (Mylle & Maes, Reference Mylle and Maes2004; Pavlacic, Buchanan, Maxwell, Hopke, & Schulenberg, Reference Pavlacic, Buchanan, Maxwell, Hopke and Schulenberg2019; Pietrzak et al., Reference Pietrzak, Goldstein, Southwick and Grant2011, Reference Pietrzak, Goldstein, Southwick and Grant2012) we conducted a sensitivity analysis in which we excluded studies which had reported only increased levels of PTSD symptoms (as opposed to full or partial PTSD diagnoses). We found somewhat larger effect sizes of PT, EW+ and EW in this analysis as compared to the main analysis, but also a considerable increase in heterogeneity, which hampers clear conclusions based on this analysis. Due to the observation that the studies comparing EW+ with WL showed significant differences at baseline already, and the observation that this particular comparison contributed considerably to network inconsistency, we conducted an exploratory analysis in which we excluded this respective comparison from the network. In this analysis, the superiority of EW+ and PT compared to EW, NW and WL was considerably reduced and superiority of EW+ and PT over EW and NW were no longer statistically significant. In this analysis heterogeneity was reduced to a small to moderate level.
Thus, when discussing our study findings, the studies comparing EW+ with WL need some additional attention. In general, the problems associated with the use of WL as control in psychotherapy RCTs has been described previously (Cuijpers & Cristea, Reference Cuijpers and Cristea2016; Eysenck, Reference Eysenck and Giles1993; Furukawa et al., Reference Furukawa, Noma, Caldwell, Honyashiki, Shinohara, Imai and Churchill2014; Staines & Cleland, Reference Staines and Cleland2007). Unfortunately, despite the availability of a credible active control treatment in RCTs on writing treatments (i.e. the neutral writing control), which has typically been used in the earlier trials, more recent RCTs increasingly implemented WL as comparator. Accordingly, after excluding comparisons between EW+ and WL nine out of 15 RCTs using EW+ had to be excluded from the analyses. In addition to the problems associated with the use of WL controls, the nine two-arm RCTs using EW+ as treatment and WL as control are also prone to the so-called investigator or researcher allegiance bias. In all nine studies the authors were involved in the development of the EW+ treatment protocol, one of the strongest indicators of researcher allegiance (Munder, Gerger, Trelle, & Barth, Reference Munder, Gerger, Trelle and Barth2011). The presence of strong researcher preferences in favour of the investigated treatment have been shown to be associated with larger benefits of the preferred treatment in psychotherapy RCTs (Gerger & Gaab, Reference Gerger and Gaab2016; Munder, Brütsch, Leonhart, Gerger, & Barth, Reference Munder, Brütsch, Leonhart, Gerger and Barth2013), and this association has been shown to be mediated by low methodological quality of the RCTs (Munder et al., Reference Munder, Gerger, Trelle and Barth2011). The choice of WL as comparator, instead of using a more credible active comparator may contribute to such bias.
Strengths and limitations
In network meta-analyses multiple comparisons between more than two treatment approaches are integrated in one analysis. This provides a more comprehensive overview regarding the comparative efficacy and acceptability of writing treatments in comparison to other treatment options, but also compared to passive and active comparators. This analytic approach allowed us to detect potential differences in the efficacy of the original and adapted writing treatments, and to check whether results are consistent across different research designs. We reduced the risk for the occurrence of publication bias by including published research articles but also publicly available dissertation theses. In order to warrant transitivity in the network we included only studies in which participants were randomly assigned to a writing intervention in at least one treatment group and to an additional comparator. We did however not include studies which directly compared only a psychotherapeutic PTSD treatment with a control treatment (e.g. waiting list) as these studies might differ from the writing intervention studies regarding clinical or methodological characteristics. Regarding the combination of different psychotherapeutic PTSD treatments in one node of the network, one could question whether the PTSD psychotherapies were similar enough with respect to their effects in order to be combined. A previous network meta-analysis demonstrated that there were no significant differences between treatment effects of EMDR, CBT, and CPT (Gerger et al., Reference Gerger, Munder, Gemperli, Nüesch, Trelle, Jüni and Barth2014). A sensitivity analysis in which we excluded two studies which used newly developed psychotherapeutic treatments (i.e. directive protocol and active facilitator writing) replicated the findings using all five studies which used psychotherapeutic PTSD treatments as comparators.
The most relevant limitation of our study is the observed heterogeneity and inconsistency. However, this observation reflects the diversity of findings reported in previous meta-analyses (Frattaroli, Reference Frattaroli2006; Frisina et al., Reference Frisina, Borod and Lepore2004; Mogk et al., Reference Mogk, Otte, Reinhold-Hurley and Kröner-Herwig2006; Smyth, Reference Smyth1998; Smyth & Pennebaker, Reference Smyth and Pennebaker2008; van Emmerik et al., Reference van Emmerik, Reijntjes and Kamphuis2013). Unfortunately, even using the currently most elaborate statistical approach to summarize available research evidence (i.e. network meta-analysis) did not provide results that allow definite conclusions. However, using network meta-analysis we were able to show, that the superiority of PT and EW+ might be overestimated when 1st focusing on short-term results only, and 2nd when including mainly comparisons between EW+ and WL. A further limitation of our study is that we focused only on PTSD symptoms and treatment acceptance as outcomes, but ignored additional potentially relevant outcomes, for instance well-being, as well as additional indicators of potential harm, for instance adverse events.
It is important to note, that many of the included studies have to be considered underpowered, as a minimum of 64 participants per group would be needed in an RCT comparing a treatment with an active comparator and expecting a medium SMD of 0.50 with a desired power of 0.80 and a two-tailed p of 0.05 (Schnurr, Reference Schnurr2007). The inclusion of underpowered trials in a meta-analysis increases the risk of biased results, partly due to the fact that underpowered studies with negative or non-significant findings have a smaller chance of being published (contributing to the so-called publication bias). We tried to minimize the impact of publication bias by including unpublished studies in addition to studies which were published in scientific journals.
Conclusions
In our network meta-analysis using data from the longest available follow-up assessments all active treatments (including NW) outperformed WL with small to moderate superiority of trauma-focused treatments (i.e. PT, EW+, EW) over NW. We found only small to moderate superiority of PT and EW+ over EW, which was statistically significant in some analyses, but probably not of clinical significance. We conclude that as it stands methodological issues to a considerable extent might explain the observed superiority of EW+ over EW. Definite conclusions are hampered to-date because of the predominant use of WL controls in EW+ RCTs, the lack of direct comparisons between the original EW and recently developed EW+, as well as a lack of RCTs investigating EW+ efficacy, which are conducted by independent researchers. Thus, particularly the superiority of EW+ over the original EW paradigm but also over NW controls await confirmation from adequately sized comparative RCTs preferably including all four active treatment approaches (i.e. EW, EW+, PT, and NW), reporting long-term data and including researchers with balanced preferences.
From a clinical perspective the potential of writing interventions to fill treatment gaps in mental health care by offering the possibility to treat patients with only minimal therapist contact is highly relevant and our analyses confirm significant benefits of writing treatments in improving PTSD symptoms. However, to date no definite conclusions are possible regarding the exact magnitude of these benefits, the increase in benefits by enhancing expressive writing with additional treatment components, and the effectiveness of writing treatments in comparison with PTSD psychotherapies.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S0033291721000143.
Acknowledgements
We would like to thank Franziska Z'graggen who conducted a pilot study for this network meta-analysis.
Author contributions
Dr Gerger had full access to all the data in the study and took responsibility for the integrity of the data and the accuracy of the data analysis. Gerger and Werner conceptualized and designed the study. Cuijpers, Gaab, Gerger, and Werner acquired, analysed or interpreted data. Gerger and Werner drafted the manuscript. Cuijpers, Gaab, Gerger, and Werner critically revised the manuscript for important intellectual content. Gerger performed statistical analyses. Gerger supervised the study.
Conflict of interest
The authors declare no conflicts of interest.
Ethical standards
As the study did not involve human subjects no ethical approval was necessary.