Introduction
One in five women may experience perinatal mental illness (Gavin et al., Reference Gavin, Gaynes, Lohr, Meltzer-Brody, Gartlehner and Swinson2005). Opportunities to diagnose these conditions may be missed by primary care physicians (Prady et al., Reference Prady, Pickett, Petherick, Gilbody, Croudace, Mason, Sheldon and Wright2016; Ford et al., Reference Ford, Shakespeare, Elias and Ayers2017) and, in particular, half of all cases of perinatal depression (prevalence of one in eight women) go undetected (Gavin et al., Reference Gavin, Meltzer-Brody, Glover and Gaynes2015; Bauer et al., Reference Bauer, Knapp and Parsonage2016b). Consequences of untreated postnatal depression (PND) can be profound and long-lasting for women and families, with risks of longer-term adverse effects on child development and associated costs (Bauer et al., Reference Bauer, Knapp and Parsonage2016b), yet there are psychosocial treatments that a growing evidence base suggests are effective (Milgrom and Gemmill, Reference Milgrom and Gemmill2014; Morrell et al., Reference Morrell, Sutcliffe, Booth, Stevens, Scope, Stevenson, Harvey, Bessey, Cantrell, Dennis, Ren, Ragonesi, Barkham, Churchill, Henshaw, Newstead, Slade, Spiby and Stewart-Brown2016).
Health visitors (HVs) are public health nurses, based in community settings such as health centres and family centres across the UK, playing important roles in supporting women during and after pregnancy (Cowley et al., Reference Cowley, Caan, Dowling and Weir2007, Reference Cowley, Whittaker, Grigulis, Malone, Donetto, Wood, Morrow and Maben2013; Health Education England, 2016). The impact of PoNDER HV training in assessment and delivery of cognitive-behavioural and person-centred approaches (CBA and PCA) in terms of effectiveness and cost-effectiveness for women and children has been reported (Morrell et al., Reference Morrell, Slade, Warner, Paley, Dixon, Walters, Brugha, Barkham, Parry and Nicholl2009a). PoNDER HV training can reduce the proportion of women at risk of developing PND as indicated by a reduction in score on the Edinburgh Postnatal Depression Scale (EPDS) (Cox et al., Reference Cox, Holden and Sagovsky1987), a self-report measure widely used in clinical practice (Hewitt and Gilbody, Reference Hewitt and Gilbody2009). Scores on the 10-item EPDS range from 0 to 30, higher scores indicating more depressive symptoms; at-risk women were identified as those scoring 12 or more.
Further analyses of mental health outcomes in the subgroup of lower-risk women (EPDS score < 12) in the PoNDER trial at 6 weeks after childbirth suggested that participants in the intervention group had a reduced risk of developing PND as indicated by reduction in EPDS scores 6 months after childbirth (Brugha et al., Reference Brugha, Morrell, Slade and Walters2011). While the training intervention appeared to be effective, that analysis did not address the separate question of costs and cost-effectiveness. There is a need to know whether the additional costs associated with providing this additional care are considered worthwhile in relation to the health benefit it produces. Here we examine the cost-effectiveness of this universal preventive approach for lower-risk women.
Method
The main PoNDER cluster-randomised controlled trial is detailed elsewhere (Morrell et al., Reference Morrell, Warner, Slade, Dixon, Walters, Paley and Brugha2009b, Reference Morrell, Slade, Warner, Paley, Dixon, Walters, Brugha, Barkham, Parry and Nicholl2009a). The trial randomised 101 general practitioner (GP) practices to either (a) ‘usual health visitor care’ (n = 38 clusters; 1 cluster lost to follow-up); (b) care by HVs trained in assessing postnatal for symptoms of PND and a CBA to address postnatal psychological problems (n = 30 clusters); or (c) care by HVs trained in assessing women for symptoms of PND and a PCA to address postnatal psychological problems (n = 32 clusters). HVs received brief training derived from CBT principles (Appleby et al., Reference Appleby, Warner, Whitton and Faragher1997) for the CBA arm or from person-centred counselling principles (Holden et al., Reference Holden, Sagovsky and Cox1989), for the PCA arm (Morrell et al., Reference Morrell, Ricketts, Tudor, Williams, Curran and Barkham2011). The training was carried out by Masters-level trainers and was equivalent for both the CBA and PCA arms (1 day on clinical assessment skills; 5 days on psychotherapeutic approach; 4 half days of reflective practice/clinical supervision). The training developed intervention HVs’ skills in genuineness and listening so that HVs could talk to the woman about PND, gain her trust, develop an ongoing relationship with her, and be open to being re-contacted if the woman felt that her symptoms were not improving spontaneously. We recognised that the research was undertaken in a population that could be considered to be a vulnerable group and so the trial excluded: women under 18 and women with pre-existing severe and enduring mental health problems. The trial began in April 2003 and continued over 3 years. Baseline measurements, including EPDS, were taken at 6 weeks postnatally. Women completed study questionnaires at 6, 12, and 18 months postnatally. In all, 4084 women consented to participate in the study; 3449 returned baseline 6-week postal questionnaires, of whom 595 (17.3%) scored 12 or more on EPDS (‘at-risk’ women). At-risk women in intervention groups (b) and (c) were offered up to 8 weekly psychologically informed sessions with the HV. Brugha et al. (Reference Brugha, Morrell, Slade and Walters2011) subsequently reported a sub-study of the main trial, focused on lower-risk participants who scored below 12 on the 6-week postnatal EPDS (‘EPDS-negative’) (further information, including trial CONSORT diagram and cluster-randomisation methods can be found in that publication). The study examined the effectiveness of care from HVs trained in assessment and psychological support in preventing PND in EPDS-negative women 6–18 months later.
Economic evaluation
The economic evaluation examined the cost-effectiveness of PoNDER HV training for PND in the population of lower-risk (EPDS-negative) women examined in the Brugha et al. (Reference Brugha, Morrell, Slade and Walters2011) sub-study. We also explored whether PoNDER training affected the number of HV visits and whether the impact was similar across women with different levels of depression risk. The economic evaluation design followed technology appraisal guidelines by the National Institute for Clinical Excellence (2004) (NICE) (now National Institute for Health and Care Excellence) and consequently took an NHS and social care perspective.
Costs
The following costs were included: HV training and ongoing clinical supervision to deliver the intervention; HV contacts (number, duration and purpose of visit – whether for PND, mother (excluding PND), baby or any combination of the three); infant immunisations; GP contacts; prescriptions for all conditions; social worker contacts; admissions to Mother and Baby psychiatric units; other mental health contacts, including counsellor, community psychiatric nurse (CPN), community mental health team (CMHT); mental health nurse (MHN), crisis services, psychologist, psychotherapist, psychiatric outpatient and mother and baby psychiatric outpatient.
Cost components (Table 1) other than costs of delivering the intervention were derived from resource use data and nationally representative unit costs applicable at the time the trial started [principally Netten and Curtis (Reference Netten and Curtis2004)]. Resource use data from 6 weeks to 6 months were collected on a resource use log completed by HVs based on their own and GP records (Morrell et al., Reference Morrell, Warner, Slade, Dixon, Walters, Paley and Brugha2009b).
a Includes surgery, home and telephone contacts. Unit cost based on the most common type of contact; surgery contact.
b Assuming a 2-hour visit. No information was available on length of visit.
c Prices adjusted using inflation indices given in Netten and Curtis, Reference Netten and Curtis2004. (Netten and Curtis, Reference Netten and Curtis2004).
d Includes crisis service, psychologist, psychotherapist, psychiatric outpatient and mother and baby psychiatric outpatient contacts. Unit cost based on the most common type of contact; psychiatric outpatient contact.
e Includes counsellor, community psychiatric nurse (CPN), community mental health team (CMHT) and mental health nurse (MHN) contacts. Unit cost based on the most common type of contact; CPN home visit.
f Based on most common drug and dosage for antidepressant prescriptions.
g Calculated as an average of the prescriptions for the eight most common indications.
Note: Means (cluster-adjusted standard errors and confidence intervals).
a Number of baby, mother and PND visits sum to greater than the total number of visits due to some visits being for more than one purpose.
b 5 or fewer contacts in the control and combined intervention groups. Means for both groups round to zero.
c 5 or fewer contacts in the control and combined intervention groups; mean rounds to zero.
d 5 or fewer contacts in the control group and combined intervention groups; mean rounds to zero.
*p < 0.05, **p < 0.01, ***p < 0.001.
The costs of delivering PoNDER training were calculated from trainer fees, travel, backfill HV time and ongoing clinical supervision costs (Morrell et al., Reference Morrell, Warner, Slade, Dixon, Walters, Paley and Brugha2009b). Mean cost of training per HV was £1398 (annual equivalent £988). This translated to an increase in cost per HV hour of client time from £77 (Netten and Curtis, Reference Netten and Curtis2004) to £79: these figures represent unit costs used for HVs in control and intervention groups, respectively.
Outcomes
Data on health-related quality of life were collected using SF-36 at 6 weeks, 6 and 12 months, and generated the preference-based health measure, the SF-6D index (Brazier et al., Reference Brazier, Roberts and Deverill2002). Mothers’ quality-adjusted life-year (QALY) gains between 6 weeks and 6 months were calculated from SF-6D utility index scores at those time points by the area-under-the-curve using the trapezoidal method (Manca et al., Reference Manca, Hawkins and Sculpher2005a). We examined risk-of-depression outcomes at follow-up as a secondary outcome, dichotomising 6-month EPDS scores into 1 = lower-risk (score < 12) and 0 = at-risk (score ⩾ 12).
Analysis
Cluster-adjusted t tests were applied to comparisons of continuous data; cluster-adjusted chi-squared tests were applied to tabulations of categorical data (Davis, Reference Davis2001; Herrin, Reference Herrin2012). Intra-cluster correlation coefficients (ICCs) were derived from one-way analysis of variance (Ukoumunne, Reference Ukoumunne2002). The conventional 5% level of significance was used throughout.
In addition to cost-effectiveness analyses, we explored whether the PoNDER HV training affected the number of HV visits and whether the impact varied by level of depression risk. We addressed the latter question by examining whether numbers of HV visits over the 6-month follow-up differed according to the proximity of participants’ scores to the threshold for ‘lower-risk’ v. ‘at-risk women’. The sample was differentiated into risk sub-groups of ‘very low risk’, ‘subthreshold risk’ and ‘at-risk’ women (EPDS score 0–5; 6–11; and 12 or more, respectively) adopting cut-points used previously (Brugha et al., Reference Brugha, Morrell, Slade and Walters2011) for the effectiveness analysis. A two-level negative binomial model of HV visits was fitted to examine impacts of PoNDER training on visits across all low-risk participants, controlling for experimental group, reason for visit (for mother, baby and/or PND), number of children, history of serious life-events and (for reasons discussed below) cluster size and treatment × cluster size interaction. The model was then extended to include risk sub-group (very low risk v. subthreshold risk) and its interaction with the experimental group.
Analysis of cluster-randomised data must consider correlations between observations within clusters to avoid biased estimates of sampling uncertainty and imprecision in estimating coefficients (Manca et al., Reference Manca, Rice, Sculpher and Briggs2005b). HVs and GPs working within the general practice and the women registered there comprised a cluster. Clustering effects can arise because a practice's patients might have characteristics in common (e.g. similar reasons for living within the practice area, socio-economic circumstances).
Clustering may affect costs and outcomes differently, and ICCs of costs may be larger than those of outcomes (Gomes et al., Reference Gomes, Grieve, Nixon and Edmunds2012a). There may be differential recruitment to and attrition from clusters between trial arms (Adams et al., Reference Adams, Gilliford, Ukoumunne, Eldridge, Chinn and Campbell2004). Imbalances in cluster size between experimental groups could be related to the outcome being measured; for example, the volume of work undertaken by practitioners may be related to patient outcomes (Panageas et al., Reference Panageas, Schrag, Russell Localio, Venkatraman and Begg2007; Gomes et al., Reference Gomes, Grieve, Nixon and Edmunds2012a).
Because of its cluster-randomised design, the PoNDER trial dataset (all levels of risk of PND) has previously served as the basis for exploring approaches to economic modelling of clustered data (Gomes et al., Reference Gomes, Grieve, Nixon, Ng, Carpenter and Thompson2012b). We applied recommended methods (Gomes et al., Reference Gomes, Grieve, Nixon, Ng, Carpenter and Thompson2012b) to all lower-risk participants (EPDS score<12), considering effects of clustering on estimated coefficients and standard errors of costs and outcomes while addressing potential correlations between them (Gomes et al., Reference Gomes, Grieve, Nixon and Edmunds2012a). In our base-case analysis, we used a system of equations (seemingly-unrelated regressions, SUR) where costs and outcomes error terms are permitted to be correlated. The system yields a coefficient on the treatment allocation term in both equations to enable estimation of cost/outcome differences between groups and the covariance between those coefficients.
To adjust for imbalances in cluster size, we incorporated a treatment × cluster size interaction term into each equation (Gomes et al., Reference Gomes, Grieve, Nixon, Ng, Carpenter and Thompson2012b). Cost and outcome (QALY; dichotomous 6-month depression-risk) equations adjusted for potential confounders: mother's age, history of PND, living alone, any history of major life-events, baseline 6-week EPDS score, number of other children in the family and whether mother was economically active. In an analysis of the primary outcome, costs and QALY equations additionally controlled for 6-week (baseline) utility (Manca et al., Reference Manca, Hawkins and Sculpher2005a). The impact of clustering on estimation precision was accounted for by calculating cluster-robust errors.
Incremental cost-effectiveness ratios (ICERs) and cost-effectiveness acceptability curves (CEACs) were generated from cost and outcome regressions. CEACs show the probability that an intervention is cost-effective at various hypothetical ‘threshold values’ of an outcome. The range of willingness-to-pay values covered by the CEAC included £20 000 per QALY, the lower range of thresholds typically used by NICE to identify which interventions to recommend for implementation (National Institute for Clinical Excellence, 2004; National Institute for Health and Clinical Excellence, 2008; National Institute for Health and Care Excellence, 2013). While women with complete clinical and economic data (complete cases) were considered in the base-case analysis, imputation of 6-month missing data was carried out in sensitivity analyses.
Sensitivity analyses
To explore whether primary outcome (QALY) findings were robust to assumptions in important parameters, we varied the definition of threshold delineating groups of lower-risk and at-risk women at 6 weeks, considering a cut-off for being at-risk of 10 or more on EPDS and a cut-off of 13 or more (Songoygard et al., Reference Songoygard, Stafne, Evensen, Salvesen, Vik and Morkved2012; Morrell et al., Reference Morrell, Sutcliffe, Booth, Stevens, Scope, Stevenson, Harvey, Bessey, Cantrell, Dennis, Ren, Ragonesi, Barkham, Churchill, Henshaw, Newstead, Slade, Spiby and Stewart-Brown2016). We also performed SUR on a two-stage bootstrapped sample of cost and outcomes data (1000 replications) to address potential issues of non-normality in distributions of costs and outcomes (Gomes et al., Reference Gomes, Grieve, Nixon and Edmunds2012a). Bootstrap sampling was stratified by randomised group. The methods are presented as sensitivity analyses because one caveat to using two-stage bootstrapping is lower-than-nominal coverage probability (Gomes et al., Reference Gomes, Grieve, Nixon and Edmunds2012a). To examine the impact of missing cases on 6-month results, we created ten complete datasets generated by multilevel multiple-imputation models, and ran them separately for control and intervention groups as previously recommended (Gomes et al., Reference Gomes, Díaz-Ordaz, Grieve and Kenward2013). Models included regressors used in the cost-effectiveness analyses and other baseline factors that predicted missing data on costs and outcomes (feeding method, receipt of benefits, the age of leaving full-time education). Continuous variables were imputed by predictive mean-matching and dichotomous variables were imputed by logistic regression. Multilevel imputation was implemented by chained equations using the mice (Buuren and Groothuis-Oudshoorn, Reference Buuren and Groothuis-Oudshoorn2011) and miceadds (Robitzsch et al., Reference Robitzsch, Grund and Henke2016) packages in R statistical software (R Core Team, 2016). Results of the SUR from each of the ten complete datasets were combined in Stata using Rubin's rules (Rubin, Reference Rubin1987; StataCorp, 2015).
Results
A total of 2241 lower-risk women (767 control in 37 clusters; 1474 intervention in 63 clusters) completed the EPDS at 6-month follow-up. Data sufficient to compute SF-6D scores at both 6-week and 6-month time-points were available for 2158 participants (736 control in 37 clusters; 1422 intervention in 63 clusters). There were 1459 women with complete economic and SF-6D data at both time-points (417 control in 23 clusters; 1042 intervention in 47 clusters). Participants’ baseline characteristics are summarised in Supplementary Table S1.1 for the full lower-risk sample (N = 2241), the sample with economic data (N = 1459) and the sample without economic data (N = 782). Baseline characteristics of the samples with and without economic data within their experimental groups differed in only one respect: at baseline, 3% (10/350) of control group women without economic data available reported poor baby health over the previous 4 weeks compared with 1% (3/417) of control group women with economic data.
Resource use and costs
Over the 6-month period (Table 2), there were no A&E attendances or admissions to mother and baby psychiatric units, while clinical and community mental health contacts and social services visits were extremely rare (five contacts or fewer in either group). The intervention group had statistically significantly fewer HV visits focused on the mother than the control group (3.6 v. 2.0, p < = 0.001), with similar results for PND visits (although contacts for this reason, were comparatively low: 0.3 v. 0.2). HV contacts that were focused on the mother differed between control and combined (CBA/PCA) intervention sub-groups (3.6 v. 1.4, p = 0.003). Overall, total HV time spent with the mother/baby was 56 minutes lower in the intervention than control group (p = 0.027). Average time spent by HVs in the CBA subgroup was 62 minutes lower (p = 0.049) than in the control group. Average time spent by HVs in the PCA subgroup was not different from that in the control group (p = 0.156).
Initial bivariate analyses examining a number of visits to women in the ‘very low risk’ and ‘subthreshold risk’ groups showed interesting patterns. Mean number of visits related to mother's health, and for PND specifically, appeared to rise as EPDS score rose (online Supplementary Table S2.1). This pattern was not seen in visits related to baby health. A number of visits related to the mother's health was significantly lower in the CBA group compared with control group, and also in the CBA and the PCA groups combined compared to control, for both ‘very low risk’ and ‘subthreshold risk’ women. Visits were somewhat but not significantly lower in the PCA group compared with controls within sub-groups.
Further multivariate analyses examined the impact of the PoNDER HV training on the total number of HV visits (for all purposes). In the lower-risk sample, intervention participants received non-significantly more visits than controls (online Supplementary Table S3.1); results were quite similar for CBA and PCA approaches. Analyses also examined whether the intervention had a differential impact on a number of HV visits according to how near mothers’ EPDS scores were to 12. Intervention participants in both subthreshold risk and very low-risk groups received (non-significantly) more HV visits than controls, similar to the results over the whole lower-risk sample. The interaction term for combined intervention and risk sub-group was not significantly different from zero (p = 0.905); results were similar for CBA and PCA (p = 0.837 and p = 0.647, respectively). The impact of the intervention appears to have been relatively uniform over the whole of the lower-risk sample.
The overall cost of care for women in the intervention group was significantly lower (difference of £72; 95% CI −137 to −8; cluster-adjusted t = 2.246, p = 0.028) than for controls (Table 3).
Note: Means (cluster-adjusted standard errors and confidence intervals).
*p < 0.05.
Outcomes
Control and intervention groups were similar in terms of baseline (6-week) utilities (Table 4); utilities were somewhat higher in the intervention group at 6 months. Mean QALYs were significantly higher in the combined intervention group than in controls (mean difference: 0.004 95% CI 0.000–0.008, p = 0.0466). There was a 2.8% difference (95% CI −0.5% to 6.1%) in the proportions remaining at low risk at 6 months between the combined CBA and PCA intervention and control groups.
Note: Means (cluster-adjusted standard errors and confidence intervals).
a Binary variable representing whether the participant was at risk of depression, where EPDS⩾12 is coded as 0 and EPDS<12 is coded as 1. The variable is here treated as continuous, and results are expressed in percentage terms.
*p < 0.05.
Clustering and correlation of costs and outcomes data
ICCs of QALYs (online Supplementary Table S4.1) were negative and larger in the control group than in the combined intervention group; ICCs for costs were higher in the combined intervention group than in the control group. Mean cluster sizes differed between control (18.1) and intervention groups (22.2). Taking Cohen's criterion (Cohen, Reference Cohen1988) as a gauge of effect size, costs were moderately positively correlated with cluster size in the control group (r = 0.43, p ⩽ 0.001) and weakly negatively correlated with cluster size in the CBA group (r = −0.20, p ⩽ 0.001); QALYs were weakly positively correlated with cluster size in the CBA (r = 0.09, p = 0.030) and weakly negatively correlated with cluster size in the PCA groups (r = −0.03, p = 0.005); but not correlated when these intervention groups were combined. These results indicate the need to adjust appropriately for both clustering and correlation within the cost-effectiveness analyses.
Results of the cost-effectiveness analyses
The inclusion of covariates with small amounts of missing data slightly decreased the sample available for analysis to 1446 (with the loss of 7/1035 (0.8%) intervention cases and 6/411 (1.4%) controls).
Outcomes
Mean QALY difference between intervention and control groups, adjusted for covariates, was not statistically significant (Table 5). Results are similar for the intervention subgroups, although gains were slightly greater in the CBA than PCA group. For the dichotomised risk of depression outcome, the adjusted difference in the proportions of mothers at low-risk at 6 months was very slightly lower than the unadjusted figures for the combined intervention groups and for CBA and PCA separately.
Note: unb'd = unbounded.
a Estimated marginal means, cluster-robust standard errors.
b Estimates from SUR equation for QALY adjusted for mother's age, history of PND, living arrangement (alone or with others), any history of major life events, baseline EPDS score, number of other children in the family, whether the mother was economically active, baseline utility.
c Estimates from SUR equation for costs adjusted for mother's age, history of PND, living arrangement (alone or with others), any history of major life events, baseline EPDS score, number of other children in the family, whether the mother was economically active, baseline utility.
d Rounded to nearest 100.
e Binary variable representing whether the participant was at risk of depression, where EPDS⩾12 is coded as 0 and EPDS<12 is coded as 1. The variable is here treated as continuous, and results are expressed in percentage terms. Estimates from SUR equation for low-risk at 6 month adjusted for mother's age, history of PND, living arrangement (alone or with others), any history of major life events, baseline EPDS score, number of other children in the family, whether the mother was economically active.
*p < 0.05, **p < 0.01, ***p < 0.01.
Costs and cost-effectiveness
Adjusted 6-month costs in the intervention group were £82 lower than in the controls. In the intervention sub-groups, costs in the CBA group were £93 lower and costs in the PCA group £73 lower compared with controls.
The point estimate for the cost of a QALY created by the intervention was negative. A negative ICER may occur when costs are lower and outcomes better in one group (referred to as ‘dominance’). In this case, because outcomes were approximately equivalent between groups, although costs were significantly lower in the intervention group, the resulting confidence intervals of the ICER were wide, crossing zero and the upper bound was less than the lower bound (Glick et al., Reference Glick, Doshi, Sonnad and Polsky2007). We can take from the results that the PoNDER intervention is the preferred strategy over usual care, as long as the NHS is willing to pay anywhere from 0 to approximately £66 500 per QALY. The probability of the intervention being cost-effective at £20 000 exceeds 99% (Fig. 1).
When looking at the CEACs for CBA v. control and PCA v. control, there was little difference between them in the probability of being cost-effective (over 99%) over the range of QALY values between 0 and £20 000 (Fig. 2). CBA had a marginally higher probability of cost-effectiveness, which reflects the slightly lower mean costs of CBA, with similar QALYs gained between all three strategies.
The cost of being in the lower-risk group at 6 months rather than in the at-risk group as a result of the intervention (the ICER) was very low, at about £3500.
Sensitivity analyses
Results of all sensitivity analyses are given in Supplementary Table S5.1.
To examine whether results are robust to violations of the assumption of normally distributed dependent variables, the regressions were applied to data generated by two-stage bootstrapping. There was a 99% probability that the intervention was cost-effective at a willingness-to-pay of £20 000 (online Supplementary file S1 Figure S1.1). Results for CBA and PCA were similar, CBA having a 1% higher probability of cost-effectiveness at the £20 000 threshold (99% v. 98%).
We varied the cut-off score for the lower-risk sample (considering both a lower cut-off for being at-risk of 10 or more on the EPDS and a higher cut-off of 13 or more on the EPDS) to examine whether the results were robust to changes in the size and composition of the lower-risk group. The estimates of cost and QALY differences were fairly similar to those in the main analyses. With the lower cut-off for higher-risk status, adjusted QALY difference estimates were 0.001 (95% CI −0.002 to 0.004) lower than in the main analyses (0.001 v. 0.002). With the higher cut-off, estimates were the same as in the main analyses. Cost differences were again similar to the main analyses for the lower cut-off and £10 lower than in the main analyses with the higher cut-off. CEACs were similar to those in the main analyses (online Supplementary file S1 Figure S1.2).
Running the analyses on imputed data also produced similar results to the main analyses in terms of estimated mean costs and QALYs at 6 months. The adjusted QALY difference was very slightly higher than in the main analysis and the confidence interval did not cross zero; the adjusted cost difference of −£47 (95% CI −85 to −10) was £35 lower than in the main analysis, with a smaller confidence interval. The ICER was considerably reduced (from −50 800 to −16 700), with a confidence interval that was negative, suggesting that, taking sampling uncertainty into account as well as the point estimate, the combined intervention was dominant. CEACs were very similar to those in the main analyses (online Supplementary file S1 Figure S1.3).
Discussion
Our analyses examined the impacts of PoNDER HV training package on 6-month costs and QALYs for mothers at lower risk of PND. The intervention for this population of mothers was not only cost-effective at the NICE threshold of £20 000 per QALY gained, but also cost-reducing. This finding is of particular interest given that the eight psychologically-informed HV sessions were primarily targeted at women with an EPDS score of 12 or more (Morrell et al., Reference Morrell, Warner, Slade, Dixon, Walters, Paley and Brugha2009b; Brugha et al., Reference Brugha, Morrell, Slade and Walters2011) and not at those ‘very low risk’ and ‘subthreshold risk’ women included in the analyses reported here. The choice of approach (CBA or PCA) made relatively little difference to cost or to the probability of cost-effectiveness at the £20 000 threshold, suggesting that training in CBA and PCA approaches had more or less equivalent economic consequences. The impact of PoNDER HV training did not appear to be confined to those women closer to the EPDS threshold score of 12, as evidenced by analyses of a number of HV visits across risk sub-groups and by sensitivity analyses varying the threshold for lower-risk.
In relation to the secondary outcome, the 2.3% mean difference between intervention and control groups in proportions of lower-risk women at 6 months was not statistically significant. In analyses elsewhere of a larger sample of lower-risk women (N = 2241) than the 1446 observations available for the economic analysis, the percentage of women with an EPDS score of 12 or more 6-months postnatally was 10.8% of control women and 7.7% of intervention women, a difference of 3.1% (95% CI 0.4–5.9%) (Morrell et al., Reference Morrell, Slade, Warner, Paley, Dixon, Walters, Brugha, Barkham, Parry and Nicholl2009a; Brugha et al., Reference Brugha, Morrell, Slade and Walters2011). Nonetheless, in the economic evaluation sample, the probability of cost-effectiveness was very high over a range of willingness-to-pay thresholds below £14 000 for being lower-risk rather than at-risk at 6 months.
These findings provide strong evidence that the training programme was cost-effective in preventing depression 6 months after childbirth in mothers at lower risk of depression even though psychological intervention sessions were not targeted on them (Brugha et al., Reference Brugha, Morrell, Slade and Walters2011). This intervention reduced the risk of depression and paid for itself over 6 months.
We also found that the number of visits related to the mother's health was significantly lower in the combined intervention group compared with controls. After the training, the intervention groups HVs were more confident in assessing risk and reassessing women than the control group HVs (Morrell et al., Reference Morrell, Warner, Slade, Dixon, Walters, Paley and Brugha2009b). The HVs offered face-to-face psychological support sessions to women who were indicated as depressed according to their clinical assessment and face-to-face EPDS score. The HVs could distinguish true depression from extreme tiredness and labile mood. Therefore, the intervention group HVs appropriately responded to all levels of risk according to the combination of the EPDS score and enhanced clinical assessment skills gained during the training. This may have made them more efficient in their visits to those women at greater risk rather than to those at less risk in comparison with the control groups. In contrast, control group HVs were not trained to have the skills to know which women were truly depressed and therefore may have visited those women who had extreme tiredness and some symptoms of depression but were not depressed.
Strengths and limitations
Strengths of the study include a large number of observations available for analysis, analytic methods appropriate to clustered data and sensitivity analyses of key assumptions in those analyses. The purpose of cluster allocation in training interventions is to protect against contamination of the untrained control group; disadvantages of this approach include potential selection biases during recruitment, increased complexity of design and increased sample sizes required compared with individual randomisation (Klar, Reference Klar2015). Cluster allocation also ensured that health outcomes related to all HVs within each cluster, thereby strengthening the generalisability of results. Our analyses addressed imbalances between trial arms in terms of numbers and size of clusters, an issue not considered in the original analyses (Morrell et al., Reference Morrell, Warner, Slade, Dixon, Walters, Paley and Brugha2009b). Other potential challenges to the robustness of results (impacts of missing cost data and skewness of dependent variables) were explored in sensitivity analyses; they did not make any difference to the conclusions drawn from the main analyses.
There were several limitations to this study. Data available for the cost-effectiveness analysis were less complete than those available on health outcome data; however, sensitivity analyses drawing on multiply-imputed data confirmed the findings, indicating if anything even stronger evidence of the dominance of the intervention over usual care.
Cost measures were confined to health and social care services, a limitation when looking at women at lower risk of depression who have less need for support. We only analysed costs over 6-months postnatally so we did not factor in the risk of longer-term adverse effects on child development and associated costs (Bauer et al., Reference Bauer, Knapp and Parsonage2016b) or on employment-related productivity losses associated with depression (Thomas and Morris, Reference Thomas and Morris2003).
Present-day unit costs may differ from those used here (2003/04 prices) which reflect the organisation of care and mix of services at the time of the trial. Organisation of care, outcomes and unit costs were undoubtedly interrelated and cannot now be easily disentangled. The organisation of health visiting has changed over time: HVs are registered nurses or midwives who can now gain additional qualifications to become specialist community public health nurses (Cowley et al., Reference Cowley, Whittaker, Grigulis, Malone, Donetto, Wood, Morrow and Maben2013; Health Education England, 2016). There has been a national programme to improve access to child health services, increasing numbers of HVs, and to transfer the commissioning of health visiting to local government (Department of Health, 2015). Such changes could affect unit costs. The cost of an hour of health visiting (client-related work) in 2014/15 was £76 (Curtis and Burns, Reference Curtis and Burns2015), whereas the hourly cost used here would be £101 if uprated to 2014/15 prices (Curtis and Burns, Reference Curtis and Burns2015). Data sources and methods used to estimate HV unit costs have changed since 2004 (Netten and Curtis, Reference Netten and Curtis2004), as no data on HV time-use was available for later calculations (e.g. for face-to-face/indirect contacts and travel) (Curtis and Burns, Reference Curtis and Burns2015).
Our cost-effectiveness analysis adopted methods consistent with good practice guidelines (Ramsey et al., Reference Ramsey, Willke, Glick, Reed, Augustovski, Jonsson, Briggs and Sullivan2015) and employed methods relevant to clustered data, but the choice of analytical model can influence results (Mantopoulos et al., Reference Mantopoulos, Mitchell, Welton, McManus and Andronis2016) and so is a source of methodological uncertainty. However, given the strength of our conclusions, such uncertainties are unlikely to be a concern.
In the base-case analysis, the mean utility scores of intervention group mothers were not significantly greater than for control group mothers. We might ask whether measurement of changes in utility in mothers at lower risk for depression can be as accurate as in an at-risk population. However, the SF-6D is sensitive to differences in EPDS scores, discriminating well between different dichotomised levels of risk for PND (Petrou et al., Reference Petrou, Morrell and Spiby2009).
Implications for policy and practice
PoNDER HV training offers the benefit of a service delivered in routine postnatal care with an assessment by HVs with whom women are already in contact: it is a low-cost universal preventive intervention. It reduces the risk of developing PND symptoms, reduces health and social care service use over 6-months and is cost-effective. Recent reviews on the cost-effectiveness of preventive and early interventions (Bauer et al., Reference Bauer, Knapp and Adelaja2016a; Morrell et al., Reference Morrell, Sutcliffe, Booth, Stevens, Scope, Stevenson, Harvey, Bessey, Cantrell, Dennis, Ren, Ragonesi, Barkham, Churchill, Henshaw, Newstead, Slade, Spiby and Stewart-Brown2016) suggest that to date there are no other reports of an economic evaluation alongside a clinical trial to prevent perinatal mental health problems. Decision models drawing on the economic evidence have found that some interventions that address mild or subthreshold symptoms (including PCA and CBT-based universal approaches) are likely to be cost-effective and in some cases also lead to cost savings (Morrell et al., Reference Morrell, Sutcliffe, Booth, Stevens, Scope, Stevenson, Harvey, Bessey, Cantrell, Dennis, Ren, Ragonesi, Barkham, Churchill, Henshaw, Newstead, Slade, Spiby and Stewart-Brown2016). There is also evidence that assessment by trained professionals such as HVs can lead to better outcomes for postnatal women including reduced risk of depression (Bauer et al., Reference Bauer, Knapp and Parsonage2016b; O'Connor et al., Reference O'Connor, Rossom, Henninger, Groom and Burda2016).
Two major global challenges in relation to mental illness are the ‘treatment gap’(Kohn et al., Reference Kohn, Saxena, Levav and Saraceno2004) and ‘prevention gap’ (Jorm et al., Reference Jorm, Patten, Brugha and Mojtabai2017). Rates of undiagnosed and untreated PND are particularly high (Bijl et al., Reference Bijl, De Graaf, Hiripi, Kessler, Kohn, Offord, Ustun, Vicente, Vollebergh, Walters and Wittchen2003), yet many women with perinatal depression do not take up screening (Reay et al., Reference Reay, Matthey, Ellwood and Scott2011). One implication of our study is that there need not be a ‘prevention gap’: women at lower risk of depression would benefit from support by HVs additionally trained in assessment and psychological support. A universal prevention programme of this kind would come at no extra cost to the healthcare system; indeed it would be cost-reducing. There are potential implications for women's perception of available support should they need it (Henderson et al., Reference Henderson, Byrne and Duncan-Jones1981; Brugha et al., Reference Brugha, Sharp, Cooper, Weisender, Britto, Shinkwin, Sherrif and Kirwan1998): women with perinatal depression can be fearful of accessing mental health services (Slade et al., Reference Slade, Morrell, Rigby, Slade, Morrell, Rigby and Ricci2010), for instance worrying that their children will be taken into care (Dolman et al., Reference Dolman, Jones and Howard2013; Megnin-Viggars et al., Reference Megnin-Viggars, Symington, Howard and Pilling2015).
Conclusion
Our analyses confirm that PoNDER HV training in assessment for symptoms of PND plus the skills to provide a psychologically informed intervention (CBA or PCA) is cost-effective, even when additional psychological care is not indicated. This provides support for further investigation of the merits of a universal service that includes extra HV training in clinical assessment and the ability to offer psychological support if indicated.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S0033291718001940
Clinical trial registration number
ISRCTN92195776 (www.controlled-trials.com/ISRCTN92195776)
Financial support
The PoNDER trial was funded by NHS Health Technology Assessment, England; CH and MK inputs to the design and economic analysis reported in this paper were funded by an NIHR Senior Investigator award to MK.
Conflict of interest
MK and CH reported grants from NIHR during the conduct of the study. The authors declare no conflicts of interest.
Ethics committee approval
Trent multicentre research ethics committee.