Relative effectiveness can be defined as “the extent to which an intervention does more good than harm compared with one or more alternative interventions under the usual circumstances of healthcare practice” (1). This contrasts with relative efficacy, which is a comparison “under ideal circumstances,” which is usually associated with controlled clinical trials (Reference Eichler, Abadie and Breckenridge2). “Comparative effectiveness” is closely related to relative effectiveness (Reference Towse, Jonsson and McGrath3).
Towse et al. (Reference Towse, Jonsson and McGrath3) propose an analytical framework, which draws upon production function theory, that describes how certain sets of inputs and processes yield specified outcomes. The aim is to systematically identify and quantify the potential determinants of relative effectiveness. This study reports a first assessment of the framework to help understand the contextual differences between countries that could be associated with differences in effectiveness and relative effectiveness. In recognition of ongoing efforts to develop European Union (EU) -level approaches to assessment, our case study focuses on breast cancer in three countries in the EU.
OBJECTIVE
To highlight potential cross-country differences in the relative effectiveness of a new drug we reviewed studies investigating reasons for differences in health outcomes in breast cancer. We also reviewed relevant national clinical guidelines and health technology assessment (HTA) reports to understand similarities and differences in the management of breast cancer. We show how our analytical framework can help to understand the factors that might drive differences in relative effectiveness across different settings.
METHODS
In a separate study in this issue (Reference Towse, Jonsson and McGrath3), we set out an analytical framework that uses a health production function approach, with health as the output of interest (Reference Jönsson4). Inputs (“factors” or “determinants”) are classified according to the level at which they operate: patient level (i.e., individuals’ clinical or socio-demographic characteristics); provider level; and the level of the healthcare environment or system. The relative effectiveness of a drug is the additional net output (health) achieved by adding a new drug to usual care or substituting it for another treatment. In this study, we use breast cancer as a case study to identify evidence on the factors associated with health outcomes, drawing on findings from England, Spain, and Sweden.
In selecting a disease area for our case study, we considered several potential tracer conditions including cardiovascular disease, Alzheimer's disease, schizophrenia, cancer, osteoporosis, and rheumatoid arthritis. We selected breast cancer because it is a common condition, is a high clinical priority in all three countries, and new drugs enter the market regularly. Outcomes are driven by both drug and nondrug interventions, as well as by the coordination of care across different settings, and the care pathway covers prevention, early detection, diagnosis, surgery, and adjuvant therapy.
The selection of countries was mainly driven by the likelihood that data would be available for most of the factors we wanted to investigate. We, therefore, decided to limit our choice to countries with similar gross domestic product (high income countries), that had good data on usage and cost, and that varied in technology diffusion and health outcomes. Pragmatically, national clinical guidelines would be accessible only if published in English, Spanish, or Swedish, and this factor helped us to finalize our selection. Our three study countries, England, Spain, and Sweden, have published clinical guidelines on breast cancer, which provide an indication of national priorities and inputs that may influence outcomes. Two of the three countries (England and Sweden) have also assessed the cost-effectiveness of (some) breast cancer drugs.
To identify the data that would be needed to populate a health production model for breast cancer, we undertook a review of the literature. We also reviewed national clinical guidelines and HTA reports.
Literature Review
A recent review of studies explored the extent of any variation in relative efficacy and relative effectiveness of medicines used in one or more EU Member States (Reference Mestre-Ferrandiz, Puig-Peiró and Towse5). The review found little empirical evidence on cross-country differences, and no cross-border observational studies to compare effectiveness in routine practice. For the purpose of this article, we, therefore, simplified our approach: focusing on breast cancer, we searched for studies that investigated determinants of health outcomes such as mortality or quality of life in one or more of our countries of interest (Sweden, Spain, the United Kingdom). We included regression analyses and registry studies published between January 2000 and August 2011. The search strategy was designed for Medline based on key search terms agreed by three of the authors (Table 1) and then adapted to run on EMBASE. The Medline strategy is available in online Supplementary Table 1.
Note. Both interventional (experimental) and observational studies were eligible for inclusion. The search strategy used for Medline is available online (Supplementary file 1).
Titles and abstracts from the searches were screened for eligibility by two of the review team (R.P.P., A.M.). To be eligible for inclusion, studies needed to explicitly investigate determinants driving differences in outcomes, either across countries (international comparative studies) or within countries (individual country studies). Potentially eligible studies were identified by two authors (R.P.P., A.M.) and assessed for inclusion by one author (RPP). Figure 1 shows the study selection process. One member of the review team (R.P.P.) extracted the data from each study into a template, providing details of the study design, countries covered by the study, data sources, health outcomes and findings (see online Supplementary Tables 1 and 2). As shown in Table 2, the factors identified were then grouped into the framework categories reflecting the level of influence (individual, provider, and national level) using the template from Table 1 in Towse et al. 2015 (Reference Towse, Jonsson and McGrath3). These data were checked by a second reviewer (A.M.).
Note. Two international studies (Reference Botha, Bray, Sankila and Parkin8; Reference Sant, Allemani and Santaquilani14)and one based in Spain (Reference Vilaprinyo, Rue, Marcos-Gragera and Martinez-Alonso31) discussed the quality and efficacy of care in relation to their findings but none formally tested for it.
Clinical Guidance Review
To identify similarities and differences in recommended care pathways across our study countries, clinical guidelines for the treatment of breast cancer and relevant HTA reports were reviewed. We searched the Web sites of national HTA agencies (England and Wales, Sweden), Ministries of Health (Spain) and Royal Colleges (Spain), and consulted experts (Sweden). Comparative data on screening programs, and treatment recommendations by stage of disease were extracted and tabulated.
RESULTS
Forty-three studies were included in the literature review. Thirteen of these forty-three were international comparative studies that covered at least two of the three countries in our case study (Reference Autier, Boniol and La Vecchia6–Reference Woods, Coleman and Lawrence18). The remaining thirty studies were national, investigating individual countries. Nine studies covered England (Reference Davies, Linklater and Coupland19–Reference Sloggett, Young and Grundy27), four were set in Spain (Reference Cabanes, Vidal and Perez-Gomez28–Reference Vilaprinyo, Rue, Marcos-Gragera and Martinez-Alonso31), and seventeen were set in Sweden (Reference Duffy, Tabar and Chen32–Reference Warwick, Tabar, Vitak and Duffy48).
The review of national guidance (either clinical guidelines or HTA reports) identified five documents on breast cancer care for England and Wales (49–53), three from Spain (54–56), and six from Sweden (Reference Engholm, Ferlay and Christensen57–62). The Cancer Strategy document published by the Spanish Ministry of Health (54) makes no treatment-specific recommendations, so we also reviewed the two Spanish Society of Medical Oncology (SEOM) guidelines (55;56) although these are not “official” guidance. In all countries, guidelines covered the whole disease pathway incorporating early, advanced, and metastatic disease.
Table 2 provides an overview of factors affecting breast cancer outcomes identified from the literature review. It groups them according to the multilevel approach: “individual level,” “provider level,” and “environment and healthcare system level” set out in Towse et al. (Reference Towse, Jonsson and McGrath3) (Table 1). The table lists the studies that either tested for determinants, or commented on them. We discuss the key factors below.
Individual Level Factors
At the individual level, several demographic factors were consistently associated with poorer outcomes in breast cancer patients, including older age, socio-economic status, and lifestyle factors (smoking status). Older women (aged 75 and over) had lower survival rates than younger women. Although this is partly explained by stage at diagnosis (Reference Sant, Allemani and Capocaccia13)—older women are more likely to present with late stage disease—a Swedish study found that survival differences persisted and were more pronounced in older women with late stage disease than clinically comparable (but younger) women. Older women underwent less intensive diagnostic activity, and less aggressive treatment, even after adjusting for comorbidity (Reference Eaker, Dickman, Bergkvist and Holmberg33). Evidence from England and Sweden suggested that women with lower socio-economic status have worse survival, after adjusting for tumor size and age (Reference Davies, Linklater and Coupland19;Reference Bellocco, Karlsson, Tejler and Lambe42) and that better educated women are likely to have a better prognosis (Reference Hussain, Altieri, Sundquist and Hemminki39;Reference Hussain, Lenner, Sundquist and Hemminki40). A Swedish study found that smoking status independently increased the risk of death (after adjusting for age and stage of disease) (Reference Manjer, Andersson and Berglund44). We found no direct evidence on treatment concordance (adherence).
In terms of individuals’ clinical characteristics, there was strong evidence that disease stage at diagnosis is an important—and perhaps the most important—predictor of cross-country differences in 5-year survival. However, stage at diagnosis is not, in itself, an “explanation”; rather, it begs the question of why disease stage differs across countries. Possible reasons include screening intensity, access to diagnosis and treatment, and public awareness (which we consider below). Tumor pathology, in particular, the proportion of women with node negative disease, accounts for some differences in survival (Reference Sant, Allemani and Capocaccia13), and Swedish studies found that genetic (familial) determinants also affect prognosis and survival (Reference Hartman, Lindstrom and Dickman37;Reference Hemminki, Ji, Forsti, Sundquist and Lenner38;Reference Lindstrom, Hall and Hartman43). Women with specific comorbidities may have fewer treatment options, for instance if they are unsuitable for radiotherapy or chemotherapy (Reference Eaker, Dickman, Bergkvist and Holmberg33). However, we found no study that explicitly tested the impact of co-morbidity on survival.
Provider Level Factors
There was less evidence on which features of the healthcare system influence survival, and our searches found no cross-country analyses. Studies from England have investigated the role played by access (travel time) and by multidisciplinary teams (MDTs). Travel time to the GP (general practitioner) was correlated with stage at diagnosis, but there was no consistent relationship between travel time to hospital and survival or stage at diagnosis (Reference Jones, Haynes and Sauerzapf20). MDTs improved the process of care but did not significantly improve survival at 1, 3, or 5 years (Reference Morris, Haward, Gilthorpe, Craigs and Forman23;Reference Rachet, Maringe and Nur25). However, if average survival for a breast cancer patient is around 7 to 8 years after diagnosis (Reference Sant, Capocaccia and Coleman15), longer follow-up periods may be needed to detect an effect.
Other studies have considered access to diagnostic facilities and to treatments, and waiting times between symptom onset and treatment. The importance of access to diagnostic facilities is well-recognized, and we discuss this in relation to screening programs (see below). An English study analyzed data from the Northern and Yorkshire Cancer Registry and Information Service (NYCRIS) to compare 3-year survival rates for those diagnosed between 1982 and 1990 with cases diagnosed between 1991 and 1999 (Reference Pisani and Forman24). In all age groups, 3-year survival improved significantly between the two periods. Stage at diagnosis explained all the improvement in those aged over 65, and explained most of the improvement in women aged below 65. Although the uptake of systemic treatment (chemotherapy and hormone treatment) increased substantially over time, systemic treatment had no statistically significant effect in explaining improvements in prognosis in any age group or overall. However, there are several reasons why this “negative” finding for treatment effect needs to be interpreted carefully. First, 3-year survival may be too short a time to robustly assess the impact of systemic therapy on mortality. In addition, data on stage at diagnosis were missing for a large proportion of cases, particularly in the earlier period. This “stage migration” could have led to greater misclassification bias in the first period, which could, therefore, overstate the role of stage in explaining survival improvement. Lastly, the study did not test for an interaction between stage at diagnosis and treatment uptake, so did not isolate the effect of earlier treatment per se. Further details of this study (Reference Pisani and Forman24) are available in online Supplementary Table 1.
Finally, the quality and consistency of data recording is known to vary across countries, and there are differences between countries in the methods and specificity of certifying cause of death (Reference Autier, Boniol and La Vecchia6;Reference Botha, Bray, Sankila and Parkin8). However, a recent analysis found that even “implausibly extreme” assumptions about data errors could not account for all the observed cross-country differences in survival (Reference Woods, Coleman and Lawrence18).
National / Environmental Factors
There are national screening programs in operation in England and Wales (Reference Richards63) and in Sweden (Reference Wilking and Kasteng64). In Spain, screening programs are managed and run on a regional basis. Table 3 summarizes the characteristics of the screening programs in terms of the target population and screening interval, based on the review of clinical guidelines.
Notes: NHSBSP: National Health Service Breast Screening Programme (http://www.cancerscreening.nhs.uk/breastscreen/)
Sources: Botha 2003 (Reference Botha, Bray, Sankila and Parkin8); Ministerio de Sanidad y Politica Social 2010 (54); Wilking 2009 (Reference Wilking and Kasteng64); Autier 2011 (Reference Autier, Boniol, Gavin and Vatten65)
The intensity of screening activity was strongly associated with improved survival, although evidence for an impact on mortality rates was mixed (Reference Autier, Boniol and La Vecchia6;Reference Duffy, Tabar and Chen32). Both national screening programs and opportunistic screening increased the incidence of early stage breast cancer. This improves overall survival rates, reflecting both the effect of earlier treatment and lead time bias. However, countries that have not introduced screening have also seen improvements in survival (Reference Autier, Boniol and La Vecchia6;Reference Botha, Bray, Sankila and Parkin8), suggesting that other factors play a role.
Evidence on the role of national guidelines was sparse, in terms of both the extent of implementation and the effect on outcomes. Our review of national guidance found few differences in recommendations for treatment of breast cancer, but variation in the date of issue and of the scope of guidance, as well as its implementation, may be important. A Swedish study investigated regional differences in survival, and found that suboptimal diagnostic activity in one county explained the variation. Services were reorganized in this county: multidisciplinary working was better staffed and co-ordinated, screening and diagnostic activity were quality assured, and treatment recommendations were implemented. When guideline adherence improved in these ways, survival also improved (Reference Eaker, Dickman and Hellstrom34). An evaluation of the effects of 1995 Calman-Hine report, which introduced national cancer guidelines, found that adherence varied across English regions (Reference Morris, Haward, Gilthorpe, Craigs and Forman23). A study found evidence that care processes had improved as a result of both the Calman-Hine report and the subsequent English Cancer Strategy (2000), but improvements in survival were not statistically significant (Reference Rachet, Maringe and Nur25).
Several international studies found that countries with higher national income, and that spent a greater proportion on healthcare, also had better survival rates (Reference Berrino, De Angelis and Sant7;Reference Sant, Aareleid and Berrino12;Reference Sant, Allemani and Capocaccia13;Reference Sant, Berrino, Capocaccia and Estève16;Reference Woods, Coleman and Lawrence18). This may be due to improved access to care. For example, countries with higher national income may be able to afford better equipped hospitals; the number of in-patient beds and computerized tomography (CT) scanners per million population were found to be positively associated with survival (Reference Sant, Berrino, Capocaccia and Estève16). However, some of this improvement in survival may be an artefact of improved detection methods (e.g., screening programs) which increases the incidence of “over diagnosed” cancers (see Table 1).
DISCUSSION
Our case study is not a definitive assessment of the validity of our framework, but rather a first attempt to explore how a health production approach can help identify the factors that should be considered in an assessment of the relative effectiveness of a new drug. These factors could potentially be used to optimize effectiveness in routine practice. Engagement from broad group of stakeholders (including providers) would be crucial to the success of this process, and we set out below the types of challenge they would need to resolve.
Choice of Outcome Measure
Cross-country differences in breast cancer outcomes are well documented (Reference Autier, Boniol and La Vecchia6;Reference Berrino, De Angelis and Sant7;Reference Sant, Allemani and Capocaccia13;Reference Sant, Allemani and Santaquilani14;Reference Wilking and Kasteng64). However, the outcome measure used to assess relative performance across countries can give very different results in terms of ranking. When our three countries are assessed by 5-year survival rates, Sweden is ranked first and the United Kingdom is ranked last (Reference Sant, Allemani and Santaquilani14); but an analysis of mortality trends from 1989 to 2006 ranked Spain first and Sweden last (Reference Autier, Boniol and La Vecchia6). To understand this apparent discrepancy, we need to recognize that survival is a “complex indicator of a country's performance” (Reference Berrino, De Angelis and Sant7). Longer survival may reflect later death and/or earlier diagnosis—and earlier diagnosis may reflect screening intensity. But earlier diagnosis that does not lead to later death is of questionable benefit to patients. Comparisons based on survival may, therefore, be misleading, if differences in survival do not reflect reductions in mortality. A recent international comparison suggested screening did not play a direct part in reductions in mortality (Reference Autier, Boniol, Gavin and Vatten65). Both survival and mortality may need to be considered alongside incidence if valid assessments of prognosis are to be made (Reference Sant, Capocaccia and Coleman15;Reference Autier and Boniol66).
Data Limitations
A limitation is that we have only identified factors reported in the literature, and there may be other important drivers that have not been assessed. For example, we found no study that isolated the impact of hormone replacement therapy (HRT) on outcomes. HRT is associated with an increase in the risk of breast cancer (Reference Bergkvist, Bixo and Björkelund67;Reference Prentice68), but only an estimated 3 of 100 breast cancers is related to use of HRT (69). As use of HRT varies and breast cancers induced by HRT may be less aggressive, variations in HRT prescribing across countries are likely to influence international differences in survival rates in a complicated way.
Most of the evidence related to the individual level, which probably reflects data availability—cancer registries include an array of patient characteristics, but comparable information on countries’ healthcare provider systems must be added from external sources. Where access to treatment was assessed, this typically did not take account of dose or duration of treatment. Conversely, we found more evidence on national factors, such as screening programs. Subsequent studies need to further elucidate the factors that may influence breast cancer outcomes, ideally in consultation with clinical experts and possibly drawing on additional (unpublished) data sources such as those documenting differences in resource availability, or spend on breast cancer. They would need to take account of evidence of the impact of genetic variations on both prognosis and choice of therapy.
Causality or Association?
A further shortcoming of our review is that it reports associations between health outcomes and various factors, but it is less clear whether the relationships are causal. This is because most of our studies are retrospective analyses of observational data. The quality of this type of study is heavily dependent upon the number of observations, the underlying data quality (which is rarely reported in journal articles), the functional form of the model and whether there are confounding factors that are not, perhaps cannot be, taken into account. To explore causality would require different study designs, such as randomized trials. However, these are not feasible when investigating the impact of national factors. Even if associations are robust, they shed little light on drivers relating to the inputs and activities included in the care given, which will impact on how a treatment is used and what, if anything, it displaces. There may also be interactions and correlations between the factors we identified, both within and between different levels, for instance, national income is likely to be correlated with individuals’ educational level and individuals’ stage at diagnosis will be linked to system level screening policy. This problem is perhaps more complex for breast cancer than for some other diseases, such as acute conditions, although most chronic diseases are managed through a combination of screening, diagnosis, lifestyle alterations or interventions, and drug treatment.
CONCLUSIONS AND POLICY IMPLICATIONS
Based on our review of studies comparing breast cancer outcomes and of guidelines/HTA reports in three European countries, we believe that the way efficacy translates into relative effectiveness across health systems is likely to be influenced by a range of complex and interrelated factors. These comprise not only the genetic and other biological and behavioral patient factors mentioned by Eichler et al. (Reference Eichler, Abadie and Breckenridge2) (which we term “individual” patient level factors in our model) but also the characteristics of the providers and healthcare environment and system-level factors. For example, the importance of stage at diagnosis begs the question of why stage of disease differs across countries. Arguably, this finding reflects the conclusion of Eichler et al. (Reference Eichler, Abadie and Breckenridge2) that “where there is an apparent large gap between efficacy and effectiveness, one is not looking at a drug problem but at a healthcare delivery problem, and the focus of remedial action should be shifted to improving real life performance.”
Relative effectiveness is a current policy issue in Europe, and this is why our case study is focused here. In principle, the same issues arise in any context where drugs are approved centrally but where there may be significant regional variations in how the drugs are used in practice and, therefore, differences in relative effectiveness. By recognizing that impediments to improving health can arise at several levels, policy makers in any jurisdiction can begin to explore ways to optimize relative effectiveness. Studies that show differences in relative effectiveness between countries, or that identify factors suggesting these exist, provide one way to identify how health system performance can be improved.
Careful consideration of the determinants within our framework may also aid discussions on the extent to which evidence for HTA based decision making can be shared across health systems, and identify the data required for robust comparisons. In some cases, it will be reasonable to expect evidence on relative effectiveness to be transferable; in other cases, it may be possible to anticipate and adjust for expected differences in relative effectiveness between countries, and so use evidence from one country in another. In other cases, however, an understanding of relative effectiveness in a country may generate questions that cannot be answered by existing evidence and that require a bespoke study.
SUPPLEMENTARY MATERIAL
Supplementary Tables 1 and 2 http://dx.doi.org/10.1017/S0266462315000720
CONFLICTS OF INTEREST
Puig-Peiro, M.Sc. reports grants from Pfizer during the conduct of the study and grants from The Association of the British Pharmaceutical Industry outside the submitted work. At the time of writing the report, Dr. Puig-Peiro was working at the Office of Health Economics. Her new affiliation is the Catalan Health Service and she does not have conflict of interests. Dr. Mason reports grants from Pfizer (contract with OHE Consulting) during the conduct of the study and grants from Novartis (contract with OHE Consulting) outside the submitted work. Dr. Mestre-Ferrandiz reports grants from Pfizer during the conduct of the study and from The Association of the British Pharmaceutical Industry outside the submitted work. Professor Towse reports grants from Pfizer during the conduct of the study and from The Association of the British Pharmaceutical Industry, outside the submitted work. Dr. McGrath reports grants from Pfizer during the conduct of the study and from Pfizer and AstraZeneca outside the submitted work. Professor Jönsson reports personal fees from Pfizer, during the conduct of the study.