In 1993, Kozma and colleagues advanced a new framework for outcomes research known as the “ECHO” model (Reference Kozma, Reeder and Schulz1). Using this new framework, the authors cautioned the medical community against relying predominantly on efficacy data generated from clinical trial programs. Rather, the authors suggested that medical decision making should simultaneously consider the economic, clinical, and humanistic impacts of alternative treatment options.
During the same time period, significant advancements were being made in humanistic outcomes research. Due to the contributions of interdisciplinary teams around the world, a proliferation of studies designed to develop and validate patient reported outcome (PRO) measures occurred. These scientifically validated measures provided the conduit for gathering general or disease related patient reports on various aspects of either their disease or treatment. With the availability of patient reported outcomes data, thresholds quantifying a clinically meaningful improvement could be developed. These thresholds are now known as minimal clinically important difference (MCID) thresholds.
Alternative Methods to Derive MCID Thresholds
In recent years, the analytic rigor required to support MCID threshold estimation has increased. The work of numerous interdisciplinary teams including biostatisticians, epidemiologists, physicians, and others have culminated in the identification of three different and unique strategies for MCID development (Reference Kazis, Anderson and Meenan2–Reference McGlothlin and Lewis13). Table 1 provides a summary of the various techniques to derive MCID threshold estimates and provides examples of them.
MCID, minimum clinically important difference; PRO, patient reported outcome.
Two quantitative strategies, the distribution-based and anchor-based MCID threshold estimation methods, have been advanced more recently in the scientific literature. Distribution-based MCID estimates report the change in a PRO measure based on the normal variability observed in the data and, in general, assume that a change greater than X times the baseline standard deviation is clinically meaningful (Reference Barnes, Vaidyanathan, Williamson and Lipworth14). The utility of distribution-based methods may be limited because they are based on statistical reasoning only. Also, variations in standard deviations may be observed when PRO data are obtained from heterogeneous patient populations. Because of these limitations, anchor-based threshold estimates are generally perceived as the more rigorous of the two techniques (Reference Barnes, Vaidyanathan, Williamson and Lipworth14). The third strategy involves the development of MCID threshold estimates by technical expert panelists. Generally this method uses clinical experience to derive the estimates instead of patient reported outcome data. When all three types of threshold estimates are available, thresholds developed based upon clinical expert opinions have been recommended as a supplementary strategy to existing anchor- and distribution-based estimates of MCID (Reference Barnes, Vaidyanathan, Williamson and Lipworth14).
MCID Estimates and Technology Assessment of SAR Treatment Options: A Case Study
The Total Nasal Symptom Score (TNSS) is a PRO measure required by marketing authorization authorities (eg, the United States Food and Drug Administration) to assess patient perceptions of the benefits of alternative treatments for seasonal allergic rhinitis (SAR) (15). The reflective Total Nasal Symptom Score (rTNSS) measures the overall effectiveness of a treatment in controlling symptoms of a pre-specified (e.g., 12 hr) period of time, and AM and PM reports are typically provided by patients. Patients report their symptom severity on a scale of zero (no symptoms) to three (severe symptoms) for up to four different symptom types. The four symptom types assessed by the rTNSS include rhinorrhea, nasal congestion, nasal itching, and sneezing.
Computation of treatment effectiveness using the TNSS as the primary endpoint for clinical trials for SAR treatments can be derived through various scoring mechanisms. In general, however, treatment effectiveness is defined as the change in TNSS score from baseline. With respect to calculating the score at any time point in the study, one technique uses an average of the score recorded by the patient in the morning and the evening. Another method of scoring involves summing of the morning and evening patient reported data.
Using patient reported data derived from the TNSS, Barnes and colleagues derived MCID estimates using several analytical techniques. A more in-depth discussion of data sources and statistical methods used by Barnes et al. to generate MCID estimates using data collected from patients with SAR is described elsewhere (Reference Barnes, Vaidyanathan, Williamson and Lipworth14). However, one of the MCID thresholds derived used the preferred direct anchor-based approach with estimated MCIDs ranging from 0.28 units (95% confidence interval [CI]: -0.18–0.73) and 0.23 units (95% CI: -0.16–0.62). This estimate is used to determine whether four different intranasal steroids' efficacy data as found in the approved prescribing information would likely be of a magnitude that patients would perceive the improvements to be clinically meaningful.
Table 2 provides a summary of the data that is used to complete the comparison. The focus of the data presented is solely on current representatives of intranasal steroid and antihistamine treatments. A wider complementary analysis representing further treatment options for SAR is described elsewhere (Reference Meltzer, Wallace, Dykewicz and Shneyer16). When the absolute value of treatment effect is larger than the absolute value of the estimated MCID thresholds the treatment is assumed to not only provide a statistically significant improvement in patient's symptoms but also a clinically meaningful improvement as well. Data provided in Table 2 illustrate that all treatments advance both a statistically significant and clinically meaningful improvement in symptom relief for SAR patients.
a Assessed using a 24-point scale and normalized to a 12-point scale for purposes of comparison.
b Clinically important differences occur when the absolute value of the mean change in rTNSS exceeds the anchor-based MCID threshold estimate.
c When multiple values of treatment effect/benefit were reported in the approved prescribing information, the most conservative (lowest) estimate was included and all others were excluded from the evaluation.
In July 2013, the Agency for Healthcare Research and Quality (AHRQ) released a comparative effectiveness review of SAR treatments (17). Despite the availability of anchor-based estimates in 2010 (Reference Barnes, Vaidyanathan, Williamson and Lipworth14), the AHRQ reviewers used consensus-based technical panel MCID threshold estimates (17). Two panel members considered a 4-point change meaningful and one advisor considered a 2-point change meaningful for the TNSS. The panel ultimately concluded that 30% of the maximum score (i.e., a MCID of 3.6) was an appropriate threshold for SAR technology assessment. The estimates of this panel may have likely been informed by federal documents guiding the development of drugs for allergic rhinitis and prior research by Bousquet and colleagues (15;Reference Bousquet, Combescure, Klossek, Daurès and Bousquet18). A closer look at prior research reveals that the primary focus was on the assessment of responsiveness.
While responsiveness and MCID threshold estimation may be related, the two concepts should not be inferred as being interchangeable as one focuses on individual patient effects and the other compares population means. Irrespective of the underlying rationale for the selection, a 30% (3.6 units) MCID threshold derived from the AHRQ panel substantially exceeds the anchor-based MCID threshold derived through rigorous statistical testing of patient reported rTNSS data. Application of the 3.6 threshold in a manner consistent with the previous evaluation would suggest that none of the evaluated intranasal products provide clinically meaningful benefits in patient populations for which they have been approved and routinely used in clinical practice.
DISCUSSION
Healthcare decision makers are faced with a difficult task on a daily basis: assessing the quality of evidence for multiple therapies and interpreting the plethora of data to guide reimbursement and coverage determinations that affect patients’ lives, physically and monetarily. Within the United States (US), payers may rely upon data provided within Academy of Managed Care Pharmacy Format for Formulary Submissions (19) and supplemental systematic reviews and additional reports. External to the United States, HTA organizations such as the National Institute for Health and Care Excellence in the United Kingdom and the Institute for Quality and Efficiency in Health Care in Germany provide guidance about whether the value of an individual product may warrant reimbursement (20;21).
Over the past few decades, researchers have sought to obtain high quality patient reported data and quantify MCID thresholds that can be used to provide clinicians and reimbursement authorities with a secondary reference threshold to evaluate the clinical effectiveness of alternative therapies (Reference Hedayat, Want and Xu12). The MCID, first defined by Jaeschke et al., is “the smallest difference in score in the domain of interest which patients would perceive as beneficial and which would mandate, in the absence of troublesome side effects and excessive cost, a change in the patient's management” (Reference Jaeschke, Singer and Guyatt3). Thus, MCIDs may be an important component of medical decision making if populations of patients are to maintain sufficient access to treatment options most likely to provide them with meaningful improvement in symptoms.
Furthermore, the case study above demonstrated that reimbursement decisions may vary based upon the MCID threshold estimation technique used. To mitigate the risk of denying patient access to treatment options that may result in clinically important improvements in their treatment, all available methods should be considered and presented in health technology assessment reports. A critical assessment should discuss differences in obtained results and check whether the applied MCID is sensitive for detecting relevant effects.
Before clinicians can routinely use MCID to guide decisions specific to treatment plans of individual patients, much work is needed by the research community in collaboration with payers and reimbursement authorities. There may be merit in bringing experts from around the globe together to discuss many of the issues identified in this study. For example, this study identified that the TNSS has favorable endorsement for marketing authorization and regulatory bodies for use in pivotal clinical trials. Once a PRO gains this favorable endorsement, should there be an immediate focus on developing MCID threshold estimates in a uniform manner to guide subsequent reimbursement/payer decisions? Also, what entity, if any, should assume responsibility for monitoring ranges of MCID threshold estimates and commissioning action when large variations exist? Finally, what group should be identified as monitoring the medical literature and designated terms associated with MCID threshold estimation? Encouraging standardization around such terminology may support ease of access of such data when it is needed by clinicians or reimbursement authorities.
As with any evaluation, this review and associated case study have certain limitations. First, the statistical analysis supporting the derivation of MCIDs using alternative methods is simply reviewed. Additional detail is provided for the review in the study by Barnes et al. (Reference Barnes, Vaidyanathan, Williamson and Lipworth14). In addition, the therapeutic alternatives in the evaluation are not complete and only a few intranasal treatment options for SAR are considered. The authors make no apology for the simplicity. Quality, transparency and simplicity in decision making should always be a desired goal in medicine. The narrow focus on a select few current treatment options for SAR was deemed adequate to illustrate the methodological points and was motivated by the authors’ awareness that additional research was ongoing to support further therapeutic options for the treatment of SAR (Reference Meltzer, Wallace, Dykewicz and Shneyer16).
CONCLUSION
Medical decision making today is greatly enhanced because of high quality outcomes research. Clinical trials continue around the global and provide healthcare decision makers insights on the potential advantages or disadvantages of alternative treatment options for a given disease state. Reliance, however, on this information solely for the purposes of medical decision making continues to be an area of public debate.
With the proliferation of high quality humanistic outcomes research data, medical decision makers may now have at their disposal high quality evidence to better understand the influence of alternative technologies on patient reported health status. The translation of PRO data into methodologically sound MCID estimates and the consistent application of rigorously developed MCID threshold estimates are important steps in attaining the appropriate balance between clinical and humanistic outcomes. Individuals and entities conducting health technology assessment have a professional and ethical responsibility to use MCID estimates derived from rigorous statistical analysis of population based patient reported outcomes when available.
CONFLICTS OF INTEREST
This study was funded by Meda Pharmaceutical. Dr. Munzel was employed by Meda Pharma GmbH when the study was conducted and has no conflict of interest with regard to this study. Dr.'s Carroll and Morland are employees of Xcenda, a contractor to Meda at the time of study completion. Dr. Brixner received personal fees from Xcenda during the conduct of this study. She also reports receipt of personal funds and/or grants from Millcreek Outcomes Group, Abbott, Novo Nordisk, and Certara outside of the submitted work. Dr. Meltzer received personal fees from Meda during the conduct of this study. In addition, Dr. Meltzer reports personal fees and grants from Allergan, AstraZeneca, Boehringer Ingelheim, Circassia, Glaxo-SmithKline, Greer, Johnson & Johnson, Merck, Mylan, Regeneron/Sanofi, Sunovion, 3E Therapeutics, Takeda, Teva, and Valeant outside the conduct of the submitted work. Dr. Lipworth reports grants and personal fees from Meda during the conduct of this study, as well as the receipt of personal fees, grants, and/or non-financial support from Dr. Reddy, Cipla, Novartis, Boehringer-Ingelheim, Teva, Astra Zeneca, Janssen, Roche, and Chiesi outside the conduct of the submitted work.