1 Introduction
Imagine a patient who suffers from lung disease. She suffers shortness of breath only during heavy physical activity, such as jogging for three blocks. On a scale of 0 to 100, what is her quality of life like? And how does her quality of life compare to that of a more severely ill patient, someone who suffers shortness of breath even in a resting state?
A respondent in a health survey may find it extremely difficult to come up with a rating for a health description like this. Surely the first person is much healthier than the second, but how much healthier? And how different would their quality of life be? 20 points? 50? 80? The specific numbers may seem quite arbitrary.
In order to rate health conditions, survey respondents must not only evaluate how good or bad a condition is, they must then decide how to translate that evaluation into a specific value on an unfamiliar rating scale. Because such tasks are subject to individual interpretation, the specific values assigned to a given health state may depend on who is doing the rating and the circumstances of the rating task, leaving much confusion for researchers and policy makers trying to make sense of the results.
1.1 Personal perspective in health ratings
The uncertainty of health ratings is evident in the differences often observed between patients’ and non-patients’ ratings of health conditions (Boyd et al., Reference Boyd, Sutherland, Heasman, Tritchler and Cummings1990; Brickman et al., Reference Brickman, Coates and Janoff-Bulman1978; Hurst et al., Reference Hurst, Jobanputra, Hunter, Lambert, Lockhead and Brown1994; Riis et al., Reference Riis, Loewenstein, Baron, Jepson, Fagerlin and Ubel2005; Sackett & Torrance, Reference Sackett and Torrance1978; Schultz & Decker, Reference Schulz and Decker1985; Sieff et al., Reference Sieff, Dawes and Loewenstein1999; Smith et al., in press). Patients typically rate their condition higher than non-patients, so explanations for the discrepancy often focus either on patients overvaluing their health condition or non-patients undervaluing it.
However, the discrepancy between patients’ and non-patients’ ratings may actually reflect more complex perspective differences than a straightforward under- or over-valuing of health conditions by either group. Kahneman and Tversky’s (Reference Kahneman and Tversky1979) prospect theory suggests that an individual’s reference point is critical in determining how he or she evaluates a given state. As gains or losses become more distant from the status quo, they have a diminishing effect on utility. In the case of health, small changes in health should produce a relatively steep change in quality of life (QoL), with proportionally smaller impact from larger changes in health.
Because patients and non-patients have a different status-quo reference point, they should have different perceptions of the same health condition. For a patient suffering from a moderately severe case of lung disease, a milder case of the same disease would represent a gain in health generating a steep improvement in QoL, whereas a severe case of lung disease would represent a loss in health with a steep cost in QoL. By contrast, for a person in full health with no lung disease, both mild and severe cases of lung disease would represent a loss in health. Because increasing losses have a diminishing impact, the mild case would have a proportionally larger cost in QoL than the more severe case.
As Figure 1 illustrates, the gain and loss framing and the diminishing-return characteristic of prospect theory predicts that patients may actually give worse ratings to severe conditions, and that patients should perceive a greater QoL difference between mild and severe health conditions than do non-patients. If so, it may be too simplistic to say that patients overvalue, or non-patients undervalue, the health condition
1.2 Context in health ratings
Another issue that may complicate interpretation of health state evaluations is that ratings may depend on the task context. When rating single items in isolation with no context about how it compares to alternatives, respondents tend to give noncommittal ratings somewhere in the middle of the scale, arguably to leave room on either side for unknown future items (Haubensak, Reference Haubensak1992). However, when multiple items are rated, respondents tend to spread the items somewhat evenly across the rating scale (Parducci, Reference Parducci1963), essentially using the items themselves to impose meaning onto the rating scale. These strategies suggest that people may be attending more to the relative position of the items than to the specific values associated with the scale. The evaluability hypothesis (Hsee et al., Reference Hsee, Loewenstein, Blount and Bazerman1999) suggests that respondents may draw heavily on such inter-item comparisons, particularly when the relevant attributes for judgment are unfamiliar or difficult to evaluate.
A rating task that presents multiple items simultaneously allows respondents to take relative positioning into account when assigning values to each item. Rather than dropping items somewhere in the middle for lack of more information, respondents can use the relative comparison between items to decide how to place the items on the scale.
1.3 Testing for perspective and context effects
This study looks at how patients and non-patients rate descriptions of health conditions that differ in severity. We asked lung disease patients and healthy non-patients to evaluate the quality of life (QoL) for several scenarios describing different severity levels in lung disease, ranging from mild to severe. Based on prospect theory, we predicted that patients QoL ratings should not be uniformly higher than non-patients’ ratings for all of the lung disease scenarios. Rather, we predicted that, because most patients’ status quo position lies between the mildest and most severe scenarios, they should perceive a wide distinction between these two scenarios. Because non-patients view both scenarios as a loss, they should perceive a much smaller gap between them. The difference in ratings between the mild and severe scenarios should be larger for the patients than for the non-patients.
In addition, this study looks at the effect of multiple-item context on both patients’ and non-patients’ ratings. Some of our participants rated only a single lung disease scenario in isolation, a condition we called the “No Context” condition because no information was provided about the relative severity of the scenario compared to other possible cases. Other participants rated multiple scenarios presented together, each describing a different level of severity. We term this the “Context” condition because the task places each scenario within a broader context that conveys the severity of the scenario relative to other cases.
We predicted that items rated in the No Context condition should be grouped closer to the center of the rating scale, with relatively small differences between the mild and severe scenarios. By contrast, items rated in the Context condition should receive more distinct ratings, with a greater difference between mild and severe conditions. We also predicted a greater effect of the rating context for patients than for non-patients. By virtue of their own experience, patients should bring some implicit context to the task that is largely unavailable to non-patients. Patients are more likely to know something about the possible range of severity than do non-patients. Even when severity context is not provided explicitly by the task, we anticipated that patients would be able to draw on that information and make those comparisons on their own, attenuating the effect of the explicit information provided in the context condition.
2 Method
2.1 Participants
Lung disease patients. Patient participants were recruited from a list of 310 potential participants who met eligibility criteria based on administrative records of the University of Pennsylvania Health System. Eligible participants had received a diagnosis of chronic bronchitis or emphysema (as designated by the ICD-9 codes of 491*, 492*, or 496*) and had been seen more than once in a pulmonary clinic between January 1, 2001 and January 1, 2002. Potential participants received the survey in the mail with a cover letter describing the purpose of the study. No financial incentive was offered. If patients did not return the survey within 3 weeks, they were sent another copy of the survey. Of the 310 lung disease patients identified as potential participants, 10 were deceased, 11 could not be reached due to incorrect addresses, and 2 stated that they did not have lung disease. Excluding these patients, the response rate was 55% (N = 159).
Participants ranged in age from 23 to 90 years (M= 67.5, SD = 11.3). Most participants were Caucasian (74%) or African American (23%), with slightly more females (54%) than males. Years of education ranged from 8 to 21 (M = 13.6, SD = 3.1). Sixty-five percent of participants indicated that they had emphysema, 17% had chronic bronchitis, and 29% had asthma. Patients’ reported their own QoL as 56.9, on average (SD = 22.8). In comparing their own health to our five lung scenarios (Appendix A), 49.6% rated their own health as better than the middle scenario, Scenario C, and 50.4% rated their own health as being as bad or worse than this scenario. Only 11% described their health as “excellent” or “very good,” while 38% described it as ’good,” 34% described it as “fair,” and 16% described it as “poor.” None of these self-rated health measures was significantly related to the outcome variables of interest, the QoL ratings for the mild or severe lung disease scenario.
Healthy participants. Healthy participants were recruited from a pool of prospective jurors at the Philadelphia County Courthouse. In Philadelphia County, prospective jurors are selected from voter registration and drivers license records. Surveys were distributed to interested jurors after announcing to all prospective jurors that those who filled out a survey would receive a candy bar.
Among the prospective jurors, 240 volunteers completed the survey. Participants were asked in the survey whether they had any personal experience with lung disease, and only those who indicated no such experience (N = 196) were included for analysis in this study. Among these, participants ranged in age from 18 to 83 years (M= 39.9, SD = 13.1), and were predominantly Caucasian (50%) or African American (43%), with more females (69%) than males. Years of education ranged from 9 to 21 (M = 14.4, SD = 2.5).
The patient and non-patient samples were significantly different on several demographic dimensions. The non-patient group was significantly younger and more educated than the patient group and included significantly more women, more African Americans, and fewer Caucasians than the patient group. However, of these variables only one, age, was significantly related to one of the outcome variables of interest, QoL for the mild scenario. The pattern of results was unchanged when these demographic variables were included as covariates in analyses comparing patients and non-patients.
2.2 Survey materials and procedures
Survey materials included scenarios describing lung conditions with different levels of severity (See Appendix 1 for all scenarios). Each lung condition scenario described the level of activity that would cause a person with that condition to become short of breath. For example, the scenario for the most severe lung condition stated, “This person has a lung condition that causes him to become short of breath even when in a resting state. In other words, he is short of breath just sitting in a chair. Occasionally, his shortness of breath interferes with his sleep.” Participants were asked to provide QoL estimates on a scale from 0 (as bad as death) to 100 (perfect health) for one or more lung disease scenario.
Participants were randomly assigned to either the Context condition, or the No Context condition. In the Context condition, participants read and rated five different scenarios, presented in order from least severe to most severe.Footnote 1 In the No Context condition, participants read and rated only one scenario, either the least severe (shortness of breath only after extreme exertion) or the most severe (shortness of breath in a resting state), and were provided with no information about other possible scenarios or the relative severity of the condition. Participants in the No Context condition were randomly assigned to either the mild or the severe survey version. Participants were first given instructions for the task and were given one or five scenarios to read over first, then given the scenario(s) a second time to rate.
Patient participants also received several items addressing their own health, including, 1) Current QoL: patients rated their own QoL using the using same 0 to 100 scale used for scenario ratings, 2) Current lung disease description: patients saw the same five lung disease scenarios used as contextual severity information in the Context condition (see Appendix 1), and were asked to identify which of the five was most similar to their own lung condition. Patients selected one of 7 response options (better than scenario A, about the same as scenario A, B,C,D, or E, or worse than Scenario E), 3) SF-1general health evaluation (Ware & Sherbourne, Reference Ware and Sherbourne1992): patients categorized their own general health as excellent, very good, good, fair, or poor.
Finally, all participants were asked for demographic information, including age, gender, race, and educational background. Healthy participants were also asked whether they had personal experience with lung disease.
3 Results
3.1 Comparing patients’ and non-patients’ ratings
We hypothesized that non-patients would distinguish less between mild and severe scenarios than patients. Consistent with this hypothesis, the difference in ratings for the mild and severe scenarios in the No Context condition was only 16 points for healthy non-patients, versus 29 points for patients. Mean QoL ratings for the mild (M = 54.9, SD = 17.4) and severe (M = 39.1, SD = 20.1) scenarios were significantly different for both healthy participants, t(127) = 4.66, p<.001, and for patients (M = 70.3, SD = 20.9 for mild, M = 41.6, SD = 25.3 for severe), t(97) = 6.13, p<.001, but the effect was significantly larger for patients, F(1, 224) = 5.23, p =.02, ɳ2 =.02.
3.2 Comparing Context and No Context rating tasks
We hypothesized that participants would distinguish more between mild and severe lung disease scenarios in the Context condition, where multiple scenarios were presented together to provide contextual information about relative severity. As predicted, the contextual information increased the difference in ratings between mild and severe scenarios. Collapsing across the two participant groups, the difference in ratings increased from 21 points in the No Context condition to 54 points in the Context condition, t(332) = 7.49, p<.001. Mean QoL ratings for the mild scenario were significantly higher in the Context condition (M = 61.48, SD = 20.39) than in the No Context condition (M = 69.89, SD = 23.61), t(220) = 2.84, p =.005. Conversely, mean QoL ratings for the severe scenario were significantly lower in the Context condition (M = 21.3, SD = 22.87) than in the No Context condition (M = 53.90, SD = 23.34), t(227) = 7.92, p<.001.
3.3 Comparing context effects for patients and non-patients
We hypothesized that the introduction of context information would affect non-patients’ ratings more than patients’ ratings, because patients should have some implicit context information about their own disease, even in the No Context condition. Contrary to this hypothesis, there was a non-significant trend toward a larger context effect for patients than for non-patients. Figure 2 shows that, for non-patients, the difference between mild and severe ratings grew from 16 points in the No Context condition to 45 points in the Context condition, whereas for non-patients, the difference grew from 29 points to 67 points, t(332) = 1.01, p =.31. The effect of context was significant for both patients, t(332) = 5.59, p<.001, and for non-patients, t(332) = 4.98, p<.001.
To summarize we found that patients give more distinct ratings to mild and severe health state scenarios than do non-patients, consistent with prospect theory. We also found that both patients and non-patients give more distinct ratings to mild and severe scenarios when multiple scenarios are presented together, providing contextual information about the health state and the range of severity associated with the condition. Finally, we expected non-patients’ ratings to be affected more by context than non-patients’ ratings, but this prediction did not bear out. The effect of context was not significantly different for the two groups, and in fact, there was a non-significant trend toward a larger effect of context for the patient group.
4 Discussion
Because there is no way to objectively measure quality of life, researchers working to understand how health influences quality of life are forced to rely on subjective judgments. By their nature, these judgments are based on personal interpretation, making it difficult to compare judgments across individuals, across groups, or across different tasks.
Previous studies have demonstrated that personal health history influences judgments of health conditions, with patients typically giving more positive ratings to health conditions than non-patients. This study provides evidence that this patient vs. non-patient discrepancy is not unidirectional; lung disease patients in this study rated severe conditions more negatively than did non-patients.
The way patients and non-patients in this study distinguished between mild and severe conditions was consistent with prospect theory (Kahneman & Tversky, Reference Kahneman and Tversky1979). Almost all of the respondents in our patient group rated their health somewhere between the most severe and the least severe lung disease. From this perspective, the mild scenario looks like a dramatic improvement and the severe scenario looks like a dramatic drop, spreading the two scenarios relatively far apart on a QoL scale. For non-patients, both conditions are a loss in health, with the most dramatic cost in QoL associated with the initial drop to the mild condition, placing the two scenarios relatively close together on the QoL scale. In the No Context condition, our healthy participants estimated only a 16 point difference (out of 100) in quality of life between a patient who experiences shortness of breath while resting in a chair--an extremely severe degree of lung disease suffered by only about 5% of our patient population--and a patient who suffers shortness of breath only after jogging three blocks, a level of fitness that likely exceeds that of most Americans. Our patient participants estimated a larger 29 point difference between these same two scenarios.
This study also explored how ratings are affected by the rating task itself, specifically, the context in which a scenario is presented. Hsee and colleagues (1979) found that ratings made in isolation differ from ratings made alongside other items, particularly when the items are difficult to evaluate. Multiple items presented together provide information about possible alternatives, helping raters understand whether a given item is good or bad, big or small, a lot or a little.
In the case of our lung disease scenarios, the evaluability of lung disease severity should not have been especially poor. Rather than using unfamiliar measurement units to describe severity, such as providing some metric of lung capacity, the scenarios specified familiar types of physical activity that would cause shortness of breath. Nevertheless, contextual information influenced ratings, despite these intuitive descriptions of lung disease severity, arguably providing useful information about the range of severity that can be expected for the disease. Across the two respondent groups, ratings for the mild and severe conditions were more distinct when made in the context of multiple scenarios, with a 21 point difference in the No Context condition and a 54 point difference in the Context condition.
The results of this study did pose one surprise. We anticipated that non-patients would be more affected by context than non-patients. If context helps participants evaluate the conditions by providing information about the range of alternatives, then patients should be less affected by this additional information, as their own experience should provide some information about the range even in the No Context condition. We found no evidence of an attenuated context effect for patients. If anything, patients showed a slightly larger effect of context, though the interaction of group and context was non-significant.
Why were lung disease patients influenced by contextual information as much or more than non-patients? One possibility is that, while patients have a good deal of information about lung disease and the range of severity associated with it, they may not always access this information when evaluating the lung disease scenarios. Judgments are highly influenced by whatever information is most active and accessible in memory (Tversky & Kahneman, Reference Tversky and Kahneman1973). When patients in the Context condition were cued to think about the range of severity for lung disease, this range should have become a highly active feature of the disease that strongly influenced judgments, whereas severity range might have been only one of many features that came to mind for patients in the No Context condition who were not explicitly cued to think about it.
4.1 Implications
Over the last several years, the positive psychology movement has inspired more researchers to investigate the factors that influence well-being and the mechanisms behind people’s remarkable capacity to adapt to adverse circumstances. A continuing concern about this research emerges from the subjective nature of the available measures of happiness, quality of life, and related constructs. Conclusions about what does or does not influence well-being rely on subjective self-reports, reports that are often malleable.
In the health domain, another concern arises from the application of health-related quality of life data to cost-effectiveness analyses. The discrepancy between patients’ and non-patients’ QoL ratings has led to some discussion in the literature as to whether health care analyses ought to incorporate evaluations made by patients or by the general public (Boyd et al., Reference Boyd, Sutherland, Heasman, Tritchler and Cummings1990; Dolan, Reference Dolan1996; Gold et al., Reference Gold, Siegel, Russell and Weinstein1996; Ubel, Loewenstein, & Jepson, Reference Ubel, Loewenstein and Jepson2003). This question is further complicated by the evidence presented here. The discrepancy in ratings can not be easily characterized as an overestimation by patients or an underestimation by non-patients. Rather, ratings seem to depend on the relative position of the rater and the health condition in question. Because both patients’ and non-patients’ ratings are remarkably malleable, dramatically influenced by the context in which scenarios were rated, this study can not resolve the question of whose ratings are more accurate or more reliable in evaluating health states. Rather, this study highlights the difficulty of comparing the two groups or of drawing conclusions about whose evaluations are more meaningful.
These results suggest that researchers should take great care and consider the details of the rating task when soliciting QoL estimates. Whether the research goals are a theoretical understanding of well-being or an applied effort to improve quality of life, we must exercise caution in making conclusions based on subjective reports.
Appendix
4.1.1 Scenario A
This person has a lung condition that causes him to become short of breath only after extreme exertion, like jogging 3 blocks, carrying a heavy basket of laundry up two flights of stairs, or shoveling snow for 20 minutes.
4.1.2 Scenario B
This person has a lung condition that causes him to become short of breath after walking briskly for 2 blocks or walking up one flight of stairs.
4.1.3 Scenario C
This person has a lung condition that causes him to become short of breath after walking slowly for 1 block. He must rest while walking up a flight of stairs.
4.1.4 Scenario D
This person has a lung condition that causes him to become short of breath after walking across a room. He is unable to walk up stairs.
4.1.5 Scenario E
This person has a lung condition that causes him to become short of breath even when in a resting state. In other words, he is short of breath just sitting in a chair. Occasionally, his shortness of breath interferes with his sleep.