1 Introduction
In the field of risky decision making, a framing effect has been reported that involves different descriptions of otherwise equivalent choice options elicit preference reversals (Reference Kutscher and FeldmanTversky & Kahneman, 1981). This effect is said to question human rationality because it violates descriptive invariance, which is a basic principle of rational choice. In the paradigmatic risky choice framing tasks of an Asian disease problem, participants are asked to imagine that the United States is preparing for a disease outbreak that is going to kill 600 people. Two programs/options — one that is certain and one that is risky — for combating the disease are presented in positively or negatively framed versions.
Positive frame
Sure program: Two hundred people will be saved.
Risky program: There is a 1/3 probability that 600 people will be saved (non-zero complement) and a 2/3 probability that no people will be saved (zero complement).
Negative frame
Sure program: Four hundred people will die.
Risky program: There is a 1/3 probability that nobody will die (zero complement) and a 2/3 probability that 600 people will die (non-zero complement).
According to the framing effect, respondents were reported to favor the sure program in the positive frame but the risky program in the negative frame.
Reference Kühberger and TannerKühberger and Tanner (2010) examined the impacts of adding, or removing implied complements of risk options on this framing effect. They showed that removing the zero or non-zero complements of the risky option can eliminate or enlarge the effect respectively. In this paper, we intended to examine the reliability and robustness of the evidence presented by Reference Kühberger and TannerKühberger and Tanner (2010) by conducting a direct replication with a Chinese sample. In what follows, we also extended their findings by setting different probabilities and expected value (EV) ratios to examine the effect under various conditions.
1.1 Prospect theory and fuzzy-trace theory
The dominant explanation for the framing effect is prospect theory (Reference Kahneman and TverskyKahneman & Tversky, 1979; Reference Tversky and KahnemanTversky & Kahneman, 1992). This theory proposes that the value (v) of a simple prospect that pays x with probability p is given by w(p)v(x), and people behave to maximize the overall prospect value (Reference Fox, Erner, Walters, Keren and WuFox, Erner & Walters, 2015). w(.) is an inverse S-shaped weighting function that overweights low probabilities and underweights moderate to large probabilities; v(.), the value function, also S-shape, is concave for gains but convex for losses. v(.) is steeper for losses than equivalent gains, which leads to loss aversion. Framing effects are explained in prospect theory by the valuation v (See Table 1). Zero complements fail to add value to the overall utility (v(0) = 0). Thus, v(+200) is better than 1/3×v(+600) because the value function is concave for gain, and 2/3×v(−600) is better than v(−400) because the value function is convex for loss.
Note. This table was revised from Tables 1 and 2 in the original article. Types 4 and 5 in the original study were replaced with types 2 and 3. The v(x) indicates a subjective valuation function (concave for gain and convex for loss); FTT indicated fuzzy-tracetheory; PT indicated prospect theory; +FE = strong framing effect; IND = indifference predicted.
In contrast to prospect theory, fuzzy-trace theory holds that decision making operates on gist rather than on verbatim details and is characterized as parallel, fuzzy, or qualitative rather than linear, as in logic, or precise, as in computation (Reference Reyna and BrainerdReyna & Brainerd, 1991). The theory posits categorical gist or ordinal gist when presented with numerical information (Reference Blalock and ReynaBlalock & Reyna, 2016; Reference Reyna and Brust-RenckReyna & Brust-Renck, 2020). Categorical gist represents outcomes according to qualitative distinctions, for instance, “some quantity” versus “none”; “good” versus “bad.” Ordinal gist involves approximate comparisons of outcome magnitudes, such as “more” versus “less” (Reference Wolfe, Reyna and SmithWolfe, Reyna & Smith, 2018). The theory proposes that adults rely on the simplest gist when processing information, beginning with the lowest level of gist (categorical), and recruit higher (more precise) levels for decision making only if the lower levels do not provide sufficient discrimination between the options (Reference Reyna, Wilhelms, Mccormick and WeldonReyna, Wilhelms, Mccormick & Weldon, 2016). Based on the original paper, fuzzy-trace theory explained the framing effect in terms of the categorical gist of options (Table 1). In the positive frame, the options translated into some people will be saved (sure option) and some people will be saved or no one will be saved (risky option). The choice centers on those aspects that make a difference, i.e., some will be saved versus no one will be saved. Thus, the sure option is clearly preferred. In the negative frame, the options are translated into some people will die (sure option), and nobody will die or some people will die (risky option). Because of no people will die being clearly better than some people dying, the risky option is preferred.
1.2 The original study for replication and extension
Reference Kühberger and TannerKühberger and Tanner (2010) clarified the limitations of the framing effect by examining the effects of tasks that vary in terms of how the options are specified. They showed that removing the zero complements for the risky option can eliminate the framing effect, whereas removing the non-zero complements will enlarge this effect. This study presented a critical test of prospect theory and fuzzy-trace theory. They supported fuzzy-trace theory and revealed the limitation of existing theories on how people deal with disclosed information and inferences in formal models, such as prospect theory.
Specifically, three versions of tasks were implemented in the original experiment of Reference Kühberger and TannerKühberger and Tanner (2010) (Table 1): Type 1, the original version of the Asian disease problem; Type 2, the zero complements of risky options were removed; and Type 3, the non-zero complements were removed. For Type 2, prospect theory predicts an identical framing effect because zero complements fail to add value to the overall utility. In contrast, fuzzy-trace theory predicts indifference in preference for task options because either sure or risky options will be simplified into some people will be saved in the positive frame and some people will die in the negative frame. For Type 3, both prospect theory and fuzzy-trace theory predict large framing effects.
Our study used a 2 (frame: positive vs. negative) × 3 (task type: 1, 2, and 3) × 4 (scenario: water contamination, genetically engineered crops, fish kidney disease, endangered forest) mixed design, with frame and task type as between-subject factors and scenario as a within-subject factor. The sample size was 558, and participants were asked to choose between a sure and a risky option in four different environmental issue scenarios. The results revealed a significant interaction effect of type and frame (F(2, 552) = 43.32, p < .001; η2p = .107). A significant framing effect was detected in Type 1, a large framing effect in Type 3 and no framing effect in Type 2, which supported fuzzy-trace theory rather than prospect theory.
Because concerns over the replicability of scientific findings have arisen in psychology (Open Science Collaboration, 2015), revisiting and verifying existing findings is important. Replicability is a fundamental feature of science, and replication studies should play a role in efforts to improve scientific practices (Reference Dunlap and MurchisonDunlap,1926). The general function of replication is to verify a fact or piece of knowledge, implying the following more specific functions: to address sampling error (i.e., false-positive detection); to control for artifacts (lack of internal validity); to address researcher fraud; to test generalizations to different populations; and to test the same hypothesis from a previous study using a different procedure (Reference SchmidtSchmidt, 2009). In the field of decision making, recently, there is also an increasing interest to replicate JDM effects with Chinese participants (e.g., Reference Jiang and MaJiang & Ma, 2019; Reference Shen, Huang, Jiang and LiShen, Huang, Jiang & Li, 2019).
The study of Reference Kühberger and TannerKühberger and Tanner (2010) had novel and potentially important theoretical implications for prospect theory and fuzzy-trace theory in explaining the framing effect, and its practical effects on risky decision making. This work deserves to be replicated and extended for several reasons. First, the original study was conducted with samples from the United States and Europe (Reference Kühberger and TannerKühberger & Tanner, 2010) and is critical in light of concerns over the translatability of findings among locations (Reference Klein, Ratliff, Vianello, Adams, Bahník, Bernstein and NosekKlein et al., 2014). Risk preference differ by culture. For example, previous research by Weber, Hsee and their colleagues used different methods to assess risk-taking and repeatedly found that respondents from China were significantly less risk-averse than their counterparts in the United States (Reference Hsee and WeberHsee & Weber, 1997, 1999; Reference Weber, Hsee and SokolowskaWeber & Hsee, 1998; Reference Weber, Hsee and SokolowskaWeber, Hsee & Sokolowska,1998). However, recently, evidence has shown that the Chinese have a moderate risk preference; that is, for the gain domain, people from China are less risk-averse than people from Germany and the United States and are less risk-seeking in the loss domain (Reference Rieger, Wang and HensRieger, Wang, & Hens, 2015). Therefore, expanding the original study to Chinese samples is of interest. We aimed for a very close replication according to the replication taxonomy proposed by Reference LeBel, McCarthy, Earp, Elson and VanpaemelLeBel et al. (2018) in Study 1.
Second, we noticed that the original findings were detected when the probability of gain/loss in the risky option was fixed (1/3 gain, 2/3 loss). According the previous research, the probability of the risky option was one of the main factors causing variations in susceptibility to the framing effect (Reference Hogarth and EinhornHogarth & Einhorn, 1990; Reference Tversky and KahnemanTversky & Kahneman, 1992). For example, a stronger framing effect was detected for large probabilities than for small probabilities of gain (Reference Gosling and MoutierGosling & Moutier, 2017; Reference Kühberger, Schulte-Mecklenbeck and PernerKühberger, Schulte-Mecklenbeck & Perner, 1999). Thus, varying probability might also change the original findings of Reference Kühberger and TannerKühberger and Tanner (2010), but empirical evidence is lacking.
Moreover, the original study focused on only alternatives with a similar EV, which is confined. Some suggested that researchers might exaggerate the framing effect, such as loss aversion, by using options with similar EVs (Reference Mrkva, Johnson, Gächter and HerrmannMrkva, Johnson, Gächter & Herrmann, 2020). Likewise, relevant research also manipulated the attractiveness of sure and unsure options by setting different EVs to examine the accuracy of optimal decisions (Reference Hollard, Maafi and VergnaudHollard, Maafi & Vergnaud, 2016). For the current research, if changing the EV ratio between the risky and the sure option (i.e., EV of the risky option/EV of the sure option) could still support the original findings, we can further support fuzzy-trace theory.
Finally, this original study deserved to be replicated because of its practical impact on decision making related to environmental and health risks and provided a potential means to manipulate individuals’ preferences in risky settings. That is, removing specific complements for the risky option could be used as an inexpensive yet effective behavior nudge to eliminate or expand the framing effect. Also, few if any studies have examined the framing effect the environmental and health scenario, especially using a Chinese sample.
1.3 Overview of replications
The present study aimed to examine the reliability and robustness of the evidence presented by Reference Kühberger and TannerKühberger and Tanner (2010) by conducting Study 1 — a direct and precise replication using a Chinese sampleFootnote 1. This replication study also aimed to expand the application of the original findings by varying the probability of the risky option (Study 2) and the EV ratio (Study 3). In Study 2, we extended the unique probability into large, medium, and small probabilities to examine the effect under various probability conditions. In Study 3, we set two additional conditions wherein either the risky or the certain option had a larger EV (i.e., EV of the risky option/EV of the sure option is larger or smaller than 1) to examine whether the original findings of Reference Kühberger and TannerKühberger and Tanner (2010) persist for different EV ratios. To ensure interpretable results, we obtained original materials from the authors (Reference Kühberger and TannerKühberger & Tanner, 2010).
2 Method
2.1 Pre-registration and Open Science
We pre-registered our three studies on the Open Science Framework before we collected the first wave of data. Power analyses and all materials used in the studies, as well as the data, code, and outputs, are available in the Open Science Framework (https://osf.io/a65gz/; Study 1 pre-registration: https://osf.io/7wcy5; Study 2 pre-registration: https://osf.io/c7ary; Study 3 pre-registration: https://osf.io/vzuma).
2.2 Power Analyses and Data Collection
In Reference Kühberger and TannerKühberger and Tanner’s (2010) study, the sample size was 558 (US (34%), Germany (35%), other parts of Europe (31%)) and the results revealed an interaction effect of type and frame (F(2, 552) = 43.32, p < .001; η2p = .107). The 90% confidence interval of the reported effect size was CI [0.093, 0.178] according to Smithson (2001). In the present study, we aimed to detect the same interaction effect in all three studies. A power analysis prior to data collection using the effect size in the original study of η2p = .107 showed that for an interaction ANOVA with six groups, a sample of at least 845 participants was needed to detect a medium 80% power (Reference Faul, Erdfelder, Lang and BuchnerFaul, Erdfelder, Lang & Buchner, 2007). Considering that a direct replication requires at least two times the sample size of the original study (Reference Morey and LakensMorey & Lakens, 2016), we collected 1144 Chinese samples in Study 1. For Studies 2 and 3 — extensions of the original study — we planned to collect 900 samples according to the sample size calculated by the power analysis for a more convincing power and in case of data exclusion.
The present research included two data collection waves.Footnote 2 The first wave was launched soon after the pre-registration, and the second wave was collected during the revision. All of the conditions and measurements were reported. The samples were recruited online via Sojump (http://www.Wjx.cn), which is similar to Mechanical Turk or Qualtrics, and is a well-known online platform for launching nationwide electronic surveys in China that is widely used in behavioral and psychological research. All participants were paid ¥2 for their participation. Following the structure of recently published replication studies (Chen et al., in press; Reference Kutscher and FeldmanKutscher & Feldman, 2019), we nested the description of three studies in each section.
2.3 Participants
Study 1 aims to directly replicate the original research in Reference Kühberger and TannerKühberger and Tanner (2010) using Chinese samples. A total of 1144 participants were recruited, of which 25 participants were excluded because they did not complete the entire experiment (participants quit in the middle of the study or contain missing data). The final valid database consisted of 1119 participants (69.2% female, 30.8% male; M age = 24.07, SD age = 5.24, and five participants did not disclose their age) and contained nearly 186 participants per between-subject cell.
Study 2 extended the original study by setting different probabilities of risky options to examine the original effect under various probability conditions. A total of 900 participants were recruited online via Sojump (http://www.wjx.cn). Eight participants were excluded because they did not complete the entire experiment. The final valid database consisted of 892 participants (58.9% female, 41.1% male; M age = 25.61, SD age = 6.58, and three did not disclose their age) and included nearly 149 participants per between-subject cell.
In Study 3, we included two additional conditions wherein either the certain or the risky option had a larger EV (EV ratio of approximately 0.8 or 1.2) to expand the original findings of Reference Kühberger and TannerKühberger and Tanner (2010). Nine hundred twenty participants were recruited online via Sojump. Fourteen subjects were excluded because they failed to complete the entire experiment. The final valid database consisted of 906 Chinese respondents (58.5% female; 41.5% male; M age = 25.70, SD age = 6.29, and five did not disclose their age) and included nearly 151 participants per between-subject cell.
2.4 Design and Procedure
For Study 1, as in the original study, the design of this replication study was a 3 (task type: 1, 2, and 3) × 2 (frame: positive vs. negative) × 4 (scenario: water contamination, genetically engineered crops, fish kidney disease, endangered forest) mixed design, with frame and task type as between-subject factors and scenario as a within-subject factor. The order of the four scenarios was counterbalanced.
In Study 2 and 3, we only used a water contamination scenario from Study 1 on the principle of parsimony. Study 2 employed a 3 (task type: 1, 2, and 3) × 2 (frame: positive vs. negative) × 3 (probabilities: small (25%), medium (47%), and large (72%))Footnote 3 mixed design (see the OSF site for more details).Footnote 4 The frame and task type were between-subject factors, and the probability was the within-subject factor. The order of different probability conditions was counterbalanced.
Study 3 was a 3 (task type: 1, 2, and 3) × 2 (frame: positive vs. negative) × 3 (EV ratio: 1, 0.8 and 1.2), with frame and task type as the between-subject factors and EV ratio (expected value of the risky option/expected value of the sure option) as the within-subject factor. The order of the within-subject factors was counterbalanced (see the OSF site for more details). The order of different EV ratio conditions was counterbalanced
The procedure of the present studies was modeled after the original study. Participants were required to complete a consent form prior to participation, after which they were randomly assigned to different experimental conditions. In each situational experiment, the participants were presented with a description of a scenario and then asked to choose between a sure and a risky option under the assumed situation. Then, demographic information on age, sex, education, and residence was collected.
2.5 Materials
We used the customary translation-back-translation modification procedure by the BOOM research group to translate the materials into Chinese (Reference Berry, Triandis and BerryBerry, 1989). The procedure is widely used in cross-cultural research, such as in Russia, China, Germany, and Pakistan (Reference Brailovskaia and BierhoffBrailovskaia & Bierhoff, 2016; Brailovskaia, Scho, Zhang & Bieda, 2018; Reference Bibi, Lin and MargrafBibi, Lin & Margraf, 2020). The process included the steps of forward translation, reconciliation, back translation, and review (see the OSF site for a more detailed description of the procedure).
The experimental materials were identical in Study 1 and the original study. According to Reference Kühberger and TannerKühberger and Tanner (2010), the options were framed either positively (human lives, animal lives, or forest saved) or negatively (lives lost or forest damaged), and three task types were employed (For details, see the online supplementary).
2.5.1 Study 1
An example of a fish kidney disease scenario is presented below:
A committee found a fish disease in a nearby lake. About 12 fish species (among them the most popular dining fish) have the Proliferative Kidney Disease (PKD). This is a chronically developing infectious disease that can have deadly consequences for the fish. Young fish are especially susceptible, while others seem to be immune to infection. Experts suggest that PKD is one cause of declining fish catches. The researchers assume that human activities and water pollution foster the spread of the disease. They are considering releasing more fish into the lake to control the epidemic.
Imagine that you are a government official of the adjacent village. Which of the following options would you favor? Assume that the estimates are as follows:
Type 1 (Classic):
Option A: If the release of fish is implemented, 4 fish species will survive.
Option B: If the release of fish is implemented, there is a 1/3 probability that all of the 12 fish species will survive, and a 2/3 probability that none of them will survive.
Option C: If the release of fish is implemented, 8 fish species will die.
Option D: If the release of fish is implemented, there is a 2/3 probability that all of the 12 fish species will die, and there is a 1/3 probability that none of the 12 fish species will die
Type 2:
Option A: If the release of fish is implemented, 4 fish species will survive.
Option B: If the release of fish is implemented, there is a 1/3 probability that all of the 12 fish species will survive.
Option C: If the release of fish is implemented, 8 fish species will die.
Option D: If the release of fish is implemented, there is a 2/3 probability that all of the 12 fish species will die.
Type 3:
Option A: If the release of fish is implemented, 4 fish species will survive.
Option B: If the release of fish is implemented, there is 2/3 probability that none of the 12 fish species will survive.
Option C: If the release of fish is implemented, 8 fish species will die.
Option D: If the release of fish is implemented, there is 1/3 probability that none of the 12 fish species will die.
2.5.2 Study 2 and Study 3
The material used in Study 2 and 3 was revised from the water contamination scenario from Study 1, and all the certain options lead to 10 children being saved:
A refinery leak polluted the surrounding soil and water, which resulted in children from nearby villages suffering from a rare disease. Drugs for this disease have been developed and tested, and different medical options are available.
Imagine that you are an expert who is very influential among local hospitals. Which of the following options would you favor? Assume that the estimates are as follows.
In Study 2, we set three different probability conditions (small: 25%; medium: 47%; large: 72%); In Study 3, we used a probability of only 25% in the risky option. The different EV ratios (expected value of the risky option/expected value of the sure option) were set.
Study 2
An example of small probability is presented as follows.
There are 40 children suffering from the disease.
Type 1 (Classic):
Option A: If the drug is used, the health of 10 children will be saved for sure.
Option B: If the drug is used, there is a 25% probability that the health of all of the 40 children will be saved and a 75% probability that the health of none of the 40 children will be saved.
Option C: If the drug is used, the health of 30 children will be damaged for sure.
Option D: If the drug is used, there is a 75% probability that the health of all of the 40 children will be damaged and a 25% probability that the health of none of the 40 children will be damaged.
Type 2:
Option A: If the drug is used, the health of 10 children will be saved for sure.
Option B: If the drug is used, there is a 25% probability that the health of all of the 40 children will be saved.
Option C: If the drug is used, the health of 30 children will be damaged for sure.
Option D: If the drug is used, there is a 75% probability that the health of all of the 40 children will be damaged.
Type 3:
Option A: If the drug is used, the health of 10 children will be saved for sure.
Option B: If the drug is used, there is a 75% probability that the health of none of the 40 children will be saved.
Option C: If the drug is used, the health of 30 children will be damaged for sure.
Option D: If the drug is used, there is a 25% probability that the health of none of the 40 children will be damaged.
Study 3
Examples of the classic task type in the different EV ratio versions are presented below (the number of sick children was manipulated):
EV ratio approximately 0.8
There are 32 children suffering from the disease.
Option A: If the drug is used, the health of 10 children will be saved for sure.
Option B: If the drug is used, there is a 25% probability that the health of all of the 32 children will be saved, and a 75% probability that the health of none of the 32 children will be saved.
Option C: If the drug is used, the health of 22 children will be damaged for sure.
Option D: If the drug is used, there is a 75% probability that the health of all of the 32 children will be damaged, and a 25% probability that the health of none of the 32 children will be damaged.
EV ratio approximately 1
There are 40 children suffering from the disease.
Option A: If the drug is used, the health of 10 children will be saved for sure.
Option B: If the drug is used, there is a 25% probability that the health of all of the 40 children will be saved, and a 75% probability that the health of none of the 40 children will be saved.
Option C: If the drug is used, the health of 30 children will be damaged for sure.
Option D: If the drug is used, there is a 75% probability that the health of all of the 40 children will be damaged, and a 25% probability that the health of none of the 40 children will be damaged.
EV ratio approximately 1.2
There are 48 children suffering from the disease.
Option A: If the drug is used, the health of 10 children will be saved for sure.
Option B: If the drug is used, there is a 25% probability that the health of all of the 48 children will be saved, and a 75% probability that the health of none of the 48 children will be saved.
Option C: If the drug is used, the health of 38 children will be damaged for sure.
Option D: If the drug is used, there is a 75% probability that the health of all of the 48 children will be damaged, and a 25% probability that the health of none of the 48 children will be damaged.
3 Results
Corresponding to the original analysis, we coded the choice of a risky option as 1 and the choice of a sure option as 0. The average score represented the proportion of risky options that each participant chose in a certain condition. Therefore, those with means higher than 0.5 exhibited a risk-seeking propensity, whereas those with means lower than 0.5 exhibited a risk-aversion propensity.
3.1 Study 1
The results were broken down by scenario, task type, and framing and are presented in Table 4. Compared with the original findings (Table 3 in Reference Kühberger and TannerKühberger & Tanner (2010)), participants’ risk-seeking propensity in the current study is nearly the same as that in the original study. In this study, we intended to replicate the difference in the framing effect across types, as in the original study. We first analyzed the influence of sociodemographic factors, and no relevant differences were found. Then, we performed a repeated-measures ANOVA with framing and task type as between-subject factors and the scenario as a within-subject factor, as in the original study. The main effect of the scenario was significant (F(3, 3339) = 4.45, p = .004; η2p= .004; 90% CI [0.001, 0.08]), which indicated that the preference for the risky choice varied across scenarios. The main effect of framing was significant (F(1, 1113) = 116.34, p <.001; η2p=.095; 90% CI [0.069, 0.122]), reflecting a greater preference for risky options under the negative frame than under the positive frame. The interaction between the scenario and the framing was also significant (F(3, 3339) = 3.40, p = .017; η2p = .003; 90% CI [0.0003, 0.006]), which indicated that the size of the framing effect varied across scenarios.
More importantly, the ANOVA revealed a significant interaction between framing and task type (F(2, 1113) = 53.95, p < .001; η2p = .088; 90% CI [0.063, 0.115]). The classic framing task (Type 1) replicated the usual findings: 44.05% of respondents preferred the risky option under the positive frame, whereas 64.71% of respondents preferred the risky option under the negative frame (F(1, 1113)=37.55, p < .001; η2p = .033; 90% CI [0.018, 0.052]), with reliable framing effects for each scenario (Fs > 9.15, ps < .003). In task Type 2, the framing effect was not significant (F(1, 1113) = 1.14, p = .286; ns), with 57.22% of respondents under the positive frame and 53.63% of respondents under the negative frame choosing the risky option. No framing effect was detected across the four scenarios (Fs < 2.09, ps > .149). For task Type 3, we found the largest framing effect, with 76.46% of participants showing a risk preference under the negative frame and only 30.65% showing a risk preference under the positive frame (F(1, 1113) = 185.80, p < .001; η2p = .143; 90% CI [0.113, 0.174]); similar differences between framing conditions were found in each scenario of Type 3 (Fs > 51.42, ps < .001).
To reduce the occurrence of a false positive, the Bonferroni adjustment was used for post hoc pairwise comparisons for Studies 1–3. Post hoc analysis revealed that in the positive framing condition, the proportion of risky choices in Type 3 was significantly lower than that in Type 1 (p < .001, d = 0.28, 95% CI [0.180, 0.384]) and Type 2 (p < .001, d = 0.56, 95% CI [0.453, 0.660]); and the risky choice proportion in Type 1 was significantly lower than that in Type 2 (p < .001, d = 0.26, 95% CI [0.159, 0.363]). However, in the negative frame, the proportion of Type 3 risky choices was significantly higher than that in Type 1 (p = .001, d = 0.26, 95% CI [0.159, 0.363]) and Type 2 (p < .001, d = 0.49, 95% CI [0.391, 0.597]); and the risky choice proportion in Type 1 was significantly higher than that in Type 2 (p = .003, d = 0.23, 95% CI [0.125, 0.329]), as shown in Figure 1.
In summary, the key findings of the original study were replicated. A significant difference in the framing effect across task type was found, as in the original study. The findings for the Type 1 and Type 3 tasks were consistent with both prospect theory and fuzzy-trace theory, whereas the findings in the Type 2 task were inconsistent with prospect theory but consistent with fuzzy-trace theory. The resulting effect size of the variation in the framing effect varies across the task types was comparable to the original effect reported in the original study (η2p = .088 compared with η2p = .107 in the original study). A statistically significant interaction effect between frame and task type in the same direction as the original study was detected, which indicated that this replication was as successful as the results in the original study (Reference Camerer, Dreber, Forsell, Ho, Huber, Johannesson and WuCamerer et al., 2016; Open Science Collaboration, 2015). Thus, we successfully replicated the study in Reference Kühberger and TannerKühberger and Tanner (2010) using a Chinese sample.
3.2 Study 2
The results are broken down by probability, task type, and framing and are presented in Table 5.
We first analyzed the influence of sociodemographic factors, and no relevant differences were found. Then, we performed a repeated-measures ANOVA with framing and task type as between-subject factors and probability as the within-subject factor. The three-way interaction was not significant (F(4, 1772) = 0.20, p = .938, ns). The main effect of probability (F(2, 1772) = 2.34, p = .097, ns) was also not significant, whereas the main effect of type (F(2, 886) = 18.85, p < .001; η2p = .041; 90% CI [0.021, 0.063]) was significant, indicating that the preference for risky choice varied across task type. The main effect of framing was also significant (F(1, 886) = 130.79, p < .001; η2p = .129; 90% CI [0.096, 0.163]), indicating a stronger preference for the risky option under negative frames than under positive frames. The interaction between probability and task type (F(4, 1772) = 0.25, p = .908, ns) and that between probability and framing (F(2, 1772) = 1.74, p = .177, ns) were not significant.
More relevantly, the interaction effect between type and frame was significant (F(2, 886) = 51.34, p < .001; η2p = .104; 90% CI [0.073, 0.135]) and showed the same pattern as in the original study. A simple test revealed that the original effect was detected in all three different probability conditions.
In the small-probability condition, the classic framing task (Type 1) yielded a significant framing effect: 46.21% of respondents preferred the risky option under positive frames, and 76.55% of respondents preferred the risky option under negative frames (F(1, 886) = 31.35, p < .001; η2p = .034; 90% CI [0.017, 0.056]). In task Type 2, no framing effect was detected (F(1, 886) = 0.021, p = .885, ns), with 56.38% of respondents preferring risky choices under positive frames and 57.14% of respondents preferring risky choices under negative frames. For task Type 3, the largest framing effect was detected: 73.97% of participants showed risk preference under the negative frame, whereas only 19.61% showed risk preference under the positive frame (F(1, 886) = 103.70, p < .001; η2p = .105; 90% CI [0.075, 0.137]). Furthermore, post hoc tests showed that, in the positive frame, the proportion of the risky option in Type 3 was lower than that in Type 1 and Type 2 (ps < .001, d 1 = 0.58, 95% CI [0.350, 0.814]; d 2 = 0.81, 95% CI [0.574, 1.043]). The proportion of the risky option in Type 2 in the negative frame was significantly lower than that in Type 1 (p = .001, d = 0.42, 95% CI [0.188, 0.646]) and Type 3 (p = .005, d = 0.36, 95% CI [0.128, 0.584]).
The same pattern was revealed in the medium probability condition. A framing effect was found in the classic task type (Type 1): 48.28% of respondents preferred the risky option under positive frames, and 75.17% of respondents preferred the risky option under negative frames (F(1, 886) = 23.94, p < .001; η2p = .026; 90% CI [0.012, 0.046]). In task Type 2, the framing effect was not significant (F(1, 886) = 0.20, p = .654, ns), with 54.36% of respondents preferring risky choices under positive frames and 51.95% of respondents preferring risky choices under negative frames. For task Type 3, 69.86% of participants showed a risk preference under the negative frame, whereas only 20.26% showed a risk preference under the positive frame (F(1, 886) = 83.89, p < .001; η2p = .086; 90% CI [0.059, 0.117]), revealing the largest framing effect (Figure 2). Additionally, post hoc tests revealed, as in the 25% condition, the proportion of the risky option in Type 3 was lower than in Type 1 and Type 2 (ps < .001, d 1 = 0.62, 95% CI [0.388, 0.853]; d 2 = 0.75, 95% CI [0.519, 0.986]) in the positive frame. The proportion of the risky option in Type 2 in the negative frame was significantly lower than in Type 1 (p < .000, d = 0.49, 95% CI [0.262, 0.722]) and Type 3 (p = .003, d = 0.37, 95% CI [0.146, 0.603]).
In the large probability condition, the classic framing effect was detected in Type 1, with 56.55% of respondents preferring the risky option under positive frames and 75.86% of respondents preferring the risky option under negative frames (F(1, 886) = 12.47, p < .001; η2p = .014; 90% CI [0.004, 0.029]). In Type 2, no framing effect was revealed, with a similar risky preference in the positive (60.40%) and negative (55.19%) frames (F(1, 886) = .948, p = .331, ns). Type 3 revealed the largest framing effect, with 23.53% of respondents preferring the risky option under positive frames and 73.29% of respondents preferring the risky option under negative frames (F(1, 886) = 85.35, p < .001; η2p = .088; 90% CI [0.060, 0.118]). Post hoc tests revealed that the proportions of the risky option in Type 3 were lower than those in Type 1 and Type 2 (ps < .001, d 1 = 0.71, 95% CI [0.475, 0.943]; d 2 = 0.77, 95% CI [0.539, 1.007]) in the positive frame. In the negative frame, the proportion of the risky option in Type 2 was significantly lower than those in Type 1 (p < .001, d = 0.45, 95% CI [0.220, 0.679]) and Type 3 (p = .002, d = 0.38, 95% CI [0.153, 0.610]).
In summary, varying the probability did not change the original findings, with significant interactions found between task type and frame for all probability conditions. Additionally, the findings of no framing effect in Type 2 under both high- and small-probability conditions indicated that framing effects are more consistent with fuzzy-trace theory, which further supports the robustness of the original findings. Thus, the key findings of the original study were replicated under different probability conditions.
3.3 Study 3
The results, broken down by EV ratio, task type, and framing, are presented in Table 6.
We first analyzed the influence of sociodemographic factors, and no relevant differences were found. Then, we performed a repeated-measures ANOVA with framing and task type as the between-subject factors and EV ratio as the within-subject factor. The three-way interaction was not significant (F(4, 1800) =1.15, p = .330, ns). The main effect of the EV ratio was not significant (F(2, 1800) = 1.67, p = .189, ns), which failed to find support for the hypothesis that choices were based on expected utility. The main effects of task type (F(2, 900) = 10.25, p < .001; η2p =.022; 90% CI [0.008, 0.039]) and framing were significant (F(1, 900) = 121.72, p < .001; η2p= .119; 90% CI [0.088, 0.152]), which indicated a stronger preference for the risky option in Type 2 than in Type 1 and Type 3 and a stronger preference for the risky option under negative than under positive frames. The interaction between EV ratio and task type (F(4, 1800) = 0.18, p = .950, ns) and that between EV ratio and framing (F(2, 1824) = 0.29, p = .745, ns) were not significant.
More importantly, the interaction between type and framing was significant (F(2, 900) = 53.08, p < .001; η2p= .106; 90% CI [0.075, 0.137]). To further investigate how EV ratio variations influence the original effect, we separated the results by EV ratio to further test whether the original finding — that the framing effect varied across task type — would hold under different EV ratio conditions.
When the EV ratio was 1, a framing effect was detected in the classic framing task (Type 1): 48.61% of respondents preferred the risky option under the positive frame, whereas 63.33% preferred the risky option under the negative frame (F(1, 900) = 7.38, p = .007; η2p = .008; 90% CI [0.001, 0.021]). For Type 2, no framing effect was detected (F(1, 900) = 0.04, p = .846, ns), with 59.09% of respondents preferring risky choices under positive frames and 58.06% of respondents preferring risky choices under negative frames. For task type 3, we found large framing effects (F(1, 900) = 124.33, p < .001; η2p = .121; 90% CI [0.090, 0.155]), with only 16.67% of respondents preferring the risky option under the positive frame and 76.19% of respondents preferring the risky option under the negative frame (Figure 3). Furthermore, post hoc tests revealed that in the positive frame, risky preference in Type 3 was significantly lower than in Type 1(p < .001, d = 0.73, 95% CI [0.498, 0.966]) and Type 2 (p < .001, d = 0.97, 95% CI [0.733, 1.204]), whereas under the negative frame, no difference in risky preference was detected among types.
When the sure option was preferred (EV ratio approximately 0.8), the original findings were replicated. In the classic framing task, 43.75% of respondents preferred the risky option under the positive frame, whereas 70.00% preferred the risky option under the negative frame (F(1, 900) = 23.05, p < .001; η2p = .025; 90% CI [0.011, 0.044]). For Type 2, no framing effect was found (F(1, 900) = 0.003, p = .958, ns): 56.49% of respondents preferred the risky option under the positive frame, and 56.77% preferred the risky option under the negative frame. For task Type 3, we found large framing effects (F(1, 900) = 91.60, p < .001; η2p = .092; 90% CI [0.064, 0.123]), with only 19.87% of respondents preferring the risky option under the positive frame and 71.43% of respondents preferring the risky option under the negative frame, as shown in Figure 3. Meanwhile, post hoc tests revealed that the proportion of the risky option in Type 3 was lower than in Type 1 and Type 2 (ps < .001, d 1 = 0.53, 95% CI [0.302, 0.763], d 2 = 0.80, 95% CI [0.563, 1.034]) in the positive frame. In the negative frame, the proportion of the risky option in Type 2 was significantly lower than in Type 1 (p = .042, d = 0.27, 95% CI [0.045, 0.496]) and Type 3 (p = .020, d = 0.29, 95% CI [0.067, 0.521]).
The same effect was detected when the risky option was preferred (EV ratio approximately 1.2). For the classic task, 50.00% of respondents preferred the risky option under the positive frame, and 70.00% of respondents preferred the risky option under the negative frame (F(1, 900) = 13.39, p < .001; η2p = .015; 90% CI [0.004, 0.030]). For Type 2, no framing effect was detected (F(1, 900) = 0.19, p = .663, ns), with 60.39% of respondents preferring the risky option under the positive frame and 58.06% of respondents preferring the risky option under the negative frame. The largest framing effect was found for task Type 3 (F(1, 900) = 92.52, p < .001; η2p = .093; 90% CI [0.065, 0.124]), with 23.72% of respondents preferring the risky option under the positive frame and 75.51% of respondents preferring the risky option under the negative frame, as shown in Figure 3. Furthermore, post hoc tests showed that the proportion of the risky option in Type 3 was lower than in Type 1 and Type 2 (ps < .001, d 1 = 0.56, 95% CI [0.328, 0.790], d 2 = 0.78, 95% CI [0.542, 1.003]) in the positive frame. In the negative frame, the proportion of the risky option in Type 3 was significantly lower than in Type 2 (p = .004, d = 0.39, 95% CI [0.158, 0.613]).
Overall, the original finding that framing effects vary across task type was replicated in all three EV ratio conditions. Framing effects were found for Type 1, and a large framing effect was found for Type 3. No framing effect was detected for Type 2 despite the variation in the EV ratio. These findings indicate that decision making operates on simplified rather than exact numerical information, which is more consistent with fuzzy-trace theory.
4 General Summary: Mini Meta-Analysis
To provide an aggregate effect across the three studies, we conducted a mini-internal meta-analysis using the package “metafor” (Reference ViechtbauerViechtbauer, 2010; Liu, Liu, Wang & Zhang, in press) in R. The meta-analysis aims to gain a complete analysis of our hypothesis that the framing effect varied across task types. Nine effect sizes of the framing effect from the present paper were used, and the framing effect was examined in three studies among three task types. The results revealed a significant overall framing effect (total N = 2917), with risky probability in the sure options that was significantly lower than in the risky options (Hedges’ g = −0.53, SE =.18, p= .003, 95% CI [−0.939, −0.117]. Moreover, the effect sizes were significantly heterogeneous (Q(8) = 168.30, p< .001), which indicated further moderator analyses were necessary.
A subgroup meta-analysis was conducted to examine the moderate effect of the task type. The results indicated the framing effect differs in task types (F(2, 6) = 82.44, p< .020). More specifically, a classical framing effect was detected in task Type 1: the risky probability in sure options was significantly lower than in risky options with a medium effect size (Hedges’ g= -0.46, SE =.07, p= .020, 95% CI [−.741, −.176]). In Type 2, no framing effect was reported (Hedges’ g= .05, SE =.06, p= .531, 95% CI [−.227, .332]). While in Type 3, a large framing effect was detected with a large effect size (Hedges’ g= -1.16, SE = .08, p= .005, 95% CI [−1.504, −.821]). The meta-analysis results were consistent with our hypothesis that the framing effect varies across task types, which is more consistent with fuzzy-trace theory rather than prospect theory.
4.1 Replications evaluation
Following the approach proposed by LeBel et al. (2019), we evaluated whether the findings of the replications are consistent with the original study. Three distinct statistical aspects of a replication result are considered: (1) whether a signal was detected, (2) consistency of the replication effect size (ES) relative to the original study ES, and (3) the relative precision of the replication ES estimate relative to the original study. The results, summarized in Figure 4, showed that for all three studies, the ES 90% confidence interval excludes 0 and includes the original ES point estimate. This finding indicates that a signal was detected, and the ES was consistent with the original study.
5 General Discussion
The goal of this study was to assess the replicability of the findings presented by Reference Kühberger and TannerKühberger and Tanner (2010) in that removing the zero complements for a risky option can eliminate the framing effect, whereas removing the non-zero complements will enlarge this effect. These findings suggest that people fail to infer the logical complement when provided with an incomplete description and that the framing effect depends on an incomplete description of options. The findings first indicate that complementarity does not hold between the various problem descriptions, and an incompletely described problem remains ambiguous. The second implication of their findings pertains to the claim that the classic risky choice framing effect proposed by Reference Kutscher and FeldmanTversky and Kahneman (1981) is evidence for irrational preferences. Furthermore, for the accounts of the framing effect, the original finding favored fuzzy-trace theory, in accordance with the hypothesis that thinking may be intuitive and operates on simplified information, as opposed to prospect theory and other models that assume a more computational decision-making process. Consistent with the original findings, in the current research, the very close direct replication of Study 1 confirmed that the framing effect was changed by removing complements from risky options using the same items adopted by Reference Kühberger and TannerKühberger and Tanner (2010). Evaluation of the replications indicated that a signal was detected, and the ES was consistent with the original study. Together with the results found using samples from the United States and Germany, the results of the present study using Chinese samples add to the accumulating evidence that the elimination of the framing effect for the incomplete choice description is a general human tendency rather than the product of culture.
We broadened the original findings by manipulating probabilities of risky options and EV ratios of risky and certain options. In Study 2, the results indicated that the original findings could be generated for large-, medium-, and small-probability conditions. These findings implied that variations in probability are irrelevant to the behavioral pattern between different types of choice descriptions, which favors fuzzy-trace theory rather than prospect theory. According to the assumptions of fuzzy-trace theory, we can speculate that individuals were not sensitive to probabilities when completing the choice task in this study and were likely to reduce the quantitative information (e.g., 25% probability that 40 children will survive) into its qualitative gist (e.g., someone will survive). Similarly, the variation in EV ratios in Study 3 also had no impacts on the original findings; participants in all three EV ratio conditions (0.8, 1, and 1.2) showed similar behavioral patterns. This finding also challenges prospect theory, which holds that decision making is based on a tendency to favor the option with larger prospects (Reference Kahneman and TverskyKahneman & Tversky, 1979; Reference Tversky and KahnemanTversky & Kahneman, 1992). Therefore, the proportion of respondents choosing a risky option should be higher under the condition in which the EV ratio is larger. However, the risk preference in Study 3 remained unchanged under all conditions. This result also seemed to be more supportive of the claim of fuzzy-trace theory that decision making operates on simplified rather exact numerical information. Together, the results of Studies 2 and 3 further confirm the reliability and robustness of the study of Reference Kühberger and TannerKühberger and Tanner (2010) and the accounts of the framing effect in fuzzy-trace theory.
Moreover, based on the gist representations proposed by fuzzy-trace theory, we summarized how categorical and ordinal gist translation characterizes the experimental stimuli of Studies 1–3 (see Tables A4 and A5 in the supplement). According to the results of the current studies, categorical gist translation seemed more pivotal than ordinal gist translation. Actually, following the ordinal gist translation, the choice task is indeterminate. For example, for the positive framed program in Study 2, “a lower amount of people will certainly be saved (sure program) is pitted against a lower probability that more people will be saved (risky program)” within each option pair. Moreover, based on the results of Study 3, for the same type and frame programs, the choice proportions between different probability and EV ratio conditions were not significantly different (range of difference in each block: 0–0.12). These results imply that participants’ preferences did not change despite the variations in the probabilities and the EV ratio. Together, these findings did not support ordinal gist translation. Therefore, according to the results, participants in the current research were more likely to adopt categorical gist translation in risky decision making.
It would be unwise to reject prospect theory on the basis of the current studies. Our findings show that the explanation of the framing effect by prospect theory is silent about how people deal with information that has not been disclosed and that people may not tend to calculate in accordance with this continuous selection process, but rather prefer to operate on simplified information, as hypothesized by fuzzy-trace theory. However, in these studies, multiple trials were presented rather than a one-shot choice, which would increase the difficulty in estimating EV. In addition, individual differences, such as thinking style, interact with environmental factors, which could lead to adaptive decision strategies. Therefore, one avenue for future research is to subdivide decision tasks and individual differences to explore the applicability of different decision models.
In addition, for the classical type of frame effect (i.e., Type 1), the results of the current study showed that different probability levels did not influence the framing effect. Different from our findings, a meta-analysis on Asian-disease-type experiments revealed that participants demonstrated a stronger framing effect for large rather than for small probabilities of gain (Reference Kühberger, Schulte-Mecklenbeck and PernerKühberger, Schulte-Mecklenbeck & Perner, 1999). Other research shows that trials with a large probability (70%) of gain led to the framing effect, whereas those with a small probability (30%) did not (Reference Gosling and MoutierGosling & Moutier, 2017). The differences may lie in the thinking styles of individuals and experimental settings. The three studies of the current research consistently supported fuzzy-trace theory; that is, participants in our research were inclined to simplify information processing. Reference Simon, Ravez and MaglioneSimon, Ravez and Maglione (2004) found that the framing effect was moderated by the combination of the need for cognition and the depth of processing. Similarly, Reference Mahoney, Buboltz, Levin, Doverspike and SvyantekMahoney, Buboltz, Levin, Doverspike and Svyantek (2011) found that individuals scoring high on heuristic processing were more likely to show a framing effect. Correspondingly, in Study 2, the framing effect was also detected at the small probability level.
Following the proposal by Reference Simons, Shoda and LindsaySimons, Shoda and Lindsay (2017), we specified constraints on the generality of our findings. First, given that this effect has been observed for a diverse range of participants (US and European populations in the original study; Chinese in the current study), we expect that these findings can be generalized to other samples from other cultures. Currently, crowdsourced research projects in which multiple laboratories independently conduct the same study are prevalent. For instance, following the example of the Many Labs initiatives (Reference Klein, Ratliff, Vianello, Adams, Bahník, Bernstein and NosekKlein et al., 2014), the Many Labs 2 project conducted pre-registered replications of 28 classic and contemporary published findings to examine the variation in effect magnitudes across samples and settings (Reference Klein, Vianello, Hasselman, Adams, Adams, Alper and NosekKlein et al., 2018). Similarly, Reference Moshontz, Campbell, Ebersole, IJzerman, Urry, Forscher and ChartierMoshontz et al. (2018) created a distributed network of laboratories called the Psychological Science Accelerator (PSA), which aims to make crowdsourced research more commonplace in psychology, promote diversity in crowdsourcing, and increase the efficiency of large-scale collaborations. Accordingly, future replications and extensions could be conducted across multiple laboratories from different cultures to test the generality of findings across locations.
Second, as a direct replication and extension, the current research used four different environment scenarios that follow the original study. We expect the results to be generalized to environmental and health risk decision-making situations. However, whether the findings can be generalized to other domains, such as financial and economic scenarios, remains unclear. One avenue for future research could use monetary scenarios or other health-medical scenarios to test whether the conclusions reached by the current and original studies applied to other domains. Finally, the original study’s sample and that of our current replication were both recruited online. Participants make decisions between binary choices in specific scenarios. We expect the results to be generalized to online situations. However, we do not have evidence that the findings will occur in offline situations with multiple choices.
In conclusion, the current research provides converging evidence to support and broaden the original findings of Reference Kühberger and TannerKühberger and Tanner (2010). That is, for the framing effect, removing the zero/non-zero complements for the risky option can largely eliminate, or enlarge, the effect. Moreover, these findings could also be detected by varying the probability of risky options or the EV difference between risky and certain options. Therefore, we consider this study to be compelling evidence for considering fuzzy-trace theory in explaining risky decision making with information that has not been disclosed.