1. Introduction
Nudges, often described as freedom-preserving interventions that steer people in particular directions, attempt to change behavior without imposing mandates or significantly altering material incentives see, Osman et al., (Reference Osman, Fenton, Pilditch, Lagnado and Neil2018) or Thaler and Sunstein (Reference Thaler and Sunstein2021) for an overview. These behavioral interventions have been applied across a variety of domains, including conservation, recycling, weight loss, medicine adherence, and the general promotion of health and well-being (see Thaler and Sunstein (Reference Thaler and Sunstein2021), and Mertens et al. (Reference Mertens, Herberz, Hahnel and Brosch2022) for overviews), and have also been applied to the financial domain. Examples of behavioral interventions in finance include changing the default on pensions, so that a portion of an employee’s salary is put into retirement saving unless they opt out (Thaler and Benartzi, Reference Thaler and Benartzi2004), or removing minimum repayment for credit card repayments, to stop people from anchoring on the minimum repayment (Sakaguchi et al., Reference Sakaguchi, Stewart, Gathergood, Adams, Guttman-Kenney, Hayes and Hunt2022), thereby increasing their credit card debt repayments.
1.1. Prior research
Despite the widespread use of behavioral interventions, both in and out of the financial domain, there have been debates about their acceptability, desirability and effectiveness, some of which we will address now. Prior work finds that most people do find behavioral interventions acceptable, at least if they are the kind of interventions that have been adopted or are under serious consideration by contemporary governments (Hagman et al., Reference Hagman, Andersson, Västfjäll and Tinghög2015; Jung and Mellers, Reference Jung and Mellers2016; Osman et al., Reference Osman, Fenton, Pilditch, Lagnado and Neil2018; Petrescu et al., Reference Petrescu, Hollands, Couturier, Ng and Marteau2016; Reisch and Sunstein, Reference Reisch and Sunstein2016; Reisch et al., Reference Reisch, Sunstein and Gwozdz2016; Venema et al., Reference Venema, Kroese and De Ridder2018). At the same time, there are many qualifications to this general proposition. For example, people will not approve of behavioral interventions that they believe to be inconsistent with their values and interests (Reisch et al., Reference Reisch, Sunstein and Gwozdz2016). Let us offer a few details.
1.1.1. Nudging oneself or nudging others
Some studies find that people’s support for behavioral interventions is higher when they are given a justification of the policy in terms of its effects on people in general than when they are given a justification in terms of the effects on themselves (Cornwell and Krantz, Reference Cornwell and Krantz2014). There is also evidence that people also believe behavioral interventions to be more effective for others than for themselves, and that their judgments of the acceptability of behavioral interventions are predicted by how effective they anticipate the interventions will be on others’ behavior (van Gestel et al., Reference van Gestel, Adriaanse and de Ridder2021). A systematic review of the acceptability of government intervention to change health-related behaviors found that support for the interventions was highest among those not engaging in the targeted behavior (Diepeveen et al., Reference Diepeveen, Ling, Suhrcke, Roland and Marteau2013). This means that those who do not smoke are much more accepting of interventions trying to reduce smoking—as we all know that smoking has adverse health consequences, including those who smoke.
1.1.2. Intervention characteristics
Prior work has explored the potential effects of transparency, messenger, and perceived manipulation and ethics.
Transparency
Some prior work has found that people evaluate behavioral interventions more favorably when they are transparent and when people are aware of the process that leads to behavioral change (Diepeveen et al., Reference Diepeveen, Ling, Suhrcke, Roland and Marteau2013; Felsen et al., Reference Felsen, Castelo and Reiner2013; Jung and Mellers, Reference Jung and Mellers2016; Osman et al., Reference Osman, Fenton, Pilditch, Lagnado and Neil2018; Petrescu et al., Reference Petrescu, Hollands, Couturier, Ng and Marteau2016; Reisch and Sunstein, Reference Reisch and Sunstein2016; Reisch et al., Reference Reisch, Sunstein and Gwozdz2016, Reference Reisch, Sunstein and Gwozdz2017; Sunstein, Reference Sunstein2016c). One explanation of the preference for transparency is that it enables people to maintain a sense of agency over the behavior being targeted by the intervention (Osman, Reference Osman2014). Free choice is underpinned by a sense of agency and so, relative to opaque interventions, if people know how behavior change is achieved, they think they can more easily choose to do otherwise, thus preserving their autonomy (Lin et al., Reference Lin, Osman and Ashcroft2017; Osman et al., Reference Osman, Lin and Ashcroft2017).
One concern of transparency, however, is that it may render the intervention less effective, if not completely ineffective (see Grüne-Yanoff (Reference Grüne-Yanoff2012) for this line of argument). This concern appears to lack empirical support. Many interventions are transparent by their nature; consider a warning, a reminder, or a disclosure of information. Research by Loewenstein et al. (Reference Loewenstein, Bryce, Hagmann and Rajpal2015) does not find transparency to render interventions, particularly a weak default intervention, less effective or ineffective. Testing for end-of-life care preferences in a laboratory experiment, they find no evidence that informing participants that they were presented with a weak default, how this default works, and the other conditions present in the study, influences the default’s effectiveness, as measured by the participants changing their decision at the end of the experiment. Similarly, Kroese et al. (Reference Kroese, Marchiori and De Ridder2016), in a field experiment on healthy food choices, find no evidence that making subjects aware of the purpose behind a default intervention has any effect.
Research looking at consumer protection measures in several hypothetical and marginally incentivized consumer-related experiments similarly finds no evidence that stressing the potential behavioral influence of a pro-self as well as a pro-social default reduces their effectiveness (Steffel et al., Reference Steffel, Williams and Pogacar2016). Research by Bruns et al. (Reference Bruns, Kantorowicz-Reznichenko, Klement, Jonsson and Rahali2018) tests for two different types of transparency and their combined effect (knowledge of the potential influence of the default and its purpose) and how they influence the effect of the default. They conduct a laboratory experiment where participants are nudged toward making contributions to carbon emission reduction by introducing a default value. Similar to the aforementioned studies, their findings demonstrate that the information on the potential influence combined with the purpose of the default, or just its purpose, do not significantly affect contributions, which were increased by the default value. Again, transparency with regard to the behavioral intervention implemented did not render it less effective. Similarly, research in diverse laboratory settings finds that a lack of transparency has little or no effect on people’s experience of or evaluation of interventions (Michaelsen and Sunstein, Reference Michaelsen and Sunstein2023).
None of these studies rules out the possibility of scenarios under which transparency can backfire. Transparency might a) reduce the effectiveness of interventions (because it induces reflection or because people do not endorse the underlying goals), b) make interventions counterproductive (because people show reactance when they do not like being nudged), c) make interventions even more effective (because people would understand and support the underlying goals), or d) have no real impact on effectiveness at all (Sunstein, Reference Sunstein2016b).
Hypothesis 1. People have more positive attitudes toward transparent financial interventions.
Perceived Manipulation and Concerns around Ethics
Are nudges manipulative? To answer that question, we need a definition of manipulation (Sunstein, Reference Sunstein2016c). Some behavioral interventions are not plausibly characterized as manipulative; consider information that is relevant to consumer choices or a truthful warning that the risks associated with certain products. On the other hand, some people have expressed concern that certain interventions, including perhaps default rules, might turn out to be manipulative (Bovens, Reference Bovens2009; Hausman and Welch, Reference Hausman and Welch2010; Rebonato, Reference Rebonato2014; Waldron, Reference Waldron2014), as they might ‘operate in the dark’, prohibit real structural change and infantilize adults, reducing their autonomy (Bubb and Pildes, Reference Bubb and Pildes2013; Burgess, Reference Burgess2012; Conly, Reference Conly2014; Halpern, Reference Halpern2016; Schmidt and Engelen, Reference Schmidt and Engelen2020).
Human reasoning has sometimes been said to consist of two families of cognitive operations: System 1 (automatic, fast, potentially biased) and System 2 (reflective, slow, deliberative) (Kahneman, Reference Kahneman2011). System 1 interventions are those that target automatic processes, whereas System 2 interventions tend to be more informational and appeal to deliberative processes. Research by Hagman et al. (Reference Hagman, Andersson, Västfjäll and Tinghög2015) found that support for interventions making active use of inertia and inattention (System 1) received significantly less support. Work by Felsen et al. (Reference Felsen, Castelo and Reiner2013) found similar results, showing that participants were more likely to accept job offers from a company applying a System 2 intervention compared to a System 1 intervention, although the latter was still viewed favorably. There is some evidence that whether people prefer System 1 or System 2 interventions depends on the frame (Davidai and Shafir, Reference Davidai and Shafir2020) and on the potential effectiveness (Davidai and Shafir, Reference Davidai and Shafir2020; Sunstein, Reference Sunstein2016c), with System 1 nudges having been found to be, on average, more effective (Hummel and Maedche, Reference Hummel and Maedche2019). Effectiveness here can be framed in terms of objective measures (Cadario and Chandon, Reference Cadario and Chandon2019; Davidai and Shafir, Reference Davidai and Shafir2020; Hummel and Maedche, Reference Hummel and Maedche2019), but potentially also in the individuals judging themselves as having made better decisions (Clavien, Reference Clavien2018; Jung and Mellers, Reference Jung and Mellers2016; Michaelsen et al., Reference Michaelsen, Johansson and Hedesström2024). Interestingly, interventions turning out to be highly effective, or even the most effective, have been found to be inversely correlated with acceptance, where approval levels only increased with the perceived effectiveness of the intervention (Cadario and Chandon, Reference Cadario and Chandon2019). This finding highlights a need to correct misconceptions about which interventions work best, in addition to highlighting effectiveness in general, to increase overall approval. This finding is also in line with guidelines proposed in the FORGOOD framework (Lades and Delaney, Reference Lades and Delaney2022), which proposes that ethical acceptability of behavioral interventions is partially determined by the public judging the intervention (‘the means’) based on its goal and its effectiveness (‘the end’).
Although the end may sometimes be judged to justify the means, research by Turetski et al. (Reference Turetski, Rondina, Hutchings, Feng and Soman2023) shows that large variation remains. The authors study how the ethics of an intervention varies across specific intervention types (i.e., defaults, incentives), the domains in which they are delivered (i.e., organ donation, retirement savings), and how the rationale for their use is presented (i.e., loss framing, resistibility). They observe significant effects of domain and the type of intervention, as well as a significant interaction between domain and the type of intervention, suggesting that certain types of interventions may differ in perceived ethics depending on their domain, or that nudges in certain domains may be deemed more ethical, depending on what type of intervention is used. These effects also persist for perceptions of threat to autonomy and expected success. In terms of intervention types, Turetski et al. (Reference Turetski, Rondina, Hutchings, Feng and Soman2023) find that defaults were, on average, rated as significantly less acceptable and more autonomy-threatening than all the other interventions, regardless of domain. These findings are similar to research by Jung and Mellers (Reference Jung and Mellers2016) which also showed opposition to opt-out defaults for organ donations, as well as less favorable views for System 1 nudges (e.g., defaults and sequential orderings) as compared to System 2 nudges (e.g., educational opportunities or reminders). System 1 nudges were perceived as more autonomy threatening, but more interestingly, System 2 nudges were perceived as more effective for better decision making and more necessary for changing behavior, though there is evidence to the contrary (Davidai and Shafir, Reference Davidai and Shafir2020; Hummel and Maedche, Reference Hummel and Maedche2019). In addition to lower acceptance of defaults, Turetski et al. (Reference Turetski, Rondina, Hutchings, Feng and Soman2023) also found lower acceptability for social proof-based interventions. Notably, the authors do find that participants in their study tended to find most of the interventions acceptable and not-threatening to autonomy.
Research by Bruns and Perino (Reference Bruns and Perino2023) shows that the maintenance of autonomy is a core determinant of intervention acceptability; when comparing recommendations, defaults, and mandates as three possible interventions to improve climate protection, they were rated increasingly as ‘freedom threatening’ (prohibiting autonomy) and correlated positively with levels of reactance, with mandates receiving the highest negative reactance. However, there is also work to indicate that behavioral interventions are not perceived as curtailing autonomy at all (Wachner et al., Reference Wachner, Adriaanse and De Ridder2021), and that the implementation of behavioral interventions, whether the behavioral intervention was transparent or opaque, did not change choice satisfaction. Similarly, Michaelsen et al. (Reference Michaelsen, Johansson and Hedesström2024) find that despite participants who received a prosocial opt-out default nudge making more prosocial choices, they did not report lower autonomy or choice satisfaction than participants in opt-in default or active-choice conditions. This finding persisted even when the presence of the behavioral intervention was disclosed (transparency), and when monetary choice stakes were introduced. Adding in more nuance to the ethical concerns around autonomy, specifically with regards to behavioral interventions using defaults, Arvanitis et al. (Reference Arvanitis, Kalliris and Kaminiotis2022) find that experiences of autonomy and choice satisfaction, in addition to the intervention type (defaults), are also dependent on the overall choice architecture in which the intervention type operates. In their study, participants faced a hypothetical choice of health insurance plans. The results showed that when there were three health plans to choose from, participants default-nudged by having one plan pre-selected (vs. no default plan) gave significantly lower ratings in one of three autonomy sub-scales. However, this negative effect vanished when participants faced nine options to choose from.
The difficulty of the debate around the manipulativeness and ethics of behavioral change interventions, as raised by Turetski et al. (Reference Turetski, Rondina, Hutchings, Feng and Soman2023), is partly a product of the possible incomparability of domains: ‘We should avoid making generalized statements about the ethics of choice architecture interventions, and instead focus on exploring specific implementations of choice architecture interventions to better understand their acceptability in the public eye’ (p. 7). However, the trend in findings seems to indicate a preference for System 2 nudges (Davidai and Shafir, Reference Davidai and Shafir2020; Felsen et al., Reference Felsen, Castelo and Reiner2013; Hagman et al., Reference Hagman, Andersson, Västfjäll and Tinghög2015; Jung and Mellers, Reference Jung and Mellers2016), regardless of domain, on which we base our hypothesis.
Hypothesis 2. People have more positive attitudes toward System 2 financial interventions.
Messenger
Those who impose the intervention—the messenger—have been found to affect attitudes toward interventions as well. Some research finds that people trust interventions that are developed and proposed by researchers more than those that are developed and proposed by government (Osman et al., Reference Osman, Fenton, Pilditch, Lagnado and Neil2018). Research has also found that trust in government affects the acceptability of government interventions (Branson et al., Reference Branson, Duffy, Perry and Wellings2012) and it has been suggested that when negative attitudes to interventions are found, they may stem from mistrust of government (Jung and Mellers, Reference Jung and Mellers2016). In support of this, Bang et al. (Reference Bang, Shu and Weber2020) found that the acceptability of interventions depends on who designs and implements them and that these differences in acceptability were explained by perceived differences in the intention of the designer. Consistent with this, Tannenbaum et al. (Reference Tannenbaum, Fox and Rogers2017) found that people’s support for a intervention depended on whether they were told that it had been chosen by a policy-maker they supported or one they opposed (Bush vs Obama administration).
If we apply these messenger effects to the financial sector, we can identify several key parties who can propose and implement financial interventions: regulators and policy makers, independent researchers, and commercial organisations such as credit card unions and banks. It is possible that the former two would find widespread support for their interventions, if they are perceived as acting in the interest of the relevant people. By contrast, credit unions, banks and other for-profits might be assumed to have incentives that directly oppose those of the people. Prior research has shown that since the financial crisis, attitudes toward the financial services industry have become more negative (Bennett and Kottasz, Reference Bennett and Kottasz2012), and some authors have concluded that there is now a crisis of trust in that sector (Bachmann et al., Reference Bachmann, Gillespie and Kramer2011; Sapienza and Zingales, Reference Sapienza and Zingales2012).
This lack of trust may spill over into a mistrust of the interventions proposed by for-profit entities in the financial services. A review of ‘creepy’ interventions by de Jonge et al. (Reference de Jonge, Verlegh and Zeelenberg2022) showcases the example of ING, a Dutch bank, that decided to leverage large amounts of personal and transaction data to determine which customers had recently become parents. After having determined this the bank chose to tailor their product advertisements and ‘nudge’ these young parents to invest in financial products for their newborns. This led to widespread public and media outrage. The messenger, practice, and the goal were widely condemned. The argument can be made that this intervention could be classified only as ‘sludge’ (Thaler, Reference Thaler2018), and that it fails to comply with any of the tenets of the FORGOOD framework (Lades and Delaney, Reference Lades and Delaney2022). Although financial regulations have tightened—the Dutch regulator (Autoriteit Financiele Markten) warned other financial institutions about this practice it remains to be seen whether there is a messenger effect in the presence of financial interventions.
Hypothesis 3. People have more positive attitudes toward financial interventions proposed ‘in general’ than those specifically proposed by a bank.
With the foregoing research in mind, and aligned with our three hypotheses, we hypothesize that people most approve of transparent, system 2 financial interventions. To ensure accounting for differences in intervention characteristics we also split interventions into two frames: spending and saving.
We strongly suspected that in addition to a preference for transparent and system 2 interventions, people also prefer interventions that are savings-framed rather than spending-framed. Despite those frames being the inverse of each other (e.g., to spend less is to save more), there has been work to suggest positive framing (focusing on increasing the desirable behavior) as compared to negative framing (focusing on decreasing the undesirable behavior) can lead to increased intervention acceptability of a intervention (Nelson et al., Reference Nelson, Bauer and Partelow2021; Ouvrard et al., Reference Ouvrard, Abildtrup and Stenger2020; Rafaï et al., Reference Rafaï, Ribaillier and Jullien2022). However, we do have to emphasize that the evidence is mixed, and derived exclusively from interventions in the sustainability domain.
Hypothesis 4. People approve more of financial interventions that are savings rather than spending framed.
1.1.3. Demographic characteristics
Prior research has found a possible effect of demographic characteristics on attitudes toward, and usage of, interventions. Work by Beshears et al. (Reference Beshears, Choi, Laibson, Madrian and Skimmyhorn2016) showed that low-income and younger people were most likely to stick with automatic enrollment into 401(k) plans, as well as sticking with the contribution rate default. Work by Shah et al. (Reference Shah, Osborne, Lefkowitz Kalter, Fertig, Fishbane and Soman2023) also looked at retirement savings, but did so in a Mexican context. They tested an intervention in which they send out text messages with different framing. The most effective framing was found to focusing on ‘family security’, where the intervention emphasized that retirement savings can aid toward securing a financially stable future for one’s family. The researchers found a strong effect of age. Those over the age of 28 saw an increase in contribution rates by 89%. For those under the age of 28, however, this specific framing of the intervention backfired, decreasing contributions by 53%. This finding was explained by the average age of starting a family being 28 years in Mexico, which meant that the intervention is relevant to those around and over 28 years of age, but not to those who are younger. This is both a feature of the demographic and the intervention itself, raising questions about how to frame interventions to the extent that they are specific, yet not exclusive. A further review of individualised or ‘smart’ nudging can be found in Hallsworth (Reference Hallsworth2023).
This work is a collaborative effort with a large Australian financial institution, and hence its exclusive focus on financial interventions. We also think that limiting ourselves to a single domain of interventions is positive, given the differences between intervention acceptability between domains (Turetski et al., Reference Turetski, Rondina, Hutchings, Feng and Soman2023), allowing us to compare intervention types (e.g., defaults, social norms) more directly.
2. Method
We designed an online survey to test whether people like financial interventions. Resources for this survey, including the pre-registration, survey design, exact scenarios used and analytics files, can be found on the Open Science Framework, https://osf.io/6sfpm/.
2.1. Design
We used a mixed factorial design, with 2 levels for framing (spending, savings), 2 levels for system (System 1, System 2) and 2 levels for transparency (transparent, opaque), to give a 2 x 2 x 2 design. We prepared 18 unique financial interventions (see Appendix O). Of these 18 interventions, 12 are System 1 interventions and 6 are System 2 interventions (System), derived from interventions having been conducted, or being considered to be conducted, with the Australian financial institution. Of those 18, evenly spread across the two Systems, 9 financial interventions are framed in terms of spending, and 9 are framed in terms of savings (Frame). For all of these 18 scenarios we created transparent versions, explaining how exactly the intervention works in terms of the behavioral aspects it targets. This is known as the Transparency condition. This leads to a total of 36 scenarios. Figure 1 below shows all 36 conditions for clarification. The numbers in Figure 1 refer to the unique numbers for each intervention (scenario) for which details can be found in Appendix O.
To take account of the risk of information overload and fatigue, we did not present participants with all 36 scenarios. We divided the 36 scenarios into 4 blocks; Transparent x Spending (with 6 being System 1 and 3 being System 2), Opaque x Spending, Transparent x Saving and Opaque x Saving. Each participant would only see 2 scenarios of each of these 4 blocks, at random. The order of the four blocks was also randomized to ensure there were no ordering effects. Figure 1 shows the blocks and how they can be drawn from for clarification.
We tested people’s attitudes toward these financial interventions across six dependent variables: approval, benefit, ethics, manipulation, the likelihood of use, and the likelihood of use when proposed by a bank. To measure these six variables, participants were asked to state their agreement with a bipolar Likert Scale (1 = Strongly Disagree, 3 = Neither Disagree nor Agree, 5 = Strongly Agree), with each of the interventions, to the following statements:
-
• ‘I approve of this intervention’
-
• ‘I see clear benefits to this intervention’
-
• ‘I find this intervention ethical’
-
• ‘I find this intervention manipulative’
-
• ‘I would make use of this intervention’
-
• ‘I would make use of this intervention if my bank were to apply this’
We have highlighted the phrase this intervention in our above statements as this was tailored to the intervention being proposed. For example, in scenarios 17, 18, 35, and 36, which all feature the Goal Feedback intervention, the statements to be rated would start with ‘I approve of this feedback all the way to I would make use of this feedback if my bank were to apply it’.
2.2. Participants
We acquired a sample representative of the Australian population. In total, 2,100 participants were tested. The initial 100 were used to test for technical issues (there were none) followed by the remaining 2,000Footnote 1 .
Our pre-registrationFootnote 2 outlines our five exclusion criteria. First, we excluded all participants who did not consent to the study. Second, we excluded all participants who had demographic criteria for which we had reached our quota. Third, we excluded all participants who did not complete the survey. These three exclusions left us with 1,891 participants. Our fourth and fifth exclusion criteria focused on the quality of the responses given. Criterion four removed all participants who completed the survey in under three minutes, and criterion five removed all participants for whom we could not spot variation in at least half of their answers, as we suspected these types of responses to be due to the participant rushing through the survey. We did not find either of these quality issues within our results, so we did not have to discard data based on these exclusions, leaving us with a sample of 1,891 participants.
The initial number of participants (n = 2,100) was based on internal resource constraints. A power analysis reveals that for both the t-testsFootnote 3 as well as the linear modelsFootnote 4 conducted we reach a power of over .99, mostly due to the partial within-subjects design of our study.
2.3. Procedure
Upon entering the survey, participants were first presented with a quick introduction to the survey, and asked if they would consent to participating in the survey. If they consented, participants were asked for their demographic data to determine quota fit. When the quota had not yet been met, participants proceeded to seeing the first scenario describing a financial intervention. After reading the scenario describing a financial intervention, participants were asked to indicate their opinion of the financial intervention in terms of each of the six dependent variables by assigning a value on a 1–5 bipolar Likert scale (1 = ‘strongly disagree’, 5 = ‘strongly agree’). This would continue for eight randomly allocated scenarios, drawn at random from the four pre-determined blocks of interventions. After having completed the rating of all eight scenarios, participants were also asked to complete the Cognitive Reflection Task (Frederick, Reference Frederick2005), Lusardi’s Big Three as a measure of financial literacy (Lusardi, Reference Lusardi2015) and to self-assess their understanding of financial management (1 = very low, 5 = very high) across eight different domains (e.g., mortgages, investing).
The experiment was presented via Qualtrics and launched with Dynata, a crowd sourcing system for participant recruitment to obtain a representative sample of Australian citizens. All participants were financially compensated for their time ($5.60 AUD), calculated according to Dynata’s rates.
2.4. Analysis
We also pre-registered our analysisFootnote 5 . Initially, we aimed to run 3 models per dependent variable (approval, benefit, ethics, manipulation, likelihood of use and likelihood of use when proposed by a bank). The first model only regressing the dependent variable against our 3 main variables of interest (Transparency, System and Frame), with the second model also including demographic characteristics (age, gender, income, state, living conditions [metro, rural]) and the third model also incorporating the financial literacy score, cognitive reflection task scores and an accumulated rating of self-perceived efficacy across 8 domains in personal finance managementFootnote 6 . All of these models were pre-registered to be simple linear models (OLS).
3. Results
We decided to deviate from our pre-registration in a number of meaningful ways: First, we will only present the models including all measured variables, as we no longer see the first and second models as adding any additional value. Second, we will apply a Linear Mixed Model rather than a simple OLS to allow us to account for the (fixed) effects of both the rounds of interventions and the participants. We have analyzed our results in simple OLS form as well and these results can be found in Table B1 in Appendix B. The appendix also houses the model-free means and standard deviations of all 36 scenarios across the six different dependent variables (Table A1, Appendix A) as this is an important fixed effect.
Table 1 below shows the results for the mean difference testing between our six dependent variables and the point of neutrality. Because we used a bipolar 5-point Likert scale, our neutrality point is 3. We find that ratings for all variables, except for perceived manipulation, are statistically significantly different from neutrality. We will continue to discuss the results, per hypothesis, in turn.
3.1. Transparency
First, we hypothesized that people would have more positive attitudes toward transparent financial interventions, as compared to opaque financial interventions, in line with prior research on the positive effect of transparency on attitudes toward interventions. Figures 2 and 3 show that our study fails to replicate this effect. Across all six dependent variables, we fail to discern any effect of transparency, across any of the 36 scenarios. This lack of effect is further shown in Table 2, where Transparency fails to reach a level of significance in influencing any of the six attitudes toward the financial interventions, even when controlling for both individual participants and interventions as fixed effects.
Note: The reference levels for the instrumental variables are: Transparency (Opaque), System (1), Frame (Savings). Reference levels for the covariates are: Gender (Female), Area (Metropolitan). Age and Income are banded numerical variables, with age ranging from 18–24 to 85 and older, and Income ranging from under <$10,000 to over $150,000. The CRT (Cognitive Reflection Test), Self-efficacy and Financial Literacy variables are all numerical variables with ranges of 0–3, 8–40, and 0–3, respectively. The models include dummies for all Australian states. $^{*}p$ 0.1; $^{**}p$ 0.05; $^{***}p$ 0.01.
3.2. System
Second, we hypothesized that people would have more positive attitudes toward financial interventions that target System 2, as compared to financial interventions that target System 1, in line with prior research on the more negative attitudes held toward System 1 interventions (perhaps because they were deemed to be more manipulative). Figure 4 shows that System 2 interventions were rated significantly more positive than their System 1 counterparts, except for manipulation in which case they are rated significantly lower. Table 2 shows this further as coefficients for the System 2, as compared to System 1, financial interventions are significantly higher (lower for manipulation).
3.3. Messenger
Third, we hypothesized that people would have more positive attitudes toward financial interventions in general, rather than financial interventions proposed by a bank, with the latter having a very strong messenger effect. A repeated measures t-test reveals the presence of a messenger effect; people rate their likelihood of making use of the intervention when proposed by a bank as 0.065 95% [0.078, 0.052] lower compared to when the messenger of the intervention is not disclosed. We would like to emphasize, however, that this messenger effect does not render the attitudes toward the intervention as negative. Comparing the overall mean of financial interventions proposed by a bank to the neutrality point, we continue to find that the likelihood of using a intervention, even when proposed by a bank, with a mean of 3.197 (95% [0.177, 0.217]) is significantly higher than neutrality (Table 1).
3.4. Frame
Last, we hypothesized that people would have more positive attitudes toward financial interventions that were savings rather than spending framed, as the savings frame is a more positive way of framing both the intervention as well as the desired behavior. Figure 4 shows support for this hypothesis, with savings framed interventions being rated significantly higher (lower for manipulation) across all dependent variables, without confidence intervals ever overlapping. Table 2 highlights this effect further with the coefficient for spending being significantly lower than that for savings.
3.5. Demographics
In addition to the effects of transparency, system, frame, and messenger, we would like to highlight several demographic and individual effects. Having controlled for individuals’ views toward financial interventions, we continue to find effects for all demographic variables measured. We consistently find an effect of age, with younger participants holding significantly more positive attitudes toward financial interventions. We also find a significant effect of gender, with women holding significantly more positive attitudes toward interventions than men. We also find an effect of income, with higher income earners holding significantly more positive views toward financial interventions, as well as area effects, with those living in more metropolitan areas holding significantly more positive views toward financial interventions than those who live more rural. Last, looking at our three measures of literacy and efficacy, we find that good performance on the Cognitive Reflection Task is significantly negatively associated with positive attitudes toward financial interventions; that those who rate themselves as having higher financial self-efficacy hold significantly more positive views toward financial interventions; and that higher financial literacy significantly reduces the willingness to make use of a intervention, regardless of whether it is proposed by a bank or not, despite perceiving it as less manipulative.
To understand these effects further, we conducted an exploratory analysisFootnote 7 on the effect of the demographic variables on the effects of Transparency, System, and Frame of the interventions. We used model-based recursive partitioning creating decision treesFootnote 8 , capped at a depth of 4 for legibility, for each of the six dependent variables, for which the decision trees and coefficient results can be found in Appendix C–N. We find that age, income, and gender are key decision nodes for our decision trees, for all six dependent variables, with the strongest partition centering around age. The first partition for all six dependent measures is based on age, with those below 45 being further partitioned into high and low income earners, and then further partitioned by gender for low income earners, and again on age for higher income earners. This partition shows that young, low income women approve of financial interventions significantly more than their male counterparts (Appendix I), specifically having a significantly higher intercept (3.669, compared to 3.569) as well as a significantly higher coefficient for System 2 interventions (0.152, compared to $-$ 0.054). For the higher income earners under 45, this effect is further driven by age, with those under 35 having higher approval rates for financial interventions compared to those age 35–44. Looking at those over 45 years of age, we again see a partitioning based on income, although this split is less clear as it partitions those earning between $80,000 and $150,000 from all other income groups. This group is then again partitioned by age, where we again find that younger participants between 45 and 75 years of age have higher approvals for financial interventions than those 75 and older. This holds true for the other income groups as well. This pattern seemingly repeats for the other five variables, with the first partitioning being age-based, those under 45 being split from those over 45 of age. We explore this effect further in Figure 5 below, showing that those over 45 year have significantly lower ratings for financial interventions, across all six dependent variables. Across the second partitioning being income-based, with lower earners being partitioned away from higher incomes, although this partitioning is less clean. The third partitioning is often gender based, as seen with approval, benefit, manipulation, and likelihood of use when proposed by a bank, but further income and age partitioning is done here as well, making it more difficult to interpret the results. Appendix I–N show the coefficients associated with the impact of the partitions on the six dependent variables, through the changing coefficients for the three key predictor variables: transparency, system, and frame.
4. Discussion
4.1. Attitudes toward financial interventions
We find that, on average, people are highly supportive of financial interventions, showing that ratings are above neutrality for our five positive dependent variables, and not significantly different from neutrality for the negative dependent variable (rating the intervention in terms of it being perceived as manipulative), indicating that, on average, the financial interventions were not experienced as manipulative in nature. This finding of support is in line with most prior work reviewed, especially that of Sunstein (Reference Sunstein2015, Reference Sunstein2016a).
In terms of our hypotheses, we find no support for our first hypothesis, showing that transparency has no significant impact on attitudes toward the financial interventions. This is line with previous experimental and field work showcasing that transparency need not change an intervention (Bruns et al., Reference Bruns, Kantorowicz-Reznichenko, Klement, Jonsson and Rahali2018; Kroese et al., Reference Kroese, Marchiori and De Ridder2016; Loewenstein et al., Reference Loewenstein, Bryce, Hagmann and Rajpal2015; Michaelsen et al., Reference Michaelsen, Johansson and Hedesström2024; Steffel et al., Reference Steffel, Williams and Pogacar2016). It is possible, however, that the lack of effect of transparency is mostly due to the interventions being quite clear in their purpose, with further explanation seeming to be unwarranted even when participants were assigned to see opaque interventions.
With respect to our second hypothesis; we do find support for the proposition that people prefer System 2 interventions. System 2 interventions were rated higher across all five positive dependent variables, as well as being rated as less manipulative. This is again in line with prior work (Arad and Rubinstein, Reference Arad and Rubinstein2018; Felsen et al., Reference Felsen, Castelo and Reiner2013; Gold et al., Reference Gold, Lin, Ashcroft and Osman2020; Hagman et al., Reference Hagman, Andersson, Västfjäll and Tinghög2015; Pechey et al., Reference Pechey, Burge, Mentzakis, Suhrcke and Marteau2014; Petrescu et al., Reference Petrescu, Hollands, Couturier, Ng and Marteau2016; Sunstein, Reference Sunstein2015). However, we do need to note that this comparison is not as clean as the comparison between transparent and opaque interventions. As both Figure 1 and Table O1 (Appendix O) shows, there is an imbalance between the System 1 and System 2 interventions proposed. First, we test six different System 1 interventions as compared to three different System 2 interventions. Although we do cross-validate them in terms of Frame as well as Transparency, it is difficult to compare a System 1 to a System 2 intervention directly. For example, when taking an opaque and spending-framed intervention as an example, one of the System 1 interventions for this focuses on social norms, making participants aware of how much other people are spending and how this can influence their own spending habits, whereas one of the System 2 interventions is grounded in taking a longer-term view, proposing a spending-tracker intervention to raise awareness of people’s own spending habits. Although we report a significant difference between System 1 and System 2 across all six dependent variables, Figures 2 and 3 reveal that this may be largely driven by a strong attitudes toward specific interventions (e.g., scenario 5, 7, 17, 18). This is a limitation of this specific comparison that warrants highlighting. With regards to specific types of interventions, we did test two different default interventions for both spending (scenario 1, 19) and savings (scenario 2, 20), both when opaque (1, 2) and transparent (19, 20). Figures 2 and 3 have revealed there to be no effect of Transparency on this intervention type, but t-tests reveal there to be a significant difference between defaults and the other interventions. Looking at defaults for spending, we find a significant difference for defaults compared to other interventions, having been rated 0.097 95% [ $-$ 0.014, $-$ 0.180] lower in terms of approval, 0.154 95% [ $-$ 0.077, $-$ 0.232] lower in terms of benefit and 0.151 95% [ $-$ 0.070, $-$ 0.233] lower in ethics, but we find no significant differences in manipulation or the likelihood of use, both general and when proposed by a bank. For defaults using the savings frame, we see find a significantly higher rating 0.095 95% [0.022, 0.168] for benefit, a 0.228 95% [0.144, 0.312] higher rating for likelihood of use, and a 0.179 95% [0.092, 0.265] for likelihood of use when proposed by a bank. As such, our findings are in line with prior work that argues that approval of interventions is not only type dependent (default), but also domain (spending, savings) dependent (Turetski et al., Reference Turetski, Rondina, Hutchings, Feng and Soman2023), with no effect for transparency (Michaelsen et al., Reference Michaelsen, Johansson and Hedesström2024; Wachner et al., Reference Wachner, Adriaanse and De Ridder2021).
We find support for our third hypothesis as well, finding that people rate their likelihood of making use of a financial intervention slightly higher when the messenger for this intervention is not explicitly disclosed. As suggested by the literature, this could be due to the negative perceptions of the financial sector (Bachmann et al., Reference Bachmann, Gillespie and Kramer2011; Bennett and Kottasz, Reference Bennett and Kottasz2012; Sapienza and Zingales, Reference Sapienza and Zingales2012), or a perceived misalignment between the goals of the individual and the messenger (de Jonge et al., Reference de Jonge, Verlegh and Zeelenberg2022). This effect however, is slight, and we do continue to find positive attitudes toward the financial interventions for both ‘messenger types’, with people indicating a positive likelihood of making use of the intervention.
We also included a framing variable, displaying interventions that were framed in terms of either spending or saving. There is some prior literature to suggest that people prefer a positive frame over a negative frame, but that evidence is mixed and largely derived from the sustainability domain (Nelson et al., Reference Nelson, Bauer and Partelow2021; Ouvrard et al., Reference Ouvrard, Abildtrup and Stenger2020; Rafaï et al., Reference Rafaï, Ribaillier and Jullien2022). We do find a strong preference, which remains consistent across all six measures; people prefer interventions promoting their savings behaviors rather than prohibiting their spending behaviors. Further exploration of this result is needed to understand why exactly this is. We hypothesized that savings-frame is can be perceived as the positive inverse of the spending-frame, which is inherently negative (i.e., ‘save more’ vs. ‘spend less’). However, this is merely a speculation and will warrant further exploration.
There is also the need to highlight that this results suffers a similar limitation to our result for comparing System 1 and System 2 interventions directly. Table O1 (Appendix O) reveals details of the exact interventions tested, and although we attempted to make the spending and saving-frames the inverse of each other, this was not always possible. For example, scenarios 3 and 4 are the exact inverse of each other, with the spending-frame focusing on social norms in spending, with the savings-frame focusing on social norms in savings behavior. But even here, the example mentioned displays slight differences, with spending focusing on grocery spending, and savings focusing on money saved when switching utility providers. Although we report a significant difference between spending and savings-framed interventions across all six dependent variables, these comparisons are again not as clean as comparing the transparent and the opaque interventions, and this is again a limitation of this specific comparison that warrants highlighting.
4.2. Demographic effects
Beyond the characteristics of the financial interventions themselves, we find consistent demographic effects. On average, younger people, women, those earning higher incomes and living in more metropolitan areas are more favorable toward financial interventions. Using model based recursive partitioning, we find that this effect is strongest for age, with those under 45 having significantly more positive attitudes toward behavioral interventions.
Prior work has confirmed that demographic characteristics can affect the effectiveness of interventions, but we are not aware of work having looked at demographic characteristics affecting attitudes toward financial interventions, or interventions in general.
We can speculate about the reasons for the effects that we observe. People may reject the need for interventions in their behavior if they (over)confident that they can manage themselves. Research by Menkhoff et al. (Reference Menkhoff, Schmeling and Schmidt2013) finds that age is negatively correlated with overconfidence, whereas work by Prims and Moore (Reference Prims and Moore2017) finds that it is confidence and not overconfidence that increases with age (the difference being whether the confidence is proportionate to one’s abilities). Work by Hansson et al. (Reference Hansson, Rönnlund, Juslin and Nilsson2008) finds the opposite: age is positively correlated with overconfidence. From these conflicting findings, it is difficult to draw a strong conclusion. Another possible explanation is that with increases in age comes a decrease in support of guidance and automation. interventions are a form of guidance, and several interventions do rely on a form of automation, or at least delegation. Work by Lee et al., Reference Lee, Gershon, Reimer, Mehler and Coughlin2021 has shown that age is negatively correlated with acceptance of automation. Further testing reveals that there are significant differences between our age groups for financial literacy and financial self-efficacy. We find that financial literacy increases with age, reaching its peak at age group 65–74 (2.27), followed by ages 55–64 (2.10) and 75–84 (2.08), which are all statistically different from each other. The highest financial literacy score (ages 65–74) is 1.076 95% CI[1.021, 1.132] higher than the lowest score of (ages 18–24). We find a similar trend when looking at financial self-efficacy with the highest scores being associated with the higher age groups: ages 85 and older score themselves at 28.22, followed by those aged 65–74 (28.03) and 75–84 (27.70). However, these three scores are not significantly different from each other. Statistically significant lower self-efficacy scores are reached from age group 55–64, with scores that are 1.97 95% CI[ 0.778, 3.165] lower than the highest score. Further research could explore domains in which younger people are assumed, or rate themselves, to be more knowledgeable and confident than older people, and see whether this high approval of interventions persists in those domains as well.
Similarly, looking at gender effects, research has found that women are less confident in their financial knowledge and decision-making than men are (Barber and Odean, Reference Barber and Odean2001; Beckmann and Menkhoff, Reference Beckmann and Menkhoff2008). This (lack of) confidence does not map onto performance; overconfidence has been found to often be detrimental in financial decision-making, specifically investing. However, it is possible to speculate that those who deem themselves as less able (in this case, women) would be more willing to follow advice, guidelines and here, an intervention. Looking at our own data, we do find that women rate themselves 2.633 95%[2.425, 2.841] lower in terms of financial self-efficacy, and also hold significantly lower financial literacy scores, which are lower by 0.555 95% [0.523, 0.587] as compared to those of the men. Further research could explore domains in which women are assumed, or rate themselves, to be more knowledgeable and confident than men, and see whether this high approval of interventions persists.
We are unaware of prior work showing that attitudes toward interventions are dependent on income levels. However, the argument can be made that for those with lower incomes, more is at stake; if the intervention backfires they stand to forego a relatively larger portion of income (or overall wealth). This may make lower income individuals more hesitant to follow a intervention, out of a fear of losing control over their money. But this again remains speculative. Within our own data we do again find effects of income on financial literacy and self-efficacy. There was a significant difference in financial literacy scores for the twelve different income groups F(11, 15118) = 25.04, p <0.001 as well as a significant difference in financial self-efficacy scores F(11, 15118) = 94.82, p <0.001. Contrary to the age group differences, these results are not linear in nature. Although the lowest income group has a significantly lower literacy score as compared to the highest income level (0.518 95% CI[0.420, 0.615]), there are income groups that do not have statistically different scores (e.g., groups 2 and 9, or 3 and 11). The same holds true for financial self-efficacy. We find this effect difficult to interpret and urge further study.
4.3. Implications and applications
A key finding of this research is that there is a variability of attitudes toward interventions, depending on individual characteristics. We find that intervention attitudes are significantly more positive for women, higher income earners and those who are younger (under 45). These findings may have implications for market segmentation and personalization for both policy and feature/product design in the financial domain, particularly given increased use of artificial intelligence to deliver a broader range of customer experiences.
This finding is relevant to the increasing interest in individualised or personalised interventions (Dimitrova et al., Reference Dimitrova, Mitrovic, Piotrkowicz, Lau and Weerasinghe2017; Mills, Reference Mills2022; Peer et al., Reference Peer, Egelman, Harbach, Malkin, Mathur and Frik2020; Schöning et al., Reference Schöning, Matt and Hess2019), or smart nudging (Mele et al., Reference Mele, Spena, Kaartemo and Marzullo2021). For a recommendation on studying and applying personalised interventions, see Hallsworth (Reference Hallsworth2023). With the availability of data in the financial services being extremely high, there is also the opportunity to apply data-driven interventions to customers, to help them reach their (financial) goals. The data-driven intervention is different from the personalized intervention by not being based on demographic characteristics (e.g., age, gender, income), but based on observed behavior (e.g., amount spent per spending category, savings held, the level of debt incurred and for what, types of investments held and their spread). For example, a customer who has an outstanding credit card debt at a magnitude of several months’ wages, with strong tendencies to shop online at night, could benefit from a data-driven intervention that emphasises how much the customer has already spent on online shopping, and how that money could be used to pay down their credit card debt, saving them money on the interest incurred with holding the debt. However, we recognize that there are potential ethical concerns with respect to who applies the intervention, how it is applied, and what it is aiming to achieve. These concerns deserve further attention and research.
4.4. Future research
Our simplest and most important finding is that people have positive attitudes toward interventions in the financial domain. At the same time, these attitudes do depend on how the intervention is framed and which System it targets. Strong preferences were found for System 2 interventions that were framed in terms of savings. Additionally, we find demographic effects, showing that intervention attitudes are significantly more positive for those under 45, women, higher income earners and those living in metro areas. These findings may have implications for market segmentation and personalization for both policy and feature/product design in the financial domain. As we have mentioned, it is not yet known whether, and to what extent, people are comfortable with high levels of personalization when it comes to interventions, and what they deem acceptable data to use to personalise interventions. Research has found that the data deemed acceptable to use for personalization is limited in scope, pertaining mostly to age and gender (Piller and Müller, Reference Piller and Müller2004), but acceptability was also found to depend on the goal of the personalization (de Jonge et al., Reference de Jonge, Verlegh and Zeelenberg2022). Whether similar findings hold for interventions in the financial domain or beyond remains to be seen.
Data availability statement
Replication data and code can be found on the Open Science Framework at https://osf.io/6sfpm/.
Acknowledgments
We would like to express our sincere gratitude to the Behavioural Science Centre of Excellence at the Commonwealth Bank of Australia, for their helpful feedback and support throughout both the design and dissemination process. Their expertise and insights were invaluable in shaping our research and helping us to overcome challenges.
We also want to thank the organisers and attendees of the BEST 2023 Conference at the Queensland University of Technology for their contributions to our research.
We are also grateful to the Customer Data & Analytics Office at the Commonwealth Bank of Australia, for providing financial support for this research. Their generous funding allowed us to conduct our study and complete this work.
We are grateful as well to the Program on Behavioral Economics and Public Policy at Harvard Law School.
Author contributions
Conceptualization: van den Akker; Sunstein. Methodology: van den Akker; Sunstein. Funding Acquisition: van den Akker. Data curation: van den Akker. Formal Analysis: van den Akker. Data visualisation: van den Akker. Writing original draft: van den Akker. Writing a Review & Editing: Sunstein. All authors approved the final submitted draft.
Funding statement
This research was funded by the Customer Data & Analytics Office at the Commonwealth Bank of Australia.
Competing interests
Dr. van den Akker is employed by the Commonwealth Bank of Australia, who issued this research.
Ethical standards
The research meets all ethical guidelines, including adherence to the legal requirements of the study country.
Appendix A
Appendix B
Note: The reference levels for the instrumental variables are: Transparency (Opaque), System (1), Frame (Savings). Reference levels for the covariates are: Age (18–24), Gender (Female), Income (<$10,000), Area (Metropolitan). The CRT (Cognitive Reflection Test), Self-efficacy and Financial Literacy variables are all numerical variables with ranges of 0–3, 8–40, and 0–3, respectively. The models include dummies for all Australian States. $^{*}p<$ 0.1; $^{**}p<$ 0.05; $^{***}p<$ 0.01
Appendix C
Appendix D
Appendix E
Appendix F
Appendix G
Appendix H
Appendix I
Note: Income levels have been condensed for legibility, with level 1 being <$10,000, level 2 being $10,000–$19,999 up to levels 10, 11, and 12 which are based on $90,000–$100,000, $100,000–$150,000, and over $150,000. Depth is capped at 4 for legibility. The count (n) is measured in ratings, not participants, with each participant rating 8 interventions each.
Appendix J
Note: Income levels have been condensed for legibility, with level 1 being <$10,000, level 2 being $10,000–$19,999 up to levels 10, 11, and 12 which are based on $90,000–$100,000, $100,000–$150,000, and over $150,000. Depth is capped at 4 for legibility. The count (n) is measured in ratings, not participants, with each participant rating 8 interventions each.
Appendix K
Note: Income levels have been condensed for legibility, with level 1 being <$10,000, level 2 being $10,000–$19,999 up to levels 10, 11, and 12 which are based on $90,000–$100,000, $100,000–$150,000, and over $150,000. Depth is capped at 4 for legibility. The count (n) is measured in ratings, not participants, with each participant rating 8 interventions each.
Appendix L
Note: Income levels have been condensed for legibility, with level 1 being <$10,000, level 2 being $10,000–$19,999 up to levels 10, 11, and 12 which are based on $90,000–$100,000, $100,000–$150,000, and over $150,000. Depth is capped at 4 for legibility. The count (n) is measured in ratings, not participants, with each participant rating 8 interventions each.
Appendix M
Note: Income levels have been condensed for legibility, with level 1 being <$10,000, level 2 being $10,000–$19,999 up to levels 10, 11, and 12 which are based on $90,000–$100,000, $100,000–$150,000, and over $150,000. Depth is capped at 4 for legibility. The count (n) is measured in ratings, not participants, with each participant rating 8 interventions each.
Appendix N
Note: Income levels have been condensed for legibility, with level 1 being <$10,000, level 2 being $10,000–$19,999 up to levels 10, 11, and 12 which are based on $90,000–$100,000, $100,000–$150,000, and over $150,000. Depth is capped at 4 for legibility. The count (n) is measured in ratings, not participants, with each participant rating 8 interventions each.
Appendix O