1 Introduction
Many states of the world are not directly observable and can only be inferred from cues in the environment that probabilistically relate to the object of inference. The psychological mechanism of inferences based on probabilistic cues have been investigated in various domains, including distance perception (e.g., Brunswik, 1944), personality assessment in social interactions (e.g., Funder, 1996; Reference Gosling, Ko, Mannarelli and MorrisGosling, Ko, Mannarelli, & Morris, 2002), market prediction in an economic setting (e.g., Reference Borges, Goldstein, Ortmann and GigerenzerBorges, Goldstein, Ortmann, & Gigerenzer, 1999), legal judgments (e.g., Reference Holyoak and SimonHolyoak & Simon, 1999; Reference Glöckner and EngelGlöckner & Engel, 2013), diagnostic decision making (e.g., Reference Swets, Dawes and MonahanSwets, Dawes, & Monahan, 2000), and many more. Probabilistic inference can be a difficult task, potentially involving multiple complicating factors. Besides the empirically well investigated challenge that individuals face to select and/or (appropriately) weight and integrate information from various cues, one less intensely investigated complicating factor is that cue information can be incomplete in that potentially important cue values are missing or unknown.
Suppose, for example, that the police must determine which of two suspects, Hans or Bob, has more likely stolen money from a company’s safe, in order to decide which suspect should be the focus of investigations. Pieces of evidence (cues) for making this judgment are incomplete. For some cues, information will be available for one of the suspects but missing for the other. A pen might have been lost by the culprit at the site of crime and a roommate of Hans is certain that Hans does not own such a pen. For Bob it is unknown whether he owned such a pen or not. How would the police treat this missing cue information? One possibility would be to ignore the pen altogether. Another possibility would be to assume that Bob also does not own the pen in that missing cue values are always replaced by negative instances in case of doubt. The police could, however, also rely on more complex reasoning processes to infer the missing cue value. The officer might take into account whether it is a rare pen or not, that is, the base-rate that somebody owns this pen. If it is a very rare pen she might infer that Bob also does not own such a pen. Still, if the officer knows that with a high likelihood the pen comes from one of the two suspects, she will most likely infer that the pen belonged to Bob. Hence, missing information would be inferred from the knowledge that cue values are dependent in that a negative cue value for one suspect implies a positive cue value for the other.
On the one hand, assuming the decision to be important, one might consider that the officer does not just ignore the missing information but aims to infer missing information from the properties of the decision environment in a kind of second-order probabilistic inference. On the other hand, however, it could be argued that implementing such an additional second order inference to the incomplete information would dramatically increase the cognitive effort to solve the task. Considering that deliberate reasoning is limited by cognitive constraints (Reference SimonSimon, 1956) and that in complex tasks the available information might already occupy working memory capacity completely, simply ignoring this information seems psychologically plausible as well. In this paper, we investigate people’s adaptive usage of second order probabilistic inference to incomplete information in complex probabilistic inferences.Footnote 1
Empirical results indicate that people use a variety of mechanism for inferring missing cue values (see Garcia-Retamero & Rieskamp, 2008, 2009, for overviews). Several studies show that people tend to infer missing cue values as negative information when the context suggests that information is not provided on purpose, such as in a job interview or in the marketing of a product (Reference Huber and McCannHuber & McCann, 1982; Reference Jaccard and WoodJaccard & Wood, 1988; Reference Johnson and LevinJohnson & Levin, 1985; Reference Lim and KimLim & Kim, 1992; Reference Stone and StoneStone & Stone, 1987; Reference Simmons and LynchSimmons, & Lynch Jr., 1991; Reference Yates, Jagacinski and FaberYates, Jagacinski, & Faber, 1978). Other studies (Reference BurkeBurke, 1995; Reference Ford and SmithFord & Smith, 1987; Reference Körner, Gertze, Bettinger and AlbertKörner, Gertze, Bettinger, & Albert, 2007), however, show that people infer missing cue values from other available information and, for example, make inferences from a high price to a high quality of a product (Reference Levin, Johnson and FaraoneLevin, Johnson, & Faraone, 1984). Still other studies show that the framing of the decision situation (e.g., as involving gains or losses) results in more inferences of positive or negative information (Reference Highhouse and HauseHighhouse & Hause, 1995; Reference Levin, Johnson, Russo and DeldinLevin, Johnson, Russo, & Deldin, 1985). Even other studies show that people replace cue information by average past values (Reference MeyerMeyer, 1981; Reference Slovic and MacPhillamySlovic & MacPhillamy, 1974) or infer incomplete information by similarity of the current decision with decisions in the past (Reference Sanbonmatsu, Kardes, Posavac and HoughtonSanbonmatsu, Kardes, Posavac, & Houghton, 1997).
Taking into account these findings, Garcia-Retamero and Rieskamp (2008, 2009) proposed that people make inferences about incomplete information that are adapted to the respective structure of the environment: when it is more likely for negative information to be missing than for positive information, it is rational to infer a negative cue value (Reference Garcia-Retamero and RieskampGarcia-Retamero & Rieskamp, 2008) and people do increasingly use this inference mechanism (Reference Garcia-Retamero and RieskampGarcia-Retamero & Rieskamp, 2009). When such a contingency is absent, people use other inference mechanism instead, such as replacing missing cue values by the average valence of information in the environment (i.e., the base-rate of positive or negative information). Thus, people do not use a single inference mechanism but choose the inference mechanism that fits the decision environment best.
We extend this work by investigating whether people also use the discrimination rate of a cue as an inference mechanism for missing cue values. We derive conditions under which inferences based on the discrimination rate are particularly successful. We thereby provide a map with the characteristics of the decision environment for identifying regions of rationality for inferences from incomplete information (Reference Hogarth and KarelaiaHogarth & Karelaia, 2006). We then report three studies that ask whether people use discrimination rate when it is a valid inference mechanism and adaptively switch to other inference mechanisms if necessary.
1.1 The discrimination rate of a cue as an inference mechanism for incomplete cue information
The discrimination rate DR of a cue can be formally defined as the relative number of decisions in which a cue makes a distinct prediction for or against an option (Reference Gigerenzer and GoldsteinGigerenzer & Goldstein, 1996). For example, a binary cue that speaks for or against one of two options in 90 decision trials (i.e., + − and/or − +) and speaks for or against both options in the remaining 10 trials (i.e., + +, − −) has a discrimination rate of DR = 90/100 = .90. The importance of the discrimination rate of a cue for implementing a frugal information search has been intensely discussed in the literature (Reference Bröder and NewellBröder & Newell, 2008; Reference Gigerenzer and GoldsteinGigerenzer & Goldstein, 1999; Reference Newell, Rakow, Weston and ShanksNewell, Rakow, Weston, & Shanks, 2004; Reference Rakow, Newell, Fayers and HersbyRakow, Newell, Fayers, & Hersby, 2005). Independent of the cue-validity—that is, how often a cue points towards the better option in the trials where it discriminates—the informative value of a cue decreases if it only rarely discriminates between options.Footnote 2 When the search for cues is costly, people should and actually also do take the discrimination rate of cues into account and search for cues with low discrimination rates less often (Reference Newell, Rakow, Weston and ShanksNewell, Rakow, Weston, & Shanks, 2004).
In the current study, we examine the use of the discrimination rate in second-order inferences for incomplete cue values. In situations in which cue information is only partially available, in that a positive or negative cue prediction is available for one option but no information is available for the other option, the discrimination rate of a cue can potentially be used to infer an incomplete cue value. More formally, when a cue has a consistent discrimination rate DR > .5 and the available cue value for one option is positive, it is more likely that the missing cue value is negative than positive (and vice versa for a negative available cue value). In contrast, when the discrimination rate DR of a cue is below .5, it is more likely that the missing cue value has the same valence as the available cue value as compared to having a different value. That is, as long as the discrimination rate is different from .5 (i.e., discrimination in half of the decision trials), the discrimination rate can be used to infer incomplete information.
1.2 Success of using the discrimination rate of cues for inferring incomplete cue information
As shown in the previous section, the discrimination rate of a cue can be used to infer incomplete information in environments with a discrimination rate different from .5, that is when the cue values from the same cue are not independent. Expected success rates of using discrimination rates to infer incomplete information are dependent on the structure of the environment as derived in Appendix A. The analysis reveals that the probability for a correct inference of a missing cue value increases with the absolute difference of the discrimination rate of a cue from .5. Hence, the success of inferences based on the discrimination rate increases when discrimination rates approach 0 or 1, meaning that the dependency between cue values increases. To assess the potential value of using discrimination rates as compared to other inference mechanisms such as replacing incomplete information proportional to the base-rates of cue values (BR) or replacing missing cue values always with a negative or positive cue value (+ vs. − in Figure 1), we analyzed the success rates of those inference mechanisms in various environments. Since the accuracy of the latter two inference mechanisms is dependent on the relative frequency of positive and negative cue values in trials for which the cue does not discriminate in the environment, we compared the inference mechanisms (Figure 1) using four exemplary levels of this factor (i.e., extremes with pos = .5 and pos = 1 and two medium values with pos = .7 and pos = .8 to illustrate the relation between variables). Whenever positive and negative cue values are equally frequent in an environment, all inference mechanisms discussed in the past literature lead with a probability of .5 to a correct inference which means success at chance level (Figure 1, upper left panel). Stated differently, in such environments all previously discussed inference mechanisms provide no information gain beyond guessing and are thus useless. In contrast, inferences based on discrimination rates can still be very successful and their success is better than the success of all other strategies for DR <> .5 (Figure 1, upper left panel).
When the frequency of positive cue values for indiscriminate cases increase (and thus the frequency of negative cue values decreases), the probability for a correct inference when inferring a positive cue value and when inferring a cue value in proportion to the base-rate of positive information increase (upper right panel and lower panels, Figure 1).Footnote 3 Note, however, that using discrimination rates leads to most successful inferences of incomplete information compared to the three other mechanisms for a large range of discrimination rates. The shaded yellow regions in Figure 1 indicate the area in which an inference based on the discrimination rate is outperformed by at least one of the alternative mechanisms. Note also, whenever the discrimination rate of a cue in the environment is above .67, the inferences based on discrimination rates strictly outperform all other mechanisms (white regions in Figure 1). This is due to the fact that discrimination rates and base-rates are not independent and that the former restrict the latter (Appendix A). That is, the discrimination rate restricts the range of the base-rate (i.e., the higher the discrimination rate, the less extreme the base rate). Thus, the analysis shows that the use of the discrimination rate is a powerful and potentially successful inference mechanism for missing cue values in many decision environments.
Given that using the discrimination rate is such a powerful inference mechanism for missing cue values in many environments, we hypothesize that people are sensitive to discrimination rates when making inferences about missing cue values in the context of probabilistic inferences (H1). Additionally, the analysis shows that, compared to other inference mechanisms, inferences based on discrimination rates are successful and therefore adaptive for environments with high or low discrimination rate, and that the mechanism is less successful when the discrimination rate is close to .5 (i.e., all yellow shaded areas in Figure 1). We therefore test the hypothesis—in accordance with prior studies showing that people adapt their inferences to the structure of the environment (Reference Garcia-Retamero and RieskampGarcia-Retamero & Rieskamp, 2009)—that people adaptively use the discrimination rate only when the discrimination rate is high and that people base their inference concerning incomplete information on other properties of the environment such as base-rates if this is not the case (H2).
2 Experiment 1: Probabilistic inferences with incomplete information in 4-cue environments
In the first study, we tested both hypotheses by investigating whether participants rely on discrimination rates as inference mechanisms for incomplete information at all (H1) and whether they adapt their inference mechanism for missing cue values to the properties of the environment (H2). This adaptation should enable individuals to maintain a high performance with respect to the rational solution even in varying environments, so that people who use the inference mechanism that is adaptive to the environment should also show higher performance (H3). The manipulation of discrimination rates and base-rates was implemented by direct instruction. Specifically, participants were explicitly informed about the base-rates and discrimination rates of the cues in the environment.
2.1 Method
2.1.1 Participants
Sixty participants (31 female; mean age = 23.6 years, sd = 3.4) were recruited from the MPI Decision Lab Subject Pool using the online recruiting tool ORSEE (Reference GreinerGreiner, 2004). Participants received a show-up fee of 8 €(∼$10.0) and performance contingent payment for each correct choice of up to 5.25 €(∼$6.0).
2.1.2 Materials and design
The basic paradigm was a hypothetical stock market game adapted from previous research (Reference Bröder and SchifferBröder, 2003). Participants selected the more profitable of two stocks in a series of independent choice trials (Figure 2). Participants were provided with information from four experts, constituting binary cues, that made recommendation concerning whether the profitability of the respective stock was good or bad. The recommendations of each expert could differ between the two options or could also be positive or negative for both of them. To induce incomplete information, in each trial a recommendation of one expert for one of the options was missing and replaced by a question mark. The four experts differed in cue validity defined, as the number of correct decisions in 100 past decision trials in which the cue discriminated between options (Reference Gigerenzer and GoldsteinGigerenzer & Goldstein, 1996), with val = {.90, .80, .70, .65}. Cues were presented in order of their validity starting with the most valid cue. Discrimination rate (DR) and the base-rate of positive cue information (BR) were manipulated within participants. In the first condition, the base-rate for experts was high (BR = .73) and the discrimination rate was close to .5 (DR = .55). In the second condition, in contrast, the discrimination rate was high (DR = .90) and the base-rate was close to .5 (BR = .55). In the control condition, both BR and DR were equally high with BR = DR = .67.Footnote 4 Participants were explicitly provided with all information concerning cue validities, base-rates, and discrimination rates (see Figure 2).Footnote 5 Base-rates and discrimination rates were the same for all experts and remained constant over all trials within a condition. Conditions were presented in blocks of counterbalanced order with each block containing 35 trials (described in Table B.1 of Appendix B, and Section 2.1.5) presented in random order chosen for each participant.
2.1.3 Procedure
Participants received instructions about the stock market game: the information display and the concepts of cue validities, base-rates and discrimination rates were explained in detail. Participants were instructed that validities, base-rates, and discrimination rates of cues are based on 100 prior decision trials of the stock-market game (see also online supplement for instructions used). To assure understanding, participants solved a set of control questions. If participants answered questions incorrectly, they were instructed to re-read the instruction and try again. Additionally, each new block was briefly introduced by an instruction, and further test questions were administered to make sure that participants were aware of the change in base-rates of positive information and discrimination rates.
After completing three exercise trials, participants made 35 choices in each of the three rounds in the stock market game by mouse-click and subsequently indicated their subjective confidence of having chosen the more profitable stock on a continuous slider with the endpoints completely uncertain and completely certain which were recorded as values of −100 to +100, respectively. The order of the conditions was counterbalanced between participants and the order of decision trials within conditions was randomized for each participant. Participants did not receive feedback concerning the accuracy of their choices during the experiment but they were informed that they will receive a summary feedback and a performance contingent payment of 5 cent for each correct choice.Footnote 6 Finally, demographic data was recorded and participants were debriefed and paid.
2.1.4 Models
A model of choices in probabilistic inference tasks with incomplete information needs to account for a) how people make second-order inferences for missing cue values and b) how they integrate the available and inferred cue values to select one of the options. To address the first question, we tested five inference mechanisms for incomplete information, including the four mechanisms discussed in the previous literature (Garcia-Retamero & Rieskamp, 2008, 2009: base-rate of positive information, positive, negative, and ignoring) and the additional new mechanism based on the discrimination rate of a cue. Concerning the second question, we included four standard models of information integration for probabilistic inference tasks: the non-compensatory Take-the-best heuristic (Reference Gigerenzer, Todd and TheGigerenzer, Todd, & The ABC Research Group, 1999; Todd, Gigerenzer, & The ABC Research Group, 2012), a compensatory weighted additive model (Reference BröderBröder, 2000; Reference Glöckner and BetschGlöckner & Betsch, 2008a; Reference Reimer and HoffrageReimer & Hoffrage, 2006, 2012), a parallel constraints satisfaction network model (Reference Glöckner and BetschGlöckner & Betsch, 2008b), and a random choice model. The resulting 5 × 4 = 20 inference-integration combinations—in the following referred to as models—are listed and described in detail in Table 1.
Note: In the column schematic representation of inferences, a cue with a present cue value and a missing cue value and the inference (→) for the missing cue value is displayed for each inference mechanism; BR = the base rate of positive information in the environment (e.g., .73); +1 = positive cue value, –1 = negative cue value, and ? = missing cue value. *Note that for TTB and inferences in accord with the base-rate, a missing cue value and a positive cue value (i.e., ? +) do not lead to a decision and the next most valid cue is inspected whereas a missing cue value and a negative cue value (? –1) lead to a decision for the option with the missing cue value when the base-rate of positive information is above .5.
2.1.5 Selection of differentiating decision trials and dependent measures
Cue-patterns (i.e., Figure 2) were selected to allow for differentiating among the 20 models according to the Euclidian diagnostic task selection method (Reference Jekel, Fiedler and GlöcknerJekel, Fiedler, & Glöckner, 2012). That is, potential cue-patterns were assessed, sorted and selected by their ability for differentiating between each pair of models taking into account the most diagnostic patterns first. In total, only seven cue-patterns were necessary to disentangle all models.Footnote 7 Hence, within each condition we selected seven cue-patterns each so that for each pairwise comparison between models there was at least one type of cue-pattern for which models differed concerning choice prediction. Cue-patterns were repeated five times in each condition to allow for a reliable strategy classification.
Participants were classified as being best described by one of the 20 models according to the Multiple-Measure Maximum Likelihood strategy classification (Glöckner, 2009, 2010; Reference Jekel, Nicklisch and GlöcknerJekel, Nicklisch & Glöckner, 2010). The method relies on a simultaneous maximum-likelihood analysis of individual choices, decision times, and confidence judgments described in Appendix C. We selected decision trials so that in each condition one indicator cue-pattern was included for which all models, including inference mechanism based on base-rates, predict choices for one option while all models including discrimination rates predict the alternative option.
Our manipulation of base-rates and discrimination rates should be reflected in both measures (which are of course not completely independent, since choice data for the indicator cue-pattern is used in the strategy classification as well). Participants in the condition with a high discrimination rate and lower base-rate of positive cue values are expected to decide more in line with the discrimination rate in the indicator cue-pattern and are also expected to be more often classified as users of models that involve the usage of discrimination rates as compared to other inference mechanisms (and vice versa for the condition with a high base-rate and lower discrimination rate). The condition with an equally high base-rate for positive information and discrimination rate is run to assess if there is a preference for one of the inference mechanisms.
2.2 Results
In agreement with the hypotheses, participants adapt their inference mechanism to the structure of the environment (i.e., base-rates and discrimination rates of cues as given in the instructions): participants show more inferences in accordance with the base-rate for the indicator cue-pattern in the condition with a high base-rate (54%) than in the condition with a high discrimination rate (31%) and they show an intermediate choice proportion (48%) in the condition in which the base-rate and discrimination rate were equal (Figure 3, left panel). The difference in choices between an environment with a high base-rate versus a high discrimination rate is significant, t(59) = 4.23, p < .001, one-tailed, according to a mixed effects model with random intercepts per participant (Reference Gelman and HillGelman & Hill, 2007). Hence, results from the indicator cue-pattern support our second hypothesis that individuals use inference mechanisms adaptively to the structure of the environment.
In line with the results from the analysis of the indicator cue-pattern, the Multiple-Measure Maximum Likelihood strategy classification method based on all decision trials reveals that participants use inference mechanisms adaptively to the structure of the environment. Collapsed over all four integration algorithms, in the high base-rate condition most participants are classified either as users of inference mechanisms relying on the base-rate (30%) or a negative cue value (26%) whereas in the environment with a high discrimination rate most participants are classified either as users of the discrimination rate (46%) or a negative cue value (30%) (see Figure 4, upper panel and Table D.1 in Appendix D). Usage of inference mechanisms is significantly influenced by condition according to a Stuart-Maxwell test, χ2 (5) = 18.50, p < .01, providing further support for H2. In the environment with equal discrimination rates and base-rates of positive information, participants show a slight preference for inferences in line with the base-rate of positive information (25% base-rate versus 15% discrimination rate). To double-check the result of the strategy classification, we conducted a global fit test against the saturated model according to Reference Moshagen and HilbigMoshagen and Hilbig (2011).Footnote 8 A considerable proportion of strategy classifications (16%) show a significant misfit of the restricted as compared to the unrestricted model when using p < .05 as misfit criterion. 43% of all misfit identifications concern participants classified as users of negative inferences. Additionally, the relative frequency of participants using the base-rate in an environment with a high base-rate decreases from 30% to 22% when excluding misfits. Thus, except for a less pronounced effect in an environment with a high base-rate, conclusions remain the same when misfits from the analysis are excluded.
Performance of participants was measured as the proportion of choices in accordance with the naïve Bayesian solution. Figure 5 shows the average performance of different subgroups of participants for the different conditions. In both the high base-rate and the high discrimination rate condition participants who use the matching inference mechanism show the best performance. In the environment with high base-rates, participants who rely on base-rates show a higher overlap with the naïve Bayesian solution, t(58) = 4.17, p < .001 (one-tailed). In the environment with a high discrimination rate, participants who rely on discrimination rates show a higher overlap, t(58) = 7.05, p < .001 (one-tailed). Hence, there is support for the third hypothesis, that usage of an inference mechanism in line with the environment also pays off in higher performance.
2.3 Discussion
Experiment 1 investigated choice strategies in probabilistic inferences with incomplete cue information. The analysis revealed that discrimination rates of cues provide a highly efficient inference mechanism for incomplete information. In line with our first hypothesis, a substantial proportion of participants used this inference mechanism, which has not been discussed in the literature before. In line with our second hypothesis, choices in the indicator cue-pattern, as well as results from the Multiple-Measure Maximum Likelihood strategy classification, converge in showing that inference mechanisms are used adaptively. Specifically, inference mechanisms based on discrimination rates are more frequently used in the condition with a high discrimination rate and an uninformative base-rate whereas inference mechanisms based on the base-rate are more heavily relied on in the condition with a high base-rate and an uninformative discrimination rate (i.e., close to .5). Additionally, in line with the third hypothesis, performance with respect to the naïve Bayes solution increases if there is a match between the inference mechanism used and the environmental structure.
Although the results so far provide support for our hypotheses, it might be argued that the results could have been partially due to the fact that information on discrimination rates and base-rates was explicitly provided. It is not clear whether results also generalize to more natural situations in which people are not provided with explicit information but must learn properties of the environment by experience. It might also be questioned whether the results concerning application of effortful second order inferences to incomplete information also hold in more complex environments and hence under conditions of increased working memory load. Each of these concerns is plausible. To test them, we conducted two further experiments in which no explicit information concerning base-rates and discrimination rates was provided: participants had to acquire this information from feedback if they wanted to use it. Additionally, the degree of complexity was increased from four to six cues for the third experiment. Since both studies use the same method and reveal similar findings they are presented jointly.
3 Experiments 2 and 3: Learning the base-rate and discrimination rate of cues in 4- and 6-cue environments
3.1 Method
3.1.1 Participants and Design
Sixty-four (41 female; mean age = 24.4 years, sd = 4.6) and fifty (33 female; mean age = 22.7 years, sd = 3.7) participants took part in Experiments 2 and 3, respectively. They were recruited using the same protocol as before. In both studies participants were compensated by show-up fees (Experiment 2: 17.0 €, ∼$21.0; Experiment 3: 7.0 €, ∼$9.0) and additional performance contingent payment of up to 6.0 €(∼$7.0). The environment structure was manipulated between-subjects. In the first condition, the base-rate for positive information was high and the discrimination rate was close to .5 and therefore relatively uninformative (BR = .73, DR = .55). In the second condition, discrimination rate was high and base-rate was relatively close to .5 (BR = .55, DR = .9).
We again used the stock market game in both studies but increased the number of cues from four to six in Experiment 3. In contrast to the previous experiment, base-rates of positive information and discrimination rates were not given to participants but could be learned in two initial learning phases each consisting of 100 trials of the stock market game.Footnote 9 The validities of the experts were again explicitly provided (Experiment 2: val = {.90, .80, .70, .65}, Experiment 3: val = {.90, .80, .75, .70, .65, .60}).
3.1.2 Procedure
In contrast to Experiment 1, both studies included an initial learning phase consisting of 200 trials. In the first part of this learning phase, participants made choices in 100 trials and received feedback concerning the missing cue value as well as the better option after each trial. The feedback thus exactly reflected the validities, base-rates and discrimination rates of the respective condition. In the second part of the learning phase, participants again made choices in 100 trials but they were subsequently additionally asked to infer the missing cue value before they received feedback. Participants were informed that they would receive an additional bonus of 1 €(∼$1.0) if they inferred at least 75 out of 100 pieces of incomplete information correctly. The test phase involved 100 trials consisting of 7 types of cue-patterns repeated 10 times and 30 filler cue-patterns. Participants again received 5 Cents for each correct choice in line with the rational model. Within the two learning phases and the the final test phase, decision trials were randomized for each participant. After completing the choice trials, participants’ subjective assessments of base-rates and discrimination rates were measured for both training phases and for the test phase and demographic data was recorded.Footnote 10 Finally, participants were debriefed and paid.
3.1.3 Models and dependent variables
We tested the same set of 20 models (i.e., 5 inference mechanisms × 4 information integration mechanisms) as before and again used indicator cue-patterns and strategy classification for identifying the inference mechanism used. Hence, both experiments again included indicator cue-patterns that perfectly discriminated between inferences based on the base-rates and inferences based on discrimination rates. Experiment 2 involved one indicator cue-pattern whereas Experiment 3 involved two indicator cue-patterns (which were also repeated 10 times each). In addition to Experiment 1, we also recorded the relative number of inferences of positive and negative cue values in the second learning phase to test in how far participants’ inferences reflect discrimination rates or base-rates.
3.1.4 Selection of cue patterns in the learning and test phase
For the main test phase of Experiment 2 we used the same diagnostic cue patterns as before (Table B.1) and for Experiment 3 we created and selected diagnostic cue patterns with six cues according to the same method as before. The cue patterns for the two preceding learning phases were randomly created but under the restriction that they met the properties of the environment (i.e., BR = .73, DR = .55 versus BR = .55, DR = .9 and the respective validities for each cue). In each trial, one of the cue values was randomly selected for the missing cue value. Two different sets of 100 trial were created and the order of presentation of the sets were counterbalanced between participants.
3.2 Results
In both experiments the results from Experiment 1 concerning all three hypotheses were replicated. In the indicator cue-patterns, participants showed a higher proportion of choices that indicate the usage of the inference mechanism base-rate (as compared to discrimination rate) in the condition with a high base-rate and a low discrimination rate in Experiment 2 (Figure 3, middle) and Experiment 3 (Figure 3, right). This pattern reverses for the condition with a high discrimination rate and a low base-rate. The change in choice proportions between conditions turned out significant in Experiment 2 (t(62) = 6.90, p < .001, one-tailed) and Experiment 3 (t(48) = 2.86, p < .01, one-tailed, in a random slope model with both types of tasks nested within participants; Reference Gelman and HillGelman & Hill, 2007) supporting our second hypothesis stating that individuals adapt their inference mechanism to properties of the environment.
Figure 4 (lower panel) and Table D.1 (Appendix D) report strategy classifications: In the high base-rate condition, the largest group of participants (with identified inference mechanisms) are classified to rely on models involving inferences using base-rates in both experiments. In the high discrimination rate condition, the largest group of participants rely on inferences based on the discrimination rate. The shift in decision strategies between conditions is significant in Experiment 2 (χ2(5, N = 64) = 29.54, p < .001) and Experiment 3 (χ2(5, N = 64) = 17.69, p < .01). In line with our first hypothesis, we again observed a substantial reliance on discrimination rate as inference mechanism when it was appropriate to use it. These findings are noteworthy since information on the discrimination rate was not explicitly provided and people spontaneously relied on discrimination rates based on their experience. Interestingly, the usage of discrimination rate is similar between experiments indicating that increased complexity does not reduce people’s tendency to adaptively use the inference mechanism. One interesting further observation is that the previously observed high proportion of participants that simply replaced missing cue values by negative values decreased almost to zero in Experiments 2 and 3. Hence, in contrast to decisions from explicitly provided information, experience might be better able to drive out mechanisms that are used by default but are maladaptive in the current environment. In both experience-based experiments, almost no participants fail the test between the classified and the saturated model (Reference Moshagen and HilbigMoshagen & Hilbig, 2011; 3 participants in Exp. 2 and 1 participant in Exp. 3).
Similar to Experiment 1, our third hypothesis stating that a match between the inference mechanism and the environment leads to a higher performance than the usage of a different mechanism is (partially) supported. Participants relying on the base-rate as inference mechanism show (Figure 6) higher performance than participants classified to rely on other mechanisms in the high base-rate condition in Experiment 3 (t(22) = 2.20, p < .05, one-tailed) but not in Experiment 2 (t(32) = 0.02, p = .48, one-tailed). For the high discrimination rate condition, participants who are classified as users of inferences in accordance with the discrimination rate show the highest performance in Experiment 2 (t(28) = 3.24, p < .01, one-tailed) and in Experiment 3 (t(24) = 4.24, p < .001, one-tailed).
To allow further testing of our hypothesis of adaptive inferences to incomplete information, in the second part of the learning phase participants did not only indicate choices but they were also asked to infer the missing cue values directly. Results show that participants are sensitive to the structure of the environment in inferring missing cue values not only when this missing information is embedded in a probabilistic inference task but also when the goal is to predict the missing information itself. Participants make more inferences in line with the discrimination rate when the discrimination rate is high as compared to low (Exp. 2: t(62) = 12.6, p < .001, Exp. 3: t(48) = 10.2, p < .001; Figure 7; both one-tailed). Similarly, participants make more inferences in line with the base-rate in the high base-rate condition as compared to the low base-rate condition (Exp. 2: t(62) = 2.9, p < .01, Exp. 3: t(48) = 1.7, p < .05; both one-tailed). The ability to infer incomplete information indicates that participants acquire knowledge concerning the structure of the environment with respect to inferences for incomplete information.
3.3 Discussion
Experiments 2 and 3 closely replicate the results from the first experiment and show that the previous findings are not limited to situations in which information concerning discrimination rates and base-rates is explicitly provided. Results also generalize to more complex decision tasks as shown in Experiment 3. Participants again adapt their inference mechanism to the structure of the environment. Finally, participants’ choices approximate the rational naïve Bayesian solution quite closely and the performance is higher for participants using an inference mechanism that matches the structure of the environment.
4 General discussion
Many situations in which people have to make probabilistic inferences involve incomplete information in that cue values are missing. Previous research indicated that people are rather flexible in how they treat incomplete information and that they adapt their inference mechanisms to the structure of the environment (Reference Garcia-Retamero and RieskampGarcia-Retamero & Rieskamp, 2009). The main focus of these investigations has been on inferences of missing cue values from base-rates, neglect of missing cue values, or replacement of missing cue values by positive or negative cue values. Based on an analytic approach, we showed that it is more successful in many environments to rely on the discrimination rates of cues to infer missing values instead. Specifically, this mechanism proposes that people infer single pieces of missing cue information based on the available cue value and the degree to which a cue generally tends to make discriminating or equal predictions for the two options compared. We expected that this new inference mechanism based on discrimination rates is adaptively used by participants in environments in which the discrimination rate is a powerful predictor for missing cue values. The results from all three experiments support this proposition. In decisions with explicitly provided information concerning base-rates and discrimination rates but also in environments in which this information has to be spontaneously learned from experience, a considerably proportion of participants relies on discrimination rates to infer missing cue values. These inferred missing cue values are used adaptively to the structure of the environment: discrimination rates are increasingly used in environments in which discrimination rates are particularly informative, and base-rates are not, and vice versa for environments in which base-rates are informative and discrimination rates are less so. Finally, we find that participants who rely on the inference mechanism that is adapted to the respective environment structure tend to show higher performance than participants using other mechanisms.
It is noteworthy that increasing the complexity of the task from four to six cues (i.e., from 8 to 12 pieces for information) did not lead to a shift towards relying on cognitively less demanding inference mechanisms such as simply ignoring missing cue values. The combination of, on the one hand, relatively complex second-order inference to incomplete information and, on the other hand, the application of complex information integration mechanisms for the first order inferences does not seem to tax working memory capacity too much. This independence of complexity therefore speaks for the importance of automatic processes, which have been shown to be only partially sensitive to task complexity (e.g., Glöckner & Betsch, 2012). Finally, in a similar vein, it should be noted that in line with previous findings (e.g., Bröder, 2000; Reference Glöckner and BetschGlöckner & Betsch, 2008b) most participants integrate available and inferred information in a compensatory fashion (i.e., using PCS or WADD, Table D.1). Hence, inferences to missing information and information integration are not cognitively demanding to the extent that people switch to simpler information integration mechanisms.
Our theoretical analysis of the performance of all inference mechanisms in various environments show that inferences in line with the discrimination rate maximize the probability of a correct inference in many environments. This is more so when the discrimination rate is high (i.e., greater than .67). It is an empirical question whether real-world environments more likely consist of discrimination rates that are high or low. Even if cues with uninformative or low discrimination rates (i.e., close to or below .5) are frequent in a decision environment, active search and use of cues with a high discrimination rate (Reference Bröder and NewellBröder & Newell, 2008; Reference Newell, Rakow, Weston and ShanksNewell, Rakow, Weston, & Shanks, 2004) lead to self-tailored decision environments with cues high in discrimination rate. Thus, search for valid cues with high discrimination rates does not only lead to cues that are successful in making correct predictions but also to cues that allow for the application of a powerful inference mechanism when information is partially missing.
Results from the current studies could also have methodological implications for process tracing studies using partially concealed information such as information board and Mouselab. In the information board and its computerized version Mouselab (Reference Payne, Bettman and JohnsonPayne, Bettman, & Johnson, 1988; Reference Schulte-Mecklenbeck, Kühberger and RanyardSchulte-Mecklenbeck, Kühberger, & Ranyard, 2010), participants are asked to point at information cards to acquire information, and information search is used to test process prediction of decision strategies (e.g., Bröder, 2003; Reference Bröder, Glöckner, Betsch, Link and EttlinBröder, Glöckner, Betsch, Link, & Ettlin, 2013; Reference Garcia-Retamero and RieskampGarcia-Retamero & Rieskamp, 2009; Reference Jekel, Fiedler and GlöcknerJekel, 2012; Reference Rieskamp and OttoRieskamp & Otto, 2006). It is thus common to make the assumption that un-revealed information does not influence choices. This implicitly assumes that participants ignore those cues. Given our data, this assumption might be violated, at least in environments with two options only and cues having high (or low) discrimination rates given participants have prior knowledge or acquire knowledge about the properties of the environment in the course of a study. This can lead to wrong conclusions concerning strategy use, particularly in environments in which information search is costly. For example, looking up information partially might not unambiguously indicate the usage of noncompensatory decision strategies and could also result from people using a compensatory weighted additive strategy based on partially concealed but inferred cue values. Following the above argument that these inferences to incomplete information can be done with little cognitive effort, this problem might be quite substantial in some research paradigms and should be kept in mind when interpreting process-tracing measures.
A potential limitation of our research is that we used a selected set of cue-patterns that allowed us to discriminate among models. We cannot completely rule out that this way of cue-pattern selection might influence participants’ inferences for missing cue-values and integration with other information (Reference Rieskamp and HoffrageRieskamp & Hoffrage, 1999). Furthermore, we used an incentive scheme that rewarded correct inferences for missing information in the learning phase in Experiment 2 and 3, which might have increased participants’ general attention to missing information in their subsequent decisions.
Future studies might address how the process that leads to incomplete information influences the usage of the discrimination rate as an inference mechanism. Information might be missing due to random noise. Alternatively, information might be incomplete on purpose for good or bad intentions. Omission of information can lead to a frugal transmission of information (Reference Gigerenzer, Todd and TheGigerenzer, Todd, & The ABC Research Group, 1999) when the omitted information is redundant and/or can be easily inferred by the context (i.e., the other cue value): praising the durability of a product in a consumer-context likely implies, without explicitly mentioning, that the alternative product is worse on this dimension. The downside of this implicit rule in the communication of information (Reference GriceGrice, 1975) is that it can also be used to mislead people by omitting information with the purpose of letting people make invalid inferences about the missing cue value. Such a strategy is most likely more difficult to detect than providing plain wrong information, and such a strategy also disguises the responsibility for misinformation (i.e., the error is made by the receiver’s invalid inference). In how far people are sensitive to the good or bad intentions of strategic use of the discrimination rate in the transmission of information constitutes an interesting question for future research.
Appendix A: Probability of a correct inference dependent on the structure of the environment
The probability of a correct inference of a missing cue value depends on the structure of the environment (i.e., a set of decision trials as displayed in Figure 2) as defined by the discrimination rate DR (i.e., relative number of cases in which a cue discriminates between options), the relative frequency of positive information pos and negative information neg in the decision trials where the cue does not discriminate between options (i.e., + + and − −). Note that pos and neg are functions of the base-rate of positive cue-values (i.e., BR +) and sample size with:
and
Under the assumption that there is only one cue value missing for a cue in all trials and that the probability of a missing cue value is equally likely for both types of cue values, the probability for a correct inference when inferring a positive cue value p c+ is:
The probability for a correct inference when inferring a negative cue value p c− is:
The probability for a correct inference when using the discrimination rate p cDR is:
Finally, the probability for a correct inference when using the base-rate p cBR is:
with:
and:
Appendix B: Model predictions
Choice predictions are derived for each model as described in Table 1. Predictions of decision times are based on the number of computational steps necessary to apply TTB.Footnote 11 WADDcorr or the number of iterations necessary for PCS (Reference Glöckner and BetschGlöckner & Betsch, 2008a) to reach a coherent solution combined with each type of inference mechanism for missing cue values. Predictions of confidence judgments are based on the validity of the first discriminating cue for TTB, the difference in the weighted sums for each option for WADDcorr, and the difference in activations for option nodes for PCS combined with each inference mechanism for missing cue values. Prediction vectors of decision times and confidence judgments for each model are normalized to prediction contrasts by dividing the centered vector by the range of the centered vector. Predictions for all measures for the seven cue-patterns used for all integration mechanisms are displayed exemplary for the inference mechanism base-rate of positive information and discrimination rate in Table B.1. Note for the first type of tasks that an inference mechanism for missing cue values based on base-rates of positive information leads to a choice for option A, and an inference mechanism for missing cue values based on discrimination rates to a choice for option B independent of the integration mechanism (except for Rand which predicts guessing independent of the inference mechanism). The first type of tasks was used as the indicator cue-pattern in the environment with a high discrimination rate in Experiment 1 and 2.
Note: Positive cue values are indicated by +, negative cue values by −, missing cue values by ?. A:B represents guessing between options. BR = base-rate positive information, DR = discrimination rate.
Appendix C: Primer of the Multiple-Measure Maximum Likelihood strategy classification
To determine which combination of inference for missing cue values and information integration mechanism describes the observed choices and decision times for each participant best, we applied the Multiple-Measure Maximum Likelihood strategy (MM-ML) classification method (Reference Bröder and SchifferBröder & Schiffer, 2003; Glöckner, 2009, 2010; Reference Jekel, Nicklisch and GlöcknerJekel, Nicklisch, & Glöckner, 2010). To apply MM-ML, model predictions are derived for each dependent measure (i.e., choices, decision times, and confidence judgments for all three studies; see Appendix B) and the distribution of the data generating process is defined for each measure. That is, choices between two options are assumed to be independent of each other and to stem from a binomial distribution whereas errors for decision times and confidence judgments are assumed to stem from a normal distribution. In MM-ML, model comparisons are based on the comparison between the predictions for each model on choices, decision times, and confidence judgments and the observed behavior.
In more detail, the number of choices n jk of type j between two options congruent with model k are assumed to stem from a binomial distribution with a probability of 1 − єk for congruent choices. Observed decision times are winsorized over all participants (2.5 × standard deviation; i.e., approx. 1.2% of the data) to avoid problems with extreme outliers in maximum-likelihood estimation. Additionally, winsorized decision times were regressed on the order of trials presented in a hierarchical mixed-effects linear model with a random intercept and a random slope for order of trials presented for each participant to account for an individual decrease in decision times due to training (Glöckner, 2009, 2010). The resulting residual decision time x T i for trial i is assumed to stem from a normal distribution with a standard deviation σT and a mean µT that is shifted by the prediction scalar t T i and a scaling factor R T in the following way: x T i ∼ N(µT + t T i × R T , σT). Finally, an observed confidence judgment x C i is also assumed to stem from a normal distribution with a standard deviation σC and a mean µC that is shifted by the prediction scalar t C i and a scaling factor R C in the following way: x C i ∼ N(µC + t C i × R C , σC). The maximum-likelihood L total for each participant and each model can then be estimated by maximizing:
Model comparisons are based on the Bayesian information criterion (BIC, Schwarz, 1978) for N obs trials and N p fitted model parameters that can be calculated by:
Participants who show more than єk > 30% strategy inconsistent choices for the most likely model (except for random models with є = 50%) are not classified (see Reference GlöcknerGlöckner, 2009).
Appendix D: Classifications of participants’ inference mechanisms and information integration mechanisms
Note: Grey areas indicate models that lead to the same prediction (i.e., participants’ inference mechanism cannot be identified for these models). Unclassified participants have a strategy-application-error of є > .30.