2017
DOI: 10.1371/journal.pcbi.1005684
|View full text |Cite
|
Sign up to set email alerts
|

Confirmation bias in human reinforcement learning: Evidence from counterfactual feedback processing

Abstract: Previous studies suggest that factual learning, that is, learning from obtained outcomes, is biased, such that participants preferentially take into account positive, as compared to negative, prediction errors. However, whether or not the prediction error valence also affects counterfactual learning, that is, learning from forgone outcomes, is unknown. To address this question, we analysed the performance of two groups of participants on reinforcement learning tasks using a computational model that was adapted… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

30
228
3

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
3
2

Relationship

2
7

Authors

Journals

citations
Cited by 165 publications
(278 citation statements)
references
References 51 publications
(53 reference statements)
30
228
3
Order By: Relevance
“…Updating the counterfactual according to a context-dependent reference is consistent with a broader literature on reference dependence in behavioural economics in which the utilities of outcomes are assessed relative to a context-specific reference point (e.g., Kahneman and Tversky, 1979;Köszegi and Rabin, 2006;Denrell, 2015). Converging evidence from reinforcement comparison methods and behavioural economics equally suggests that people make decisions according to a context-dependent reference (Palminteri et al, 2015;Palminteri et al, 2017;Klein, Ullsperger, & Jocham, 2017;Burke et al, 2016). Importantly, providing counterfactual information to the subject reinforces the dependence on context for evaluating rewards and punishments (Palminteri et al, 2015).…”
Section: The Counterfactual World Negatively Covaries With the Real Wsupporting
confidence: 71%
See 1 more Smart Citation
“…Updating the counterfactual according to a context-dependent reference is consistent with a broader literature on reference dependence in behavioural economics in which the utilities of outcomes are assessed relative to a context-specific reference point (e.g., Kahneman and Tversky, 1979;Köszegi and Rabin, 2006;Denrell, 2015). Converging evidence from reinforcement comparison methods and behavioural economics equally suggests that people make decisions according to a context-dependent reference (Palminteri et al, 2015;Palminteri et al, 2017;Klein, Ullsperger, & Jocham, 2017;Burke et al, 2016). Importantly, providing counterfactual information to the subject reinforces the dependence on context for evaluating rewards and punishments (Palminteri et al, 2015).…”
Section: The Counterfactual World Negatively Covaries With the Real Wsupporting
confidence: 71%
“…This was true for all conditions (exceedance probability > 98%) (Table 1 and Figure 6B). In addition to comparing model parameters across conditions and subjects, we also evaluated the generative performance of each concurrent model, i.e., its ability to replicate the participant's proportion of choices as well as the participant's trial-by-trial choice sequence after reversal (Palminteri et al, 2017). To do so, the 4 models were simulated with the best-fitting parameters for the whole experiment.…”
Section: Resultsmentioning
confidence: 99%
“…Specifically, neurons seem to rescale their firing to adapt to the decision context (relative coding) so that their response to a specific value depends on the choice context (e.g., reward vs. punishment), rather than being invariant (absolute coding). Additionally, studies on value--based decision showed that feedback information affects value learning so that providing complete feedback (both obtained and foregone -counterfactual -choice outcomes) instead of partial feedback (only the obtained outcome) improves learning (Palminteri et al, 2015(Palminteri et al, , 2017Bavard et al, 2018). However, it is unclear (1) whether relative coding occurs in all PFC regions and (2) whether and how it is affected by feedback information.…”
Section: Introductionmentioning
confidence: 99%
“…On the other hand, individuals with negative choice-trace weight tend to avoid an option recently chosen, which can be interpreted as uncertainty-driven exploration. Notably, it is observed that the choice-trace effects are often confounded with the effects of differential (asymmetric) learning rates for positive and negative reward prediction errors 45,46 . We, therefore, constructed another class of models (RL3a and 3b) which introduce asymmetric learning rates instead of the choice-trace effect.…”
Section: Psychiatric Symptoms and Decision-making Processes: A Model-mentioning
confidence: 99%