2020
DOI: 10.1038/s41467-019-13953-1
|View full text |Cite
|
Sign up to set email alerts
|

Dopamine transients do not act as model-free prediction errors during associative learning

Abstract: Dopamine neurons are proposed to signal the reward prediction error in model-free reinforcement learning algorithms. This term represents the unpredicted or 'excess' value of the rewarding event, value that is then added to the intrinsic value of any antecedent cues, contexts or events. To support this proposal, proponents cite evidence that artificiallyinduced dopamine transients cause lasting changes in behavior. Yet these studies do not generally assess learning under conditions where an endogenous predicti… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

8
65
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 62 publications
(76 citation statements)
references
References 36 publications
8
65
0
Order By: Relevance
“…Previous research suggests that transfer from S2 to S1 in sensory preconditioning depends on the amount of learning during the association phase, initially increasing with the number of pairings and declining thereafter 27 , 28 . We thus assessed whether the amount of training affected memory-guided decision making and generalization.…”
Section: Resultsmentioning
confidence: 98%
“…Previous research suggests that transfer from S2 to S1 in sensory preconditioning depends on the amount of learning during the association phase, initially increasing with the number of pairings and declining thereafter 27 , 28 . We thus assessed whether the amount of training affected memory-guided decision making and generalization.…”
Section: Resultsmentioning
confidence: 98%
“…Recently, Maes et al, using inhibitory optogenetics to prevent cue-evoked DA signals found that the DA neuron activity observed in response to a reward-predictive cue is a prediction error, not a signal about the value of the cue [ 93 ]. Experiments using excitatory optogenetics to stimulate DA neuron activity published [ 94 ] confirmed that DA stimulation supports associative learning on antecedent cues without evidence of a cached value being ascribed to the cue. Additionally, Morrens et al [ 95 ] showed that novel, but not familiar, cues evoke DA release and that if DA release is inhibited during a novel cue, learning about that cue is impaired.…”
Section: Synaptic Plasticity: Homosynaptic and Heterosynaptic Theoriesmentioning
confidence: 99%
“…Most RL research since has focused on simple forms of learning from outcomes that act as primary or secondary rewards, such as food, money, or numeric points in a game. However, the path to an RPE is not always so straightforward: For instance, recent work departs from the role of dopaminergic signaling in standard RPEs based on scalar rewards, extending the domain of RL to learning from indirect experiences (e.g., secondary conditioning) and more abstract learning of associations based on sensory features [57,58]. These findings suggest that RL value computations integrate information beyond primary and secondary rewards.…”
Section: Rewards and Expectationsmentioning
confidence: 99%