Reversal learning has been studied as the process of learning to inhibit previously rewarded actions. Deficits in reversal learning have been seen after manipulations of dopamine and lesions of the orbitofrontal cortex. However, reversal learning is often studied in animals that have limited experience with reversals. As such, the animals are learning that reversals occur during data collection. We have examined a task regime in which monkeys have extensive experience with reversals and stable behavioral performance on a probabilistic two-arm bandit reversal learning task. We developed a Bayesian analysis approach to examine the effects of manipulations of dopamine on reversal performance in this regime. We find that the analysis can clarify the strategy of the animal. Specifically, at reversal, the monkeys switch quickly from choosing one stimulus to choosing the other, as opposed to gradually transitioning, which might be expected if they were using a naive reinforcement learning (RL) update of value. Furthermore, we found that administration of haloperidol affects the way the animals integrate prior knowledge into their choice behavior. Animals had a stronger prior on where reversals would occur on haloperidol than on levodopa (L-DOPA) or placebo. This strong prior was appropriate, because the animals had extensive experience with reversals occurring in the middle of the block. Overall, we find that Bayesian dissection of the behavior clarifies the strategy of the animals and reveals an effect of haloperidol on integration of prior information with evidence in favor of a choice reversal.
Sabatinelli D, Bradley MM, Lang PJ, Costa VD, Versace F. Pleasure rather than salience activates human nucleus accumbens and medial prefrontal cortex.
Explore-exploit decisions require us to trade off the benefits of exploring unknown options to learn more about them, with exploiting known options, for immediate reward. Such decisions are ubiquitous in nature, but from a computational perspective, they are notoriously hard. There is therefore much interest in how humans and animals make these decisions and recently there has been an explosion of research in this area. Here we provide a biased and incomplete snapshot of this field focusing on the major finding that many organisms use two distinct strategies to solve the explore-exploit dilemma: a bias for information ('directed exploration') and the randomization of choice ('random exploration'). We review evidence for the existence of these strategies, their computational properties, their neural implementations, as well as how directed and random exploration vary over the lifespan. We conclude by highlighting open questions in this field that are ripe to both explore and exploit.
Novelty seeking refers to the tendency of humans and animals to explore novel and unfamiliar stimuli and environments. The idea that dopamine modulates novelty seeking is supported by evidence that novel stimuli excite dopamine neurons and activate brain regions receiving dopaminergic input. In addition, dopamine is shown to drive exploratory behavior in novel environments. It is not clear whether dopamine promotes novelty seeking when it is framed as the decision to explore novel options vs. the exploitation of familiar options. To test this hypothesis, we administered systemic injections of saline or GBR-12909, a selective dopamine transporter (DAT) inhibitor, to monkeys and assessed their novelty seeking behavior during a probabilistic decision making task. The task involved pseudorandom introductions of novel choice options. This allowed monkeys the opportunity to explore novel options or to exploit familiar options that they had already sampled. We found that DAT blockade increased the monkeys’ preference for novel options. A reinforcement learning (RL) model fit to the monkeys’ choice data showed that increased novelty seeking following DAT blockade was driven by an increase in the initial value the monkeys assigned to novel options. However, blocking DAT did not modulate the rate at which the monkeys learned which cues were most predictive of reward or their tendency to exploit that knowledge. These data demonstrate that dopamine enhances novelty-driven value and imply that excessive novelty seeking—characteristic of impulsivity and behavioral addictions—might be caused by increases in dopamine, stemming from less reuptake.
Models of visual emotional perception suggest a reentrant organization of the ventral visual system with the amygdala. Using focused functional magnetic resonance imaging in humans with a sampling rate of 100 ms, here we determine the relative timing of emotional discrimination in amygdala and ventral visual cortical structures during emotional perception. Results show that amygdala and inferotemporal visual cortex differentiate emotional from nonemotional scenes ϳ1 s before extrastriate occipital cortex, whereas primary occipital cortex shows consistent activity across all scenes. This pattern of discrimination is consistent with a reentrant organization of emotional perception in visual processing, in which transaction between rostral ventral visual cortex and amygdala originates the identification of emotional relevance.
Summary Reinforcement learning (RL) theories posit that dopaminergic signals are integrated within the striatum to associate choices with outcomes. Often overlooked is that the amygdala also receives dopaminergic input and is involved in Pavlovian processes that influence choice behavior. To determine the relative contributions of the ventral striatum (VS) and amygdala to appetitive RL we tested rhesus macaques with VS or amygdala lesions on deterministic and stochastic versions of a two-arm bandit reversal learning task. When learning was characterized with a RL model relative to controls, amygdala lesions caused general decreases in learning from positive feedback and choice consistency. By comparison, VS lesions only affected learning in the stochastic task. Moreover, the VS lesions hastened the monkeys’ choice reaction times, which emphasized a speed-accuracy tradeoff that accounted for errors in deterministic learning. These results update standard accounts of RL by emphasizing distinct contributions of the amygdala and VS to RL.
Research on emotional perception and learning indicates appetitive cues engage nucleus accumbens (NAc) and medial prefrontal cortex (mPFC), whereas amygdala activity is modulated by the emotional intensity of appetitive and aversive cues. This study sought to determine patterns of functional activation and connectivity among these regions during narrative emotional imagery. Using event-related fMRI, we investigate activation of these structures when participants vividly imagine pleasant, neutral, and unpleasant scenes. Results indicate that pleasant imagery selectively activates NAc and mPFC, whereas amygdala activation was enhanced during both pleasant and unpleasant imagery. NAc and mPFC activity were each correlated with the rated pleasure of the imagined scenes, while amygdala activity was correlated with rated emotional arousal. Functional connectivity of NAc and mPFC was evident throughout imagery, regardless of hedonic content, while correlated activation of the amygdala with NAc and mPFC was specific to imagining pleasant scenes. These findings provide strong evidence that pleasurable text-driven imagery engages a core appetitive circuit, including NAc, mPFC, and the amygdala.
Reinforcement learning (RL) is the behavioral process of learning the values of actions and objects. Most models of RL assume that the dopaminergic prediction error signal drives plasticity in frontal-striatal circuits. The striatum then encodes value representations that drive decision processes. However, the amygdala has also been shown to play an important role in forming Pavlovian stimulus-outcome associations. These Pavlovian associations can drive motivated behavior via the amygdala projections to the ventral striatum or the ventral tegmental area. The amygdala may, therefore, play a central role in RL. Here we compare the contributions of the amygdala and the striatum to RL and show that both the amygdala and striatum learn and represent expected values in RL tasks. Furthermore, value representations in the striatum may be inherited, to some extent, from the amygdala. The striatum may, therefore, play less of a primary role in learning stimulus-outcome associations in RL than previously suggested.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.