Learning to predict rewards based on environmental cues is essential for survival. It is believed that animals learn to predict rewards by updating predictions whenever the outcome deviates from expectations, and that such reward prediction errors (RPEs) are signaled by the mesolimbic dopamine system—a key controller of learning. However, instead of learning prospective predictions from RPEs, animals can infer predictions by learning the retrospective cause of rewards. Hence, whether mesolimbic dopamine instead conveys a causal associative signal that sometimes resembles RPE remains unknown. We developed an algorithm for retrospective causal learning and found that mesolimbic dopamine release conveys causal associations but not RPE, thereby challenging the dominant theory of reward learning. Our results reshape the conceptual and biological framework for associative learning.
Behavioral approaches utilizing rodents to study mood disorders have focused primarily on negative valence behaviors associated with potential threat (anxiety-related behaviors). However, for disorders such as depression, positive valence behaviors that assess reward processing may be more translationally valid and predictive of antidepressant treatment outcome. Chronic corticosterone (CORT) administration is a well-validated pharmacological stressor that increases avoidance in negative valence behaviors associated with anxiety1–4. However, whether chronic stress paradigms such as CORT administration also lead to deficits in positive valence behaviors remains unclear. We treated male C57BL/6J mice with chronic CORT and assessed both negative and positive valence behaviors. We found that CORT induced avoidance in the open field and NSF. Interestingly, CORT also impaired instrumental acquisition, reduced sensitivity to a devalued outcome, reduced breakpoint in progressive ratio, and impaired performance in probabilistic reversal learning. Taken together, these results demonstrate that chronic CORT administration at the same dosage both induces avoidance in negative valence behaviors associated with anxiety and impairs positive valence behaviors associated with reward processing. These data suggest that CORT administration is a useful experimental system for preclinical approaches to studying stress-induced mood disorders.
The basolateral amygdala (BLA) is critical for reward behaviors via a projection to the nucleus accumbens (NAc). Specifically, BLA-NAc projections are involved in reinforcement learning, reward-seeking, sustained instrumental responding, and risk behaviors. However, it remains unclear whether chronic stress interacts with BLA-NAc projection neurons to result in maladaptive behaviors. Here we take a chemogenetic, projection-specific approach to clarify how NAc-projecting BLA neurons affect avoidance, reward, and feeding behaviors in male mice. Then, we examine whether chemogenetic activation of NAc-projecting BLA neurons attenuates the maladaptive effects of chronic corticosterone (CORT) administration on these behaviors. CORT mimics the behavioral and neural effects of chronic stress exposure. We found a nuanced role of BLA-NAc neurons in mediating reward behaviors. Surprisingly, activation of BLA-NAc projections rescues CORT-induced deficits in the novelty suppressed feeding, a behavior typically associated with avoidance. Activation of BLA-NAc neurons also increases instrumental reward-seeking without affecting free-feeding in chronic CORT mice. Taken together, these data suggest that NAc-projecting BLA neurons are involved in chronic CORT-induced maladaptive reward and motivation behaviors.
How do we learn associations in the world (e.g., between cues and rewards)? Cue-reward associative learning is controlled in the brain by mesolimbic dopamine. It is widely believed that dopamine drives such learning by conveying a reward prediction error (RPE) in accordance with temporal difference reinforcement learning (TDRL) algorithms. TDRL implementations are trial-based: learning progresses sequentially across individual cue-outcome experiences. Accordingly, a foundational assumption, often considered a mere truism, is that the more cue-reward pairings one experiences, the more one learns this association. Here, we disprove this assumption, thereby falsifying a foundational principle of trial-based learning algorithms. Specifically, when a group of head-fixed mice received ten times fewer experiences over the same total time as another, a single experience produced as much learning as ten experiences in the other group. This quantitative scaling also holds for mesolimbic dopaminergic learning, with the increase in learning rate being so high that the group with fewer experiences exhibits dopaminergic learning in as few as four cue-reward experiences and behavioral learning in nine. An algorithm implementing reward-triggered retrospective learning explains these findings. The temporal scaling and few-shot learning observed here fundamentally changes our understanding of the neural algorithms of associative learning.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.