In many cases, learning is thought to be driven by differences between the value of rewards we expect and rewards we actually receive. Yet learning can also occur when the identity of the reward we receive is not as expected, even if its value remains unchanged. Learning from changes in reward identity implies access to an internal model of the environment, from which information about the identity of the expected reward can be derived. As a result, such learning is not easily accounted for by model-free reinforcement learning theories such as temporal difference reinforcement learning (TDRL), which predicate learning on changes in reward value, but not identity. Here, we used unblocking procedures to assess learning driven by value-versus identity-based prediction errors. Rats were trained to associate distinct visual cues with different food quantities and identities. These cues were subsequently presented in compound with novel auditory cues and the reward quantity or identity was selectively changed. Unblocking was assessed by presenting the auditory cues alone in a probe test. Consistent with neural implementations of TDRL models, we found that the ventral striatum was necessary for learning in response to changes in reward value. However, this area, along with orbitofrontal cortex, was also required for learning driven by changes in reward identity. This observation requires that existing models of TDRL in the ventral striatum be modified to include information about the specific features of expected outcomes derived from model-based representations, and that the role of orbitofrontal cortex in these models be clearly delineated.
Cocaine addiction is characterized by poor judgment and maladaptive decision-making. Here we review evidence implicating the orbitofrontal cortex in such behavior. This evidence suggests that cocaine-induced changes in orbitofrontal cortex disrupt the representation of states and transition functions that form the basis of flexible and adaptive ‘model-based’ behavioral control. By impairing this function, cocaine exposure leads to an overemphasis on less flexible, maladaptive ‘model-free’ control systems. We propose that such an effect accounts for the complex pattern of maladaptive behaviors associated with cocaine addiction.
Background Cue-induced methamphetamine craving increases after prolonged forced (experimenter-imposed) abstinence from the drug (incubation of methamphetamine craving). Here, we determined whether this incubation phenomenon would occur under conditions that promote voluntary (self-imposed) abstinence. We also determined the effect of the novel mGluR2 positive allosteric modulator, AZD8529, on incubation of methamphetamine craving after forced or voluntary abstinence. Methods We trained rats to self-administer palatable food (6 sessions) and then to self-administer methamphetamine under two conditions: 12 sessions (9-hr/day) or 50 sessions (3-hr/day). We then assessed cue-induced methamphetamine seeking in extinctions test after 1 or 21 abstinence days. Between tests, the rats underwent either forced abstinence (no access to the food- or drug-paired levers) or voluntary abstinence for 19 days (achieved via a discrete choice procedure between methamphetamine and palatable food; 20 trials per day). We also determined the effect of subcutaneous injections of AZD8529 (20 and 40 mg/kg) on cue-induced methamphetamine seeking 1 or 21 days after forced or voluntary abstinence. Results Under both training and abstinence conditions, cue-induced methamphetamine seeking in the extinction tests was higher after 21 abstinence days than after 1 day (incubation of methamphetamine craving). AZD8529 decreased cue-induced methamphetamine seeking on day 21 but not day 1 of forced or voluntary abstinence. Conclusions We introduce a novel animal model to study incubation of drug craving and cue-induced drug seeking after prolonged voluntary abstinence, mimicking the human condition of relapse after successful contingency management treatment. Our data suggest that PAMs of mGluR2 should be considered for relapse prevention.
SUMMARY Imagination, defined as the ability to interpret reality in ways that diverge from past experience, is fundamental to adaptive behavior. This can be seen at a simple level in our capacity to predict novel outcomes in new situations. The ability to anticipate outcomes never before received can also influence learning if those imagined outcomes are not received. The orbitofrontal cortex is a key candidate for where the process of imagining likely outcomes occurs; however its precise role in generating these estimates and applying them to learning remain open questions. Here we address these questions by showing that single-unit activity in orbitofrontal cortex reflects novel outcome estimates. The strength of these neural correlates predicted both behavior and learning, learning which was abolished by temporally-specific inhibition of orbitofrontal neurons. These results are consistent with the proposal that the orbitofrontal cortex is critical for integrating information to imagine future outcomes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.