In many cases, learning is thought to be driven by differences between the value of rewards we expect and rewards we actually receive. Yet learning can also occur when the identity of the reward we receive is not as expected, even if its value remains unchanged. Learning from changes in reward identity implies access to an internal model of the environment, from which information about the identity of the expected reward can be derived. As a result, such learning is not easily accounted for by model-free reinforcement learning theories such as temporal difference reinforcement learning (TDRL), which predicate learning on changes in reward value, but not identity. Here, we used unblocking procedures to assess learning driven by value-versus identity-based prediction errors. Rats were trained to associate distinct visual cues with different food quantities and identities. These cues were subsequently presented in compound with novel auditory cues and the reward quantity or identity was selectively changed. Unblocking was assessed by presenting the auditory cues alone in a probe test. Consistent with neural implementations of TDRL models, we found that the ventral striatum was necessary for learning in response to changes in reward value. However, this area, along with orbitofrontal cortex, was also required for learning driven by changes in reward identity. This observation requires that existing models of TDRL in the ventral striatum be modified to include information about the specific features of expected outcomes derived from model-based representations, and that the role of orbitofrontal cortex in these models be clearly delineated.
Cocaine addiction is characterized by poor judgment and maladaptive decision-making. Here we review evidence implicating the orbitofrontal cortex in such behavior. This evidence suggests that cocaine-induced changes in orbitofrontal cortex disrupt the representation of states and transition functions that form the basis of flexible and adaptive ‘model-based’ behavioral control. By impairing this function, cocaine exposure leads to an overemphasis on less flexible, maladaptive ‘model-free’ control systems. We propose that such an effect accounts for the complex pattern of maladaptive behaviors associated with cocaine addiction.
Background
Cue-induced methamphetamine craving increases after prolonged forced (experimenter-imposed) abstinence from the drug (incubation of methamphetamine craving). Here, we determined whether this incubation phenomenon would occur under conditions that promote voluntary (self-imposed) abstinence. We also determined the effect of the novel mGluR2 positive allosteric modulator, AZD8529, on incubation of methamphetamine craving after forced or voluntary abstinence.
Methods
We trained rats to self-administer palatable food (6 sessions) and then to self-administer methamphetamine under two conditions: 12 sessions (9-hr/day) or 50 sessions (3-hr/day). We then assessed cue-induced methamphetamine seeking in extinctions test after 1 or 21 abstinence days. Between tests, the rats underwent either forced abstinence (no access to the food- or drug-paired levers) or voluntary abstinence for 19 days (achieved via a discrete choice procedure between methamphetamine and palatable food; 20 trials per day). We also determined the effect of subcutaneous injections of AZD8529 (20 and 40 mg/kg) on cue-induced methamphetamine seeking 1 or 21 days after forced or voluntary abstinence.
Results
Under both training and abstinence conditions, cue-induced methamphetamine seeking in the extinction tests was higher after 21 abstinence days than after 1 day (incubation of methamphetamine craving). AZD8529 decreased cue-induced methamphetamine seeking on day 21 but not day 1 of forced or voluntary abstinence.
Conclusions
We introduce a novel animal model to study incubation of drug craving and cue-induced drug seeking after prolonged voluntary abstinence, mimicking the human condition of relapse after successful contingency management treatment. Our data suggest that PAMs of mGluR2 should be considered for relapse prevention.
SUMMARY
Imagination, defined as the ability to interpret reality in ways that diverge from past experience, is fundamental to adaptive behavior. This can be seen at a simple level in our capacity to predict novel outcomes in new situations. The ability to anticipate outcomes never before received can also influence learning if those imagined outcomes are not received. The orbitofrontal cortex is a key candidate for where the process of imagining likely outcomes occurs; however its precise role in generating these estimates and applying them to learning remain open questions. Here we address these questions by showing that single-unit activity in orbitofrontal cortex reflects novel outcome estimates. The strength of these neural correlates predicted both behavior and learning, learning which was abolished by temporally-specific inhibition of orbitofrontal neurons. These results are consistent with the proposal that the orbitofrontal cortex is critical for integrating information to imagine future outcomes.
Addiction is characterized by a lack of insight into the likely outcomes of
one’s behavior. Insight or the ability to imagine outcomes is evident when
outcomes have not been directly experienced. Using this concept, work in both rats and
humans has recently identified neural correlates of insight in the medial and orbital
prefrontal cortices. Here we show that these correlates are selectively abolished in rats
by cocaine self-administration. Their abolition was associated with behavioral deficits
and reduced synaptic efficacy in orbitofrontal cortex, reversal of which by optogenetic
activation restored normal behavior. These results provide a link between cocaine use and
problems with insight. Deficits in these functions are likely to be particularly important
for problems such as drug relapse, in which behavior fails to account for likely adverse
outcomes. As such, these data provide a neural target for therapeutic approaches to
address these defining long-term effects of drug use.
Cocaine addiction is a complex and multidimensional process involving a number of behavioral and neural forms of plasticity. The behavioral transition from voluntary drug use to compulsive drug taking may be explained at the neural level by drug-induced changes in function or interaction between a flexible planning system, associated with prefrontal cortical regions, and a rigid habit system, associated with the striatum. The dichotomy between these two systems is operationalized in computational theory by positing model-based and model-free learning mechanisms, the former relying on an “internal model” of the environment and the latter on pre-computed or cached values to control behavior. In this review, we will suggest that model-free and model-based learning mechanisms appear to be differentially affected, at least in the case of psychostimulants such as cocaine, with the former being enhanced while the latter are disrupted. As a result, the behavior of long-term drug users becomes less flexible and responsive to the desirability of expected outcomes and more habitual, based on the long history of reinforcement. To support our specific proposal, we will review recent neural and behavioral evidence on the effect of psychostimulant exposure on orbitofrontal and dorsolateral striatum structure and function.
These findings suggest an unforeseen dissociation between opioid and psychostimulant reward and demonstrate that even in the laboratory rat some contexts are associated with the propensity to self-administer more opioid than psychostimulant drugs and vice versa, thus indicating that drug taking is influenced not only by economical or cultural factors but also can be modulated at a much more basic level by the setting in which drugs are experienced.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.