Theories of instrumental learning are centred on understanding how success and failure are used to improve future decisions1. These theories highlight a central role for reward prediction errors in updating the values associated with available actions2. In animals, substantial evidence indicates that the neurotransmitter dopamine might have a key function in this type of learning, through its ability to modulate cortico-striatal synaptic efficacy3. However, no direct evidence links dopamine, striatal activity and behavioural choice in humans. Here we show that, during instrumental learning, the magnitude of reward prediction error expressed in the striatum is modulated by the administration of drugs enhancing (3,4-dihydroxy-L-phenylalanine; L-DOPA) or reducing (haloperidol) dopaminergic function. Accordingly, subjects treated with L-DOPA have a greater propensity to choose the most rewarding action relative to subjects treated with haloperidol. Furthermore, incorporating the magnitude of the prediction errors into a standard actionvalue learning algorithm accurately reproduced subjects' behavioural choices under the different drug conditions. We conclude that dopamine-dependent modulation of striatal activity can account for how the human brain uses reward prediction errors to improve future decisions.Dopamine is closely associated with reward-seeking behaviours, such as approach, consummation and addiction3-5. However, exactly how dopamine influences behavioural choice towards available rewards remains poorly understood. Substantial evidence from experiments on primates has led to the hypothesis that midbrain dopamine cells encode errors in reward prediction, the 'teaching signal' embodied in modern computational reinforcement learning theory6. Accumulating data indicate that different aspects of the dopamine signal incorporate information about the time, context, probability and magnitude of an expected reward7-9. Furthermore, dopamine terminal projections are able to modulate the efficacy of cortico-striatal synapses10,11, providing a mechanism for the adaptation of striatal activities during learning. Thus, dopamine-dependent plasticity could explain how striatal neurons learn to represent both upcoming reward and optimal behaviour12-16. However, no direct evidence is available that links dopamine, striatal plasticity and rewardseeking behaviour in humans. More specifically, although striatal activity has been closely associated with instrumental learning in humans17,18, there is no evidence that this activity is modulated by dopamine. Here we establish this link by using combined behavioural, pharmacological, computational and functional magnetic resonance imaging techniques.We assessed the effects of haloperidol (an antagonist of dopamine receptors) and L-DOPA (a metabolic precursor of dopamine) on both brain activity and behavioural choice in groups of healthy subjects. Subjects performed an instrumental learning task involving monetary gains and losses, which required choosing between two novel visual st...
Unconscious motivation in humans is often inferred but rarely demonstrated empirically. We imaged motivational processes, implemented in a paradigm that varied the amount and reportability of monetary rewards for which subjects exerted physical effort. We show that, even when subjects cannot report how much money is at stake, they nevertheless deploy more force for higher amounts. Such a motivational effect is underpinned by engagement of a specific basal forebrain region. Our findings thus reveal this region as a key node in brain circuitry that enables expected rewards to energize behavior, without the need for the subjects' awareness.Humans tend to adapt the degree of effort they expend according to the magnitude of reward they expect. Such a process has been proposed as an operant concept of motivation (1-3). Motivational processes may be obvious, as when a prospector spends days in extreme conditions seeking gold. The popular view is that motivation can also be unconscious, such that a person may be unable to report the goals or rewards that drive a particular behavior. However, empirical evidence on this issue is lacking, and the potential brain mechanisms involved in converting expected rewards into behavioral activation are poorly understood.We developed an experimental paradigm to visualize unconscious motivational processes, using functional magnetic resonance imaging. A classical approach to trigger unconscious processing is subliminal stimulation, which can be implemented by means of masking procedures. The terminology we use in this report is based on a recent taxonomy (4), in which a process is considered subliminal if it is attended but not reportable. Successful brain imaging studies of subliminal processes have focused so far on processing words (5, 6) as well as emotional stimuli (7,8). In our study, the object of masking was an incentive stimulus for a future action, represented by the amount of reward at stake. The question we asked is whether, and how, the human brain energizes behavior in proportion to subliminal incentives.We developed an incentive force task, using money as a reward: a manipulation that is consistently shown to activate reward circuits in the human brain (9-11). The exact level of motivation was manipulated by randomly assigning the amount at stake as one pound or one penny. Pictures of the corresponding coins were displayed on a computer screen at the beginning of each trial, between two screenshots of "mask" images (Fig. 1). The reportabiity * To whom correspondence should be addressed. E-mail: pessigli@ccr.jussieu.fr. Europe PMC Funders Group Europe PMC Funders Author ManuscriptsEurope PMC Funders Author Manuscripts of the monetary stakes depended on their display duration, which could be 17,50, or 100ms. The perception of the first two durations was determined as subliminal in a preliminary behavioral test, where subjects reported not seeing anything other than the mask. The third duration was consistently associated with conscious perception of the stimuli a...
While dopamine systems have been implicated in the pathophysiology of schizophrenia and psychosis for many years, how dopamine dysfunction generates psychotic symptoms remains unknown. Recent theoretical interest has been directed at relating the known role of midbrain dopamine neurons in reinforcement learning, motivational salience and prediction error to explain the abnormal mental experience of psychosis. However, this theoretical model has yet to be explored empirically. To examine a link between psychotic experience, reward learning and dysfunction of the dopaminergic midbrain and associated target regions, we asked a group of first episode psychosis patients suffering from active positive symptoms and a group of healthy control participants to perform an instrumental reward conditioning experiment. We characterized neural responses using functional magnetic resonance imaging. We observed that patients with psychosis exhibit abnormal physiological responses associated with reward prediction error in the dopaminergic midbrain, striatum and limbic system, and we demonstrated subtle abnormalities in the ability of psychosis patients to discriminate between motivationally salient and neutral stimuli. This study provides the first evidence linking abnormal mesolimbic activity, reward learning and psychosis.
Decision making consists of choosing among available options on the basis of a valuation of their potential costs and benefits. Most theoretical models of decision making in behavioral economics, psychology, and computer science propose that the desirability of outcomes expected from alternative options can be quantified by utility functions. These utility functions allow a decision maker to assign subjective values to each option under consideration by weighting the likely benefits and costs resulting from an action and to select the one with the highest subjective value. Here, we used model-based neuroimaging to test whether the human brain uses separate valuation systems for rewards (erotic stimuli) associated with different types of costs, namely, delay and effort. We show that humans devalue rewards associated with physical effort in a strikingly similar fashion to those they devalue that are associated with delays, and that a single computational model derived from economics theory can account for the behavior observed in both delay discounting and effort discounting. However, our neuroimaging data reveal that the human brain uses distinct valuation subsystems for different types of costs, reflecting in opposite fashion delayed reward and future energetic expenses. The ventral striatum and the ventromedial prefrontal cortex represent the increasing subjective value of delayed rewards, whereas a distinct network, composed of the anterior cingulate cortex and the anterior insula, represent the decreasing value of the effortful option, coding the expected expense of energy. Together, these data demonstrate that the valuation processes underlying different types of costs can be fractionated at the cerebral level.
A key process in decision-making is estimating the value of possible outcomes. Growing evidence suggests that different types of values are automatically encoded in the ventromedial prefrontal cortex (VMPFC). Here we extend this idea by suggesting that any overt judgment is accompanied by a second-order valuation (a confidence estimate), which is also automatically incorporated in VMPFC activity. In accordance with the predictions of our normative model of rating tasks, two behavioral experiments showed that confidence levels were quadratically related to first-order judgments (age, value or probability ratings). The analysis of three functional magnetic resonance imaging data sets using similar rating tasks confirmed that the quadratic extension of first-order ratings (our proxy for confidence) was encoded in VMPFC activity, even if no confidence judgment was required of the participants. Such an automatic aggregation of value and confidence in a same brain region might provide insight into many distortions of judgment and choice.
According to economic theories, preference for one item over others reveals its rank value on a common scale. Previous studies identified brain regions encoding such values. Here we verify that these regions can valuate various categories of objects and further test whether they still express preferences when attention is diverted to another task. During functional neuroimaging, participants rated either the pleasantness (explicit task) or the age (distractive task) of pictures from different categories (face, house, and painting). After scanning, the same pictures were presented in pairs, and subjects had to choose the one they preferred. We isolated brain regions that reflect both values (pleasantness ratings) and preferences (binary choices). Preferences were encoded whatever the stimulus (face, house, or painting) and task (explicit or distractive). These regions may therefore constitute a brain system that automatically engages in valuating the various components of our environment so as to influence our future choices.
SummaryHow the brain uses success and failure to optimize future decisions is a long-standing question in neuroscience. One computational solution involves updating the values of context-action associations in proportion to a reward prediction error. Previous evidence suggests that such computations are expressed in the striatum and, as they are cognitively impenetrable, represent an unconscious learning mechanism. Here, we formally test this by studying instrumental conditioning in a situation where we masked contextual cues, such that they were not consciously perceived. Behavioral data showed that subjects nonetheless developed a significant propensity to choose cues associated with monetary rewards relative to punishments. Functional neuroimaging revealed that during conditioning cue values and prediction errors, generated from a computational model, both correlated with activity in ventral striatum. We conclude that, even without conscious processing of contextual cues, our brain can learn their reward value and use them to provide a bias on decision making.
Mental and physical efforts, such as paying attention and lifting weights, have been shown to involve different brain systems. These cognitive and motor systems, respectively, include cortical networks (prefronto-parietal and precentral regions) as well as subregions of the dorsal basal ganglia (caudate and putamen). Both systems appeared sensitive to incentive motivation: their activity increases when we work for higher rewards. Another brain system, including the ventral prefrontal cortex and the ventral basal ganglia, has been implicated in encoding expected rewards. How this motivational system drives the cognitive and motor systems remains poorly understood. More specifically, it is unclear whether cognitive and motor systems can be driven by a common motivational center or if they are driven by distinct, dedicated motivational modules. To address this issue, we used functional MRI to scan healthy participants while performing a task in which incentive motivation, cognitive, and motor demands were varied independently. We reasoned that a common motivational node should (1) represent the reward expected from effort exertion, (2) correlate with the performance attained, and (3) switch effective connectivity between cognitive and motor regions depending on task demand. The ventral striatum fulfilled all three criteria and therefore qualified as a common motivational node capable of driving both cognitive and motor regions of the dorsal striatum. Thus, we suggest that the interaction between a common motivational system and the different task-specific systems underpinning behavioral performance might occur within the basal ganglia.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.