The mechanisms of reward maximization have been extensively studied at both the computational and neural levels. By contrast, little is known about how the brain learns to choose the options that minimize action cost. In principle, the brain could have evolved a general mechanism that applies the same learning rule to the different dimensions of choice options. To test this hypothesis, we scanned healthy human volunteers while they performed a probabilistic instrumental learning task that varied in both the physical effort and the monetary outcome associated with choice options. Behavioral data showed that the same computational rule, using prediction errors to update expectations, could account for both reward maximization and effort minimization. However, these learning-related variables were encoded in partially dissociable brain areas. In line with previous findings, the ventromedial prefrontal cortex was found to positively represent expected and actual rewards, regardless of effort. A separate network, encompassing the anterior insula, the dorsal anterior cingulate, and the posterior parietal cortex, correlated positively with expected and actual efforts. These findings suggest that the same computational rule is applied by distinct brain systems, depending on the choice dimension-cost or benefit-that has to be learned.
words)When learning the value of actions in volatile environments, humans often make seemingly irrational decisions which fail to maximize expected value. We reasoned that these 'non-greedy' decisions, instead of reflecting information seeking during choice, may be caused by computational noise in the learning of action values. Here, using reinforcement learning (RL) models of behavior and multimodal neurophysiological data, we show that the majority of non-greedy decisions stems from this learning noise. The trial-to-trial variability of sequential learning steps and their impact on behavior could be predicted both by BOLD responses to obtained rewards in the dorsal anterior cingulate cortex (dACC) and by phasic pupillary dilation -suggestive of neuromodulatory fluctuations driven by the locus coeruleus-norepinephrine (LC-NE) system. Together, these findings indicate that most of behavioral variability, rather than reflecting human exploration, is due to the limited computational precision of reward-guided learning.
When learning the value of actions in volatile environments, humans often make seemingly irrational decisions which fail to maximize expected value. We reasoned that these 'non-greedy' decisions, instead of reflecting information seeking during choice, may be caused by computational noise in the learning of action values. Here, using reinforcement learning (RL) models of behavior and multimodal neurophysiological data, we show that the majority of non-greedy decisions stems from this learning noise. The trial-to-trial variability of sequential learning steps and their impact on behavior could be predicted both by BOLD responses to obtained rewards in the dorsal anterior cingulate cortex (dACC) and by phasic pupillary dilation -suggestive of neuromodulatory fluctuations driven by the locus coeruleus-norepinephrine (LC-NE) system. Together, these findings indicate that most of behavioral variability, rather than reflecting human exploration, is due to the limited computational precision of reward-guided learning.
Instrumental learning is a fundamental process through which agents optimize their choices, taking into account various dimensions of available options such as the possible reward or punishment outcomes and the costs associated with potential actions. Although the implication of dopamine in learning from choice outcomes is well established, less is known about its role in learning the action costs such as effort. Here, we tested the ability of patients with Parkinson's disease (PD) to maximize monetary rewards and minimize physical efforts in a probabilistic instrumental learning task. The implication of dopamine was assessed by comparing performance ON and OFF prodopaminergic medication. In a first sample of PD patients (n ϭ 15), we observed that reward learning, but not effort learning, was selectively impaired in the absence of treatment, with a significant interaction between learning condition (reward vs effort) and medication status (OFF vs ON). These results were replicated in a second, independent sample of PD patients (n ϭ 20) using a simplified version of the task. According to Bayesian model selection, the best account for medication effects in both studies was a specific amplification of reward magnitude in a Q-learning algorithm. These results suggest that learning to avoid physical effort is independent from dopaminergic circuits and strengthen the general idea that dopaminergic signaling amplifies the effects of reward expectation or obtainment on instrumental behavior.
Informational cues such as the price of a wine can trigger expectations about its taste quality and thereby modulate the sensory experience on a reported and neural level. Yet it is unclear how the brain translates such expectations into sensory pleasantness. We used a whole-brain multilevel mediation approach with healthy participants who tasted identical wines cued with different prices while their brains were scanned using fMRI. We found that the brain’s valuation system (BVS) in concert with the anterior prefrontal cortex played a key role in implementing the effect of price cues on taste pleasantness ratings. The sensitivity of the BVS to monetary rewards outside the taste domain moderated the strength of these effects. These findings provide novel evidence for the fundamental role that neural pathways linked to motivation and affective regulation play for the effect of informational cues on sensory experiences.
Increased mental-health symptoms as a reaction to stressful life events, such as the Covid-19 pandemic, are common. Critically, successful adaptation helps to reduce such symptoms to baseline, preventing long-term psychiatric disorders. It is thus important to understand whether and which psychiatric symptoms show transient elevations, and which persist long-term and become chronically heightened. At particular risk for the latter trajectory are symptom dimensions directly affected by the pandemic, such as obsessive–compulsive (OC) symptoms. In this longitudinal large-scale study (N = 406), we assessed how OC, anxiety and depression symptoms changed throughout the first pandemic wave in a sample of the general UK public. We further examined how these symptoms affected pandemic-related information seeking and adherence to governmental guidelines. We show that scores in all psychiatric domains were initially elevated, but showed distinct longitudinal change patterns. Depression scores decreased, and anxiety plateaued during the first pandemic wave, while OC symptoms further increased, even after the ease of Covid-19 restrictions. These OC symptoms were directly linked to Covid-related information seeking, which gave rise to higher adherence to government guidelines. This increase of OC symptoms in this non-clinical sample shows that the domain is disproportionately affected by the pandemic. We discuss the long-term impact of the Covid-19 pandemic on public mental health, which calls for continued close observation of symptom development.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.