The estimation of the reward an action will yield is critical in decision-making. To elucidate the role of the basal ganglia in this process, we recorded striatal neurons of monkeys who chose between left and right handle turns, based on the estimated reward probabilities of the actions. During a delay period before the choices, the activity of more than one-third of striatal projection neurons was selective to the values of one of the two actions. Fewer neurons were tuned to relative values or action choice. These results suggest representation of action values in the striatum, which can guide action selection in the basal ganglia circuit.
We propose a modular reinforcement learning architecture for nonlinear, nonstationary control tasks, which we call multiple model-based reinforcement learning (MMRL). The basic idea is to decompose a complex task into multiple domains in space and time based on the predictability of the environmental dynamics. The system is composed of multiple modules, each of which consists of a state prediction model and a reinforcement learning controller. The "responsibility signal," which is given by the softmax function of the prediction errors, is used to weight the outputs of multiple modules, as well as to gate the learning of the prediction models and the reinforcement learning controllers. We formulate MMRL for both discrete-time, finite-state case and continuous-time, continuous-state case. The performance of MMRL was demonstrated for discrete case in a nonstationary hunting task in a grid world and for continuous case in a nonlinear, nonstationary control task of swinging up a pendulum with variable physical parameters.
According to many modern economic theories, actions simply reflect an individual's preferences, whereas a psychological phenomenon called "cognitive dissonance" claims that actions can also create preference. Cognitive dissonance theory states that after making a difficult choice between two equally preferred items, the act of rejecting a favorite item induces an uncomfortable feeling (cognitive dissonance), which in turn motivates individuals to change their preferences to match their prior decision (i.e., reducing preference for rejected items). Recently, however, Chen and Risen [Chen K, Risen J (2010) J Pers Soc Psychol 99:573-594] pointed out a serious methodological problem, which casts a doubt on the very existence of this choice-induced preference change as studied over the past 50 y. Here, using a proper control condition and two measures of preferences (self-report and brain activity), we found that the mere act of making a choice can change self-report preference as well as its neural representation (i.e., striatum activity), thus providing strong evidence for choice-induced preference change. Furthermore, our data indicate that the anterior cingulate cortex and dorsolateral prefrontal cortex tracked the degree of cognitive dissonance on a trial-by-trial basis. Our findings provide important insights into the neural basis of how actions can alter an individual's preferences.I n Aesop's Fable "The Fox and the Grapes," a fox tries to get some grapes that are hanging on a high, unreachable vine. After failing to reach them, the fox decides that the grapes were probably sour anyway. An interesting aspect of this story is the idea that actions (e.g., giving up on the grapes) can change preferences. Because the dissonance-induced preference change indicates that behaviors can create, not just reflect, people's preferences, it challenges a vital assumption in neoclassical economics that preference or "hedonic utility" determines people's behavior (1).Since Brehm's original study in 1956 (2), this sort of preference change (i.e., the increase in ratings for chosen goods and/or the decrease in ratings for rejected goods) has been repeatedly observed under the "free-choice paradigm" (3-6). In a typical freechoice study design, participants are asked to: (i) rate their preference for a set of goods (e.g., art prints, CDs, and so forth), (ii) choose between two of the goods, and (iii) rate them again. After making a difficult choice between two equally preferred items at stage ii, individuals tend to like the selected item more and the rejected item less than they originally did (2). This tendency happens because when making a choice between two equally highly preferred items, individuals have to give up either of the two liked items. According to cognitive dissonance theory (7), simultaneously holding two or more contradictory cognitions (e.g., "I like the item" and "I rejected it") causes a psychological discomfort called "cognitive dissonance," and individuals are motivated to reduce this discomfort by changing...
Humans can acquire appropriate behaviors that maximize rewards on a trial-and-error basis. Recent electrophysiological and imaging studies have demonstrated that neural activity in the midbrain and ventral striatum encodes the error of reward prediction. However, it is yet to be examined whether the striatum is the main locus of reward-based behavioral learning. To address this, we conducted functional magnetic resonance imaging (fMRI) of a stochastic decision task involving monetary rewards, in which subjects had to learn behaviors involving different task difficulties that were controlled by probability. We performed a correlation analysis of fMRI data by using the explanatory variables derived from subject behaviors. We found that activity in the caudate nucleus was correlated with short-term reward and, furthermore, paralleled the magnitude of a subject's behavioral change during learning. In addition, we confirmed that this parallelism between learning and activity in the caudate nucleus is robustly maintained even when we vary task difficulty by controlling the probability. These findings suggest that the caudate nucleus is one of the main loci for reward-based behavioral learning.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.