Humans demonstrate a remarkable ability to generate accurate and appropriate motor behavior under many different and often uncertain environmental conditions. We previously proposed a new modular architecture, the modular selection and identification for control (MOSAIC) model, for motor learning and control based on multiple pairs of forward (predictor) and inverse (controller) models. The architecture simultaneously learns the multiple inverse models necessary for control as well as how to select the set of inverse models appropriate for a given environment. It combines both feedforward and feedback sensorimotor information so that the controllers can be selected both prior to movement and subsequently during movement. This article extends and evaluates the MOSAIC architecture in the following respects. The learning in the architecture was implemented by both the original gradient-descent method and the expectation-maximization (EM) algorithm. Unlike gradient descent, the newly derived EM algorithm is robust to the initial starting conditions and learning parameters. Second, simulations of an object manipulation task prove that the architecture can learn to manipulate multiple objects and switch between them appropriately. Moreover, after learning, the model shows generalization to novel objects whose dynamics lie within the polyhedra of already learned dynamics. Finally, when each of the dynamics is associated with a particular object shape, the model is able to select the appropriate controller before movement execution. When presented with a novel shape-dynamic pairing, inappropriate activation of modules is observed followed by on-line correction.
Haruno, Masahiko and Mitsuo Kawato. Different neural correlates of reward expectation and reward expectation error in the putamen and caudate nucleus during stimulus-action-reward association learning. J Neurophysiol 95: 948 -959, 2006. First published October 5, 2005 doi:10.1152/jn.00382.2005. To select appropriate behaviors leading to rewards, the brain needs to learn associations among sensory stimuli, selected behaviors, and rewards. Recent imaging and neural-recording studies have revealed that the dorsal striatum plays an important role in learning such stimulus-action-reward associations. However, the putamen and caudate nucleus are embedded in distinct cortico-striatal loop circuits, predominantly connected to motor-related cerebral cortical areas and frontal association areas, respectively. This difference in their cortical connections suggests that the putamen and caudate nucleus are engaged in different functional aspects of stimulus-action-reward association learning. To determine whether this is the case, we conducted an event-related and computational model-based functional MRI (fMRI) study with a stochastic decision-making task in which a stimulus-action-reward association must be learned. A simple reinforcement learning model not only reproduced the subject's action selections reasonably well but also allowed us to quantitatively estimate each subject's temporal profiles of stimulus-action-reward association and reward-prediction error during learning trials. These two internal representations were used in the fMRI correlation analysis. The results revealed that neural correlates of the stimulus-action-reward association reside in the putamen, whereas a correlation with reward-prediction error was found largely in the caudate nucleus and ventral striatum. These nonuniform spatiotemporal distributions of neural correlates within the dorsal striatum were maintained consistently at various levels of task difficulty, suggesting a functional difference in the dorsal striatum between the putamen and caudate nucleus during stimulus-action-reward association learning.
A fundamental challenge in social cognition is how humans learn another person's values to predict their decision-making behavior. This form of learning is often assumed to require simulation of the other by direct recruitment of one's own valuation process to model the other's process. However, the cognitive and neural mechanism of simulation learning is not known. Using behavior, modeling, and fMRI, we show that simulation involves two learning signals in a hierarchical arrangement. A simulated-other's reward prediction error processed in ventromedial prefrontal cortex mediated simulation by direct recruitment, being identical for valuation of the self and simulated-other. However, direct recruitment was insufficient for learning, and also required observation of the other's choices to generate a simulated-other's action prediction error encoded in dorsomedial/dorsolateral prefrontal cortex. These findings show that simulation uses a core prefrontal circuit for modeling the other's valuation to generate prediction and an adjunct circuit for tracking behavioral variation to refine prediction.
Humans can acquire appropriate behaviors that maximize rewards on a trial-and-error basis. Recent electrophysiological and imaging studies have demonstrated that neural activity in the midbrain and ventral striatum encodes the error of reward prediction. However, it is yet to be examined whether the striatum is the main locus of reward-based behavioral learning. To address this, we conducted functional magnetic resonance imaging (fMRI) of a stochastic decision task involving monetary rewards, in which subjects had to learn behaviors involving different task difficulties that were controlled by probability. We performed a correlation analysis of fMRI data by using the explanatory variables derived from subject behaviors. We found that activity in the caudate nucleus was correlated with short-term reward and, furthermore, paralleled the magnitude of a subject's behavioral change during learning. In addition, we confirmed that this parallelism between learning and activity in the caudate nucleus is robustly maintained even when we vary task difficulty by controlling the probability. These findings suggest that the caudate nucleus is one of the main loci for reward-based behavioral learning.
“Social value orientation” characterizes individual differences in anchoring attitudes towards the division of resources. Here, by contrasting people with prosocial and individualistic orientations using functional magnetic resonance imaging, we demonstrate that degree of inequity aversion in prosocials is predictable from amygdala activity and unaffected by cognitive load. This result suggests that automatic emotional processing in the amygdala lies at the core of prosocial value orientation.
The intention behind another's action and the impact of the outcome are major determinants of human economic behavior. It is poorly understood, however, whether the two systems share a core neural computation. Here, we investigated whether the two systems are causally dissociable in the brain by integrating computational modeling, functional magnetic resonance imaging, and transcranial direct current stimulation experiments in a newly developed trust game task. We show not only that right dorsolateral prefrontal cortex (DLPFC) activity is correlated with intention-based economic decisions and that ventral striatum and amygdala activity are correlated with outcome-based decisions, but also that stimulation to the DLPFC selectively enhances intention-based decisions. These findings suggest that the right DLPFC is involved in the implementation of intention-based decisions in the processing of cooperative decisions. This causal dissociation of cortical and subcortical backgrounds may indicate evolutionary and developmental differences in the two decision systems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.