Instrumental responses are hypothesized to be of two kinds: habitual and goal-directed, mediated by the sensorimotor and the associative cortico-basal ganglia circuits, respectively. The existence of the two heterogeneous associative learning mechanisms can be hypothesized to arise from the comparative advantages that they have at different stages of learning. In this paper, we assume that the goal-directed system is behaviourally flexible, but slow in choice selection. The habitual system, in contrast, is fast in responding, but inflexible in adapting its behavioural strategy to new conditions. Based on these assumptions and using the computational theory of reinforcement learning, we propose a normative model for arbitration between the two processes that makes an approximately optimal balance between search-time and accuracy in decision making. Behaviourally, the model can explain experimental evidence on behavioural sensitivity to outcome at the early stages of learning, but insensitivity at the later stages. It also explains that when two choices with equal incentive values are available concurrently, the behaviour remains outcome-sensitive, even after extensive training. Moreover, the model can explain choice reaction time variations during the course of learning, as well as the experimental observation that as the number of choices increases, the reaction time also increases. Neurobiologically, by assuming that phasic and tonic activities of midbrain dopamine neurons carry the reward prediction error and the average reward signals used by the model, respectively, the model predicts that whereas phasic dopamine indirectly affects behaviour through reinforcing stimulus-response associations, tonic dopamine can directly affect behaviour through manipulating the competition between the habitual and the goal-directed systems and thus, affect reaction time.
Computational modeling plays an important role in modern neuroscience research. Much previous research has relied on statistical methods, separately, to address two problems that are actually interdependent. First, given a particular computational model, Bayesian hierarchical techniques have been used to estimate individual variation in parameters over a population of subjects, leveraging their population-level distributions. Second, candidate models are themselves compared, and individual variation in the expressed model estimated, according to the fits of the models to each subject. The interdependence between these two problems arises because the relevant population for estimating parameters of a model depends on which other subjects express the model. Here, we propose a hierarchical Bayesian inference (HBI) framework for concurrent model comparison, parameter estimation and inference at the population level, combining previous approaches. We show that this framework has important advantages for both parameter estimation and model comparison theoretically and experimentally. The parameters estimated by the HBI show smaller errors compared to other methods. Model comparison by HBI is robust against outliers and is not biased towards overly simplistic models. Furthermore, the fully Bayesian approach of our theory enables researchers to make inference on group-level parameters by performing HBI t-test.
Sound principles of statistical inference dictate that uncertainty shapes learning. In this work, we revisit the question of learning in volatile environments, in which both the first and second-order statistics of observations dynamically evolve over time. We propose a new model, the volatile Kalman filter (VKF), which is based on a tractable state-space model of uncertainty and extends the Kalman filter algorithm to volatile environments. The proposed model is algorithmically simple and encompasses the Kalman filter as a special case. Specifically, in addition to the error-correcting rule of Kalman filter for learning observations, the VKF learns volatility according to a second error-correcting rule. These dual updates echo and contextualize classical psychological models of learning, in particular hybrid accounts of Pearce-Hall and Rescorla-Wagner. At the computational level, compared with existing models, the VKF gives up some flexibility in the generative model to enable a more faithful approximation to exact inference. When fit to empirical data, the VKF is better behaved than alternatives and better captures human choice data in two independent datasets of probabilistic learning tasks. The proposed model provides a coherent account of learning in stable or volatile environments and has implications for decision neuroscience research.
Learning and decision-making are modulated by socio-emotional processing and such modulation is implicated in clinically relevant personality traits of social anxiety. The present study elucidates the computational and neural mechanisms by which emotionally aversive cues disrupt learning in socially anxious human individuals. Healthy volunteers with low or high trait social anxiety performed a reversal learning task requiring learning actions in response to angry or happy face cues. Choice data were best captured by a computational model in which learning rate was adjusted according to the history of surprises. High trait socially anxious individuals used a less-dynamic strategy for adjusting their learning rate in trials started with angry face cues and unlike the low social anxiety group, their dorsal anterior cingulate cortex (dACC) activity did not covary with the learning rate. Our results demonstrate that trait social anxiety is accompanied by disruption of optimal learning and dACC activity in threatening situations.
Previous research has stressed the importance of uncertainty for controlling the speed of learning, and how such control depends on the learner inferring the noise properties of the environment, especially volatility: the speed of change. However, learning rates are jointly determined by the comparison between volatility and a second factor, moment-to-moment stochasticity. Yet much previous research has focused on simplified cases corresponding to estimation of either factor alone. Here, we introduce a learning model, in which both factors are learned simultaneously from experience, and use the model to simulate human and animal data across many seemingly disparate neuroscientific and behavioral phenomena. By considering the full problem of joint estimation, we highlight a set of previously unappreciated issues, arising from the mutual interdependence of inference about volatility and stochasticity. This interdependence complicates and enriches the interpretation of previous results, such as pathological learning in individuals with anxiety and following amygdala damage.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.