Reinforcement learning (RL) models have advanced our understanding of how animals learn and make decisions, and how the brain supports some aspects of learning. However, the neural computations that are explained by RL algorithms fall short of explaining many sophisticated aspects of human decision making, including the generalization of learned information, one-shot learning, and the synthesis of task information in complex environments. Instead, these aspects of instrumental behavior are assumed to be supported by the brain’s executive functions (EF). We review recent findings that highlight the importance of EF in learning. Specifically, we advance the theory that EF sets the stage for canonical RL computations in the brain, providing inputs that broaden their flexibility and applicability. Our theory has important implications for how to interpret RL computations in the brain and behavior.
People’s thoughts and feelings ebb and flow in predictable ways: surprise arises quickly, anticipation ramps up slowly, regret follows anger, love begets happiness, and so forth. Predicting these transitions between mental states can help people successfully navigate the social world. We hypothesize that the goal of predicting state dynamics shapes people’ mental state concepts. Across seven studies, when people observed more frequent transitions between a pair of novel mental states, they judged those states to be more conceptually similar to each other. In an eighth study, an artificial neural network trained to predict real human mental state dynamics spontaneously learned the same conceptual dimensions that people use to understand these states: the 3d Mind Model. Together these results suggest that mental state dynamics explain the origins of mental state concepts.
To behave adaptively, people must choose actions that maximize their expected future rewards.Engaging in such goal-directed decision-making in turn requires the capacity to (1) develop an internal model of one's environment (i.e., representing the relationship between current and future states; structure inference ), and (2) navigate this cognitive model to determine the action(s) that will lead to the most rewarding future state ( model-based planning ). WhileHumans have a remarkable ability to construct complex, goal-directed plans. W e can plan the steps needed to complete a multi-part task; plan the words we will use to communicate a new idea; plan a route through an unfamiliar city; or plan an event several months or even years away. Achieving these goals relies on two component processes. First, we need to infer the structure of a given environment, including how to get between different states in that environment (e.g., different locations in space or different steps in a task sequence). Second, we need to generate and implement a plan : a sequence of actions that leverages this internal model of the environment in the service of a particular goal. These two processes are jointly necessary for successful goal-directed behavior, but have not yet been separately measured.As a result, it is not yet known whether the ability to construct such internal models based on one's experience with an environment ( structure inference ) entails the ability to use those models to achieve the best outcome ( model-based planning ). Here, we introduce and validate a task that separately measures structure inference ability, and test whether individual differences in this ability predict the use of model-based planning.One body of work has examined how people develop internal models of their environment based on their experience with individual states in that environment and the transitions between them (Fermin et al. 2010; Behrens et al. 2018). Foundational research in the area demonstrated that animals construct cognitive maps as they navigate their spatial environment (Tolman, 1948;O'Keefe and Nadel, 1978), and that neural representations of these maps (decoded from regions of hippocampus) not only reflect the animal's location in that space but also (1) their recent locations and (2) the future projections of locations they intend to visit ( Johnson & Redish, 2007 ). Recent work has shown that cognitive maps can also be extrapolated from abstract learned associations. For instance, Schapiro and colleagues (2013) built a virtual graph-like structure, with each node represented by an individual abstract stimulus. In their experiment, participants traversed this graph sequentially, one node at a time. Despite never seeing the underlying graph, participants were able to recover the graph, based on their experience of the likelihood of moving from one node to another. In addition, much like the representation of spatial maps, the graph representation itself could also be decoded from their brain activity. Similar forms of con...
In reinforcement learning (RL) experiments, participants learn to make rewarding choices in response to different stimuli; RL models use outcomes to estimate stimulus-response values which change incrementally. RL models consider any response type indiscriminately, ranging from less abstract choices (e.g. pressing a key with the index finger), to more abstract choices that can be executed in a number of ways (e.g. getting dinner at the restaurant). But does the learning process vary as a function of how abstract the choices are? In Experiment 1, we show that choice abstraction impacts learning: participants were slower and less accurate in learning to select a more abstract choice. Using computational modeling, we show that two mechanisms contribute to this. First, the values of motor actions interfered with the values of more abstract responses, resulting in more incorrect choices; second, information integration for relevant abstract choices was slower. In Experiment 2, we replicate the findings from Experiment 1, and further extend the results by investigating whether slowed learning is attributable to working memory (WM) or RL contributions. We find that the impairment in more abstract/flexible choices is driven primarily by a weaker contribution of WM. We conclude that defining a more abstract choice space used by multiple learning systems recruits limited executive resources, limiting how much such processes then contribute to fast learning.
In reinforcement learning (RL) experiments, participants learn to make rewarding choices in response to different stimuli; RL models use outcomes to estimate stimulus–response values that change incrementally. RL models consider any response type indiscriminately, ranging from more concretely defined motor choices (pressing a key with the index finger), to more general choices that can be executed in a number of ways (selecting dinner at the restaurant). However, does the learning process vary as a function of the choice type? In Experiment 1, we show that it does: Participants were slower and less accurate in learning correct choices of a general format compared with learning more concrete motor actions. Using computational modeling, we show that two mechanisms contribute to this. First, there was evidence of irrelevant credit assignment: The values of motor actions interfered with the values of other choice dimensions, resulting in more incorrect choices when the correct response was not defined by a single motor action; second, information integration for relevant general choices was slower. In Experiment 2, we replicated and further extended the findings from Experiment 1 by showing that slowed learning was attributable to weaker working memory use, rather than slowed RL. In both experiments, we ruled out the explanation that the difference in performance between two condition types was driven by difficulty/different levels of complexity. We conclude that defining a more abstract choice space used by multiple learning systems for credit assignment recruits executive resources, limiting how much such processes then contribute to fast learning.
The ability to use past experience to effectively guide decision making declines in older adulthood. Such declines have been theorized to emerge from either impairments of striatal reinforcement learning systems (RL) or impairments of recurrent networks in prefrontal and parietal cortex that support working memory (WM). Distinguishing between these hypotheses has been challenging because either RL or WM could be used to facilitate successful decision making in typical laboratory tasks. Here we investigated the neurocomputational correlates of age-related decision making deficits using an RL-WMtask to disentangle these mechanisms, a computational model to quantify them, and magnetic resonance spectroscopy to link them to their molecular bases. Our results reveal that task performance is worse in older age, in a manner best explained by working memory deficits, as might be expected if cortical recurrent networks were unable to sustain persistent activity across multiple trials. Consistent with this, we show that older adults had lower levels of prefrontal glutamate, the excitatory neurotransmitter thought to support persistent activity, compared to younger adults. Individuals with the lowest prefrontal glutamate levels displayed the greatest impairments in working memory after controlling for other anatomical and metabolic factors. Together, our results suggest that lower levels of prefrontal glutamate may contribute to failures of working memory systems and impaired decision making in older adulthood.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.