2016
DOI: 10.1038/srep31378
|View full text |Cite
|
Sign up to set email alerts
|

Model-based action planning involves cortico-cerebellar and basal ganglia networks

Abstract: Humans can select actions by learning, planning, or retrieving motor memories. Reinforcement Learning (RL) associates these processes with three major classes of strategies for action selection: exploratory RL learns state-action values by exploration, model-based RL uses internal models to simulate future states reached by hypothetical actions, and motor-memory RL selects past successful state-action mapping. In order to investigate the neural substrates that implement these strategies, we conducted a functio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

3
61
0
1

Year Published

2016
2016
2022
2022

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 53 publications
(69 citation statements)
references
References 57 publications
3
61
0
1
Order By: Relevance
“…Note that the function mapping motor actions onto 99 cardinal movements can depend on environmental conditions, and thus, context (for example, wind 100 condition can change the relationship between primitive actions and movements in space for an 101 aerial drone). A similar mapping between arbitrary button presses and movements in the "finger 102 sailing" task has been used to provided evidence for model-based action planning in human 103 subjects [14,15]. Similarly, we can express the reward function in terms of cardinal movements 104 based on a location in space, R c (x, A) = Pr(r|x, A, c).…”
Section: Models 74mentioning
confidence: 93%
See 3 more Smart Citations
“…Note that the function mapping motor actions onto 99 cardinal movements can depend on environmental conditions, and thus, context (for example, wind 100 condition can change the relationship between primitive actions and movements in space for an 101 aerial drone). A similar mapping between arbitrary button presses and movements in the "finger 102 sailing" task has been used to provided evidence for model-based action planning in human 103 subjects [14,15]. Similarly, we can express the reward function in terms of cardinal movements 104 based on a location in space, R c (x, A) = Pr(r|x, A, c).…”
Section: Models 74mentioning
confidence: 93%
“…As a consequence, reusing policy components will 645 cause the agent to explore regions of the state space that have had high reward value in other 646 contexts, which as we have shown may or may not be an adaptive strategy. For example, successful 647 generalization in the "diabolical rooms" problem presented here, and the "finger sailing" task 648 presented by Fermin and colleagues [14,15], requires a separation of reward from movement 649 statistics. Indeed, the generalization of policy-dependent successor state representations works well 650 only under small deviations of the reward or transition function [10,38,39].…”
mentioning
confidence: 98%
See 2 more Smart Citations
“…Some evidence suggests that distractor tasks at decision time have no effect on reward revaluation [27], consistent with SR-Dyna. Other recent work has demonstrated that humans benefit from additional pre-decision time in revaluation tasks that closely resemble "policy revaluation" [96] and that this benefit recruits a network including the prefrontal cortex and basal ganglia. Such work is consistent with the predictions of both Dyna-Q as well as SR-Dyna accounts of value updating presented here.…”
Section: Future Experimental Workmentioning
confidence: 99%