2012
DOI: 10.3389/fncom.2012.00087
|View full text |Cite
|
Sign up to set email alerts
|

Learning to use working memory: a reinforcement learning gating model of rule acquisition in rats

Abstract: Learning to form appropriate, task-relevant working memory representations is a complex process central to cognition. Gating models frame working memory as a collection of past observations and use reinforcement learning (RL) to solve the problem of when to update these observations. Investigation of how gating models relate to brain and behavior remains, however, at an early stage. The current study sought to explore the ability of simple RL gating models to replicate rule learning behavior in rats. Rats were… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
26
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 20 publications
(26 citation statements)
references
References 29 publications
0
26
0
Order By: Relevance
“…We then show how to implement KTD-SARSA and XKTD-SARSA alongside a working memory module. This parallels the approach of [8], [9], [14] in which SARSA or Actor-Critic methods have been enhanced with working memory to solve nonMarkovian tasks. The advantages of introducing the Bayesian approach are seen in the existence of a native mechanism for sharing information across stateaction pairs, replacing eligibility traces, and a native mechanism for balancing exploration and exploitation, removing the need to use an appropriately parameterised action-selection method.…”
Section: Introductionmentioning
confidence: 75%
See 1 more Smart Citation
“…We then show how to implement KTD-SARSA and XKTD-SARSA alongside a working memory module. This parallels the approach of [8], [9], [14] in which SARSA or Actor-Critic methods have been enhanced with working memory to solve nonMarkovian tasks. The advantages of introducing the Bayesian approach are seen in the existence of a native mechanism for sharing information across stateaction pairs, replacing eligibility traces, and a native mechanism for balancing exploration and exploitation, removing the need to use an appropriately parameterised action-selection method.…”
Section: Introductionmentioning
confidence: 75%
“…We simulated the 12-XY using KTD-WM, XKTD-WM and SARSA(λ) with WM as presented in [14]. We simulated each model for 50 times, each run consisting of N = 10000 trials, and tracked the learning curves as shown in Figures 2.…”
Section: B Example: 12-xy Taskmentioning
confidence: 99%
“…In this situation, frequent exposure to stimuli would shorten the effective time window during which previously experienced stimuli affected learning. To reflect this effect, a medium λ of 0.5 was used during the simulation [43][44][45].…”
Section: Resultsmentioning
confidence: 99%
“…This 295 enables M2 to make decisions based upon current and past information. Such a strategy had 296 been used to learn common rat behavioral tasks (Zilli and Hasselmo, 2008), and exhibits 297 features of rat behavior (Lloyd et al, 2012). In M2, the state, $ # = {" # , " #8, }, includes both the 298 current and the most recent past arm ( Fig 3A).…”
Section: Working Memory Rl Model Learns Task Slowly 291mentioning
confidence: 99%
“…we found that maximizing the likelihood that the model would chose the same arms as the 600 animal did not provide parameters that would then be able to generate statistics that captured 601 the way in which the rats behaved ( Fig 1C -D). Therefore, consistent with other studies using 602 RL agents to fit rodent behavior (Lloyd et al, 2012;Luksys et al, 2009), we utilized ABC 603 methods to minimize the difference between characteristic statistics of the rats and the model. 604…”
Section: Modeling Considerations: ML Vs Abc and When To Stop Modelingmentioning
confidence: 99%