2022
DOI: 10.1101/2022.05.18.492543
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Neural learning rules for generating flexible predictions and computing the successor representation

Abstract: The predictive nature of the hippocampus is thought to be useful for memory-guided cognitive behaviors. Inspired by the reinforcement learning literature, this notion has been formalized as a predictive map called the successor representation (SR). The SR captures a number of observations about hippocampal activity. However, the algorithm does not provide a neural mechanism for how such representations arise. Here, we show the dynamics of a recurrent neural network naturally calculate the SR when the synaptic … Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
29
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 11 publications
(29 citation statements)
references
References 105 publications
0
29
0
Order By: Relevance
“…Similarly, the possible accelerating effect of theta phase precession on sequence learning has also been described in a number of previous works [22, 56, 23, 24]. Until recently [40, 41], SR models have largely not connected with this literature: they either remain agnostic to the learning rule or assume temporal difference learning (which has been well-mapped onto striatal mechanisms [37, 57], but it is unclear how this is implemented in hippocampus) [55, 31, 36, 58, 59]. Thus, one contribution of this paper is to quantitatively and qualitatively compare theta-augmented STDP to temporal difference learning, and demonstrate where these functionally overlap.…”
Section: Discussionmentioning
confidence: 81%
See 2 more Smart Citations
“…Similarly, the possible accelerating effect of theta phase precession on sequence learning has also been described in a number of previous works [22, 56, 23, 24]. Until recently [40, 41], SR models have largely not connected with this literature: they either remain agnostic to the learning rule or assume temporal difference learning (which has been well-mapped onto striatal mechanisms [37, 57], but it is unclear how this is implemented in hippocampus) [55, 31, 36, 58, 59]. Thus, one contribution of this paper is to quantitatively and qualitatively compare theta-augmented STDP to temporal difference learning, and demonstrate where these functionally overlap.…”
Section: Discussionmentioning
confidence: 81%
“…Other contemporary theoretical works have made progress on biological mechanisms for implementing the successor representation algorithm using somewhat different but complementary approaches. Of particular note are the works by Fang et al [40], who show a recurrent network with weights trained via a Hebbian-like learning rule converges to the successor representation in steady state, and Bono et al [41] who derive a learning rule for a spiking feed-forward network which learns the SR of one-hot features by bootstrapping associations across time (see also [43]). Combined, the above models, as well as our own, suggest there may be multiple means of calculating successor features in biological circuits without requiring a direct implementation of temporal difference learning.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Similarly, the output of the endotaxis map network is related to future states of the agent (Eqns 5, 18). However, there is an important difference: The successor representation (at least as currently discussed) is designed to improve learning under a particular policy [13, 16, 22]. By contrast the endotaxis map network is independent of policy; it just reflects the objective connectivity of the environment.…”
Section: Discussionmentioning
confidence: 99%
“…To this end, we experimented with a nonlinear input-output function for the map cells, for example introducing a tanh() nonlinearity in Eqn 4. This boosts small output values and saturates at large values [16], but did not improve the overall performance of endotaxis. Instead, learning of the map was perturbed, because the learning algorithm (Alg 2) requires a substantial difference between direct activation of a map cell from a point cell and indirect activation of the neighboring map cell.…”
Section: Navigation Using the Learned Goal Signalmentioning
confidence: 99%