2016
DOI: 10.1101/lm.041780.116
|View full text |Cite
|
Sign up to set email alerts
|

Active Inference, epistemic value, and vicarious trial and error

Abstract: Balancing habitual and deliberate forms of choice entails a comparison of their respective merits-the former being faster but inflexible, and the latter slower but more versatile. Here, we show that arbitration between these two forms of control can be derived from first principles within an Active Inference scheme. We illustrate our arguments with simulations that reproduce rodent spatial decisions in T-mazes. In this context, deliberation has been associated with vicarious trial and error (VTE) behavior (i.e… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
47
0

Year Published

2016
2016
2021
2021

Publication Types

Select...
9

Relationship

6
3

Authors

Journals

citations
Cited by 47 publications
(50 citation statements)
references
References 83 publications
1
47
0
Order By: Relevance
“…The resulting mixture of epistemic and pragmatic value turns out to be the free energy expected under any sequence of actions or policy. In short, the active inference we have demonstrated in this work has a construct validity in terms of recent work on more abstract formulations of exploration and exploitation (Friston et al., 2015, Friston et al., 2016a, Friston et al., 2016b, Pezzulo and Rigoli, 2011, Pezzulo et al., 2015, Pezzulo et al., 2016). …”
Section: Discussionmentioning
confidence: 99%
“…The resulting mixture of epistemic and pragmatic value turns out to be the free energy expected under any sequence of actions or policy. In short, the active inference we have demonstrated in this work has a construct validity in terms of recent work on more abstract formulations of exploration and exploitation (Friston et al., 2015, Friston et al., 2016a, Friston et al., 2016b, Pezzulo and Rigoli, 2011, Pezzulo et al., 2015, Pezzulo et al., 2016). …”
Section: Discussionmentioning
confidence: 99%
“…The formalism used in this paper builds upon our previous treatments of Markov decision processes (Schwartenbeck et al, 2013, Friston et al, 2014, Friston et al, 2015, Pezzulo et al, 2015, Pezzulo et al, 2016). Specifically, we extend sequential policy optimisation to include action-state policies of the sort optimised by dynamic programming and backwards induction (Bellman, 1952, Howard, 1960).…”
Section: Active Inference and Learningmentioning
confidence: 99%
“…In this case, the predictions propagated by higher hierarchical levels act as goal states, and engaging arc reflexes minimizes the ensuing prediction errors. This scheme can be extended to the planning of sequences of actions (or policies) if the agent can predictively compare the (integral of) free energy or surprise conditioned on a series of successive actions (e.g., whether turning twice right or twice left will bring to the expected goal state) . This example requires engaging the generative model to internally generate and evaluate sequences of predictions.…”
Section: A Computational Perspective On Igssmentioning
confidence: 99%