2012
DOI: 10.1007/s10994-012-5278-7
|View full text |Cite
|
Sign up to set email alerts
|

Optimal control as a graphical model inference problem

Abstract: We reformulate a class of non-linear stochastic optimal control problems introduced by Todorov (in Advances in Neural Information Processing Systems, vol. 19, pp. 1369Systems, vol. 19, pp. -1376Systems, vol. 19, pp. , 2007) as a Kullback-Leibler (KL) minimization problem. As a result, the optimal control computation reduces to an inference computation and approximate inference methods can be applied to efficiently compute approximate optimal controls. We show how this KL control theory contains the path i… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

5
279
0

Year Published

2012
2012
2023
2023

Publication Types

Select...
6
4

Relationship

1
9

Authors

Journals

citations
Cited by 267 publications
(284 citation statements)
references
References 27 publications
(44 reference statements)
5
279
0
Order By: Relevance
“…The equivalence in computational complexity is reflected in the fact that many procedures are found in both approximate solutions to optimal control and Bayesian inference. Examples here include minimisation of Kullback-Leibler divergences (Todorov 2008;Kappen et al 2009), and expectation maximisation (Toussaint and Storkey 2006), both of which can be formulated as minimising variational free energy (Neal and Hinton 1998). The main contribution of this paper concerns the interpretation of optimality, as opposed to an algorithmic contribution.…”
Section: Discussionmentioning
confidence: 99%
“…The equivalence in computational complexity is reflected in the fact that many procedures are found in both approximate solutions to optimal control and Bayesian inference. Examples here include minimisation of Kullback-Leibler divergences (Todorov 2008;Kappen et al 2009), and expectation maximisation (Toussaint and Storkey 2006), both of which can be formulated as minimising variational free energy (Neal and Hinton 1998). The main contribution of this paper concerns the interpretation of optimality, as opposed to an algorithmic contribution.…”
Section: Discussionmentioning
confidence: 99%
“…In this study, we use an information-theoretic model of bounded rational decision-making Braun, 2012, 2013;Braun and Ortega, 2014; that has precursors in the economic literature (McKelvey and Palfrey, 1995;Mattsson and Weibull, 2002;Sims, 2003Sims, , 2005Sims, , 2006Sims, , 2010Wolpert, 2006) and that is closely related to recent advances in the information theory of perception-action systems (Todorov, 2007(Todorov, , 2009Still, 2009;Friston, 2010;Peters et al, 2010;Tishby and Polani, 2011;Daniel et al, 2012Daniel et al, , 2013Kappen et al, 2012;Rawlik et al, 2012;Rubin et al, 2012;Neymotin et al, 2013;Tkačik and Bialek, 2014;Palmer et al, 2015). The basis of this approach is formalized by a free energy principle that trades off expected utility, and the cost of computation that is required to adapt the system accordingly in order to achieve high utility.…”
Section: Introductionmentioning
confidence: 99%
“…As a result, the optimal control can be estimated using Monte Carlo sampling. See [21,22,45,47] for earlier reviews and references.…”
Section: Introductionmentioning
confidence: 99%