2010
DOI: 10.2200/s00268ed1v01y201005aim009
|View full text |Cite
|
Sign up to set email alerts
|

Algorithms for Reinforcement Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
406
0
7

Year Published

2011
2011
2018
2018

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 606 publications
(421 citation statements)
references
References 130 publications
0
406
0
7
Order By: Relevance
“…Despite the valuable insights that have been generated through their design and analysis, these algorithms are of limited practical import because state spaces in most contexts of practical interest are enormous. There is a need for algorithms that generalize from past experience in order to learn how to make effective decisions in reasonable time.There has been much work on reinforcement learning algorithms that generalize (see, e.g., [5,31,32,24] and references therein). Most of these algorithms do not come with statistical or computational efficiency guarantees, though there are a few noteworthy exceptions, which we now discuss.…”
mentioning
confidence: 99%
See 1 more Smart Citation
“…Despite the valuable insights that have been generated through their design and analysis, these algorithms are of limited practical import because state spaces in most contexts of practical interest are enormous. There is a need for algorithms that generalize from past experience in order to learn how to make effective decisions in reasonable time.There has been much work on reinforcement learning algorithms that generalize (see, e.g., [5,31,32,24] and references therein). Most of these algorithms do not come with statistical or computational efficiency guarantees, though there are a few noteworthy exceptions, which we now discuss.…”
mentioning
confidence: 99%
“…There has been much work on reinforcement learning algorithms that generalize (see, e.g., [5,31,32,24] and references therein). Most of these algorithms do not come with statistical or computational efficiency guarantees, though there are a few noteworthy exceptions, which we now discuss.…”
mentioning
confidence: 99%
“…A Reinforcement Learning (RL) agent learns its behaviour from interaction with an environment and the physical or virtual agents within it, where situations are mapped to actions by maximising a long-term reward signal [4]. An RL agent is typically characterised by: (i) a finite or infinite set of states S = {s i }; (ii) a finite or infinite set of actions A = {a j }; (iii) a state transition function T (s, a, s ) that specifies the next state s given the current state s and action a; (iv) a reward function R(s, a, s ) that specifies the reward given to the agent for choosing action a in state s and transitioning to state s ; and (v) a policy π : S → A that defines a mapping from states to actions.…”
Section: Deep Reinforcement Learning For Dialogue Controlmentioning
confidence: 99%
“…For in-depth discussions of the use of value function approximations, we refer the reader to Chapter 6 in Bertsekas (2011a), Bertsekas (2011b), Bertsekas and Tsitsiklis (1996), Sutton and Barto (1998), Szepesvari (2010) and Chapters 8-10 of Powell (2011), and the many references cited there.…”
Section: Policies Based On Value Function Approximationsmentioning
confidence: 99%