2016
DOI: 10.1371/journal.pone.0157088
|View full text |Cite
|
Sign up to set email alerts
|

Benchmarking for Bayesian Reinforcement Learning

Abstract: In the Bayesian Reinforcement Learning (BRL) setting, agents try to maximise the collected rewards while interacting with their environment while using some prior knowledge that is accessed beforehand. Many BRL algorithms have already been proposed, but the benchmarks used to compare them are only relevant for specific cases. The paper addresses this problem, and provides a new BRL comparison methodology along with the corresponding open source library. In this methodology, a comparison criterion that measures… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
15
0

Year Published

2016
2016
2021
2021

Publication Types

Select...
2
2
2

Relationship

2
4

Authors

Journals

citations
Cited by 8 publications
(15 citation statements)
references
References 6 publications
0
15
0
Order By: Relevance
“…In particular, let us mention Bayesian RL approaches (see Ghavamzadeh et al (2015) for an extensive literature review), which offer two interesting features: by assuming a prior distribution on potential (unknown) environments, Bayesian RL (i) allows to formalize Bayesian-optimal exploration / exploitation strategies, and (ii) offers the opportunity to incorporate prior knowledge into the prior distribution. However, most Bayesian RL algorithms suffer computational complexity (Castronovo et al (2016)). …”
Section: Resultsmentioning
confidence: 99%
“…In particular, let us mention Bayesian RL approaches (see Ghavamzadeh et al (2015) for an extensive literature review), which offer two interesting features: by assuming a prior distribution on potential (unknown) environments, Bayesian RL (i) allows to formalize Bayesian-optimal exploration / exploitation strategies, and (ii) offers the opportunity to incorporate prior knowledge into the prior distribution. However, most Bayesian RL algorithms suffer computational complexity (Castronovo et al (2016)). …”
Section: Resultsmentioning
confidence: 99%
“…The accuracy is depending on the number of nodes those algorithms are able to visit, which is limited by an on-line computation time budget. Despite theoretical guarantees to reach Bayesian optimality offered by BL approaches 1 , they may not be applicable when the time budget that can be allocated for on-line decision making is short (Castronovo et al, 2015). Another method, Smarter Best of Sampled Set (SBOSS) (Castro and Precup, 2010), samples several MDPs from the posterior distribution, builds a merged MDP, and computes its Q-function.…”
Section: State-of-the-artmentioning
confidence: 99%
“…In practice, this happens for example when training a drone to fly in a safe environment before sending it on the operation field (Zhang et al, 2015). This is called offline training and can be beneficial to the online performance in the real environment, even if prior knowledge is inaccurate (Castronovo et al, 2014).…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations