Benchmarking for Bayesian Reinforcement Learning

Castronovo, Michaël; Ernst, Damien; Couëtoux, Adrien; Fonteneau, Raphaël

doi:10.1371/journal.pone.0157088

Cited by 8 publications

(15 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In particular, let us mention Bayesian RL approaches (see Ghavamzadeh et al (2015) for an extensive literature review), which offer two interesting features: by assuming a prior distribution on potential (unknown) environments, Bayesian RL (i) allows to formalize Bayesian-optimal exploration / exploitation strategies, and (ii) offers the opportunity to incorporate prior knowledge into the prior distribution. However, most Bayesian RL algorithms suffer computational complexity (Castronovo et al (2016)). …”

Section: Resultsmentioning

confidence: 99%

Reinforcement Learning for Electric Power System Decision and Control: Past Considerations and Perspectives

2017

Self Cite

View full text Add to dashboard Cite

Abstract:In this paper, we review past (including very recent) research considerations in using reinforcement learning (RL) to solve electric power system decision and control problems. The RL considerations are reviewed in terms of specific electric power system problems, type of control and RL method used. We also provide observations about past considerations based on a comprehensive review of available publications. The review reveals the RL is considered as viable solutions to many decision and control problems across different time scales and electric power system states. Furthermore, we analyse the perspectives of RL approaches in light of the emergence of new-generation, communications, and instrumentation technologies currently in use, or available for future use, in power systems. The perspectives are also analysed in terms of recent breakthroughs in RL algorithms (Safe RL, Deep RL and path integral control for RL) and other, not previously considered, problems for RL considerations (most notably restorative, emergency controls together with so-called system integrity protection schemes, fusion with existing robust controls, and combining preventive and emergency control).

show abstract

Section: Resultsmentioning

confidence: 99%

Reinforcement Learning for Electric Power System Decision and Control: Past Considerations and Perspectives

2017

Self Cite

View full text Add to dashboard Cite

show abstract

“…The accuracy is depending on the number of nodes those algorithms are able to visit, which is limited by an on-line computation time budget. Despite theoretical guarantees to reach Bayesian optimality offered by BL approaches 1 , they may not be applicable when the time budget that can be allocated for on-line decision making is short (Castronovo et al, 2015). Another method, Smarter Best of Sampled Set (SBOSS) (Castro and Precup, 2010), samples several MDPs from the posterior distribution, builds a merged MDP, and computes its Q-function.…”

Section: State-of-the-artmentioning

confidence: 99%

“…In practice, this happens for example when training a drone to fly in a safe environment before sending it on the operation field (Zhang et al, 2015). This is called offline training and can be beneficial to the online performance in the real environment, even if prior knowledge is inaccurate (Castronovo et al, 2014).…”

Section: Introductionmentioning

confidence: 99%

“…Instead, they rely on Bayes updates and sampling techniques during the interaction, which may be too computationally expensive, even on very small MDPs (Castronovo et al, 2015). In order to reduce significantly this cost, we propose a new practical algorithm to solve BAMDPs: Artificial Neural Networks for Bayesian Reinforcement Learning (ANN-BRL).…”

Section: Introductionmentioning

confidence: 99%

“…In our experiments, we used a benchmark recently introduced in (Castronovo et al, 2015). It compares all the major state-of-the-art BRL algorithms on a wide array of test problems, and provides a detailed computation time analysis.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Approximate Bayes Optimal Policy Search using Neural Networks

Castronovo

François-Lavet

Fonteneau

et al. 2017

Proceedings of the 9th International Conference on Agents and Artificial Intelligence

Self Cite

View full text Add to dashboard Cite

Abstract:Bayesian Reinforcement Learning (BRL) agents aim to maximise the expected collected rewards obtained when interacting with an unknown Markov Decision Process (MDP) while using some prior knowledge. State-of-the-art BRL agents rely on frequent updates of the belief on the MDP, as new observations of the environment are made. This offers theoretical guarantees to converge to an optimum, but is computationally intractable, even on small-scale problems. In this paper, we present a method that circumvents this issue by training a parametric policy able to recommend an action directly from raw observations. Artificial Neural Networks (ANNs) are used to represent this policy, and are trained on the trajectories sampled from the prior. The trained model is then used online, and is able to act on the real MDP at a very low computational cost.Our new algorithm shows strong empirical performance, on a wide range of test problems, and is robust to inaccuracies of the prior distribution.

show abstract