2012
DOI: 10.1007/978-3-642-31087-4_79
|View full text |Cite
|
Sign up to set email alerts
|

Discretized Bayesian Pursuit – A New Scheme for Reinforcement Learning

Abstract: Abstract. The success of Learning Automata (LA)-based estimator algorithms over the classical, Linear Reward-Inaction (L RI )-like schemes, can be explained by their ability to pursue the actions with the highest reward probability estimates. Without access to reward probability estimates, it makes sense for schemes like the L RI to first make large exploring steps, and then to gradually turn exploration into exploitation by making progressively smaller learning steps. However, this behavior becomes counter-in… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
16
0

Year Published

2013
2013
2022
2022

Publication Types

Select...
4
3

Relationship

5
2

Authors

Journals

citations
Cited by 20 publications
(16 citation statements)
references
References 17 publications
(21 reference statements)
0
16
0
Order By: Relevance
“…The most difficult part in the design and analysis of LA consists of the formal proofs of their convergence accuracies. 1 The mathematical techniques used for the various families (FSSA, VSSA, Discretized, etc.) are quite distinct.…”
Section: Outline Of the Classification Of Learning Automatamentioning
confidence: 99%
See 1 more Smart Citation
“…The most difficult part in the design and analysis of LA consists of the formal proofs of their convergence accuracies. 1 The mathematical techniques used for the various families (FSSA, VSSA, Discretized, etc.) are quite distinct.…”
Section: Outline Of the Classification Of Learning Automatamentioning
confidence: 99%
“…In this paper, we extend the BPA into the domain of discretization, and propose a new Bayesian estimator algorithm, namely, the Discretized Bayesian Pursuit Algorithm (DBPA) [1]. Firstly, the DBPA maintains an action probability vector for selecting actions.…”
Section: Contributions and Paper Organizationmentioning
confidence: 99%
“…Although LA have been studied extensively [1,[13][14][15]17] and been applied in many fields [4,9], designing LA when the number of actions involved, R, is large is extremely complex. The solution that we propose in this paper attempts to resolve this problem.…”
Section: Introductionmentioning
confidence: 99%
“…If the values allowed are equally spaced in this interval, the discretization is said to be linear, otherwise, the discretization is called non-linear. Following the discretization concept, many of the continuous VSSA have been discretized; indeed, discretized versions of almost all continuous automata have been reported [10,13,14].…”
Section: Introductionmentioning
confidence: 99%
“…In order to highlight the distinct characteristics of the DPA within the family of PAs, the continuous version is referred to as the CPA 2 . We briefly mention that discretized versions of all the reported EA schemes have been devised [9,13,14].…”
Section: Introductionmentioning
confidence: 99%