2003 46th Midwest Symposium on Circuits and Systems
DOI: 10.1109/mwscas.2003.1562233
|View full text |Cite
|
Sign up to set email alerts
|

A Reinforcement Learning Method Based on Adaptive Simulated Annealing

Abstract: Reinforcement learning is a hard problem and the majority of the existing algorithms suffer from poor convergence properties for difficult problems. In this paper we propose a new reinforcement learning method, that utilizes the power of global optimization methods such as simulated annealing. Specifically, we use a particularly powerful version of simulated annealing called Adaptive Simulated Annealing (ASA) [3]. Towards this end we consider a batch formulation for the reinforcement learning problem, unlike t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
21
0

Publication Types

Select...
6
2
2

Relationship

0
10

Authors

Journals

citations
Cited by 31 publications
(22 citation statements)
references
References 14 publications
0
21
0
Order By: Relevance
“…However the greedy hill climbing approach can result in search getting trapped in local maxima thus requiring backtracking and restarts. An adaptive simulated annealing approach may also be applicable here [119]. The above description summarizes the relevance of Bayesian learning issues to our work.…”
Section: Related Work In Classifier Learningmentioning
confidence: 99%
“…However the greedy hill climbing approach can result in search getting trapped in local maxima thus requiring backtracking and restarts. An adaptive simulated annealing approach may also be applicable here [119]. The above description summarizes the relevance of Bayesian learning issues to our work.…”
Section: Related Work In Classifier Learningmentioning
confidence: 99%
“…Besides, researchers have proposed sequential learning algorithms for resource allocation networks to enhance the convergence of the training error and computational efficiency [25]. A reinforcement learning method based on adaptive simulated annealing has been adopted to improve a decision making test problem [26]. In the literature, the learning algorithms for reduction of the training data sequence with significant information generates less computation time for a minimal network and achieves better performance.…”
Section: Introductionmentioning
confidence: 99%
“…For policy modification in the epoch mode one also uses simulated annealing (Atiya et al, 2003), the tree based method (Ernst et al, 2005), the temporal differences approach (Lagoudakis and Parr, 2003;Markowska-Kaczmar and Kwaśnicka, 2005), neural networks (Riedmiller, 2005) or genetic algorithms (Moriarty et al, 1999) and evolutionary computation approaches (Whiteson, 2012). The suspension of the policy update, either for some number of iterations or until the end of the episode, causes the environment exploration to be performed on the basis of the policy which cannot follow the environment changes.…”
Section: Introductionmentioning
confidence: 99%