A Reinforcement Learning Method Based on Adaptive Simulated Annealing

Atiya, Amir F.; Parlos, Alexander G.; Ingber, Lester

doi:10.1109/mwscas.2003.1562233

Cited by 31 publications

(22 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However the greedy hill climbing approach can result in search getting trapped in local maxima thus requiring backtracking and restarts. An adaptive simulated annealing approach may also be applicable here [119]. The above description summarizes the relevance of Bayesian learning issues to our work.…”

Section: Related Work In Classifier Learningmentioning

confidence: 99%

Learning topic description from clustering of trusted user roles and event models characterizing distributed provenance networks: a reinforcement learning approach

Mukherjee

Bandyopadhyay

2017

J Big Data

View full text Add to dashboard Cite

This paper proposes a reinforcement learning based message transfer model for transferring news report messages through a selected path in a trusted provenance network with the objective of maximizing the reward values based on trust or importance based and network congestion or utility based cost measures. The reward values are calculated along a dynamically defined policy path connecting start topic or event node to a goal topic or event or issue nodes for incrementally defined time windows for a given network congestion situation. A hierarchy of agents of trusted roles is used to accomplish the sub-goals associated with sub-story or subtopic in the provenance structure where an agent role may assume the semantic role of the associated subtopic. The twitted news story thread or plan of events is defined in this work from the starting topic or event node to the goal topic or event node for incrementally defined intervals of time. The graphs are clustered into subtopic and these sub-goals or sub topic nodes of a topic node at every level of granularity are associated with cluster of news reports which describe activities associated with sub-goal or sub-topic events. Such cluster of nodes may also represent drilled down sequence of sub-events describing a sub-topic or sub-goal node. The policy path in a topic or story graph model is defined by applying reinforcement learning principles on dynamically defined event models associated with evolution of topic definition observed from incrementally acquired samples of input training data spanning multiple time windows. We provide a methodology for unifying similar provenance graph models for adapting and averaging the policy path classifiers associated with individual models to produce a reduced set of unified models derived during training. A minimum set cover of classifiers is identified for the models and a clustering procedure of the models is suggested based on these classifiers. Other database clustering methods have also been suggested as alternatives for clustering these models. A collection of unified models are identified from the models identified within a cluster and the policy path classifiers associated with these models provide the story or topic descriptions destined to goal topic or event nodes characterizing these models within a cluster.

show abstract

Section: Related Work In Classifier Learningmentioning

confidence: 99%

Learning topic description from clustering of trusted user roles and event models characterizing distributed provenance networks: a reinforcement learning approach

Mukherjee

Bandyopadhyay

2017

J Big Data

View full text Add to dashboard Cite

show abstract

“…Besides, researchers have proposed sequential learning algorithms for resource allocation networks to enhance the convergence of the training error and computational efficiency [25]. A reinforcement learning method based on adaptive simulated annealing has been adopted to improve a decision making test problem [26]. In the literature, the learning algorithms for reduction of the training data sequence with significant information generates less computation time for a minimal network and achieves better performance.…”

Section: Introductionmentioning

confidence: 99%

Short-Term Load Forecasting Using Adaptive Annealing Learning Algorithm Based Reinforcement Neural Network

Lee

2016

Energies

View full text Add to dashboard Cite

Abstract:A reinforcement learning algorithm is proposed to improve the accuracy of short-term load forecasting (STLF) in this article. The proposed model integrates radial basis function neural network (RBFNN), support vector regression (SVR), and adaptive annealing learning algorithm (AALA). In the proposed methodology, firstly, the initial structure of RBFNN is determined by using an SVR. Then, an AALA with time-varying learning rates is used to optimize the initial parameters of SVR-RBFNN (AALA-SVR-RBFNN). In order to overcome the stagnation for searching optimal RBFNN, a particle swarm optimization (PSO) is applied to simultaneously find promising learning rates in AALA. Finally, the short-term load demands are predicted by using the optimal RBFNN. The performance of the proposed methodology is verified on the actual load dataset from the Taiwan Power Company (TPC). Simulation results reveal that the proposed AALA-SVR-RBFNN can achieve a better load forecasting precision compared to various RBFNNs.

show abstract

“…For policy modification in the epoch mode one also uses simulated annealing (Atiya et al, 2003), the tree based method (Ernst et al, 2005), the temporal differences approach (Lagoudakis and Parr, 2003;Markowska-Kaczmar and Kwaśnicka, 2005), neural networks (Riedmiller, 2005) or genetic algorithms (Moriarty et al, 1999) and evolutionary computation approaches (Whiteson, 2012). The suspension of the policy update, either for some number of iterations or until the end of the episode, causes the environment exploration to be performed on the basis of the policy which cannot follow the environment changes.…”

Section: Introductionmentioning

confidence: 99%

Epoch-incremental reinforcement learning algorithms

Zajdel

2013

International Journal of Applied Mathematics and Computer Science

View full text Add to dashboard Cite

In this article, a new class of the epoch-incremental reinforcement learning algorithm is proposed. In the incremental mode, the fundamental TD(0) or TD(λ) algorithm is performed and an environment model is created. In the epoch mode, on the basis of the environment model, the distances of past-active states to the terminal state are computed. These distances and the reinforcement terminal state signal are used to improve the agent policy.

show abstract

A Reinforcement Learning Method Based on Adaptive Simulated Annealing

Cited by 31 publications

References 14 publications

Learning topic description from clustering of trusted user roles and event models characterizing distributed provenance networks: a reinforcement learning approach

Learning topic description from clustering of trusted user roles and event models characterizing distributed provenance networks: a reinforcement learning approach

Short-Term Load Forecasting Using Adaptive Annealing Learning Algorithm Based Reinforcement Neural Network

Epoch-incremental reinforcement learning algorithms

Contact Info

Product

Resources

About