Self-Organizing Neural Networks Integrating Domain Knowledge and Reinforcement Learning

Teng, Teck-Hou; Tan, Ah-Hwee; Żurada, Jacek M.

doi:10.1109/tnnls.2014.2327636

Cited by 42 publications

(12 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Carroll and Seppi [18] defined similarity in terms of tasks, and proposed several possible task similarity measures based on the transfer time, policy overlap, Q-values, and reward structure respectively to measure the similarity between tasks. Teng [19] proposed to use self-organization neural networks to make the effective use of domain knowledge to reduce model complexity in RL. Through minimizing the reconstruction error of a restricted Boltzmann machine simulating the behavioral dynamics of two compared Markov decision processes, Ammar and Eaton [20] gave a data-driven automated Markov decision process similarity metric.…”

Section: Knowledge Transfer In Rlmentioning

confidence: 99%

“…Later, in order to solve the problems in the earlier works mentioned above, some automatic similarity estimation works [15][16][17][18][19][20][21] are presented. These works made use data-driven similarity metrics, including the Markov decision process (MDP) similarity metric [20], the Hausdorff metric [21], and the Kantorovich metric [21].…”

Section: Introductionmentioning

confidence: 99%

“…These works made use data-driven similarity metrics, including the Markov decision process (MDP) similarity metric [20], the Hausdorff metric [21], and the Kantorovich metric [21]. These works are highly interpretative, but computationally expensive and have trouble when handling situations involving a large number of tasks [19].…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Latent Structure Matching for Knowledge Transfer in Reinforcement Learning

Zhou¹,

Yang²

2020

Future Internet

View full text Add to dashboard Cite

Reinforcement learning algorithms usually require a large number of empirical samples and give rise to a slow convergence in practical applications. One solution is to introduce transfer learning: Knowledge from well-learned source tasks can be reused to reduce sample request and accelerate the learning of target tasks. However, if an unmatched source task is selected, it will slow down or even disrupt the learning procedure. Therefore, it is very important for knowledge transfer to select appropriate source tasks that have a high degree of matching with target tasks. In this paper, a novel task matching algorithm is proposed to derive the latent structures of value functions of tasks, and align the structures for similarity estimation. Through the latent structure matching, the highly-matched source tasks are selected effectively, from which knowledge is then transferred to give action advice, and improve exploration strategies of the target tasks. Experiments are conducted on the simulated navigation environment and the mountain car environment. The results illustrate the significant performance gain of the improved exploration strategy, compared with traditional ϵ -greedy exploration strategy. A theoretical proof is also given to verify the improvement of the exploration strategy based on latent structure matching.

show abstract

Section: Knowledge Transfer In Rlmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Latent Structure Matching for Knowledge Transfer in Reinforcement Learning

Zhou¹,

Yang²

2020

Future Internet

View full text Add to dashboard Cite

show abstract

“…TD-FALCON has three modes of operation. In the PER-FORM mode, Algorithm 1 is used to select cognitive node J for deriving action choice a for state s. In the LEARN mode, TD-FALCON learns the effect of action choice a on state s. In the INSERT mode, domain knowledge can be assimilated into FALCON [16].…”

Section: A Structure and Operating Modesmentioning

confidence: 99%

“…This means a suitable ρ c3 has to be used. Given that it is difficult to know a priori a suitable ρ c3 , ρ c3 was adapted iteratively [16] using…”

Section: B Bi-directional Adaptationmentioning

confidence: 99%

Fast Reinforcement Learning under Uncertainties with Self-Organizing Neural Networks

Teng

Tan

2015

2015 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)

Self Cite

View full text Add to dashboard Cite

Using feedback signals from the environment, a reinforcement learning (RL) system typically discovers action policies that recommend actions effective to the states based on a Q-value function. However, uncertainties over the estimation of the Q-values can delay the convergence of RL. For fast RL convergence by accounting for such uncertainties, this paper proposes several enhancements to the estimation and learning of the Q-value using a self-organizing neural network. Specifically, a temporal difference method known as Q-learning is complemented by a Q-value Polarization procedure, which contrasts the Q-values using feedback signals on the effect of the recommended actions. The polarized Q-values are then learned by the self-organizing neural network using a Bi-directional Template Learning procedure. Furthermore, the polarized Qvalues are in turn used to adapt the reward vigilance of the ART-based self-organizing neural network using a Bi-directional Adaptation procedure. The efficacy of the resultant system called Fast Learning (FL) FALCON is illustrated using two singletask problem domains with large MDPs. The experiment results from these problem domains unanimously show FL-FALCON converging faster than the compared approaches.

show abstract

Learning Generalized Video Memory for Automatic Video Captioning

Chang

Tan

2018

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Self-Organizing Neural Networks Integrating Domain Knowledge and Reinforcement Learning

Cited by 42 publications

References 23 publications

Latent Structure Matching for Knowledge Transfer in Reinforcement Learning

Latent Structure Matching for Knowledge Transfer in Reinforcement Learning

Fast Reinforcement Learning under Uncertainties with Self-Organizing Neural Networks

Learning Generalized Video Memory for Automatic Video Captioning

Contact Info

Product

Resources

About