Probabilistic Guided Exploration for Reinforcement Learning in Self-Organizing Neural Networks

Wang, Peng; Zhou, Weigui; Wang, Di; Tan, Ah-Hwee

doi:10.1109/agents.2018.8460067

Cited by 5 publications

(2 citation statements)

References 5 publications

(7 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Another approach of this type could be reducing states for exploration based on some predefined metric. An approach using the adaptive resonance theorem (ART) [150] was presented in [151] and was later extended in [152]. In ART, knowledge about actions can be split into: (i) positive chunk which leads to positive rewards, (ii) negative chunk which leads to negative results, and (iii) empty chunk which is not yet taken.…”

Section: Exploration Parameters Methodsmentioning

confidence: 99%

See 1 more Smart Citation

Exploration in Deep Reinforcement Learning: A Survey

Ladosz,

Weng,

Kim

et al. 2022

Preprint

View full text Add to dashboard Cite

This paper reviews exploration techniques in deep reinforcement learning. Exploration techniques are of primary importance when solving sparse reward problems. In sparse reward problems, the reward is rare, which means that the agent will not find the reward often by acting randomly. In such a scenario, it is challenging for reinforcement learning to learn rewards and actions association. Thus more sophisticated exploration methods need to be devised. This review provides a comprehensive overview of existing exploration approaches, which are categorized based on the key contributions as follows reward novel states, reward diverse behaviours, goal-based methods, probabilistic methods, imitation-based methods, safe exploration and random-based methods. Then, the unsolved challenges are discussed to provide valuable future research directions. Finally, the approaches of different categories are compared in terms of complexity, computational effort and overall performance.

show abstract

Section: Exploration Parameters Methodsmentioning

confidence: 99%

“…In this approach, the action is randomly chosen from positive and no chunks; thus, the agent is exploring either new things or ones with the positive reward. Wang et al [152] extended this to include the probability of selecting the remaining actions based on how well they are known.…”

Section: Exploration Parameters Methodsmentioning

confidence: 99%