2021
DOI: 10.48550/arxiv.2107.08888
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Multimodal Reward Shaping for Efficient Exploration in Reinforcement Learning

Abstract: Maintaining long-term exploration ability remains one of the challenges of deep reinforcement learning (DRL). In practice, the reward shaping-based approaches are leveraged to provide intrinsic rewards for the agent to incentivize motivation. However, most existing IRS modules rely on attendant models or additional memory to record and analyze learning procedures, which leads to high computational complexity and low robustness. Moreover, they overemphasize the influence of a single state on exploration, which … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 22 publications
(15 reference statements)
0
2
0
Order By: Relevance
“…Meanwhile, there are also some autonomous systems, such as autonomous driving and drone navigation, that require intelligent agents with computing and learning capabilities to perform real-time decision-making in dynamic environments based on deep reinforcement learning (DRL). However, DRL still faces some challenges in practical applications, such as slow convergence [97], overfitting problems [98], and poor exploration in complex environments [99]. Collaborative DRL (CDRL) is treated as a promising solution to the above issues, wherein the agents can share their experiences and collaboratively learn the optimal policy for their task [69].…”
Section: B Machine Learningmentioning
confidence: 99%
“…Meanwhile, there are also some autonomous systems, such as autonomous driving and drone navigation, that require intelligent agents with computing and learning capabilities to perform real-time decision-making in dynamic environments based on deep reinforcement learning (DRL). However, DRL still faces some challenges in practical applications, such as slow convergence [97], overfitting problems [98], and poor exploration in complex environments [99]. Collaborative DRL (CDRL) is treated as a promising solution to the above issues, wherein the agents can share their experiences and collaboratively learn the optimal policy for their task [69].…”
Section: B Machine Learningmentioning
confidence: 99%
“…Deep reinforcement learning (DRL) is viewed as a promising approach for training intelligent IoT agents to tackle complex tasks such as autonomous navigation. However, DRL methods can lead to slow convergence [2], overfitting problems [3], or sub-optimal performance due to poor exploration in complex environments [4]. These challenges limit the applications of DRL for real-time autonomous IoT services where convergence time, generalizability of the learning, and performance are all important factors.…”
Section: Introductionmentioning
confidence: 99%