2023
DOI: 10.1108/ria-10-2022-0248
|View full text |Cite
|
Sign up to set email alerts
|

Efficient experience replay architecture for offline reinforcement learning

Abstract: Purpose Offline reinforcement learning (RL) acquires effective policies by using prior collected large-scale data, while, in some scenarios, collecting data may be hard because it is time-consuming, expensive and dangerous, i.e. health care, autonomous driving, seeking a more efficient offline RL method. The purpose of the study is to introduce an algorithm, which attempts to sample the high-value transitions in the prioritized buffer, and uniformly sample from the normal experience buffer, improving sample ef… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(2 citation statements)
references
References 19 publications
0
1
0
Order By: Relevance
“…Motion planning for robotic arms usually uses either sparse rewards or dense rewards [16]. When using sparse rewards, in order to efficiently utilize the data, the recently Prioritized Experience Replay (PER) and Hindsight Experience Replay (HER) can achieve good results [17,18].…”
Section: Related Workmentioning
confidence: 99%
“…Motion planning for robotic arms usually uses either sparse rewards or dense rewards [16]. When using sparse rewards, in order to efficiently utilize the data, the recently Prioritized Experience Replay (PER) and Hindsight Experience Replay (HER) can achieve good results [17,18].…”
Section: Related Workmentioning
confidence: 99%
“…The dataset’s size is a critical factor, as larger datasets offer more diverse experiences, enabling better generalization. However, the size should be balanced with computational considerations [16]. A dataset’s relevance to the target task is paramount, ensuring that task-specific features contribute to the model’s adaptability.…”
Section: Introductionmentioning
confidence: 99%