2021 6th International Conference on Computational Intelligence and Applications (ICCIA) 2021
DOI: 10.1109/iccia52886.2021.00011
|View full text |Cite
|
Sign up to set email alerts
|

Prioritized Experience Replay for Continual Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
61
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 33 publications
(61 citation statements)
references
References 15 publications
0
61
0
Order By: Relevance
“…It is worth mentioning that many other algorithms are also introducing a variety of experience replay schemes. Some of them [43,55] depend on new components, and others [47] have different algorithm architectures. Since the backbone MARL algorithm of our choice in this experiment is QMIX, we do not expect a significant change over the algorithm architecture (e.g., actor-network) or major components (e.g., loss structure) as presented in other approaches to realize a relatively fair comparison.…”
Section: Comparison With Existing Experience Replay Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…It is worth mentioning that many other algorithms are also introducing a variety of experience replay schemes. Some of them [43,55] depend on new components, and others [47] have different algorithm architectures. Since the backbone MARL algorithm of our choice in this experiment is QMIX, we do not expect a significant change over the algorithm architecture (e.g., actor-network) or major components (e.g., loss structure) as presented in other approaches to realize a relatively fair comparison.…”
Section: Comparison With Existing Experience Replay Methodsmentioning
confidence: 99%
“…[33] uses the regret minimization method to design the prioritized experience replay scheme for the only agent in the environment. MaPER [43] employs model learning to improve experience replay by using a model-augmented critic network and modifying the rule of priority. Also, new loss function designs can help develop prioritization schemes [55].…”
Section: Single-agent Experience Replaymentioning
confidence: 99%
“…However, as the number of tasks increases, the fraction of memory allocated to each task shrinks, resulting in fewer samples per task for rehearsal. Other more sophisticated strategies focus on prioritising replay [34], storing and replaying exemplars from each task to best approximate task means [35], [36] or applying reservoir sampling to fix a budget for each seen task [37].…”
Section: A Rehearsalmentioning
confidence: 99%
“…Their simplicity and ease of implementation distinguish these methods, so they are, in general, the most popular approach. RLED value-based methods extend the most popular methods, such as Q-Learning [97], SARSA [72], Deep Q-Networks (DQN) [61], Double DQN (DDQN) [90], Prioritized Dueling Double Deep Q-Networks (PDD DQN) [75], and Dueling Network Architectures for Deep Reinforcement Learning [94]. -Policy-based methods directly estimate the control policy.…”
Section: Reinforcement Learning From Expert Demonstrationsmentioning
confidence: 99%