2020 International Joint Conference on Neural Networks (IJCNN) 2020
DOI: 10.1109/ijcnn48605.2020.9206982
|View full text |Cite
|
Sign up to set email alerts
|

Novelty-Guided Reinforcement Learning via Encoded Behaviors

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
1
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 15 publications
(20 citation statements)
references
References 12 publications
0
8
0
Order By: Relevance
“…Although RLHF has shown promising results by incorporating fluency, progress in this field is impeded by a lack of publicly available benchmarks and implementation resources, leading to a perception that RL is a challenging approach for NLP. To address this issue, an open-source library named RL4LMs [49] has recently been introduced, consisting of building blocks for fine-tuning and evaluating RL algorithms on LM-based generation.…”
Section: Reinforcement Learning From Human Feedbackmentioning
confidence: 99%
“…Although RLHF has shown promising results by incorporating fluency, progress in this field is impeded by a lack of publicly available benchmarks and implementation resources, leading to a perception that RL is a challenging approach for NLP. To address this issue, an open-source library named RL4LMs [49] has recently been introduced, consisting of building blocks for fine-tuning and evaluating RL algorithms on LM-based generation.…”
Section: Reinforcement Learning From Human Feedbackmentioning
confidence: 99%
“…2.4). For this reason, disco has a wider scope than other related toolkits such as RL4LM (Ramamurthy et al, 2022), which centers on RL methods only. Nevertheless, there is a large space for cross-polination between RL-based frameworks and disco because of similarities in the algorithms (Korbak et al, 2022b).…”
Section: Related Work and Conclusionmentioning
confidence: 99%
“…On a standard laptop, OpenRL can complete the training of the CartPole task in just a few seconds. Compared to the RL4LMs framework (Ramamurthy et al, 2022), our training speed for dialogue tasks has improved by 17%, with improvements in various performance indicators as well (see Appendix C for specific experimental results).…”
Section: High Performancementioning
confidence: 99%
“…The table below presents the results of training on the dialogue task (Li et al, 2017) using OpenRL and comparing them with RL4LMs (Ramamurthy et al, 2022).…”
Section: Appendix a Openrl's General Code Interfacementioning
confidence: 99%