2021
DOI: 10.48550/arxiv.2107.01969
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

The MineRL BASALT Competition on Learning from Human Feedback

Abstract: The last decade has seen a significant increase of interest in deep learning research, with many public successes that have demonstrated its potential. As such, these systems are now being incorporated into commercial products. With this comes an additional challenge: how can we build AI systems that solve tasks where there is not a crisp, well-defined specification? While multiple solutions have been proposed, in this competition we focus on one in particular: learning from human feedback. Rather than trainin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
6
1

Relationship

2
5

Authors

Journals

citations
Cited by 7 publications
(6 citation statements)
references
References 12 publications
0
6
0
Order By: Relevance
“…More complex tasks involve partially observable worlds (Kim et al 2020) and object state changes (Misra et al 2018;Puig et al 2018;Shridhar et al 2020). Some works use a planner to generate ideal demonstrations that are then labeled, while others first gather instructions and gather human demonstrations (Misra et al 2018;Shah et al 2021;Abramson et al 2020). In TEACh, human instructions and demonstrations are gathered simultaneously.…”
Section: Related Workmentioning
confidence: 99%
“…More complex tasks involve partially observable worlds (Kim et al 2020) and object state changes (Misra et al 2018;Puig et al 2018;Shridhar et al 2020). Some works use a planner to generate ideal demonstrations that are then labeled, while others first gather instructions and gather human demonstrations (Misra et al 2018;Shah et al 2021;Abramson et al 2020). In TEACh, human instructions and demonstrations are gathered simultaneously.…”
Section: Related Workmentioning
confidence: 99%
“…Human feedback can enable agents to solve sequential decision-making tasks for which the rewards are not easily defined [4]. Prior work used NE input to author [5], [6] or interactively shape robot behaviors [7], accelerate the learning, guide exploration, and prevent undesired actions [8]- [11].…”
Section: A Learning From Ne Feedbackmentioning
confidence: 99%
“…The year 2021 saw a number of Minecraft competitions that were organized alongside this MineRL Diamond competition. A sister competition, MineRL BASALT (Shah et al, 2021), made use of the same MineRL library to express desired agent goals with human demonstrations instead of the rewards, and then used human evaluators to rate the agents on how well they completed the tasks. The IGLU competition (Kiseleva et al, 2021) used the building mechanics of Minecraft for two challenges for the participants: build an agent that provides descriptions on what to build, and then build an agent which builds a structure based on the instructions.…”
Section: Related Workmentioning
confidence: 99%