2020
DOI: 10.48550/arxiv.2010.07877
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Avoiding Side Effects By Considering Future Tasks

Victoria Krakovna,
Laurent Orseau,
Richard Ngo
et al.

Abstract: Designing reward functions is difficult: the designer has to specify what to do (what it means to complete the task) as well as what not to do (side effects that should be avoided while completing the task). To alleviate the burden on the reward designer, we propose an algorithm to automatically generate an auxiliary reward function that penalizes side effects. This auxiliary objective rewards the ability to complete possible future tasks, which decreases if the agent causes side effects during the current tas… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 10 publications
(17 reference statements)
0
2
0
Order By: Relevance
“…This is particularly concerning given that Silver et al are highly influential researchers and employed at DeepMind, one of the organisations best equipped to expand the frontiers of AGI. While Silver et al "hope that other researchers will join us on our quest", we instead hope that the creation of AGI based on reward maximisation is tempered by other researchers with an understanding of the issues of AI safety [45,47] and an appreciation of the benefits of multi-objective agents [1,2].…”
Section: Discussionmentioning
confidence: 99%
“…This is particularly concerning given that Silver et al are highly influential researchers and employed at DeepMind, one of the organisations best equipped to expand the frontiers of AGI. While Silver et al "hope that other researchers will join us on our quest", we instead hope that the creation of AGI based on reward maximisation is tempered by other researchers with an understanding of the issues of AI safety [45,47] and an appreciation of the benefits of multi-objective agents [1,2].…”
Section: Discussionmentioning
confidence: 99%
“…One of these policies is then selected and executed, and a subsequent review of the outcomes may lead to an adjustment in overseer selection (without a need to remodel or retrain), or other changes such as the introduction of new objectives AGI. While Silver et al "hope that other researchers will join us on our quest", we instead hope that the creation of AGI based on reward maximisation is tempered by other researchers with an understanding of the issues of AI safety [48,50] and an appreciation of the benefits of multi-objective agents [1,2].…”
Section: Discussionmentioning
confidence: 99%