Our system is currently under heavy load due to increased usage. We're actively working on upgrades to improve performance. Thank you for your patience.
2017
DOI: 10.48550/arxiv.1711.09883
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

AI Safety Gridworlds

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
95
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
4
1

Relationship

1
9

Authors

Journals

citations
Cited by 65 publications
(107 citation statements)
references
References 0 publications
1
95
0
Order By: Relevance
“…This is particularly concerning given that Silver et al are highly influential researchers and employed at DeepMind, one of the organisations best equipped to expand the frontiers of AGI. While Silver et al "hope that other researchers will join us on our quest", we instead hope that the creation of AGI based on reward maximisation is tempered by other researchers with an understanding of the issues of AI safety [45,47] and an appreciation of the benefits of multi-objective agents [1,2].…”
Section: Discussionmentioning
confidence: 99%
“…This is particularly concerning given that Silver et al are highly influential researchers and employed at DeepMind, one of the organisations best equipped to expand the frontiers of AGI. While Silver et al "hope that other researchers will join us on our quest", we instead hope that the creation of AGI based on reward maximisation is tempered by other researchers with an understanding of the issues of AI safety [45,47] and an appreciation of the benefits of multi-objective agents [1,2].…”
Section: Discussionmentioning
confidence: 99%
“…Leike et al [134] introduced a suite of reinforcement learning environments to measure the agent's compliance with the intended safe behavior. Their work categorizes AI safety problems into two areas of specification and robustness, which cover various problems, including avoiding side effects and safe exploration, and robustness to self-modification and distributional shift.…”
Section: Ai Safetymentioning
confidence: 99%
“…When we train an agent to achieve a goal, we usually want it to achieve that goal following implicit safety constraints. Handcrafting such safety constraints would be time-consuming, difficult to scale for complex problems, and might lead to reward hacking; so a reasonable proxy consists in limiting irreversible side-effects in the environment [26].…”
Section: Learning Reversible Policiesmentioning
confidence: 99%