Avoiding Side Effects By Considering Future Tasks

Krakovna, Victoria; Orseau, Laurent; Ngo, Richard; Martic, Miljan; Legg, Shane

doi:10.48550/arxiv.2010.07877

Cited by 2 publications

(2 citation statements)

References 10 publications

(17 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This is particularly concerning given that Silver et al are highly influential researchers and employed at DeepMind, one of the organisations best equipped to expand the frontiers of AGI. While Silver et al "hope that other researchers will join us on our quest", we instead hope that the creation of AGI based on reward maximisation is tempered by other researchers with an understanding of the issues of AI safety [45,47] and an appreciation of the benefits of multi-objective agents [1,2].…”

Section: Discussionmentioning

confidence: 99%

Scalar reward is not enough: A response to Silver, Singh, Precup and Sutton (2021)

Vamplew¹,

Smith²,

Källström³

et al. 2021

Preprint

View full text Add to dashboard Cite

The recent paper "Reward is Enough" by Silver, Singh, Precup and Sutton posits that the concept of reward maximisation is sufficient to underpin all intelligence, both natural and artificial. We contest the underlying assumption of Silver et al. that such reward can be scalar-valued. In this paper we explain why scalar rewards are insufficient to account for some aspects of both biological and computational intelligence, and argue in favour of explicitly multi-objective models of reward maximisation. Furthermore, we contend that even if scalar reward functions can trigger intelligent behaviour in specific cases, it is still undesirable to use this approach for the development of artificial general intelligence due to unacceptable risks of unsafe or unethical behaviour.

show abstract

Section: Discussionmentioning

confidence: 99%

Scalar reward is not enough: A response to Silver, Singh, Precup and Sutton (2021)

Vamplew¹,

Smith²,

Källström³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…One of these policies is then selected and executed, and a subsequent review of the outcomes may lead to an adjustment in overseer selection (without a need to remodel or retrain), or other changes such as the introduction of new objectives AGI. While Silver et al "hope that other researchers will join us on our quest", we instead hope that the creation of AGI based on reward maximisation is tempered by other researchers with an understanding of the issues of AI safety [48,50] and an appreciation of the benefits of multi-objective agents [1,2].…”

Section: Discussionmentioning

confidence: 99%

Scalar reward is not enough: a response to Silver, Singh, Precup and Sutton (2021)

Vamplew

Smith

Källström

et al. 2022

Auton Agent Multi-Agent Syst

View full text Add to dashboard Cite

The recent paper “Reward is Enough” by Silver, Singh, Precup and Sutton posits that the concept of reward maximisation is sufficient to underpin all intelligence, both natural and artificial, and provides a suitable basis for the creation of artificial general intelligence. We contest the underlying assumption of Silver et al. that such reward can be scalar-valued. In this paper we explain why scalar rewards are insufficient to account for some aspects of both biological and computational intelligence, and argue in favour of explicitly multi-objective models of reward maximisation. Furthermore, we contend that even if scalar reward functions can trigger intelligent behaviour in specific cases, this type of reward is insufficient for the development of human-aligned artificial general intelligence due to unacceptable risks of unsafe or unethical behaviour.

show abstract

Avoiding Side Effects By Considering Future Tasks

Cited by 2 publications

References 10 publications

Scalar reward is not enough: A response to Silver, Singh, Precup and Sutton (2021)

Scalar reward is not enough: A response to Silver, Singh, Precup and Sutton (2021)

Scalar reward is not enough: a response to Silver, Singh, Precup and Sutton (2021)

Contact Info

Product

Resources

About