Artificial Intelligence Safety and Security 2018
DOI: 10.1201/9781351251389-7
|View full text |Cite
|
Sign up to set email alerts
|

The Value Learning Problem

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
17
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 25 publications
(18 citation statements)
references
References 10 publications
0
17
0
Order By: Relevance
“…Another widely-used definition of AGI safety is value-alignment between humans and AGI, and herein, between AGI n and AGI n+1 . Value-sets, from which goals are generated, can be hard-coded, re-coded in AGI versions, or be more dynamic by programming the AGI to learn the desired values via techniques such as inverse reinforcement learning [3,17,21]. In such a scenario, which saints will select the saintly humans to emulate?…”
Section: Lack Of Proof Of Safe Agi or Methods To Prove Safe Agimentioning
confidence: 99%
See 1 more Smart Citation
“…Another widely-used definition of AGI safety is value-alignment between humans and AGI, and herein, between AGI n and AGI n+1 . Value-sets, from which goals are generated, can be hard-coded, re-coded in AGI versions, or be more dynamic by programming the AGI to learn the desired values via techniques such as inverse reinforcement learning [3,17,21]. In such a scenario, which saints will select the saintly humans to emulate?…”
Section: Lack Of Proof Of Safe Agi or Methods To Prove Safe Agimentioning
confidence: 99%
“…The ability of AGI to self-correct or to assist its designers in correction of value alignment and behavior is called 'corrigibility' by Soares [21]. Miller et al review and examine how corrigibility can result in mis-alignment of values [14].…”
Section: Probabilistically Checkable Proofs (Pcp Theorem)mentioning
confidence: 99%
“…Another widely used definition of AGI safety is value-alignment between humans and AGI, and herein, between AGI n and AGI n+1 . Value-sets, from which goals are generated, can be hard-coded, re-coded in AGI versions, or be more dynamic by programming the AGI to learn the desired values via techniques such as inverse reinforcement learning [3,16,20]. In such a scenario, which saints will select the saintly humans to emulate?…”
Section: Lack Of Proof Of Safe Agi or Methods To Prove Safe Agimentioning
confidence: 99%
“…The ability of AGI to self-correct or to assist its designers in correction of value alignment and behavior is called 'corrigibility' by Soares [20]. Miller et al review and examine how corrigibility can result in mis-alignment of values [50].…”
Section: Probabilistically Checkable Proofs (Pcp Theorem)mentioning
confidence: 99%
“…However, the use of personal data without consent is one of the main preoccupations found in the literature involving AI Ethics. (Soares, 2016;Russel, 2019), and even how to integrate human society in a post-Singularity era (Chalmers, 2010).…”
Section: Privacymentioning
confidence: 99%