2018
DOI: 10.48550/arxiv.1803.03453
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

The Surprising Creativity of Digital Evolution: A Collection of Anecdotes from the Evolutionary Computation and Artificial Life Research Communities

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
41
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
4

Relationship

2
7

Authors

Journals

citations
Cited by 30 publications
(41 citation statements)
references
References 0 publications
0
41
0
Order By: Relevance
“…The success of reward modeling relies heavily on the quality of the reward model. If the reward model only captures most aspects of the objective but not all of it, this can lead the agent to find undesirable degenerate solutions Lehman et al, 2018;Ibarz et al, 2018). In other words, the agent's behavior depends on the reward model in a way that is potentially very fragile.…”
Section: Challengesmentioning
confidence: 99%
See 1 more Smart Citation
“…The success of reward modeling relies heavily on the quality of the reward model. If the reward model only captures most aspects of the objective but not all of it, this can lead the agent to find undesirable degenerate solutions Lehman et al, 2018;Ibarz et al, 2018). In other words, the agent's behavior depends on the reward model in a way that is potentially very fragile.…”
Section: Challengesmentioning
confidence: 99%
“…On the one hand, we want ML to generate creative and brilliant solutions like AlphaGo's Move 37 (Metz, 2016)-a move that no human would have recommended, yet it completely turned the game in AlphaGo's favor. On the other hand, we want to avoid degenerate solutions that lead to undesired behavior like exploiting a bug in the environment simulator (Clark & Amodei, 2016;Lehman et al, 2018). In order to differentiate between these two outcomes, our agent needs to understand its user's intentions, and robustly achieve these intentions with its behavior.…”
Section: Introductionmentioning
confidence: 99%
“…Motivated by difficulty in reward specification [21], inverse reinforcement learning (IRL) methods estimate a reward function from human demonstrations [24,1,20,17,23]. The central assumption behind these methods is that human behavior is rational, i.e., optimal with respect to their reward (cumulative, in expectation).…”
Section: Introductionmentioning
confidence: 99%
“…EC methods were chosen because EC is arguably the most versatile of the metalearning approaches. It is a population-based search method; allowing for extensive exploration, which often results in creative, novel solutions that are not obvious at first [16]. EC has been successful in hyperparameter optimization and architecture design in particular [18,26,22,17].…”
Section: Introductionmentioning
confidence: 99%