Coevolution Versus Self-Play Temporal Difference Learning for Acquiring Position Evaluation in Small-Board Go

Rúnarsson, Thomas Philip; Lucas, Simon M.

doi:10.1109/tevc.2005.856212

Cited by 59 publications

(67 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…board size, i.e. networks trained successfully on small boards (where training is efficient) do not play well when the board is enlarged [5,3]. The present paper builds on the promising preliminary results [6,7] of a scalable approach based on Multi-dimensional Recurrent Neural Networks (MDRNNs; [8,9]) and enhances the ability of that architecture to capture long-distance dependencies.…”

Section: Introductionmentioning

confidence: 94%

“…In addition, despite being described by a small set of formal rules, they often involve highly complex strategies. One of the most interesting board games is the ancient game of Go (among other reasons, because computer programs are still much weaker than human players), which can be solved for small boards [1] but is very challenging for larger ones [2,3]. Its extremely large search space defies traditional search-based methods.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Scalable Neural Networks for Board Games

Schaul

Schmidhuber

2009

Artificial Neural Networks – ICANN 2009

View full text Add to dashboard Cite

Abstract. Learning to solve small instances of a problem should help in solving large instances. Unfortunately, most neural network architectures do not exhibit this form of scalability. Our Multi-Dimensional Recurrent LSTM Networks, however, show a high degree of scalability, as we empirically show in the domain of flexible-size board games. This allows them to be trained from scratch up to the level of human beginners, without using domain knowledge.

show abstract

Section: Introductionmentioning

confidence: 94%

Section: Introductionmentioning

confidence: 99%

Scalable Neural Networks for Board Games

Schaul

Schmidhuber

2009

Artificial Neural Networks – ICANN 2009

View full text Add to dashboard Cite

show abstract

“…Similarly, Runarsson and Lucas [56] compare evolution and TD in small-board Go and find that TD learns much faster and in most cases achieves higher performance also. However, they find at least one setup, using coevolution, wherein evolution outperforms TD.…”

Section: Related Workmentioning

confidence: 92%

“…Those that do (e.g., [21,45,49,56,80]) rarely isolate the factors critical to the performance of each method. As a result, there are currently few general guidelines describing the methods' relative strengths and weaknesses.…”

Section: Introductionmentioning

confidence: 99%

Critical factors in the empirical performance of temporal difference and evolutionary methods for reinforcement learning

Whiteson

Taylor

Stone

2009

Auton Agent Multi-Agent Syst

View full text Add to dashboard Cite

Temporal difference and evolutionary methods are two of the most common approaches to solving reinforcement learning problems. However, there is little consensus on their relative merits and there have been few empirical studies that directly compare their performance. This article aims to address this shortcoming by presenting results of empirical comparisons between Sarsa and NEAT, two representative methods, in mountain car and keepaway, two benchmark reinforcement learning tasks. In each task, the methods are evaluated in combination with both linear and nonlinear representations to determine their best configurations. In addition, this article tests two specific hypotheses about the critical factors contributing to these methods' relative performance: (1) that sensor noise reduces the final performance of Sarsa more than that of NEAT, because Sarsa's learning updates are not reliable in the absence of the Markov property and (2) that stochasticity, by introducing noise in fitness estimates, reduces the learning speed of NEAT more than that of Sarsa. Experiments in variations of mountain car and keepaway designed to isolate these factors confirm both these hypotheses.

show abstract

“…Fogel [1] evolved neural networks for board evaluation in chess, and Schraudolph [2] similarly optimised board evaluation functions, but for the game Go and using TD-learning; Lucas and Runarsson [3] compared both methods. Moving on to games that actually require a computer to play (computer games proper, rather than just computerised games) optimisation algorithms have been applied to many simple arcade-style games such as Pacman [4], X-pilot [5] and Cellz [6].…”

Section: Optimisationmentioning

confidence: 99%

Computational Intelligence in Racing Games

Togelius

Lucas

Nardi

2007

Advanced Intelligent Paradigms in Computer Games

View full text Add to dashboard Cite

Abstract. This chapter surveys the research of us and others into applying evolutionary algorithms and other forms of computational intelligence to various aspects of racing games. We first discuss the various roles of computational intelligence in games, and then go on to describe the evolution of different types of car controllers, modelling of players' driving styles, evolution of racing tracks, comparisons of evolution with other forms of reinforcement learning, and modelling and controlling physical cars. It is suggested that computational intelligence can be used in different but complementary ways in racing games, and that there is unrealised potential for cross-fertilisation between research in evolutionary robotics and CI for games.

show abstract

Coevolution Versus Self-Play Temporal Difference Learning for Acquiring Position Evaluation in Small-Board Go

Cited by 59 publications

References 28 publications

Scalable Neural Networks for Board Games

Scalable Neural Networks for Board Games

Critical factors in the empirical performance of temporal difference and evolutionary methods for reinforcement learning

Computational Intelligence in Racing Games

Contact Info

Product

Resources

About