Computing “Elo Ratings” of Move Patterns in the Game of Go1

Coulom, Rémi

doi:10.3233/icg-2007-30403

Cited by 210 publications

(213 citation statements)

References 11 publications

(13 reference statements)

Supporting

Mentioning

199

Contrasting

Unclassified

Order By: Relevance

“…This algorithm has the advantage that it is anytime: we do not have to know in advance at which value of t the algorithm will be stopped. [3] applied it successfully in the very efficient CrazyStone implementation of Monte-Carlo Tree Search [4]. Upper Confidence Tree (or Monte-Carlo Tree Search) is not a simple setting as above: when applying an option, we reach a new state; one can think of Monte-Carlo Tree Search (or UCT) as having one bandit in each possible state s of the reinforcement learning problem, for choosing between (infinitely many) options o 1 (s), o 2 (s), .…”

Section: Progressive Wideningmentioning

confidence: 99%

“…Progressive strategies have been proposed in [4,2] for tackling problems with big action spaces; they have been theoretically analyzed in [13], and used for continuous spaces in [11,12]. We will here (i) define a variant of progressive widening (section 2.1), (ii) show why it can't be directly applied in some cases (section 2.2), (iii) define our version (section 2.3).…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Continuous Upper Confidence Trees

Couëtoux

Hoock

Sokolovska

et al. 2011

Lecture Notes in Computer Science

127

104

View full text Add to dashboard Cite

Abstract. Upper Confidence Trees are a very efficient tool for solving Markov Decision Processes; originating in difficult games like the game of Go, it is in particular surprisingly efficient in high dimensional problems. It is known that it can be adapted to continuous domains in some cases (in particular continuous action spaces). We here present an extension of Upper Confidence Trees to continuous stochastic problems. We (i) show a deceptive problem on which the classical Upper Confidence Tree approach does not work, even with arbitrarily large computational power and with progressive widening (ii) propose an improvement, termed double-progressive widening, which takes care of the compromise between variance (we want infinitely many simulations for each action/state) and bias (we want sufficiently many nodes to avoid a bias by the first nodes) and which extends the classical progressive widening (iii) discuss its consistency and show experimentally that it performs well on the deceptive problem and on experimental benchmarks. We guess that the double-progressive widening trick can be used for other algorithms as well, as a general tool for ensuring a good bias/variance compromise in search algorithms.

show abstract

Section: Progressive Wideningmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Continuous Upper Confidence Trees

Couëtoux

Hoock

Sokolovska

et al. 2011

Lecture Notes in Computer Science

127

104

View full text Add to dashboard Cite

show abstract

“…Many people have tried to improve the MC engine by increasing its level (the strength of the Monte-Carlo simulator as a standalone player), but it is shown clearly in [13,10] that this is not the good criterion: a MC engine M C 1 which plays significantly better than another M C 2 can lead to very poor results as a module in MCTS, whenever the computational cost is the same. Some MC engines have been learnt on datasets [8], but the results are strongly improved by changing the constants manually. In that sense, designing and calibrating a MC engine remains an open challenge: one has to intensively experiment a modification in order to validate it.…”

Section: Improving Monte-carlo (Mc) Simulationsmentioning

confidence: 99%

Adding Expert Knowledge and Exploration in Monte-Carlo Tree Search

Chaslot

Fiter

Hoock

et al. 2010

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Abstract. We present a new exploration term, more efficient than classical UCT-like exploration terms and combining efficiently expert rules, patterns extracted from datasets, All-Moves-As-First values and classical online values. As this improved bandit formula does not solve several important situations (semeais, nakade) in computer Go, we present three other important improvements which are central in the recent progress of our program MoGo:-We show an expert-based improvement of Monte-Carlo simulations for nakade situations; we also emphasize some limitations of this modification. -We show a technique which preserves diversity in the Monte-Carlo simulation, which greatly improves the results in 19x19. -Whereas the UCB-based exploration term is not efficient in MoGo, we show a new exploration term which is highly efficient in MoGo. MoGo recently won a game with handicap 7 against a 9Dan Pro player, Zhou JunXun, winner of the LG Cup 2007, and a game with handicap 6 against a 1Dan pro player, Li-Chen Chien. 1

show abstract

“…It has been greatly improved by including Progressive Widening and Double Progressive Widening [6,2], RAVE values [7], Blind Values [4], and handcrafted Monte-Carlo moves [17,10]. A crucial component is the Monte-Carlo move generator, also known as the playout generator.…”

Section: Introductionmentioning

confidence: 99%

Learning a Move-Generator for Upper Confidence Trees

Couëtoux

Teytaud

Doghmen

2013

Advances in Intelligent Systems and Applications - Volume 1

View full text Add to dashboard Cite

Abstract. We experiment the introduction of machine learning tools to improve Monte-Carlo Tree Search. More precisely, we propose the use of Direct Policy Search, a classical reinforcement learning paradigm, to learn the Monte-Carlo Move Generator. We experiment our algorithm on different forms of unit commitment problems, including experiments on a problem with both macrolevel and microlevel decisions.

show abstract

Computing “Elo Ratings” of Move Patterns in the Game of Go1

Cited by 210 publications

References 11 publications

Continuous Upper Confidence Trees

Continuous Upper Confidence Trees

Adding Expert Knowledge and Exploration in Monte-Carlo Tree Search

Learning a Move-Generator for Upper Confidence Trees

Contact Info

Product

Resources

About