2020
DOI: 10.1109/access.2020.3023739
|View full text |Cite
|
Sign up to set email alerts
|

AlphaGo Policy Network: A DCNN Accelerator on FPGA

Abstract: The game of GO has long been regarded as the most challenging game for artificial intelligence because of its enormous search space and the difficulty of evaluating its board positions. In early 2016, the defeat of Lee Sedol by AlphaGo became the milestone of artificial intelligence. AlphaGo's success lies in that it efficiently combines policy and value networks with Monte Carlo tree search (MCTS). And these deep convolutional neural networks (DCNNs) are trained by the combination of supervised learning and r… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
6
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 14 publications
(7 citation statements)
references
References 28 publications
0
6
0
Order By: Relevance
“…CNNs can also be used to design and implement other application systems by combining with other technologies. Examples of these include speech recognition software, machine translation software, game agent Q network [4] , go program Alphago [5] , and machine translation software. These advancements have ushered in a new era of artificial intelligence that is marked by unparalleled prosperity and wide-ranging influence.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…CNNs can also be used to design and implement other application systems by combining with other technologies. Examples of these include speech recognition software, machine translation software, game agent Q network [4] , go program Alphago [5] , and machine translation software. These advancements have ushered in a new era of artificial intelligence that is marked by unparalleled prosperity and wide-ranging influence.…”
Section: Introductionmentioning
confidence: 99%
“…The representatives of special models are NIN [21] . The representatives of reinforcement models are DQN [4] and AlphaGo [5] . In addition, 3D CNN [22] , which changes the input form of the CNN, is also included to process color images, even video images.…”
Section: Introductionmentioning
confidence: 99%
“…In [6], a framework for integrating hardware and software optimizations for sparse CNNs is shown. It is worth adding, that [7] presented an AlphaGo Policy Network using DCNN accelerator on FPGA.…”
Section: Introductionmentioning
confidence: 99%
“…The basic principle of AlphaGo Zero algorithm is to combine the neural network of DL with the Monte Carlo method of RL. The specific mechanism is to train the deep convolutional neural networks (DCNN) with RL [28,29]. Different from AlphaGo Fan and AlphaGo Lee, AlphaGo Zero algorithm can enhance its performance in Go through complete independent exploration and learning without any supervision or artificial data [27].…”
Section: Introductionmentioning
confidence: 99%