2020
DOI: 10.3390/app10134529
|View full text |Cite
|
Sign up to set email alerts
|

A Survey of Planning and Learning in Games

Abstract: In general, games pose interesting and complex problems for the implementation of intelligent agents and are a popular domain in the study of artificial intelligence. In fact, games have been at the center of some of the most well-known achievements in artificial intelligence. From classical board games such as chess, checkers, backgammon and Go, to video games such as Dota 2 and StarCraft II, artificial intelligence research has devised computer programs that can play at the level of a human master an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2020
2020
2025
2025

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 19 publications
(13 citation statements)
references
References 228 publications
0
13
0
Order By: Relevance
“…The Markov property states that given the current state and action, the next state is independent of all previous states and actions. MDPs can be described formally with the following components: denotes the state space of the process; is the set of actions; is the Markovian transition model, where P ( S t +1 | S t , A t ) is the probability of making a transition to state S t +1 when taking action A t in the state S t ; represents the reward function or feedback, R t , from the environment by which the success or failure of an agent’s actions is measured (Duarte et al, 2020). Figure 3 depicts the interaction between the agent and the environment in an MDP.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…The Markov property states that given the current state and action, the next state is independent of all previous states and actions. MDPs can be described formally with the following components: denotes the state space of the process; is the set of actions; is the Markovian transition model, where P ( S t +1 | S t , A t ) is the probability of making a transition to state S t +1 when taking action A t in the state S t ; represents the reward function or feedback, R t , from the environment by which the success or failure of an agent’s actions is measured (Duarte et al, 2020). Figure 3 depicts the interaction between the agent and the environment in an MDP.…”
Section: Methodsmentioning
confidence: 99%
“…It indicates the action A t to be taken while in the state S t . In the simplest case, the objective of RL is to find a policy that maximizes the discounted return G t for each state which is the total discount reward from time-step t [43]: where 0 ≤ γ < 1 is the discount rate to balance the immediate and future rewards. Given that the discounted return function is stochastic, the expected discounted return, starting from state S , taking action A , and following policy π , is given as [44]: where Q π ( S , A ) is called the “action-value function” and denotes the expectation operator.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…This section presents the AlphaZero-based algorithm allowing the robot to play Russian checkers. This algorithm does not need any input data for training except the game rules, which is a special feature of AlphaZero-based programs, and no database of the games, possible tricks, or tactics existing for the game are required [25]. Therefore, we decided to base our algorithm on the ideas of AlphaZero.…”
Section: Algorithm For Playing Russian Checkersmentioning
confidence: 99%
“…However, in many computer games, from classical board games such as chess, checkers, backgammon and Go to video games such as Dota 2 and StarCraft II, machine learning has made great achievements [8]. Computer programs with machine learning technology can play at the level of a human master and even at a human world champion level.…”
Section: Introductionmentioning
confidence: 99%