2020
DOI: 10.1155/2020/4708075
|View full text |Cite
|
Sign up to set email alerts
|

Hybrid Online and Offline Reinforcement Learning for Tibetan Jiu Chess

Abstract: In this study, hybrid state-action-reward-state-action (SARSAλ) and Q-learning algorithms are applied to different stages of an upper confidence bound applied to tree search for Tibetan Jiu chess. Q-learning is also used to update all the nodes on the search path when each game ends. A learning strategy that uses SARSAλ and Q-learning algorithms combining domain knowledge for a feedback function for layout and battle stages is proposed. An improved deep neural network based on ResNet18 is used for self-play tr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
0
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(1 citation statement)
references
References 27 publications
(35 reference statements)
0
0
0
Order By: Relevance
“…Since there is no interaction between the agent and physical model of ADN during training, this method can achieve physical model-free control. However, it requires a large amount of training data and distribution mismatch may degrade the performance of the algorithm even when sufficiently large and diverse data are given [33]. Ref.…”
Section: Introductionmentioning
confidence: 99%
“…Since there is no interaction between the agent and physical model of ADN during training, this method can achieve physical model-free control. However, it requires a large amount of training data and distribution mismatch may degrade the performance of the algorithm even when sufficiently large and diverse data are given [33]. Ref.…”
Section: Introductionmentioning
confidence: 99%