2023
DOI: 10.1007/s10846-023-01917-z
|View full text |Cite
|
Sign up to set email alerts
|

Smooth Q-Learning: An Algorithm for Independent Learners in Stochastic Cooperative Markov Games

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 21 publications
0
1
0
Order By: Relevance
“…According to the Bellman Equation [33,34], given a search strategy π, we define Q as the search state s t , and search action a t and the expectation of the reward discount sum of the subsequent time steps in strategy π. The implementation of the Q-learning method is as follows: At each time step t, we observe the current search state s t , and select and execute the search action a t .…”
Section: Algorithm Designingmentioning
confidence: 99%
“…According to the Bellman Equation [33,34], given a search strategy π, we define Q as the search state s t , and search action a t and the expectation of the reward discount sum of the subsequent time steps in strategy π. The implementation of the Q-learning method is as follows: At each time step t, we observe the current search state s t , and select and execute the search action a t .…”
Section: Algorithm Designingmentioning
confidence: 99%