2021
DOI: 10.1109/tvt.2021.3120292
|View full text |Cite
|
Sign up to set email alerts
|

Stochastic Game Based Cooperative Alternating Q-Learning Caching in Dynamic D2D Networks

Abstract: Edge caching has become an effective solution to cope with the challenges brought by the massive content delivery in cellular networks. In device-to-device (D2D) enabled caching cellular networks with time-varying content popularity distribution and user terminal (UT) location, we model these dynamic networks as a stochastic game to design a cooperative cache placement policy. We consider the long-term cache placement reward of all UTs in this stochastic game, where each UT becomes an agent and the cache place… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 14 publications
(7 citation statements)
references
References 44 publications
(72 reference statements)
0
4
0
Order By: Relevance
“…Literature [20] adopts the Stackelberg game to optimize UA, power allocation of non-orthogonal multiple access (NOMA), unmanned aerial vehicle (UAV) deployment and caching placement to minimize the content delivery delay. In [21], the authors improve the content caching and sharing of D2D networks by a CAQL-based caching placement algorithm. Taking into account Coordinated MultiPoint (CoMP) joint transmission technique, a reinforcement learning (RL)based algorithm is presented in [22] to maximize the delay reduction.…”
Section: A Related Workmentioning
confidence: 99%
“…Literature [20] adopts the Stackelberg game to optimize UA, power allocation of non-orthogonal multiple access (NOMA), unmanned aerial vehicle (UAV) deployment and caching placement to minimize the content delivery delay. In [21], the authors improve the content caching and sharing of D2D networks by a CAQL-based caching placement algorithm. Taking into account Coordinated MultiPoint (CoMP) joint transmission technique, a reinforcement learning (RL)based algorithm is presented in [22] to maximize the delay reduction.…”
Section: A Related Workmentioning
confidence: 99%
“…Nadia Abdolkhani et al 25 propose a close to optimal low complexity heuristic cache placement policy To solve the users' equipment (UE) cache memory sizes inconsistency problem. Zhang et al 26 model the dynamic network in a D2D caching cellular network with content popularity distribution and user terminal location time‐varying characteristics as a stochastic game to design a cooperative cache placement strategy. To solve the problem of randomness of benefits and ensure that benefits are equal for each user terminal (UT), Zhang et al 27 propose a multiwinner once auction‐based caching (MOAC) placement algorithm to maximize the content sharing revenue of all the UTs.…”
Section: Related Workmentioning
confidence: 99%
“…FDC agents are used in [42], assuming an offline training phase, shared state, and common reward. Edge caching in [43,44] is also treated with an FDC algorithm, sharing a global state between agents, with the difference being that it is compressed through learning in order to minimize communication costs.…”
Section: Edge Cachingmentioning
confidence: 99%