2018
DOI: 10.1109/mnet.2018.1800109
|View full text |Cite
|
Sign up to set email alerts
|

Deep Reinforcement Learning for Mobile Edge Caching: Review, New Features, and Open Issues

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
92
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 148 publications
(92 citation statements)
references
References 12 publications
0
92
0
Order By: Relevance
“…Since the dimensionality of θ θ θ can be much smaller than |S||A|, the DQN is efficiently trained with few experiences, 1 Softmax is a function that takes as input a vector z ∈ R F , and normalizes it into a probability distribution via σ(z) Take action a a a n using local policy 7 a a a n (t, τ ) = π n (s s s(t − 1, τ )) if t = 1 π n (s s s(T, τ − 1)) if t = 1 8 Requests r r r n (t, τ ) are revealed 9 Set s s s n (t, τ ) = r r r n (t, τ ) Find ∇L Tar (θ) for these samples, using (20) 23 Update θ τ +1 = θ τ − β τ ∇L Tar (θ) 24 If mod(τ, C) = 0, then update θ θ θ Tar = θ θ θ τ 25 end and generalizes to unseen state vectors. Unfortunately, DQN model inaccuracy can propagate in the cost prediction error in (16) that can cause instability in (18), which can lead to performance degradation, and even divergence [43], [44].…”
Section: Adaptive Dqn-based Cachingmentioning
confidence: 99%
“…Since the dimensionality of θ θ θ can be much smaller than |S||A|, the DQN is efficiently trained with few experiences, 1 Softmax is a function that takes as input a vector z ∈ R F , and normalizes it into a probability distribution via σ(z) Take action a a a n using local policy 7 a a a n (t, τ ) = π n (s s s(t − 1, τ )) if t = 1 π n (s s s(T, τ − 1)) if t = 1 8 Requests r r r n (t, τ ) are revealed 9 Set s s s n (t, τ ) = r r r n (t, τ ) Find ∇L Tar (θ) for these samples, using (20) 23 Update θ τ +1 = θ τ − β τ ∇L Tar (θ) 24 If mod(τ, C) = 0, then update θ θ θ Tar = θ θ θ τ 25 end and generalizes to unseen state vectors. Unfortunately, DQN model inaccuracy can propagate in the cost prediction error in (16) that can cause instability in (18), which can lead to performance degradation, and even divergence [43], [44].…”
Section: Adaptive Dqn-based Cachingmentioning
confidence: 99%
“…Existing works in [2], [4], [6] directly replace the BS cache with the newly fetched contents. In the content update phase, we propose to update the BS cache by taking into account both the newly fetched contents and its cache in current time slot.…”
Section: A Flowchartmentioning
confidence: 99%
“…Due to the complexity of the real environment, these conventional replacement policies cannot accurately capture dynamic characteristics of content popularity [4]. Inspired by the reinforcement learning (RL) in solving complicated control problem [5], the works in [6], [7] relied on strong feature Manuscript representation ability of deep neural network (DNN) [8] and adopted the model-free deep RL (DRL) to maximize the long-term system reward in mobile edge caching. In [6]- [8], the edge node fetches the missed content from the source server and replaces its local cache with newly fetched content.…”
Section: Introductionmentioning
confidence: 99%
“…Content caching has been envisioned to improve the efficiency in wireless content delivery by placing popular files close to the users to reduce data traffic in the back-haul links. Content caching can be divided into two main classes: 1) reactive online cache refreshment [2], where the base station refreshes the cache contents on the fly using the items fetched from the cloud during the interactions between the base station and the users, and 2) proactive offline cache refreshment [3]- [7], where the cache is updated only during dedicated time intervals (e.g., off-peak periods). This paper deals with online caching.…”
Section: Introductionmentioning
confidence: 99%
“…In [8], the authors reviewed major families of machine learning algorithms with potential applications in edge caching. Recently, reinforcement learning has also been applied in caching problems [2], [5]- [7]. The works [5], [6], and [7] consider proactive caching by refreshing the cache during dedicated time intervals (e.g., offpeak periods).…”
Section: Introductionmentioning
confidence: 99%