Deep Reinforcement Learning for Mobile Edge Caching: Review, New Features, and Open Issues

Zhu, Hao; Cao, Yang; Wang, Wei; Jiang, Tao; Jin, Shi

doi:10.1109/mnet.2018.1800109

Cited by 148 publications

(92 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Since the dimensionality of θ θ θ can be much smaller than |S||A|, the DQN is efficiently trained with few experiences, 1 Softmax is a function that takes as input a vector z ∈ R F , and normalizes it into a probability distribution via σ(z) Take action a a a n using local policy 7 a a a n (t, τ ) = π n (s s s(t − 1, τ )) if t = 1 π n (s s s(T, τ − 1)) if t = 1 8 Requests r r r n (t, τ ) are revealed 9 Set s s s n (t, τ ) = r r r n (t, τ ) Find ∇L Tar (θ) for these samples, using (20) 23 Update θ τ +1 = θ τ − β τ ∇L Tar (θ) 24 If mod(τ, C) = 0, then update θ θ θ Tar = θ θ θ τ 25 end and generalizes to unseen state vectors. Unfortunately, DQN model inaccuracy can propagate in the cost prediction error in (16) that can cause instability in (18), which can lead to performance degradation, and even divergence [43], [44].…”

Section: Adaptive Dqn-based Cachingmentioning

confidence: 99%

Deep Reinforcement Learning for Adaptive Caching in Hierarchical Content Delivery Networks

Sadeghi

Wang

Giannakis

2019

IEEE Trans. Cogn. Commun. Netw.

113

View full text Add to dashboard Cite

Caching is envisioned to play a critical role in nextgeneration content delivery infrastructure, cellular networks, and Internet architectures. By smartly storing the most popular contents at the storage-enabled network entities during off-peak demand instances, caching can benefit both network infrastructure as well as end users, during on-peak periods. In this context, distributing the limited storage capacity across network entities calls for decentralized caching schemes. Many practical caching systems involve a parent caching node connected to multiple leaf nodes to serve user file requests. To model the two-way interactive influence between caching decisions at the parent and leaf nodes, a reinforcement learning (RL) framework is put forth. To handle the large continuous state space, a scalable deep RL approach is pursued. The novel approach relies on a hyper-deep Q-network to learn the Q-function, and thus the optimal caching policy, in an online fashion. Reinforcing the parent node with ability to learnand-adapt to unknown policies of leaf nodes as well as spatiotemporal dynamic evolution of file requests, results in remarkable caching performance, as corroborated through numerical tests.

show abstract

Section: Adaptive Dqn-based Cachingmentioning

confidence: 99%

Deep Reinforcement Learning for Adaptive Caching in Hierarchical Content Delivery Networks

Sadeghi

Wang

Giannakis

2019

IEEE Trans. Cogn. Commun. Netw.

113

View full text Add to dashboard Cite

show abstract

“…Existing works in [2], [4], [6] directly replace the BS cache with the newly fetched contents. In the content update phase, we propose to update the BS cache by taking into account both the newly fetched contents and its cache in current time slot.…”

Section: A Flowchartmentioning

confidence: 99%

“…Due to the complexity of the real environment, these conventional replacement policies cannot accurately capture dynamic characteristics of content popularity [4]. Inspired by the reinforcement learning (RL) in solving complicated control problem [5], the works in [6], [7] relied on strong feature Manuscript representation ability of deep neural network (DNN) [8] and adopted the model-free deep RL (DRL) to maximize the long-term system reward in mobile edge caching. In [6]- [8], the edge node fetches the missed content from the source server and replaces its local cache with newly fetched content.…”

Section: Introductionmentioning

confidence: 99%

Dynamic Content Update for Wireless Edge Caching via Deep Reinforcement Learning

Shi

et al. 2019

IEEE Commun. Lett.

View full text Add to dashboard Cite

This letter studies a basic wireless caching network where a source server is connected to a cache-enabled base station (BS) that serves multiple requesting users. A critical problem is how to improve cache hit rate under dynamic content popularity. To solve this problem, the primary contribution of this work is to develop a novel dynamic content update strategy with the aid of deep reinforcement learning. Considering that the BS is unaware of content popularities, the proposed strategy dynamically updates the BS cache according to the time-varying requests and the BS cached contents. Towards this end, we model the problem of cache update as a Markov decision process and put forth an efficient algorithm that builds upon the long shortterm memory network and external memory to enhance the decision making ability of the BS. Simulation results show that the proposed algorithm can achieve not only a higher average reward than deep Q-network, but also a higher cache hit rate than the existing replacement policies such as the least recently used, first-in first-out, and deep Q-network based algorithms.Index Terms-Content update, Markov decision process, deep reinforcement learning, cache hit rate, long-term reward.

show abstract

“…Content caching has been envisioned to improve the efficiency in wireless content delivery by placing popular files close to the users to reduce data traffic in the back-haul links. Content caching can be divided into two main classes: 1) reactive online cache refreshment [2], where the base station refreshes the cache contents on the fly using the items fetched from the cloud during the interactions between the base station and the users, and 2) proactive offline cache refreshment [3]- [7], where the cache is updated only during dedicated time intervals (e.g., off-peak periods). This paper deals with online caching.…”

Section: Introductionmentioning

confidence: 99%

“…In [8], the authors reviewed major families of machine learning algorithms with potential applications in edge caching. Recently, reinforcement learning has also been applied in caching problems [2], [5]- [7]. The works [5], [6], and [7] consider proactive caching by refreshing the cache during dedicated time intervals (e.g., offpeak periods).…”

Section: Introductionmentioning

confidence: 99%

Online Caching Policy with User Preferences and Time-Dependent Requests: A Reinforcement Learning Approach

Hatami

Leinonen

Codreanu

2019

2019 53rd Asilomar Conference on Signals, Systems, and Computers

View full text Add to dashboard Cite

Content caching is a promising approach to reduce data traffic in the back-haul links. We consider a system where multiple users request items from a cache-enabled base station that is connected to a cloud. The users request items according to the user preferences in a time-dependent fashion, i.e., a user is likely to request the next chunk (item) of the file requested at a previous time slot. Whenever the requested item is not in the cache, the base station downloads it from the cloud and forwards it to the user. In the meanwhile, the base station decides whether to replace one item in the cache by the fetched item, or to discard it. We model the problem as a Markov decision process (MDP) and propose a novel state space that takes advantage of the dynamics of the users' requests. We use reinforcement learning and propose a Q-learning algorithm to find an optimal cache replacement policy that maximizes the cache hit ratio without knowing the popularity profile distribution, probability distribution of items, and user preference model. Simulation results show that the proposed algorithm improves the cache hit ratio compared to other baseline policies.

show abstract

Deep Reinforcement Learning for Mobile Edge Caching: Review, New Features, and Open Issues

Cited by 148 publications

References 12 publications

Deep Reinforcement Learning for Adaptive Caching in Hierarchical Content Delivery Networks

Deep Reinforcement Learning for Adaptive Caching in Hierarchical Content Delivery Networks

Dynamic Content Update for Wireless Edge Caching via Deep Reinforcement Learning

Online Caching Policy with User Preferences and Time-Dependent Requests: A Reinforcement Learning Approach

Contact Info

Product

Resources

About