Caching in Dynamic Environments: A Near-Optimal Online Learning Approach

Zhou, Shiji; Zhi, Wang; Hu, Chenghao; Mao, Yinan; Yan, Han; Wu, Chuan; Zhang, Shanghang; Zhu, Wenwu

doi:10.1109/tmm.2021.3132156

Cited by 10 publications

(6 citation statements)

References 45 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Furthermore, all above works assumed full knowledge of request processes and hence did not incorporate a learning component. A recent line of works considered content caching from an online learning perspective, e.g., [18], [19], and used the performance metric of learning regret or competitive ratio. Works such as [20], [21] used deep RL methods.…”

Section: Related Workmentioning

confidence: 99%

“…where β * ∈ R is the minimal long-term average cost of this MDP with parameter W ∈ R, and V (s) is the optimal state value up to an additive constant, which depends on the parameter W. The Q-function can then be defined as [4] Q(s, a) + β * = s−(1−a)W (s) + s p(s |s, a)V (s ), (19) such that V (s) = min a∈{0,1} Q(s, a).…”

Section: A Preliminariesmentioning

confidence: 99%

“…The Whittle index associated with state s [3] is defined as the value W (s) such that actions 0 and 1 are equally favorable in state s with a "subsidy" W (s), i.e., Q(s, 0) = Q(s, 1). Combining with (19), the closed-form for W (s) satisfies [11] −W (s) + s p(s |s, 0)V (s ) = s p(s |s, 1)V (s ). (20) However, the unknown transition probabilities p(s |s, •) hinder us to directly evaluate the Whittle index according to (20).…”

Section: A Preliminariesmentioning

confidence: 99%

See 2 more Smart Citations

Whittle Index based Q-Learning for Wireless Edge Caching with Linear Function Approximation

Xiong¹,

Wang²,

Li³

et al. 2022

Preprint

View full text Add to dashboard Cite

An explosive growth in the number of on-demand content requests has imposed significant pressure on current wireless network infrastructure. To enhance the perceived user experience, and support latency-sensitive applications, edge computing has emerged as a promising computing paradigm. The performance of a wireless edge depends on contents that are cached. In this paper, we consider the problem of content caching at the wireless edge with unreliable channels to minimize average content request latency. We formulate this problem as a restless bandit problem, which is provably hard to solve. We begin by investigating a discounted counterpart, and prove that it admits an optimal policy of the threshold-type. We then show that the result also holds for the average latency problem. Using these structural results, we establish the indexability of the problem, and employ Whittle index policy to minimize average latency. Since system parameters such as content request rate are often unknown, we further develop a model-free reinforcement learning algorithm dubbed Q-Whittle learning that relies on our index policy. We also derive a bound on its finite-time convergence rate. Simulation results using real traces demonstrate that our proposed algorithms yield excellent empirical performance.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: A Preliminariesmentioning

confidence: 99%

Section: A Preliminariesmentioning

confidence: 99%

See 1 more Smart Citation

Whittle Index based Q-Learning for Wireless Edge Caching with Linear Function Approximation

Xiong¹,

Wang²,

Li³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Our machine models are designed to mine the sequential patterns of how each individual user consumes contents and predict for each user which content she will consume the next. Nearly Optimal Cache (NOC) in [51] aims to minimize the dynamic regret, which is the performance gap between an online learning algorithm and the best dynamic policy in hindsight. NOC has provably good worst-case performance for dynamic environments with no prior distribution assumptions, but it potentially degrades the performance when working with friendly request patterns.…”

Section: Related Workmentioning

confidence: 99%

“…(1) LRU-2: evicts content based on the time elapsed since the previous two requests. In PEC, LRU2 is also used to manage the reactive portion; (2) LRU: evicts content based on the time elapsed since the last request; (3) LFU: evicts content based on the request frequency in the whole history; (4) LRB: Learning Relaxed Belady, an online learning approach using the concept of Belady boundary [32] (5) NOC: an online learning based caching algorithm with worst-case performance guarantee [51]; (6) CEC: dynamically selects reactive caching policies using reinforcement learning [54];…”

Section: Comparison With Reactive Cachingmentioning

confidence: 99%

Predictive Edge Caching through Deep Mining of Sequential Patterns in User Content Retrievals

Li¹,

Wang²,

Zong³

et al. 2022

Preprint

View full text Add to dashboard Cite

Edge caching plays an increasingly important role in boosting user content retrieval performance while reducing redundant network traffic. The effectiveness of caching ultimately hinges on the accuracy of predicting content popularity in the near future. However, at the network edge, content popularity can be extremely dynamic due to diverse user content retrieval behaviors and the low-degree of user multiplexing. It's challenging for the traditional reactive caching systems to keep up with the dynamic content popularity patterns. In this paper, we propose a novel Predictive Edge Caching (PEC) system that predicts the future content popularity using fine-grained learning models that mine sequential patterns in user content retrieval behaviors, and opportunistically prefetches contents predicted to be popular in the near future using idle network bandwidth. Through extensive experiments driven by real content retrieval traces, we demonstrate that PEC can adapt to highly dynamic content popularity at network edge, and significantly improve cache hit ratio and reduce user content retrieval latency over the state-of-art caching policies. More broadly, our study demonstrates that edge caching performance can be boosted by deep mining of user content retrieval behaviors.

show abstract

CGFNet: cross-guided fusion network for RGB-thermal semantic segmentation

Chen

Zhao

2022

Vis Comput

View full text Add to dashboard Cite

Caching in Dynamic Environments: A Near-Optimal Online Learning Approach

Cited by 10 publications

References 45 publications

Whittle Index based Q-Learning for Wireless Edge Caching with Linear Function Approximation

Whittle Index based Q-Learning for Wireless Edge Caching with Linear Function Approximation

Predictive Edge Caching through Deep Mining of Sequential Patterns in User Content Retrievals

CGFNet: cross-guided fusion network for RGB-thermal semantic segmentation

Contact Info

Product

Resources

About