This letter studies a basic wireless caching network where a source server is connected to a cache-enabled base station (BS) that serves multiple requesting users. A critical problem is how to improve cache hit rate under dynamic content popularity. To solve this problem, the primary contribution of this work is to develop a novel dynamic content update strategy with the aid of deep reinforcement learning. Considering that the BS is unaware of content popularities, the proposed strategy dynamically updates the BS cache according to the time-varying requests and the BS cached contents. Towards this end, we model the problem of cache update as a Markov decision process and put forth an efficient algorithm that builds upon the long shortterm memory network and external memory to enhance the decision making ability of the BS. Simulation results show that the proposed algorithm can achieve not only a higher average reward than deep Q-network, but also a higher cache hit rate than the existing replacement policies such as the least recently used, first-in first-out, and deep Q-network based algorithms.Index Terms-Content update, Markov decision process, deep reinforcement learning, cache hit rate, long-term reward.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.