Caching is a technique to reduce peak traffic rates by prefetching popular content in memories at the end users. This paper proposes a novel caching approach that can achieve a significantly larger reduction in peak rate compared to previously known caching schemes. In particular, the improvement can be on the order of the number of end users in the network. Conventionally, cache memories are exploited by delivering requested contents in part locally rather than through the network. The gain offered by this approach, which we term local caching gain, depends on the local cache size (i.e., the cache available at each individual user). In this paper, we introduce and exploit a second, global, caching gain, which is not utilized by conventional caching schemes. This gain depends on the aggregate global cache size (i.e., the cumulative cache available at all users), even though there is no cooperation among the caches.To evaluate and isolate these two gains, we introduce a new, information-theoretic formulation of the caching problem focusing on its basic structure. For this setting, the proposed scheme exploits both local and global caching gains, leading to a multiplicative improvement in the peak rate compared to previously known schemes. Moreover, we argue that the performance of the proposed scheme is within a constant factor from the information-theoretic optimum for all values of the problem parameters.
Replicating or caching popular content in memories distributed across the network is a technique to reduce peak network loads. Conventionally, the main performance gain of this caching was thought to result from making part of the requested data available closer to end users. Instead, we recently showed that a much more significant gain can be achieved by using caches to create coded-multicasting opportunities, even for users with different demands, through coding across data streams. These coded-multicasting opportunities are enabled by careful content overlap at the various caches in the network, created by a central coordinating server.In many scenarios, such a central coordinating server may not be available, raising the question if this multicasting gain can still be achieved in a more decentralized setting. In this paper, we propose an efficient caching scheme, in which the content placement is performed in a decentralized manner. In other words, no coordination is required for the content placement. Despite this lack of coordination, the proposed scheme is nevertheless able to create coded-multicasting opportunities and achieves a rate close to the optimal centralized scheme.I. INTRODUCTION Traffic in content delivery networks exhibits strong temporal variability, resulting in congestion during peak hours and resource underutilization during off-peak hours. It is therefore desirable to try to "shift" some of the traffic from peak to off-peak hours. One approach to achieve this is to exploit idle network resources to duplicate some of the content in memories distributed across the network. This duplication of content is called content placement or caching. The duplicated content can then be used during peak hours to reduce network congestion.From the above description, it is apparent that the network operates in two different phases: a content placement phase and a content delivery phase. In the placement phase, the network is not congested, and the system is constrained mainly by the size of the cache memories. In the delivery phase, the network is congested, and the system is constrained mainly by the rate required to serve the content requested by the users. The goal is thus to design the placement phase such that the rate in the delivery phase is minimized.There are two fundamentally different approaches, based on two distinct understandings of the role of caching, for how the placement and the delivery phases are performed.
We consider a network consisting of a file server connected through a shared link to a number of users, each equipped with a cache. Knowing the popularity distribution of the files, the goal is to optimally populate the caches such as to minimize the expected load of the shared link. For a single cache, it is well known that storing the most popular files is optimal in this setting. However, we show here that this is no longer the case for multiple caches. Indeed, caching only the most popular files can be highly suboptimal. Instead, a fundamentally different approach is needed, in which the cache contents are used as side information for coded communication over the shared link. We propose such a coded caching scheme and prove that it is close to optimal.Urs Niesen was with Bell Labs, he is now with the Qualcomm NJ Research Center. Mohammad Ali Maddah-Ali is with Bell Labs, Nokia.
Caching of popular content during off-peak hours is a strategy to reduce network loads during peak hours. Recent work has shown significant benefits of designing such caching strategies not only to deliver part of the content locally, but also to provide coded multicasting opportunities even among users with different demands. Exploiting both of these gains was shown to be approximately optimal for caching systems with a single layer of caches.Motivated by practical scenarios, we consider in this work a hierarchical content delivery network with two layers of caches. We propose a new caching scheme that combines two basic approaches. The first approach provides coded multicasting opportunities within each layer; the second approach provides coded multicasting opportunities across multiple layers. By striking the right balance between these two approaches, we show that the proposed scheme achieves the optimal communication rates to within a constant multiplicative and additive gap. We further show that there is no tension between the rates in each of the two layers up to the aforementioned gap. Thus, both layers can simultaneously operate at approximately the minimum rate. I. INTRODUCTIONThe demand for high-definition video streaming services such as YouTube and Netflix is driving the rapid growth of Internet traffic. In order to mitigate the effect of this increased load on the underlying communication infrastructure, content delivery networks deploy storage memories or caches throughout the network. These caches can be populated with some of the content during off-peak traffic hours. This cached content can then be used to reduce the network load during peak traffic hours when users make the most requests.Content caching has a rich history, see for example [1] and references therein. More recently, it has been studied in the context of video-on-demand systems for which efficient content placement schemes have been proposed in [2], [3] among others. The impact of different content popularities on the performance of caching schemes has been investigated for example in [4]- [6]. A common feature among the caching schemes studied in the literature is that those parts of a requested file that are available at nearby caches are served locally, whereas the remaining files parts are served via orthogonal transmissions from an origin server hosting all the files.Recently, [7], [8] proposed a new caching approach, called coded caching, that exploits cache memories not only to deliver part of the content locally, but also to create coded multicasting opportunities among users with different demands. It is shown there that the reduction in rate due to these coded multicasting opportunities is significant and can be on the order of the number of users in the network. The setting considered in [7], [8] consists of a single layer of caches between the origin server and the end users. The server communicates directly with all the caches via a shared link, and the objective is to minimize the required transmission rate ...
We consider a basic content distribution scenario consisting of a single origin server connected through a shared bottleneck link to a number of users each equipped with a cache of finite memory. The users issue a sequence of content requests from a set of popular files, and the goal is to operate the caches as well as the server such that these requests are satisfied with the minimum number of bits sent over the shared link. Assuming a basic Markov model for renewing the set of popular files, we characterize approximately the optimal long-term average rate of the shared link. We further prove that the optimal online scheme has approximately the same performance as the optimal offline scheme, in which the cache contents can be updated based on the entire set of popular files before each new request. To support these theoretical results, we propose an online coded caching scheme termed coded least-recently sent (LRS) and simulate it for a demand time series derived from the dataset made available by Netflix for the Netflix Prize. For this time series, we show that the proposed coded LRS algorithm significantly outperforms the popular least-recently used (LRU) caching algorithm.
We consider a basic cache network, in which a single server is connected to multiple users via a shared bottleneck link. The server has a database of files (content). Each user has an isolated memory that can be used to cache content in a prefetching phase. In a following delivery phase, each user requests a file from the database, and the server needs to deliver users' demands as efficiently as possible by taking into account their cache contents. We focus on an important and commonly used class of prefetching schemes, where the caches are filled with uncoded data. We provide the exact characterization of the rate-memory tradeoff for this problem, by deriving both the minimum average rate (for a uniform file popularity) and the minimum peak rate required on the bottleneck link for a given cache size available at each user. In particular, we propose a novel caching scheme, which strictly improves the state of the art by exploiting commonality among user demands. We then demonstrate the exact optimality of our proposed scheme through a matching converse, by dividing the set of all demands into types, and showing that the placement phase in the proposed caching scheme is universally optimal for all types. Using these techniques, we also fully characterize the rate-memory tradeoff for a decentralized setting, in which users fill out their cache content without any coordination.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.