Key-value stores are a vital component in many scale-out enterprises, including social networks, online retail, and risk analysis. Accordingly, they are receiving increased attention from the research community in an effort to improve their performance, scalability, reliability, cost, and power consumption. To be effective, such efforts require a detailed understanding of realistic key-value workloads. And yet little is known about these workloads outside of the companies that operate them. This paper aims to address this gap.To this end, we have collected detailed traces from Facebook's Memcached deployment, arguably the world's largest. The traces capture over 284 billion requests from five different Memcached use cases over several days. We analyze the workloads from multiple angles, including: request composition, size, and rate; cache efficacy; temporal patterns; and application use cases. We also propose a simple model of the most representative trace to enable the generation of more realistic synthetic workloads by the community.Our analysis details many characteristics of the caching workload. It also reveals a number of surprises: a GET/SET ratio of 30:1 that is higher than assumed in the literature; some applications of Memcached behave more like persistent storage than a cache; strong locality metrics, such as keys accessed many millions of times a day, do not always suffice for a high hit rate; and there is still room for efficiency and hit rate improvements in Memcached's implementation. Toward the last point, we make several suggestions that address the exposed deficiencies.
Prefetching is an important technique for improving effective hard disk performance. A prefetcher seeks to accurately predict which data will be requested and load it ahead of the arrival of the corresponding requests. Current disk prefetch policies in major operating systems track access patterns at the level of file abstraction. While this is useful for exploiting application-level access patterns, for two reasons filelevel prefetching cannot realize the full performance improvements achievable by prefetching. First, certain prefetch opportunities can only be detected by knowing the data layout on disk, such as the contiguous layout of file metadata or data from multiple files. Second, nonsequential access of disk data (requiring disk head movement) is much slower than sequential access, and the performance penalty for mis-prefetching a randomly located block, relative to that of a sequential block, is correspondingly greater.To overcome the inherent limitations of prefetching at logical file level, we propose to perform prefetching directly at the level of disk layout, and in a portable way. Our technique, called DiskSeen, is intended to be supplementary to, and to work synergistically with, any present file-level prefetch policies. DiskSeen tracks the locations and access times of disk blocks and, based on analysis of their temporal and spatial relationships, seeks to improve the sequentiality of disk accesses and overall prefetching performance. It also implements a mechanism to minimize mis-prefetching, on a per-application basis, to mitigate the corresponding performance penalty.Our implementation of the DiskSeen scheme in the Linux 2.6 kernel shows that it can significantly improve the effectiveness of prefetching, reducing execution times by 20%-60% for microbenchmarks and real applications such as grep, CVS, and TPC-H. Even for workloads specifically designed to expose its weaknesses, DiskSeen incurs only minor performance loss.
Key-value caches, represented by Memcached, play a critical role in data centers. Its efficacy can significantly impact users' perceived service time and back-end systems' workloads. A central issue in the in-memory cache's management is memory allocation, or how the limited space is distributed for storing keyvalue items of various sizes. When a cache is full, the allocation issue is how to conduct replacement operations on items of different sizes. To effectively address the issue, a practitioner must simultaneously consider three factors, which are access locality, item size, and miss penalty. Existing designs consider only one or two of the first two factors, and pay little attention on miss penalty. This inadequacy can substantially compromise utilization of cache space and request service time.In this paper we propose a Penalty Aware Memory Allocation scheme (PAMA) that takes all three factors into account. While the three different factors cannot be directly compared to each other in a quantitative manner, PAMA uses their impacts on service time to determine where a unit of memory space should be (de)allocated. The impacts are quantified as the decrease (or increase) of service time if a unit of space is allocated (or deallocated). PAMA efficiently tracks access pattern and use of memory, and speculatively evaluates the impacts to enable penalty-aware memory allocation for KV caches. Our evaluation with real-world Memcached workload traces demonstrates that PAMA can significantly reduce request service time compared to other representative KV cache management schemes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.