Log-Structured Merge Key-Value stores (LSM KVs) are designed to offer good write performance, by capturing client writes in memory, and only later flushing them to storage. Writes are later compacted into a tree-like data structure on disk to improve read performance and to reduce storage space use. It has been widely documented that compactions severely hamper throughput. Various optimizations have successfully dealt with this problem. These techniques include, among others, rate-limiting flushes and compactions, selecting among compactions for maximum effect, and limiting compactions to the highest level by so-called fragmented LSMs. In this article, we focus on latencies rather than throughput. We first document the fact that LSM KVs exhibit high tail latencies. The techniques that have been proposed for optimizing throughput do not address this issue, and, in fact, in some cases, exacerbate it. The root cause of these high tail latencies is interference between client writes, flushes, and compactions. Another major cause for tail latency is the heterogeneous nature of the workloads in terms of operation mix and item sizes whereby a few more computationally heavy requests slow down the vast majority of smaller requests. We introduce the notion of an Input/Output (I/O) bandwidth scheduler for an LSM-based KV store to reduce tail latency caused by interference of flushing and compactions and by workload heterogeneity. We explore three techniques as part of this I/O scheduler: (1) opportunistically allocating more bandwidth to internal operations during periods of low load, (2) prioritizing flushes and compactions at the lower levels of the tree, and (3) separating client requests by size and by data access path. SILK+ is a new open-source LSM KV that incorporates this notion of an I/O scheduler.
Abstract. In this paper we investigate the issue of automatically identifying the "natural" degree of parallelism of an application using software transactional memory (STM), i.e., the workload-specific multiprogramming level that maximizes application's performance. We discuss the importance of adapting the concurrency level to the workload in two different scenarios, a shared-memory and a distributed STM infrastructure. We propose and evaluate two alternative self-tuning methodologies, explicitly tailored for the considered scenarios. In shared-memory STM, we show that lightweight, black-box approaches relying solely on on-line exploration can be extremely effective. For distributed STMs, we introduce a novel hybrid approach that combines model-driven performance forecasting techniques and on-line exploration in order to take the best of the two techniques, namely enhancing robustness despite model's inaccuracies, and maximizing convergence speed towards optimum solutions.
Causal consistency is an attractive consistency model for replicated data stores. It is provably the strongest model that tolerates partitions, it avoids the long latencies associated with strong consistency, and, especially when using read-only transactions, it prevents many of the anomalies of weaker consistency models. Recent work has shown that causal consistency allows "latency-optimal" read-only transactions, that are nonblocking, single-version and singleround in terms of communication. On the surface, this latency optimality is very appealing, as the vast majority of applications are assumed to have read-dominated workloads.In this paper, we show that such "latency-optimal" readonly transactions induce an extra overhead on writes; the extra overhead is so high that performance is actually jeopardized, even in read-dominated workloads. We show this result from a practical and a theoretical angle.First, we present a protocol that implements "almost latency-optimal" ROTs but does not impose on the writes any of the overhead of latency-optimal protocols. In this protocol, ROTs are nonblocking, one version and can be configured to use either two or one and a half rounds of client-server communication. We experimentally show that this protocol not only provides better throughput, as expected, but also surprisingly better latencies for all but the lowest loads and most read-heavy workloads.Then, we prove that the extra overhead imposed on writes by latency-optimal read-only transactions is inherent, i.e., it is not an artifact of the design we consider, and cannot be avoided by any implementation of latency-optimal readonly transactions. We show in particular that this overhead grows linearly with the number of clients.
Classical approaches to performance prediction rely on two, typically antithetic, techniques: Machine Learning (ML) and Analytical Modeling (AM). ML takes a black box approach, whose accuracy strongly depends on the representativeness of the dataset used during the initial training phase. Specifically, it can achieve very good accuracy in areas of the features' space that have been sufficiently explored during the training process. Conversely, AM techniques require no or minimal training, hence exhibiting the potential for supporting prompt instantiation of the performance model of the target system. However, in order to ensure their tractability, they typically rely on a set of simplifying assumptions. Consequently, AM's accuracy can be seriously challenged in scenarios (e.g., workload conditions) in which such assumptions are not matched. In this paper we explore several hybrid/gray box techniques that exploit AM and ML in synergy in order to get the best of the two worlds. We evaluate the proposed techniques in case studies targeting two complex and widely adopted middleware systems: a NoSQL distributed key-value store and a Total Order Broadcast (TOB) service.
Transactional Causal Consistency (TCC) extends causal consistency, the strongest consistency model compatible with availability, with interactive read-write transactions, and is therefore particularly appealing for geo-replicated platforms. This paper presents Wren, the first TCC system that at the same time i) implements nonblocking read operations, thereby achieving low latency, and ii) allows an application to efficiently scale out within a replication site by sharding. Wren introduces new protocols for transaction execution, dependency tracking and stabilization. The transaction protocol supports nonblocking reads by providing a transaction with a snapshot that is the union of a fresh causal snapshot S installed by every partition in the local data center and a client-side cache for writes that are not yet included in S. The dependency tracking and stabilization protocols require only two scalar timestamps, resulting in efficient resource utilization and providing scalability in terms of replication sites. In return for these benefits, Wren slightly increases the visibility latency of updates. We evaluate Wren on an AWS deployment using up to 5 replication sites and 16 partitions per site. We show that Wren delivers up to 1.4x higher throughput and up to 3.6x lower latency when compared to the state-of-the-art design. The choice of an older snapshot increases local update visibility latency by a few milliseconds. The use of only two timestamps to track causality increases remote update visibility latency by less than 15%.
Abstract-This paper presents PROMPT, a PeRfOrmance Model for Partially replicated in-memory Transactional cloud stores. PROMPT combines white box Analytical Modelling and Machine Learning techniques, with the goal of achieving the best of the two methodologies: low training times, high extrapolation power, and portability across heterogeneous cloud infrastructures. We validate PROMPT via an extensive experimental study based on a popular open-source transactional in-memory data store (Red Hat's Infinispan), industry-standard benchmarks, and deployments on both public and private cloud infrastructures.
Abstract-Cloud computing represents a cost-effective paradigm to deploy a wide class of large-scale distributed applications, for which the pay-per-use model combined with automatic resource provisioning promise to reduce the cost of dependability and scalability. However, a key challenge to be addressed to materialize the advantages promised by Cloud computing is the design of effective auto-scaling and self-tuning mechanisms capable of ensuring pre-determined QoS levels at minimum cost in face of changing workload conditions. This is one of the keys goals that are being pursued by the Cloud-TM project, a recent EU project that is developing a novel, self-optimizing transactional data platform for the cloud. In this paper we present the key design choices underlying the development of Cloud-TM's Workload Analyzer (WA), a crucial component of the Cloud-TM platform that is change of three key functionalities: aggregating, filtering and correlating the streams of statistical data gathered from the various nodes of the Cloud-TM platform; building detailed workload profiles of applications deployed on the Cloud-TM platform, characterizing their present and future demands in terms of both logical (i.e. data) and physical (e.g. hardware-related) resources; triggering alerts in presence of violations (or risks of future violations) of predetermined SLAs. I. INTRODUCTIONThe Cloud Computing paradigm is profoundly changing both the architectures of the IT systems and the organization of the enterprise IT infrastructure management. Architectures of distributed computing platforms are moving from a traditional static model, where the amount of resources allocated to applications/services are a-priori estimated, towards an elastic model, where resources can be provisioned on-demand. The flexibility of Cloud computing's pay-per-use model is driving many enterprises to move their IT infrastructure and services to the "cloud". Anyway this new paradigm opens up new challenges.One of these is related to the design of effective autoscaling and self-tuning mechanisms capable of ensuring predetermined QoS levels at minimum costs in face of changing workload conditions. In fact, on one hand, an elastic platform provides a very cost-effective model to improve system performance and dependability by means of data and services replication. On the other hand, in order to take advantage of the elasticity of the underlying infrastructure, both the data and services replication management need to be automated by means of mechanisms able to guarantee the desirable Quality of Service (QoS) levels while minimizing the operational cost of the infrastructure. In such a context,
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.