In this paper we present Rollerchain, a novel Distributed Hash Table that o↵ers e cient data storage through the combination of gossip-based and structured overlay networks. The unstructured component maintains clusters of fully connected nodes, where each cluster acts as a virtual node in the structured component. This architecture simplifies the management of data replication and balances the load among nodes in the system. We have implemented a prototype of Rollerchain that we have used to experimentally validate its performance against other state of the art solutions..
Rollerchain: a DHT for E cient ReplicationJoão Paiva, João Leitão and Luís Rodrigues joao.paiva@ist.utl.pt, jc.leitao@fct.unl.pt, ler@ist.utl.pt
INESC-ID, Instituto Superior Técnico, Universidade Técnica de LisboaAbstract. In this paper we present Rollerchain, a novel Distributed Hash Table that o↵ers e cient data storage through the combination of gossip-based and structured overlay networks. The unstructured component maintains clusters of fully connected nodes, where each cluster acts as a virtual node in the structured component. This architecture simplifies the management of data replication and balances the load among nodes in the system. We have implemented a prototype of Rollerchain that we have used to experimentally validate its performance against other state of the art solutions.
This article addresses the problem of self-tuning the data placement in replicated key-value stores. The goal is to automatically optimize replica placement in a way that leverages locality patterns in data accesses, such that internode communication is minimized. To do this efficiently is extremely challenging, as one needs not only to find lightweight and scalable ways to identify the right assignment of data replicas to nodes but also to preserve fast data lookup. The article introduces new techniques that address these challenges. The first challenge is addressed by optimizing, in a decentralized way, the placement of the objects generating the largest number of remote operations for each node. The second challenge is addressed by combining the usage of consistent hashing with a novel data structure, which provides efficient probabilistic data placement. These techniques have been integrated in a popular open-source key-value store. The performance results show that the throughput of the optimized system can be six times better than a baseline system employing the widely used static placement based on consistent hashing.
Hyperspace hashing is a recent multi-dimensional indexing technique for distributed key-value stores that aims at supporting efficient queries using multiple objects' attributes. However, the advantage of supporting complex queries comes at the cost of a complex configuration. In this paper we address the problem of automating the configuration of this innovative distributed indexing mechanism. We first show that a misconfiguration may significantly affect the performance of the system. We then derive a performance model that provides key insights on the behaviour of hyperspace hashing. Based on this model, we derive a technique to automatically and dynamically select the best configuration. 1
Data placement refers to the problem of deciding how to assign data items to nodes in a distributed system to optimize one or several of a number of performance criteria such as reducing network congestion, improving load balancing, among others. This document reports on our experience when addressing this problem in distributed systems of different scales, namely: medium size datacenter-scale and internet-scale systems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.