This paper considers the communication and storage costs of emulating atomic (linearizable) multi-writer multi-reader shared memory in distributed message-passing systems. The paper contains three main contributions: (1) we present an atomic shared-memory emulation algorithm that we call Coded Atomic Storage (CAS). This algorithm uses erasure coding methods. In a storage system with N servers that is resilient to f server failures, we show that the com-We present a modification of the CAS algorithm known as CAS with garbage collection (CASGC). The CASGC algorithm is parameterized by an integer δ and has a bounded storage cost. We show that the CASGC algorithm satisfies atomicity. In every execution of CASGC where the number of server failures is no bigger than f , we show that every write operation invoked at a non-failing client terminates. We also show that in an execution of CASGC with parameter δ where the number of server failures is no bigger than f, a read operation terminates provided that the number of write operations that are concurrent with the read is no bigger than δ. We explicitly characterize the storage cost of CASGC, and show that it has the same communication cost as CAS. (3) We describe an algorithm known as the Communication Cost Optimal Atomic Storage (CCOAS) algorithm that achieves a smaller communication cost than CAS and CASGC. In particular, CCOAS incurs read and write communication costs of N N − f measured in terms of number of object values. We also discuss drawbacks of CCOAS as compared with CAS and CASGC.Keywords Shared memory emulation · Erasure coding · Multi-writer multi-reader atomic register · Concurrent read and write operations · Storage efficiency
Shareable data services providing consistency guarantees, such as atomicity (linearizability), make building distributed systems easier. However, combining linearizability with efficiency in practical algorithms is difficult. A reconfigurable linearizable data service, called RAMBO, was developed by Lynch and Shvartsman. This service guarantees consistency under dynamic conditions involving asynchrony, message loss, node crashes, and new node arrivals. The specification of the original algorithm is given at an abstract level aimed at concise presentation and formal reasoning about correctness. The algorithm propagates information by means of gossip messages. If the service is in use for a long time, the size and the number of gossip messages may grow without bound. This paper presents a consistent data service for long-lived objects that improves on RAMBO in two ways: it includes an incremental communication protocol and a leave service. The new protocol takes advantage of the local knowledge, and carefully manages the size of messages by removing redundant information, while the leave service allows the nodes to leave the system gracefully. The new algorithm is formally proved correct by forward simulation using levels of abstraction. An experimental implementation of the system was developed for networks-of-workstations. The paper also includes analytical and preliminary empirical results that illustrate the advantages of the new algorithm.
This paper considers the communication and storage costs of emulating atomic (linearizable) multi-writer multireader shared memory in distributed message-passing systems. The paper contains two main contributions:(1) We present an atomic shared-memory emulation algorithm that we call Coded Atomic Storage (CAS). This algorithm uses erasure coding methods. In a storage system with N servers that is resilient to f server failures, we show that the communication cost of CAS is N N −2f . The storage cost of CAS is unbounded.(2) We present a variant of CAS known as CAS with Garbage Collection (CASGC). The CASGC algorithm is parametrized by an integer δ and has a bounded storage cost. We show that in every execution where the number of write operations that are concurrent with a read operation is no bigger than δ, the CASGC algorithm with parameter δ satisfies atomicity and liveness. We explicitly characterize the storage cost of CASGC, and show that it has the same communication cost as CAS.2 server nodes fail. Since the read and write protocols require multiple communication phases where entire replicas are sent, the ABD algorithm has a high communication cost.The main goal of our paper is to develop shared memory emulation algorithms, based on the idea of erasure coding, that are efficient in terms of communication and storage costs. Erasure coding is a generalization of replication that is well known in the context of classical storage systems [17], [20], Dolev [5] allows only a single node to act as a writer. Also, it does not distinguish between client and server nodes as we do in our paper. 2 and regardless of the number of client failures. We also show in Lemma 6 that CAS ensures atomicity regardless of the number of (client or server) failures. In Theorem 2 in Section IV, we also analyze the communication cost of CAS. Specifically, in a storage system with N servers that is resilient to f server node failures, we show that the communication costs of CAS are equal to N N −2f . We note that these communication costs of CAS are smaller than replication based schemes (see extended version of this paper [7]). The storage cost of CAS, however, is unbounded because each server stores the value associated with every version of the data object it receives. In comparison, in ABD which is based on replication, the storage cost is bounded because each node stores only the latest version of the data object (see [7]).The CASGC algorithm: In Section V, we present a variant of CAS called the CAS with Garbage Collection (CASGC) algorithm, which achieves a bounded storage cost by garbage collection, i.e., discarding values associated with sufficiently old versions. CASGC is parametrized by an integer δ which, informally speaking, controls the number of tuples that each server stores. We show that CASGC satisfies atomicity in ii We only provide brief sketches of the proofs of our results here. Full proofs of our theorems can be found in the extended version of this paper [7].
Transforming abstract algorithm specifications into executable code is an error-prone process in the absence of sophisticated compilers that can automatically translate such specifications into the target distributed system. This paper presents a framework that was developed for translating algorithms specified as Input/Output Automata (IOA) to distributed programs. The framework consists of a methodology that guides the software development process and a core set of functions needed in target implementations that reduce unnecessary software development. As a proof of concept, this work also presents a distributed implementation of a reconfigurable atomic memory service for dynamic networks. The service emulates atomic read/write shared objects in the dynamic setting where processors can arbitrarily crash, or join and leave the computation. The algorithm implementing the service is given in terms of IOA. The system is implemented in Java and runs on a network of workstations. Empirical data illustrates the behavior of the system.
This paper considers quorum-replicated, multi-writer, multi-reader (MWMR) implementations of survivable atomic registers in a distributed message-passing system with processors prone to failures. Previous implementations in such settings invariably required two rounds of communication between readers/writers and replica owners. Hence the question arises whether it is possible to have single round read and/or write operations in this setting. As a first step, we present an algorithm, called CWFR, that allows the classic two round write operations, while supporting single round read operations. Since multiple write operations may be concurrent with a read operation, this algorithm involves an iterative (local) discovery of the latest completed write operation. This algorithm precipitates the question of whether fast (single round) writes may co-exist with fast reads. We thus devise a second algorithm, called SFW, that exploits a new technique called server side ordering (SSO), which -unlike previous approaches-places partial responsibility for the ordering of write operations on the replica owners (the servers). With SSO, fast write operations are introduced for the very first time in the MWMR setting. While this is possible, we show that under certain conditions the MWMR model imposes inherent limitations on any quorum-based fast write implementation of a safe read/write register and potentially even restricts the number of writer participants in the system. In this case our second algorithm achieves near optimal efficiency. Both algorithms are proved to preserve atomicity in all permissible executions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.