We determine the weakest failure detectors to solve several fundamental problems in distributed message-passing systems, for all environments -i.e., regardless of the number and timing of crashes. The problems that we consider are: implementing an atomic register, solving consensus, solving quittable consensus (a variant of consensus in which processes have the option to decide 'quit' if a failure occurs), and solving non-blocking atomic commit.
Abstract. In the set-agreement problem, n processes seek to agree on at most n−1 different values. This paper determines the weakest failure detector to solve this problem in a message-passing system where processes may fail by crashing. This failure detector, called the Loneliness detector and denoted L, outputs one of two values, "true" or "false" such that:(1) there is at least one process where L outputs always "false", and (2) if only one process is correct, L eventually outputs "true" at this process.
In the population protocol model introduced by Angluin et al.[2], a collection of agents, which are modelled by finite state machines, move around unpredictably and have pairwise interactions. The ability of such systems to compute functions on a multiset of inputs that are initially distributed across all of the agents has been studied in the absence of failures. Here, we show that essentially the same set of functions can be computed in the presence of halting and transient failures, provided preconditions on the inputs are added so that the failures cannot immediately obscure enough of the inputs to change the outcome. We do this by giving a general-purpose transformation that makes any algorithm for the fault-free setting tolerant to failures.
We study the feasibility and cost of implementing Ω-a fundamental failure detector at the core of many algorithms-in systems with weak reliability and synchrony assumptions. Intuitively, Ω allows processes to eventually elect a common leader. We first give an algorithm that implements Ω in a weak system S where (a) except for some unknown timely process s, all processes may be arbitrarily slow or may crash, and (b) only the output links of s are eventually timely (all other links can be arbitrarily slow and lossy). Previously known algorithms for Ω worked only in systems that are strictly stronger than S in terms of reliability or synchrony assumptions.We next show that algorithms that implement Ω in system S are necessarily expensive in terms of communication complexity: all correct processes (except possibly one) must send messages forever; moreover, a quadThis paper was originally invited to the special issue of Distributed Computing based on selected papers presented at the 22nd ACM Symposium on Principles of Distributed Computing (PODC 2003). It appears separately due to publication delays. ratic number of links must carry messages forever. This result holds even for algorithms that tolerate at most one crash. Finally, we show that with a small additional assumption to system S-the existence of some unknown correct process whose links can be arbitrarily slow and lossy but fair-there is a communication-efficient algorithm for Ω such that eventually only one process (the elected leader) sends messages. Some recent experimental results indicate that two of the algorithms for Ω described in this paper can be used in dynamically-changing systems and work well in practice [Schiper, Toueg in
Abstract. At the heart of distributed computing lies the fundamental result that the level of agreement that can be obtained in an asynchronous shared memory model where t processes can crash is exactly t + 1. In other words, an adversary that can crash any subset of size at most t can prevent the processes from agreeing on t values. But what about the remaining (2 2 n − n) adversaries that might crash certain combination of processes and not others? This paper presents a precise way to characterize such adversaries by introducing the notion of disagreement power: the biggest integer k for which the adversary can prevent processes from agreeing on k values. We show how to compute the disagreement power of an adversary and how this notion enables to derive n equivalence classes of adversaries.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.