Abstract:Total order broadcast and multicast (also called atomic broadcast/multicast) present an important problem in distributed systems, especially with respect to fault-tolerance. In short, the primitive ensures that messages sent to a set of processes are, in turn, delivered by all those processes in the same total order. The problem has inspired an abundance of literature, with a plethora of proposed algorithms. This article proposes a classification of total order broadcast and multicast algorithms based on their… Show more
“…This method is inspired by the sequencer based atomic broadcast as explained in Xavier et al [13]. In this method, a micro-kernel is elected to be the single sequencer of the system.…”
Section: Deadlock Avoidance By Sequencer Based Atomic Broadcastmentioning
As today's manycore processors already feature over 64 cores and as tomorrow's are slated to contain 1000s, it is important to design operating system techniques that can efficiently cope with this scale of resource coordination. The current state-of-the-art in manycore processor architectures has evolved from traditional bus-based architectures over rings to mesh-based Network-on-Chip (NoC) interconnects. This implies an increasing potential for scalable message passing. However, contemporary operating systems heavily rely on single system images with shared memory constructs that may not scale well to large core counts. To address these challenges, we devise a distributed message passing only system comprised of so-called "pico-kernels" per core. They are controlled by dedicated "micro-kernels" topologically centered within a set of cores that cooperatively comprise the overall operating system in a peer-to-peer fashion.Such a system promotes rethinking and redesigning of various operating system services focusing on scalability as the primary design constraint. We consider the challenges of distributed allocation of jobs, each comprised of a set of tasks to be mapped to disjoint cores. A naive solution performing fragmented allocations may quickly escalate to deadlocks, where jobs hold and wait for cores in circular dependencies. To tackle these challenges, we propose a deadlock free distributed job allocation protocol. We have devised two policies for avoiding deadlocks, namely active cancellation and sequencer-based atomic broadcast. The protocol and the two policies have been implemented and evaluated on a Tilera TilePro64 processor with 64 cores on a single socket. Results show that for sparse job allocations active cancellation provides less job allocation overhead while for denser job allocations the sequencer-based atomic broadcast provides less overhead.
“…This method is inspired by the sequencer based atomic broadcast as explained in Xavier et al [13]. In this method, a micro-kernel is elected to be the single sequencer of the system.…”
Section: Deadlock Avoidance By Sequencer Based Atomic Broadcastmentioning
As today's manycore processors already feature over 64 cores and as tomorrow's are slated to contain 1000s, it is important to design operating system techniques that can efficiently cope with this scale of resource coordination. The current state-of-the-art in manycore processor architectures has evolved from traditional bus-based architectures over rings to mesh-based Network-on-Chip (NoC) interconnects. This implies an increasing potential for scalable message passing. However, contemporary operating systems heavily rely on single system images with shared memory constructs that may not scale well to large core counts. To address these challenges, we devise a distributed message passing only system comprised of so-called "pico-kernels" per core. They are controlled by dedicated "micro-kernels" topologically centered within a set of cores that cooperatively comprise the overall operating system in a peer-to-peer fashion.Such a system promotes rethinking and redesigning of various operating system services focusing on scalability as the primary design constraint. We consider the challenges of distributed allocation of jobs, each comprised of a set of tasks to be mapped to disjoint cores. A naive solution performing fragmented allocations may quickly escalate to deadlocks, where jobs hold and wait for cores in circular dependencies. To tackle these challenges, we propose a deadlock free distributed job allocation protocol. We have devised two policies for avoiding deadlocks, namely active cancellation and sequencer-based atomic broadcast. The protocol and the two policies have been implemented and evaluated on a Tilera TilePro64 processor with 64 cores on a single socket. Results show that for sparse job allocations active cancellation provides less job allocation overhead while for denser job allocations the sequencer-based atomic broadcast provides less overhead.
“…For instance, Schiper and Pedone [63] propose a protocol for open groups. In open groups [22], not only the members of the system group can multicast messages to its members, but any other process can. Most of the surveyed multicast protocols were intended for closed groups where only group members are able to multicast messages, compelling external clients to forward their messages to any internal process that later multicasts each message.…”
Many distributed services need to be scalable: internet search, electronic commerce, e-government... In order to achieve scalability those applications rely on replicated components. Because of the dynamics of growth and volatility of customer markets, applications need to be hosted by adaptive systems. In particular, the scalability of the reliable multicast mechanisms used for supporting the consistency of replicas is of crucial importance. Reliable multicast may propagate updates in a pre-defined order (e.g., FIFO, total or causal). Since total order needs more communication rounds than causal order, the latter appears to be the preferable candidate for achieving multicast scalability, although the consistency guarantees based on causal order are weaker than those of total order. This paper provides a historical survey of different scalability approaches for reliable causal multicast protocols.
In this paper, we study the atomic multicast problem, a fundamental abstraction for building faulttolerant systems. In the atomic multicast problem, the system is divided into non-empty and disjoint groups of processes. Multicast messages may be addressed to any subset of groups, each message possibly being multicast to a different subset. Several papers previously studied this problem either in local area networks [3,9,20] or wide area networks [13,21]. However, none of them considered atomic multicast when groups may crash. We present two atomic multicast algorithms that tolerate the crash of groups. The first algorithm tolerates an arbitrary number of failures, is genuine (i.e., to deliver a message m, only addressees of m are involved in the protocol), and uses the perfect failures detector P. We show that among realistic failure detectors, i.e., those that do not predict the future, P is necessary to solve genuine atomic multicast if we do not bound the number of processes that may fail. Thus, P is the weakest realistic failure detector for solving genuine atomic multicast when an arbitrary number of processes may crash. Our second algorithm is non-genuine and less resilient to process failures than the first algorithm but has several advantages: (i) it requires perfect failure detection within groups only, and not across the system, (ii) as we show in the paper it can be modified to rely on unreliable failure detection at the cost of a weaker liveness guarantee, and (iii) it is fast, messages addressed to multiple groups may be delivered within two inter-group message delays only.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.