Today's hardware technology presents a new challenge in designing robust systems. Deep submicron VLSI technology introduced transient and permanent faults that were never considered in low-level system designs in the past. Still, robustness of that part of the system is crucial and needs to be guaranteed for any successful product. Distributed systems, on the other hand, have been dealing with similar issues for decades. However, neither the basic abstractions nor the complexity of contemporary fault-tolerant distributed algorithms match the peculiarities of hardware implementations.This paper is intended to be part of an attempt striving to overcome this gap between theory and practice for the clock synchronization problem. Solving this task sufficiently well will allow to build a very robust high-precision clocking system for hardware designs like systems-on-chips in critical applications. As our first building block, we describe and prove correct a novel Byzantine fault-tolerant self-stabilizing pulse synchronization protocol, which can be implemented using standard asynchronous digital logic. Despite the strict limitations introduced by hardware designs, it offers optimal resilience and smaller complexity than all existing protocols.
In this paper, we investigate the approximate consensus problem in highly dynamic networks in which topology may change continually and unpredictably. We prove that in both synchronous and partially synchronous systems, approximate consensus is solvable if and only if the communication graph in each round has a rooted spanning tree, i.e., there is a coordinator at each time. The striking point in this result is that the coordinator is not required to be unique and can change arbitrarily from round to round. Interestingly, the class of averaging algorithms, which are memoryless and require no process identifiers, entirely captures the solvability issue of approximate consensus in that the problem is solvable if and only if it can be solved using any averaging algorithm.Concerning the time complexity of averaging algorithms, we show that approximate consensus can be achieved with precision of ε in a coordinated network model in O(n n+1 log 1 ε ) synchronous rounds, and in O(∆n n∆+1 log 1 ε ) rounds when the maximum round delay for a message to be delivered is ∆. While in general, an upper bound on the time complexity of averaging algorithms has to be exponential, we investigate various network models in which this exponential bound in the number of nodes reduces to a polynomial bound.We apply our results to networked systems with a fixed topology and classical benign fault models, and deduce both known and new results for approximate consensus in these systems. In particular, we show that for solving approximate consensus, a complete network can tolerate up to 2n − 3 arbitrarily located link faults at every round, in contrast with the impossibility result established by Santoro and Widmayer (STACS '89) showing that exact consensus is not solvable with n − 1 link faults per round originating from the same node.
In digital circuits, metastability can cause deteriorated signals that neither are logical 0 or logical 1, breaking the abstraction of Boolean logic. Unfortunately, any way of reading a signal from an unsynchronized clock domain or performing an analog-to-digital conversion incurs the risk of a metastable upset; no digital circuit can deterministically avoid, resolve, or detect metastability (Marino, 1981). Synchronizers, the only traditional countermeasure, exponentially decrease the odds of maintained metastability over time. Trading synchronization delay for an increased probability to resolve metastability to logical 0 or 1, they do not guarantee success.We propose a fundamentally different approach: It is possible to contain metastability by finegrained logical masking so that it cannot infect the entire circuit. This technique guarantees a limited degree of metastability in -and uncertainty about -the output.At the heart of our approach lies a time-and value-discrete model for metastability in synchronous clocked digital circuits. Metastability is propagated in a worst-case fashion, allowing to derive deterministic guarantees, without and unlike synchronizers. The proposed model permits positive results and passes the test of reproducing Marino's impossibility results. We fully classify which functions can be computed by circuits with standard registers. Regarding masking registers, we show that they become computationally strictly more powerful with each clock cycle, resulting in a non-trivial hierarchy of computable functions.Demonstrating the applicability of our approach, we present the first fault-tolerant distributed clock synchronization algorithm that deterministically guarantees correct behavior in the presence of metastability. As a consequence, clock domains can be synchronized without using synchronizers, enabling metastability-free communication between them.
MotivationShrinking feature sizes and increasing clock speeds are the most visible signs of the tremendous advances in VLSI design, which will accommodate billions of transistors on a single chip in the near future [12]. This comes at the price of increased system-level complexity, however: With today's deep submicron technology with GHz clock speeds, wiring delays dominate transistor switching delays, and signals cannot traverse the whole die within a single clock cycle any more. Moreover, the reduced voltage swing needed for high clock speeds and low power consumption dramatically increases the adverse effects of single event upsets like α-particle or neutron hits. The resulting increase of the transient failure rate (soft-error rate) [17] and crosstalk sensitivity [23] has raised concerns about the dependabil- ity of future generation VLSI chips [5]. In fact, a modern VLSI chip can no longer be viewed as a monolithic block of synchronous hardware, where all state transitions occur simultaneously. Rather, VLSI chips are nowadays considered as systems of interacting subsystems -the advent of Systems-on-Chip (SoC). Due to the problems listed above, however, SoCs have much in common with the loosely-coupled distributed systems that have been studied by the fault-tolerant distributed algorithms community for decades. This paper explores whether it is possible to utilize some of this research for SoCs and similar VLSI devices.More specifically, in the context of our DARTS-Project (ti.tuwien.ac.at/darts), which is a joint project between Vienna University of Technology and Austrian Aerospace, we will explore an alternative approach (patented in [26]) to synchronous clocking in VLSI chips and PCB-level system designs. As shown in Fig. 1, the idea is to replace the external quartz oscillator and the clock tree, which supplies the clock signal to the different functional units (Fu i ) on a traditional chip, using a GALS-like approach [4]: Every functional unit has attached a dedicated fault-tolerant tick generation block (TS-Alg), which generates the Fu's local clock signal. In contrast to GALS, however, our approach ensures that the local clock signals of different Fu's are closely synchronized to each other. To accomplish this, all TS-Alg blocks communicate with each other over a simple "network" of clock signals (TS-Net). This alternative clock- ing approach has a number of advantages, which makes it particularly promising for certain application domains: First of all, it does not need a quartz oscillator, which is an expensive and sensitive device (shock, vibration, temperature etc.). The generated clock always runs at the maxi-
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.