Abstract-We consider real-time control systems that consist of a controller that computes and sends setpoints to be implemented in physical processes through process agents. We focus on systems that use commercial off-the-shelf hardware and software components. Setpoints of these systems have strict real-time constraints: Implementing a setpoint after its deadline, or not receiving setpoints within a deadline, can cause failure. In this paper, we address delay faults: faults that cause setpoints to violate their real-time constraints. We present Axo, a fault-tolerance protocol that guarantees safety and improves availability for a class of such systems that exhibit two main properties: the setpoints must have a known validity horizon, and process agents must be capable of handling duplicate setpoints. To reason about delay faults, and consequently design Axo, we present an abstraction of a controller; the abstraction applies to a wide range of real-time control systems. We prove guarantees of safety and availability. Finally, we present an implementation of Axo and the results of the tests performed with Commelec, a real-time control system for electric grids.
Abstract. We propose two expressive and complementary techniques for the verification of safety properties of infinite-state BIP models. Both our techniques deal with the full BIP specification, while the existing approaches impose considerable restrictions: they either verify finite-state systems or they do not handle the transfer of data on the interactions and priorities. Firstly, we propose an instantiation of the ESST (Explicit Scheduler Symbolic Thread) framework to verify BIP models. The key insight is to apply symbolic reasoning to analyze the behavior of the system described by the BIP components, and an explicit-state search to analyze the behavior of the system induced by the BIP interactions and priorities. The combination of symbolic and explicit exploration techniques allow to benefit from abstraction, useful when reasoning about data, and from partial order reduction, useful to mitigate the state space explosion due to concurrency. Secondly, we propose an encoding from a BIP model into a symbolic, infinitestate transition system. This technique allows us to leverage the state of the art verification algorithms for the analysis of infinite-state systems. We implemented both techniques and we evaluated their performance against the existing approaches. The results show the effectiveness of our approaches with respect to the state of the art, and their complementarity for the analysis of safe and unsafe BIP models.
Abstract-Real-time control systems (RTCSs) tolerate delay and crash faults by replicating the controller. Each replica computes and issues setpoints to actuators over a network that might drop or delay messages. Hence, the actuators might receive an inconsistent set of setpoints. Such inconsistency is avoided either by having a single primary replica compute and issue setpoints (in passive replication) or a consensus algorithm select one sendingreplica (in active replication). However, due to the impossibility of a perfect failure-detector, passive-replication schemes can have multiple primaries, causing inconsistency, especially in the presence of intermittent delay faults. Furthermore, the impossibility of bounded-latency consensus causes both schemes to have poor real-time performance. We identified three properties of RTCSs that enable active-replication schemes to agree on the measurements before computing, instead of using traditional consensus. As all computing replicas compute with the same state, the resulting setpoints are guaranteed to be consistent. We present the design of Quarts, an agreement solution for active replication that guarantees consistency and bounded latency-overhead. We prove the guarantees and compare the performance of Quarts with existing solutions through simulation. We show that Quarts provides an availability higher than existing solutions, and that the availability improvement is up to 10x with two replicas.
Abstract-Aggregation of electric resources is a fundamental function for the operation of power grids at different time scales. In the context of a recently proposed framework for the real-time control of microgrids with explicit power setpoints, we define and formally specify an aggregation method that explicitly accounts for delays and message asynchronism. The method allows to abstract the details of resources using high-level concepts that are device and grid-independent. We demonstrate the application of the method to a Cigre benchmark with heterogenous and lowinertia resources.
We consider cyber-physical systems (CPSs) comprising a central controller that might be replicated for highreliability, and one or more process agents. The controller receives measurements from process agents, causing it to compute and issue setpoints that are sent back to process agents. The implementation of these setpoints causes a change in the state of the controlled physical process, and the new state is communicated to the controllers through resulting measurements. To ensure correct operation, the process agents must implement only those setpoints that were caused by their most recent measurements. However, in the presence of replication of the controller, network or computation delays, setpoints and measurements do not necessarily succeed in causing the intended behavior. To capture the dependencies among events associated with measurements and setpoints, we introduce the intentionality relation among such events in a CPS and illustrate its differences with respect to the happened-before relation. We propose a mechanism, intentionality clocks, and the design of controllers and process agents that can be used to guarantee the strong clock-consistency condition under the intentionality relation. Moreover, we prove that our design ensures correct operation despite crash, delay, and network faults. We also demonstrate the practical application of our abstraction through an illustration with a real-world CPS for electrical vehicles.
Abstract-Multiple software agents can be used to perform the real-time control of electrical grids. The control performance of such solutions is influenced by software non-idealities such as crashes and delays of the software agents, and message losses and delays due to the underlying communication network. To study the effect of these non-idealities on control systems, we present an open-source software testbed, named T-RECS. It uses software containers to test existing software without modification. The communication network among the software containers is emulated using Mininet framework, which allows for real packets being exchanged. The electric resources in the grid are simulated using state-of-the-art models, whereas the grid itself is modeled in the phasor domain. As control agents are run as is and message exchanges are emulated, T-RECS accurately captures the realworld properties of the control framework. We demonstrate the working of T-RECS with the Commelec control framework and show the effect of network non-idealities on the control performance. We make a beta version available.
Real-time control systems use controllers that compute and issue setpoints within stringent delay constraints. Failure to do so, due to a crash or delay as a result of software and/or hardware faults, can cause failure of the controlled resources. Recently, Axo, a protocol for masking crash and delay faults by replicating the controller, was proposed. Axo provides safety by discarding delayed setpoints, and it relies on the presence of valid setpoints for providing availability. To ensure that enough valid setpoints are issued, faulty controller replicas need to be detected and recovered. We present a mechanism for detection and recovery of delay-and crash-faulty replicas under the Axo framework. These mechanisms were designed to be soft state (i.e., their state can be reconstructed from received messages) to enable seamless additions of new replicas. Besides presenting the design, we analytically characterize the time to detect and recover a faulty replica, and we validate them experimentally. We demonstrate the performance of Axo by using two case studies: the first provides a stability analysis of an inverted pendulum system with Axo, and the second shows the fault-tolerance performance of Axo through a deployment on a real-time control system that controls a CIGRÉ low-voltage benchmark microgrid.
Deploying a power grid controller in the field makes it susceptible to message losses caused by the inherent uncertainties and non-idealities of communication networks, especially when the control action is taken at a sub-second timescale. We consider a centralized power grid controller that monitors and controls resources in real-time. The resources send advertisements that contain information about their state, and an estimation of their behavior in the time horizon when the control action is expected to be implemented. The controller uses this information to compute and issue setpoints that are thus only valid for this time horizon. An occasional loss of one or more advertisements might render the controller incapable of issuing valid setpoints. We introduce advertisements with a longer-term prediction interval, which are constantly sent along with the short-term ones, and can be used by the controller when it is missing information from some or all resources. We show the advantages of using such an approach on a controller that, by exploiting local resources flexibilities, performs frequency support on the CIGRÉ benchmark low-voltage microgrid.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.