Edward Curley scite author profile

et al. 2006

This thesis considers the problem of recovering from failures of distributable threads with assured timeliness. When a node hosting a portion of a distributable thread fails, it causes orphans-i.e., thread segments that are disconnected from the thread's root. A termination model is considered for recovering from such failures. In this model the orphans must be detected and cleaned up, and failure-exception notification must be delivered to the farthest, contiguous surviving thread segment for resuming thread execution. Two real-time scheduling algorithms (AUA and HUA) and three distributable thread integrity protocols (TPR, D-TPR and W-TPR) are presented. We show that AUA combined with any of the protocols presented bounds the orphan cleanup and recovery time, thereby bounding thread starvation durations and maximizing the total thread accrued timeliness utility. The algorithms and the protocols are implemented in a real-time middleware that supports distributable threads.The experimental studies with the implementation validate the algorithm/protocols' timebounded recovery property and confirm their effectiveness.

show abstract

On Best-Effort Real-Time Assurances for Recovering from Distributable Thread Failures in Distributed Real-Time Systems

et al. 2007

Recovering from distributable thread failures in distributed real-time Java

ACM Trans. Embed. Comput. Syst.

et al. 2010

We consider the problem of recovering from the failures of distributable threads (“threads”) in distributed real-time systems that operate under runtime uncertainties including those on thread execution times, thread arrivals, and node failure occurrences. When a thread experiences a node failure, the result is a broken thread having an orphan. Under a termination model, the orphans must be detected and aborted, and exceptions must be delivered to the farthest, contiguous surviving thread segment for resuming thread execution. Our application/scheduling model includes the proposed distributable thread programming model for the emerging Distributed Real-Time Specification for Java (DRTSJ), together with an exception-handler model. Threads are subject to time/utility function (TUF) time constraints and an utility accrual (UA) optimality criterion. A key underpinning of the TUF/UA scheduling paradigm is the notion of “best-effort” where higher importance threads are always favored over lower importance ones, irrespective of thread urgency as specified by their time constraints. We present a thread scheduling algorithm called HUA and a thread integrity protocol called TPR. We show that HUA and TPR bound the orphan cleanup and recovery time with bounded loss of the best-effort property. Our implementation experience for HUA/TPR in the Reference Implementation of the proposed programming model for the DRTSJ demonstrates the algorithm/protocol's effectiveness.

show abstract

Assured-Timeliness Integrity Protocols for Distributable Real-Time Threads with in Dynamic Distributed Systems

et al.

Networked embedded systems present challenges for designers composing distributed applications with dynamic, real-time, and resilience requirements. We consider the problem of recovering from failures of distributable threads with assured timeliness in dynamic systems with overloads, and node and (permanent/transient) network failures. When a failure prevents timely execution, the thread must be terminated, requiring detecting and aborting thread orphans and delivering exceptions to the farthest, contiguous surviving thread segment for possible resumption, while optimizing system-wide timeliness. A scheduling algorithm (HUA) and two thread integrity protocols (D-TPR and W-TPR) are presented and shown to bound orphan cleanup and recovery times with bounded loss of best-effort behavior. Implementation experience using the emerging Distributed Real-Time Specification for Java (DRTSJ) demonstrates the algorithm/protocols' effectiveness.

show abstract

On Scheduling Exception Handlers in Dynamic, Embedded Real-Time Systems

Jensen