2000
DOI: 10.1109/69.842258
|View full text |Cite
|
Sign up to set email alerts
|

The PSTR/SNS scheme for real-time fault tolerance via active object replication and network surveillance

Abstract: ÐThe time-triggered message-triggered object (TMO) scheme was formulated a few years ago as a major extension of the conventional object structuring schemes with the idealistic goal of facilitating general-form design and timeliness-guaranteed design of complex real-time application systems. Recently, as a new scheme for realizing TMO-structured distributed and parallel computer systems capable of both hardware and software fault tolerance, we have formulated and demonstrated the primary-shadow TMO replication… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2006
2006
2012
2012

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 16 publications
(6 citation statements)
references
References 14 publications
0
6
0
Order By: Relevance
“…Real-time fault-tolerant systems: IFLOW [18] and MEAD [19] use fault-prediction techniques to reduce fault detection and client failover time to change the frequency of backup replica state synchronization to minimize state synchronization during failure recovery, and by determining the possibility of a primary replica failure and redirecting clients to alternate servers before failures occur, respectively. The Time-triggered Message-triggered Objects (TMO) project [20] considers replication schemes such as the primary-shadow TMO replication (PSTR) scheme, for which recovery time bounds can be quantitatively established, and real-time fault tolerance guarantees can be provided to applications. FC-ORB [21] is a real-time Object Request Broker (ORB) middleware that employs end-to-end utilization control to handle fluctuations in application workload and system resources by enforcing desired CPU utilization bounds on multiple processors by adapting the rates of end-to-end tasks within user-specified ranges.…”
Section: B Evaluating Safemat-induced Failover Overhead Timesmentioning
confidence: 99%
“…Real-time fault-tolerant systems: IFLOW [18] and MEAD [19] use fault-prediction techniques to reduce fault detection and client failover time to change the frequency of backup replica state synchronization to minimize state synchronization during failure recovery, and by determining the possibility of a primary replica failure and redirecting clients to alternate servers before failures occur, respectively. The Time-triggered Message-triggered Objects (TMO) project [20] considers replication schemes such as the primary-shadow TMO replication (PSTR) scheme, for which recovery time bounds can be quantitatively established, and real-time fault tolerance guarantees can be provided to applications. FC-ORB [21] is a real-time Object Request Broker (ORB) middleware that employs end-to-end utilization control to handle fluctuations in application workload and system resources by enforcing desired CPU utilization bounds on multiple processors by adapting the rates of end-to-end tasks within user-specified ranges.…”
Section: B Evaluating Safemat-induced Failover Overhead Timesmentioning
confidence: 99%
“…MEAD [17] and its proactive recovery strategy for distributed CORBA applications can minimize the recovery time for DRE systems. The Time-triggered Message-triggered Objects (TMO) project [9] considers replication schemes such as the primary-shadow TMO replication (PSTR) scheme, for which recovery time bounds can be quantitatively established, and real-time fault tolerance guarantees can be provided to applications. DARX [11] provides adaptive fault-tolerance for multi-agent software platforms by dynamically changing replication styles in response to changing resource availabilities and application performance.…”
Section: Related Workmentioning
confidence: 99%
“…Time testing, referred to as heartbeating [3,9,10], can be used to check if a component or system is anomalous, but it may fail to locate where an anomaly is in a component or system. The detection mechanism depending on exceptions [7] may not handle unanticipated, state-dependent anomalies.…”
Section: Introductionmentioning
confidence: 99%
“…Several approaches to anomaly detection for dependable systems have been suggested in [3,[8][9][10]7,4], which may provide partial solutions from the perspective of quality factors of the mechanisms for anomaly detection-speed and accuracy. Time testing, referred to as heartbeating [3,9,10], can be used to check if a component or system is anomalous, but it may fail to locate where an anomaly is in a component or system.…”
Section: Introductionmentioning
confidence: 99%