Proceedings International Conference on Dependable Systems and Networks
DOI: 10.1109/dsn.2002.1028920
|View full text |Cite
|
Sign up to set email alerts
|

Implementation and performance evaluation of an adaptable failure detector

Abstract: Chandra and Toueg introduced the concept of unreliable failure detectors. They showed how, by adding these detectors to an asynchronous system, it is possible to solve the Consensus problem. In this paper, we propose a new implementation of a failure detector. This implementation is a variant of the heartbeat failure detector which is adaptable and can support scalable applications. In this implementation we dissociate two aspects: a basic estimation of the expected arrival date to provide a short detection ti… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
129
0
12

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 153 publications
(142 citation statements)
references
References 12 publications
(11 reference statements)
1
129
0
12
Order By: Relevance
“…An adaptive failure detector should be designed to improve the quality of failure detection service to fit the application needs and network environmental changes. Bertier [7] proposed the implementation of failure detectors based on failure detection as a novel shared service between several applications. Failure detection based on the sharing of other nodes' failure status can facilitate detection time at the cost of increased overhead control.…”
Section: Background and Related Workmentioning
confidence: 99%
“…An adaptive failure detector should be designed to improve the quality of failure detection service to fit the application needs and network environmental changes. Bertier [7] proposed the implementation of failure detectors based on failure detection as a novel shared service between several applications. Failure detection based on the sharing of other nodes' failure status can facilitate detection time at the cost of increased overhead control.…”
Section: Background and Related Workmentioning
confidence: 99%
“…Often these pseudo-codes use syntactical constructs such as repeat periodically (Chandra & Toueg, 1996) (Aguilera, Chen, & Toueg, 1999) (Bertier, Marin, & Sens, 2002), at time t send heartbeat (Chen, Toueg, & Aguilera, 2002;Bertier et al, 2002), at time t check whether message has arrived , or upon receive , together with several variants (see Table 1). Such syntactical constructs are not often found in COTS programming languages such as C or C++, which leads us to the problem of translating the protocol specifications into running software prototypes using one such standard language.…”
Section: Failure Detection Protocols In the Application Layermentioning
confidence: 99%
“…Failure detection in MPI relies usually on heart beat technique [2] or on senderbased logging [16] that consist in detecting remote activity through the network. Such techniques detect node or link failures, not data corruption.…”
Section: Related Workmentioning
confidence: 99%