The /spl phi/ accrual failure detector

Hayashibara, Naohiro; Défago, Xavier; Yared, Rami; Katayama, Takuya

doi:10.1109/reldis.2004.1353004

Cited by 141 publications

(131 citation statements)

References 25 publications

Supporting

Mentioning

116

Contrasting

Unclassified

Order By: Relevance

“…A fault tolerance service to check the cloud providers and other services status will be developed and evaluated. We also plan to use an adaptive fault monitoring algorithm, as proposed by [18,30] and [70], which are more adaptable to be used in a large-scale distributed environment. It is also important to include a security service and an SLA service in the federated platform.…”

Section: Discussionmentioning

confidence: 99%

See 1 more Smart Citation

Towards a Hybrid Federated Cloud Platform to Efficiently Execute Bioinformatics Workflows

Saldanha¹,

Ribeiro²,

Araújo³

et al. 2012

Bioinformatics

View full text Add to dashboard Cite

Section: Discussionmentioning

confidence: 99%

“…There are extensive studies in the literature on failure detection systems [16,31,45,70]. On the other hand, few systems are designed to scale with a large number of nodes as those found on clouds.…”

Section: Fault Tolerance Service and High Availabilitymentioning

confidence: 99%

Towards a Hybrid Federated Cloud Platform to Efficiently Execute Bioinformatics Workflows

Saldanha¹,

Ribeiro²,

Araújo³

et al. 2012

Bioinformatics

View full text Add to dashboard Cite

“…Each node periodically disseminates its status information to a number of randomlyselected nodes and relays status information received from other nodes. This method is also used to detect and advertise node failures across the cluster [35].…”

Section: Background: the Systems We Targetmentioning

confidence: 99%

Keeping up with storage: Decentralized, write-enabled dynamic geo-replication

Matri

Pérez

Costan

et al. 2018

Future Generation Computer Systems

View full text Add to dashboard Cite

Large-scale applications are ever-increasingly geo-distributed. Maintaining the highest possible data locality is crucial to ensure high performance of such applications. Dynamic replication addresses this problem by dynamically creating replicas of frequently accessed data close to the clients. This data is often stored in decentralized storage systems such as Dynamo or Voldemort, which offer support for mutable data. However, existing approaches to dynamic replication for such mutable data remain centralized, thus incompatible with these systems. In this paper we introduce a writeenabled dynamic replication scheme that leverages the decentralized architecture of such storage systems. We propose an algorithm enabling clients to locate tentatively the closest data replica without prior request to any metadata node. Large-scale experiments on various workloads show a read latency decrease of up to 42% compared to other state-ofthe-art, caching-based solutions.

show abstract

“…One of the key messages of this book is that it is important to distinguish between porting a code Table 1: Syntactical constructs used in several failure detector protocols. ϕ is the accrual failure detector discussed in (Hayashibara, 2004;Hayashibara et al, 2004). D is the eventually perfect failure detector of (Chandra & Toueg, 1996).…”

Section: Failure Detection Protocols In the Application Layermentioning

confidence: 99%

Application-Layer Fault-Tolerance Protocols

Florio

2009

View full text Add to dashboard Cite

The /spl phi/ accrual failure detector

Cited by 141 publications

References 25 publications

Towards a Hybrid Federated Cloud Platform to Efficiently Execute Bioinformatics Workflows

Towards a Hybrid Federated Cloud Platform to Efficiently Execute Bioinformatics Workflows

Keeping up with storage: Decentralized, write-enabled dynamic geo-replication

Application-Layer Fault-Tolerance Protocols

Contact Info

Product

Resources

About