Proceedings of 5th Asia-Pacific Workshop on Systems 2014
DOI: 10.1145/2637166.2637235
|View full text |Cite
|
Sign up to set email alerts
|

Machine fault tolerance for reliable datacenter systems

Abstract: Although rare in absolute terms, undetected CPU, memory, and disk errors occur often enough at datacenter scale to significantly affect overall system reliability and availability. In this paper, we propose a new failure model, called Machine Fault Tolerance, and a new abstraction, a replicated writeonce trusted table, to provide improved resilience to these types of failures. Since most machine failures manifest in application server and operating system code, we assume a Byzantine model for those parts of th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2015
2015
2023
2023

Publication Types

Select...
4

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 15 publications
(26 reference statements)
0
3
0
Order By: Relevance
“…With a black-box mechanism to prevent equivocation, one can reduce the number of replicas in BFT to 2đť‘“ +1 [15,16,24]. Several BFT systems achieve that using trusted hardware as the black box: attested append-only memory (A2M) [23] uses a trusted log, TrInc [57] and MinBFT [86] use a trusted counter, Hybster [14] uses Intel's SGX, CheapBFT [49] uses FPGAs, and H-MFT [92] uses trusted hypervisors to implement write-once tables. By separating execution from agreement [89], one can reduce the number of execution replicas to 2đť‘“ +1, but 3đť‘“ +1 replicas are still required for agreement.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…With a black-box mechanism to prevent equivocation, one can reduce the number of replicas in BFT to 2đť‘“ +1 [15,16,24]. Several BFT systems achieve that using trusted hardware as the black box: attested append-only memory (A2M) [23] uses a trusted log, TrInc [57] and MinBFT [86] use a trusted counter, Hybster [14] uses Intel's SGX, CheapBFT [49] uses FPGAs, and H-MFT [92] uses trusted hypervisors to implement write-once tables. By separating execution from agreement [89], one can reduce the number of execution replicas to 2đť‘“ +1, but 3đť‘“ +1 replicas are still required for agreement.…”
Section: Related Workmentioning
confidence: 99%
“…In the slow path-when there are failures or slowness in the network-uBFT uses a novel protocol that combines digital signatures with judicious use of a trusted computing base. The trusted computing base in uBFT is non-tailored and small: rather than trusted enclaves with arbitrary logic such as Intel's SGX [26] or trusted hypervisors [92]-which have large attack surfaces due to their complexity [27,34]-uBFT relies solely on disaggregated memory, a technology increasingly present in data centers due to the availability of RDMA [83] today and CXL [28] in a few years. The key mechanism from disaggregated memory we leverage in uBFT are single-writer regions (regions of memory that can be written by one designated host and can be read by others), implemented in hardware through access permissions.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation