N. Knezevic scite author profile

Modern Byzantine fault-tolerant state machine replication (BFT) protocols involve about 20,000 lines of challenging C++ code encompassing synchronization, networking and cryptography. They are notoriously difficult to develop, test and prove. We present a new abstraction to simplify these tasks. We treat a BFT protocol as a composition of instances of our abstraction. Each instance is developed and analyzed independently.To illustrate our approach, we first show how our abstraction can be used to obtain the benefits of a state-ofthe-art BFT protocol with much less pain. Namely, we develop AZyzzyva, a new protocol that mimics the behavior of Zyzzyva in best-case situations (for which Zyzzyva was optimized) using less than 24% of the actual code of Zyzzyva. To cover worst-case situations, our abstraction enables to use in AZyzzyva any existing BFT protocol, typically, a classical one like PBFT which has been tested and proved correct.We then present Aliph, a new BFT protocol that outperforms previous BFT protocols both in terms of latency (by up to 30%) and throughput (by up to 360%). The development of Aliph required two new instances of our abstraction. Each instance contains less than 25% of the code needed to develop state-of-the-art BFT protocols.

show abstract

The Next 700 BFT Protocols

Aublin

Guerraoui

Knezevic

et al. 2015

ACM Trans. Comput. Syst.

110

108

View full text Add to dashboard Cite

We present Abstract (ABortable STate mAChine replicaTion), a new abstraction for designing and reconfiguring generalized replicated state machines that are, unlike traditional state machines, allowed to abort executing a client's request if "something goes wrong."Abstract can be used to considerably simplify the incremental development of efficient Byzantine faulttolerant state machine replication (BFT) protocols that are notorious for being difficult to develop. In short, we treat a BFT protocol as a composition of Abstract instances. Each instance is developed and analyzed independently and optimized for specific system conditions. We illustrate the power of Abstract through several interesting examples.We first show how Abstract can yield benefits of a state-of-the-art BFT protocol in a less painful and errorprone manner. Namely, we develop AZyzzyva, a new protocol that mimics the celebrated best-case behavior of Zyzzyva using less than 35% of the Zyzzyva code. To cover worst-case situations, our abstraction enables one to use in AZyzzyva any existing BFT protocol.We then present Aliph, a new BFT protocol that outperforms previous BFT protocols in terms of both latency (by up to 360%) and throughput (by up to 30%). Finally, we present R-Aliph, an implementation of Aliph that is robust, that is, whose performance degrades gracefully in the presence of Byzantine replicas and Byzantine clients.

show abstract

Predicting and preventing inconsistencies in deployed distributed systems

Yabandeh

Knezevic

Kostić

et al. 2010

ACM Trans. Comput. Syst.

View full text Add to dashboard Cite

We propose a new approach for developing and deploying distributed systems, in which nodes predict distributed consequences of their actions and use this information to detect and avoid errors. Each node continuously runs a state exploration algorithm on a recent consistent snapshot of its neighborhood and predicts possible future violations of specified safety properties. We describe a new state exploration algorithm, consequence prediction, which explores causally related chains of events that lead to property violation.This article describes the design and implementation of this approach, termed CrystalBall. We evaluate CrystalBall on RandTree, BulletPrime, Paxos, and Chord distributed system implementations. We identified new bugs in mature Mace implementations of three systems. Furthermore, we show that if the bug is not corrected during system development, CrystalBall is effective in steering the execution away from inconsistent states at runtime.

show abstract

Enabling DVD-like features in P2P video-on-demand systems

Vratonjic

Gupta

Knezevic

et al. 2007

View full text Add to dashboard Cite

Peer-to-peer (p2p) video-on-demand (VoD) is increasingly popular with Internet users. Currently deployed pure p2p VoD systems provide poor general performance and they lack advanced features such as fast forward and seeking to arbitrary points. Peer-assisted VoD systems can provide such services, but they require very well provisioned source servers (or server farms).We propose BulletMedia, a system that uses proactive caching to attempt to provide advanced features without requiring a well provisioned server. In BulletMedia, blocks are altruistically replicated by peers not to aid immediate playback but to simply increase the number of replicas of each block. This helps ensure that blocks are available in-overlay and reduces dependence on the source. BulletMedia combines a traditional overlay mesh approach with a structured overlay. The overlay mesh is used to fetch blocks at a high rate, while the structured overlay is used to enable efficient block discovery and to control block replication. Initial experimental results from a prototype BulletMedia implementation demonstrate that it can both effectively control in-overlay block replication and can efficiently use these replicas to perform forward seeks.

show abstract

Staged deployment in mirage, an integrated software upgrade testing and distribution system

Crameri

Knezevic

Kostić

et al. 2007

View full text Add to dashboard Cite

Despite major advances in the engineering of maintainable and robust software over the years, upgrading software remains a primitive and error-prone activity. In this paper, we argue that several problems with upgrading software are caused by a poor integration between upgrade deployment, user-machine testing, and problem reporting. To support this argument, we present a characterization of software upgrades resulting from a survey we conducted of 50 system administrators. Motivated by the survey results, we present Mirage, a distributed framework for integrating upgrade deployment, user-machine testing, and problem reporting into the overall upgrade development process. Our evaluation focuses on the most novel aspect of Mirage, namely its staged upgrade deployment based on the clustering of user machines according to their environments and configurations. Our results suggest that Mirage's staged deployment is effective for real upgrade problems.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

N. Knezevic

The next 700 BFT protocols

The Next 700 BFT Protocols

Predicting and preventing inconsistencies in deployed distributed systems

Enabling DVD-like features in P2P video-on-demand systems

Staged deployment in mirage, an integrated software upgrade testing and distribution system

Contact Info

Product

Resources

About