The eventual consistency model has been widely adopted in NoSQL systems. By tolerating weak consistency, these systems attain high throughput and availability while sustaining side effects on user experience and developer friendliness. Trading off consistency from latency has been a common consensus. An important but widely ignored problem is how to control the consistency of an existing system without the necessity of modifying the system implementation. In this paper, we present a systematic study on the client-centric consistency of a NoSQL system, Cassandra, and disclose how the consistency can be substantially enhanced by tuning the system configurations when users use partial quorum settings. We use session guarantee as the consistency model and analyze the root cause of consistency violation, testifying that the length of the write queue is a reasonable indicator for consistency quantification. For inconsistency mitigation, we show through extensive experiments how the consistency is affected by the read and write processes of the system, and how the consistency can be improved by tuning system configurations. In particular, we provide developers with recommended configurations by changing the write thread number and the fine-grained quorum setting for enhanced consistency control. Because consistency anomalies do not occur uniformly, we discuss how to stabilize the consistency by analyzing system logs.
INTRODUCTIONA storage system providing weak replica consistency model is easier to achieve high availability, high throughput, and low latency. Therefore, many NoSQL systems, especially quorum systems such as DynamoDB, Voldemort, Riak, and Cassandra opt for eventual consistency, a typical weak replica consistency model. 1 These systems become popular choices by users with the awareness that they may read stale data with certain probability. Since eventually consistent systems make no rigorous guarantees on the staleness of data items returned, it is very important for users and developers to quantify how eventual the consistency is, how to program under the eventually consistent systems and how to provide stronger consistency while keeping its benefits. 2 Ongoing research efforts are made to quantify the consistency in NoSQL systems. 3,4 Most works focus on the consistency-latency trade-off according to the PACELC criteria. 1 For example, Wada et al claimed that with eventual consistency, the probability of reading the latest data within 0 and 450 ms is 33% in a low workload using Ama-zon SimpleDB. 5 Bails et al showed that the average latency is 25 ms in LinkedIn's single-node Voldemort system and the maximal latency is 5s in Yammer's Riak cluster. 6 Some works consider how to program in these NoSQL systems. For example, the Consistency As Logical Monotonicity (CALM) theorem was proposed for guiding developers to design programs under the eventually consistent systems. 7 Other works focus on designing new read and write protocols to support stronger consistency. 8-11 For example, Bails et al proposed a "bo...