Copy-On-Write (COW) is a powerful technique for data protection in file systems. Unfortunately, it introduces a recursively updating problem, which leads to a side effect of write amplification. Studying the behaviors of write amplification is important for designing, choosing and optimizing the next generation file systems. However, there are many difficulties for evaluation due to the complexity of file systems. To solve this problem, we proposed a typical COW file system model based on BTRFS, verified its correctness through carefully designed experiments. By analyzing this model, we found that write amplification is greatly affected by the distributions of files being accessed, which varies from 1.1x to 4.2x. We further found that write amplification is also affected by the number of files being accessed, the number of files contained in a file system, and as well as the space utilization of file system trees.
The explosive growth of modern web-scale applications has made cost-effectiveness a primary design goal for their underlying databases. As a backbone of modern databases, LSM-tree based key-value stores (LSM store) face limited storage options. They are either designed for local storage that is relatively small, expensive, and fast or for cloud storage that offers larger capacities at reduced costs but slower. Designing an LSM store by integrating local storage with cloud storage services is a promising way to balance the cost and performance. However, such design faces challenges such as data reorganization, metadata overhead, and reliability issues.
In this paper, we propose
RocksMash
, a fast and efficient LSM store that uses local storage to store frequently accessed data and metadata while using cloud to hold the rest of the data to achieve cost-effectiveness. To improve metadata space-efficiency and read performance,
RocksMash
uses an LSM-aware persistent cache that stores metadata in a space-efficient way and stores popular data blocks by using compaction-aware layouts. Moreover,
RocksMash
uses an extended write-ahead log for fast parallel data recovery. We implemented
RocksMash
by embedding these designs into RocksDB. The evaluation results show that
RocksMash
improves the performance by up to 1.7 × compared to the state-of-the-art schemes and delivers high reliability, cost-effectiveness, and fast recovery.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.