With the recent performance improvements in commodity hardware, low-cost commodity server-based storage has become a practical alternative to dedicated-storage appliances. Because of the high failure rate of commodity servers, data redundancy across multiple servers is required in a server-based storage system. However, the extra storage capacity for this redundancy significantly increases the system cost. Although
erasure coding (EC)
is a promising method to reduce the amount of redundant data, it requires distributing and encoding data among servers. There remains a need to reduce the performance impact of these processes involving much network traffic and processing overhead. Especially, the performance impact becomes significant for random-intensive applications. In this article, we propose a new lightweight redundancy control for server-based storage. Our proposed method uses a new local filesystem-based approach that avoids distributing data by adding data redundancy to locally stored user data. Our method switches the redundancy method of user data between replication and EC according to workloads to improve capacity efficiency while achieving higher performance. Our experiments show up to 230% better online-transaction-processing performance for our method compared with CephFS, a widely used alternative system. We also confirmed that our proposed method prevents unexpected performance degradation while achieving better capacity efficiency.
This paper presents an analysis of a performance bottleneck in enterprise file servers using Linux and proposes a modification to this operation system for avoiding the bottleneck. The analysis shows that metadata cache deallocation of current Linux causes large latency in file-request processing when the operational throughput of a file server becomes large. To eliminate the latency caused by metadata cache deallocation, a new method, called "split reclaim," which divides metadata cache deallocation from conventional cache deallocation, is proposed. It is experimentally shown that the split-reclaim method reduces the worst response time by more than 95% and achieves three times higher throughput under a metadata-intensive workload. The split-reclaim method also reduces latency caused by cache deallocation under a general file-server workload by more than 99%. These results indicate that the split-reclaim method can eliminate metadata cache deallocation latency and make possible the use of commodity servers as enterprise file servers.Index Terms-Cache memory, file servers, memory management, scalability.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.