Abstract:The current trend for high-performance distributed file systems is object-based architecture that uses local object storage to store the file data. The IO performance of such systems depends on the local object storage that manages the underlying low-level storage, such as Fusion IO ioDrive, a flash device connected through PCI express. It provides OpenNVM flash primitives, such as atomic batch write and sparse addressing. We designed an object storage using OpenNVM whose goal is to maximize IOPS/bandwidth performance. Using the sparse address space, it is possible to design object storage as an array of fixed-size regions. Using atomic batch write, the object storage supports the ACID properties in each write. Our prototype implementation achieves 740,000 IOPS for object creations using 16 threads, which is 12 times better than DirectFS. The write performance achieves 97.7% of the physical peak performance on average.
Abstract:The fusion of the research field of high-performance computing (HPC) with that of big data, which has become known as the field of extreme big data, is problematic in that file creation in storage systems such as distributed file systems is not optimized. That is, the large workload leads to simultaneous creations of many files by many processes when creating checkpoints. The need to improve the file creation processes prompted us to design a scale-out distributed file system for post-petascale systems named PPFS. PPFS consists of PPMDS, which is a scale-out distributed metadata server, and PPOSS, which is a scalable distributed storage server for flash storage. The high file creation performance of PPMDS was achieved by using a key-value store for metadata storage and non-blocking distributed transactions to update multiple entries simultaneously. PPOSS depends on PPOST, which is an object storage system that manages the underlying low-level storage, such as Fusion IO ioDrive, a flash device connected through PCI express supporting OpenNVM. The high file creation performance was attained by implementing the PPFS prototype using file creation optimization, termed bulk creation, to reduce the amount of communication between PPMDS and PPOSS. And, to enhance the I/O performance of PPOSS when the client process and PPOSS run on the same node, PPOSS accesses a local storage device directly. The prototype implementation of PPFS with a further file creation optimization called object prefetching achieves 138,000 Operations Per Second for file creation when using five metadata servers and 128 client processes, thereby exceeding the performance of IndexFS by 2.52 times. With local access optimization, PPOSS reached its limit at a block size of 16 KiB, which is an improvement of 1.5 times compared to before optimization. Furthermore, this evaluation indicates that PPFS has a good scalability on file creation and IO performance, that is required for post-petascale systems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.