There is a huge amount of duplicated or redundant data in current storage systems. So Data Deduplication, which uses lossless data compression schemes to minimize the duplicated data at the interfile level, has been receiving broad attention in recent years. But there are still research challenges in current approaches and storage systems, such as: how to chunking the files more efficiently and better leverage potential similarity and identity among dedicated applications; how to store the chunks effectively and reliably into secondary storage devices. In this paper, we propose ADMAD: an Application-Driven Metadata Aware De-duplication Archival Storage System, which makes use of certain meta-data information of different levels in the I/O path to direct the file partitioning into more Meaningful data Chunks (MC) to maximally reduce the inter-file level duplications. However, the chunks may be with different lengths and variable sizes, storing them into storage devices may result in a lot of fragments and involve a high percentage of random disk accesses, which is very inefficient. Therefore, in ADMAD, chunks are further packaged into fixed sized Objects as the storage units to speed up the I/O performance as well as to ease the data management.Preliminary experiments have demonstrated that the proposed system can further reduce the required storage space when compared with current methods (from 20% to near 50% according to several datasets), and largely improves the writing performance (about 50%-70% in average).
A cryptographic network file system has to guarantee confidentiality and integrity of its files, and also it has to support random access. For this purpose, existing designs mainly rely on (often ad-hoc) combination of Merkle hash tree with a block cipher mode of encryption. In this paper, we propose a new design based on a MAC tree construction which uses a universal-hash based stateful MAC. This new design enables standard model security proof and also better performance compared with Merkle hash tree. We formally define the security notions for file encryption and prove that our scheme provides both confidentiality and integrity. We implement our scheme in coreFS, a user-level network file system, and evaluate the performance in comparison with the standard design. Experimental results confirm that our construction provides integrity protection at a smaller cost.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.