012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST) 2012
DOI: 10.1109/msst.2012.6232380
|View full text |Cite
|
Sign up to set email alerts
|

Design of an exact data deduplication cluster

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
14
0

Year Published

2012
2012
2021
2021

Publication Types

Select...
4
2
2

Relationship

1
7

Authors

Journals

citations
Cited by 26 publications
(15 citation statements)
references
References 18 publications
1
14
0
Order By: Relevance
“…Other distributed systems assume nodes with individual CPU and RAM that have access to a shared storage device abstraction where nodes perform deduplication in parallel. This allows the sharing of metadata information between nodes by keeping it on the shared storage device, which otherwise would have to be sent over the network [Clements et al 2009;Kaiser et al 2012]. Finally, distinct nodes may handle distinct tasks.…”
Section: Scopementioning
confidence: 99%
See 1 more Smart Citation
“…Other distributed systems assume nodes with individual CPU and RAM that have access to a shared storage device abstraction where nodes perform deduplication in parallel. This allows the sharing of metadata information between nodes by keeping it on the shared storage device, which otherwise would have to be sent over the network [Clements et al 2009;Kaiser et al 2012]. Finally, distinct nodes may handle distinct tasks.…”
Section: Scopementioning
confidence: 99%
“…Although specific details are not presented, several gateways can be combined to perform deduplication over a common data repository, thus allowing global distributed deduplication. As a distinct approach, the dedupv1 centralized design can be extended over a shared storage device (SAN) where several nodes have exclusive access to their own data partitions [Kaiser et al 2012]. Nodes are seen as independent dedupv1 nodes that export their own iSCSI interface, partition data, compute hashes, and map chunk requests to the correct nodes.…”
Section: Backup and Archival Storagementioning
confidence: 99%
“…On the other hand, a decentralized approach to distributing deduplication metadata management across multiple servers [10,7,8,5,21,9,15,12] require additional hardware and software resource cost for multiple deduplication servers. In order to reduce such additional cost, simple DB-sharding approach that embeds the DB-shard of the whole dedup metadata database on each storage server has been proposed [13]. However, this DB-sharding approach to SN-SS suffers from inherited problems, i.e., to identify a duplicate chunk, the fingerprint lookup must be broadcasted to all DB-shards in the cluster.…”
Section: Introductionmentioning
confidence: 99%
“…A major class of data deduplication systems is called fingerprinting-based data deduplication [25,37,17,3,21,5,6,12,34]. The generic design for backup-oriented deduplication systems splits the data stream into chunks.…”
Section: Introductionmentioning
confidence: 99%