2012
DOI: 10.1145/2385603.2385606
|View full text |Cite
|
Sign up to set email alerts
|

WAN-optimized replication of backup datasets using stream-informed delta compression

Abstract: Replicating data off-site is critical for disaster recovery reasons, but the current approach of transferring tapes is cumbersome and error-prone. Replicating across a wide area network (WAN) is a promising alternative, but fast network connections are expensive or impractical in many remote locations, so improved compression is needed to make WAN replication truly practical. We present a new technique for replicating backup datasets across a WAN that not only eliminates duplicate regions of files (deduplicati… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
69
0

Year Published

2013
2013
2021
2021

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 95 publications
(69 citation statements)
references
References 19 publications
0
69
0
Order By: Relevance
“…of these approaches for likeness detection needs high overheads of computation and categorisation. Shilane et al projected a stream-informed delta compression (SIDC) approach utilized in a WAN surroundings for reducing similar information transmission and so fast information replication [9]. This approach is superfeature primarily based and enhances the block-level deduplication by solely police work resemblance among nonduplicate blocks within the cache that preserves the backup stream locality.…”
Section: Resemblance Detection Based Data Reductionmentioning
confidence: 99%
“…of these approaches for likeness detection needs high overheads of computation and categorisation. Shilane et al projected a stream-informed delta compression (SIDC) approach utilized in a WAN surroundings for reducing similar information transmission and so fast information replication [9]. This approach is superfeature primarily based and enhances the block-level deduplication by solely police work resemblance among nonduplicate blocks within the cache that preserves the backup stream locality.…”
Section: Resemblance Detection Based Data Reductionmentioning
confidence: 99%
“…Data deduplication is an efficient data reduction approach that not only reduces storage space [4], [5], [6], [7], [8], [9], [10] by eliminating duplicate data but also mini-mizes the transmission of redundant data in low-bandwidth network environments [11], [12], [13], [14]. In general, a chunk-level data deduplication scheme splits data blocks of a data stream (e.g., backup files, databases, and virtual machine images) into multiple data chunks that are each uniquely identified and duplicate-detected by a secure SHA-1 or MD5 hash signature (also called a fingerprint) [5], [11].…”
Section: Introductionmentioning
confidence: 99%
“…While data deduplication has been widely deployed in storage systems for space savings, the fingerprint-based dedu-plication approaches have an inherent drawback: they often fail to detect the similar chunks that are largely identical except for a few modified bytes, because their secure hash digest will be totally different even only one byte of a data chunk was changed [4], [5], [12], [15], [16]. It becomes a big challenge when applying data deduplication to storage data-sets and workloads that have frequently modified data, which demands an effective and efficient way to eliminate redun-dancy among frequently modified and thus similar data.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations