2011
DOI: 10.1109/tc.2010.263
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Deduplication Techniques for Modern Backup Operation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
39
0

Year Published

2014
2014
2022
2022

Publication Types

Select...
4
4

Relationship

1
7

Authors

Journals

citations
Cited by 77 publications
(39 citation statements)
references
References 22 publications
0
39
0
Order By: Relevance
“…It is also frequently used to calculate the key values in the recent big data and cloud computing environment [3,4,5]. It is critical to develop a high speed SHA-1 hardware since the performance of the entire system depends on the speed of key creation, especially under key-value store environment [6,7,8].…”
Section: Introductionmentioning
confidence: 99%
“…It is also frequently used to calculate the key values in the recent big data and cloud computing environment [3,4,5]. It is critical to develop a high speed SHA-1 hardware since the performance of the entire system depends on the speed of key creation, especially under key-value store environment [6,7,8].…”
Section: Introductionmentioning
confidence: 99%
“…Nowadays, researchers suggest setting the expected chunk size by rule of thumb. For example, 4 KB [17] or 8 KB [3,7,18] are considered reasonable by some researchers; Symantec Storage Foundation 7.0 (Mountain View, CA, USA) recommends a chunk size of 16 KB or higher [19]; IBM (Armonk, NY, USA) mentioned the average chunk size for most deduplicated files is about 100 KB [20]. However, these expected chunk sizes lack either theoretical proof or experimental evaluation.…”
Section: Related Workmentioning
confidence: 99%
“…At present, deduplication is widely used in secondary storage systems such as backup or archival systems [1][2][3][4], and is also gradually used in primary storage systems, such as file systems [5,6]. Content defined chunking (CDC) [7] can achieve high duplicate elimination ratios (DERs), and therefore is the most widely used data chunking algorithm.…”
Section: Introductionmentioning
confidence: 99%
“…2. The Data de-duplication [4]- [6] can operate at the whole file, block (Chunk), and bit level. Whole file de-duplication or Single Instance Storage (SIS) [3] finds the hash value for the entire file which is the file index.…”
Section: F De-duplication Techniquesmentioning
confidence: 99%
“…This method of detecting duplicates is File level de-duplication. Extreme Binning [4] uses this approach by dividing the chunk index into two tiers namely Primary index and Bin [4]. Primary Index contains the representative ChunkID, Whole file hash and pointer to bin.…”
Section: E File Level De-duplication -Extreme Binningmentioning
confidence: 99%