2010 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems 2010
DOI: 10.1109/mascots.2010.37
|View full text |Cite
|
Sign up to set email alerts
|

Frequency Based Chunking for Data De-Duplication

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
18
0

Year Published

2012
2012
2021
2021

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 56 publications
(18 citation statements)
references
References 7 publications
0
18
0
Order By: Relevance
“…Samuel et al [11] presented the design of a system for composing and enforcingcontext-aware disclosure rules for preserving privacy and securityof multimedia big data systems. Lu et al [12] proposed a frequency based chunkingalgorithm, which explicitly considers the frequencyinformation of data segments during the chunking process. Yu et al [13] presented the leap-based CDCalgorithm and added a secondary condition to it in order toreduce the computing overhead and maintain the samededuplication ratio.…”
Section: Related Workmentioning
confidence: 99%
“…Samuel et al [11] presented the design of a system for composing and enforcingcontext-aware disclosure rules for preserving privacy and securityof multimedia big data systems. Lu et al [12] proposed a frequency based chunkingalgorithm, which explicitly considers the frequencyinformation of data segments during the chunking process. Yu et al [13] presented the leap-based CDCalgorithm and added a secondary condition to it in order toreduce the computing overhead and maintain the samededuplication ratio.…”
Section: Related Workmentioning
confidence: 99%
“…On the other hand, the building-up algorithm divides the stream into small chunks that are then composed when the deduplication gain is not affected. Moreover, a variant of the breaking apart algorithm can be combined with a statistical chunk frequency estimation algorithm, further dividing large chunks that contain smaller chunks appearing frequently in the data stream and consequently allowing higher space savings [Lu et al 2010].…”
Section: Granularitymentioning
confidence: 99%
“…Data compression reduces the file size by eliminating redundant data contained in a document, while data deduplication identifies duplicate data elements, such as an entire file [13,14] and data block [15][16][17][18][19][20][21][22][23], and eliminates both intra-file and inter-file data redundancy, hence reducing the data to be transferred or stored. When multiple instances of the same data element are detected, only one single copy of the data element is transferred or stored.…”
Section: B Data Deduplicationmentioning
confidence: 99%
“…The redundant data element is replaced with a reference or pointer to the unique data copy. Based on the algorithm granularity, data deduplication algorithms can be classified into three categories: whole file hashing [13,14], sub-file hashing [15][16][17][18][19][20][21][22][23], and delta encoding [24]. Traditional data de-duplication operates at the application layer, such as object caching, to eliminate redundant data transfers.…”
Section: B Data Deduplicationmentioning
confidence: 99%
See 1 more Smart Citation