2020
DOI: 10.1109/access.2019.2963625
|View full text |Cite
|
Sign up to set email alerts
|

Function of Content Defined Chunking Algorithms in Incremental Synchronization

Abstract: Data chunking algorithms divide data into several small data chunks in a certain way, thus transforming the operation of data into the one of multiple small data chunks. Data chunking algorithms have been widely used in duplicate data detection, parallel computing and other fields, but it is seldom used in data incremental synchronization. Aiming at the characteristics of incremental data synchronization, this paper proposes a novel data chunking algorithm. By dividing two data that need synchronization into s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2

Relationship

1
4

Authors

Journals

citations
Cited by 10 publications
(7 citation statements)
references
References 31 publications
0
7
0
Order By: Relevance
“…If the cutoff condition is not satisfied, slide the window to the next byte and re-perform the comparison of the sum of 1s with the threshold. This process is repeated until all the cut points and chunks of a data stream are found [14]. The algorithm's operation is shown in Fig.…”
Section: ) Rapid Asymmetric Maximum (Ram) Algorithmmentioning
confidence: 99%
See 3 more Smart Citations
“…If the cutoff condition is not satisfied, slide the window to the next byte and re-perform the comparison of the sum of 1s with the threshold. This process is repeated until all the cut points and chunks of a data stream are found [14]. The algorithm's operation is shown in Fig.…”
Section: ) Rapid Asymmetric Maximum (Ram) Algorithmmentioning
confidence: 99%
“…In the worst case that a byte is added, deleted, or changed near the chunk boundary, that chunk and the subsequent one are affected. One of the PCI drawbacks is the varying size of chunks due to the geometric distribution [14]. Besides, the ability to eliminate low entropy strings is another aspect to consider.…”
Section: ) Rapid Asymmetric Maximum (Ram) Algorithmmentioning
confidence: 99%
See 2 more Smart Citations
“…Therefore, when the CDC algorithms are applied to delta synchronization, we need to focus on the byte-shifting ability, which affects the stability of the cut-point and then affects the accuracy of the differential data. To better apply the CDC algorithm to the differential data acquisition, we proposed the PCI algorithm 17 in the previous study, which enhanced the ability to resist byte-shifting through small window pattern matching, The accuracy of differential data acquisition is improved.…”
Section: F I G U R E 1 Communication Flow Of Rsync Algorithmmentioning
confidence: 99%