Assuring Demanded Read Performance of Data Deduplication Storage with Backup Datasets

Nam, Young Jin; Park, Dongchul; Du, David H. C.

doi:10.1109/mascots.2012.32

Cited by 48 publications

(19 citation statements)

References 7 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…They enumerate the spatial area by using a selective duplication threshold value. Their experiments with the actual backup datasets determine that the proposed scheme achieves requested read performance in most cases at the realistic cost of write performance [8].…”

Section: International Journal Of Computer Applications (0975 -8887)mentioning

confidence: 99%

Survey on Data Deduplication for Cloud Storage to Reduce Fragmentation

Fegade¹,

Bharati²

2016

IJCA

View full text Add to dashboard Cite

Data Deduplication is an important technique which provides better result to store more information with less space. Cost and maintenance of Information backup storage system for major enterprises can be minimized by storing it on Cloud Storage. Data redundancy between different kinds of data storage gets minimal by utilizing data deduplication method. By giving each application differently and storing the associated information distinctly the overall disk usage can be enhanced to a great level. Cloud backup systems uses data deduplication to eliminate duplicate chunks that are present in multiple files. The duplicate chunks are substituted with the references to already present chunks through deduplication, without storing it again on cloud storage. The successive chunks are actually stored in scattered form in backup system in numerous segments (the storage unit of cloud).

show abstract

Section: International Journal Of Computer Applications (0975 -8887)mentioning

confidence: 99%

Survey on Data Deduplication for Cloud Storage to Reduce Fragmentation

Fegade¹,

Bharati²

2016

IJCA

View full text Add to dashboard Cite

show abstract

“…J. Nam, D. Park, and D. H. Du [2], author proposed a novel indicator for dedupe scheme. They proposed scheme provides two-fold approach, first, a novel indicator for dedupe scheme called cache-aware Chunk Fragmentation Level (CFL) monitor and second selective duplication for improvement read performance.…”

Section: Related Workmentioning

confidence: 99%

“…Proposed scheme assures demanded read performance of each data stream while completing its write performance at a practical level and also certain a target system recovery time. Major drawback of selective duplication is that it requires extra memory space called in-memory temp container [2].…”

Section: Comparisonmentioning

confidence: 99%

Survey on Fragmentation for Deduplication in Backup Storage

2015

IJSR

View full text Add to dashboard Cite

Abstract:In backup environments field deduplication yields major advantages. Deduplication is process of automatic elimination of duplicate data in a storage system and it is most effective technique to reduce storage costs. Deduplication effects predictably in data fragmentation, because logically continuous data is spread across many disk locations. Fragmentation mainly caused by duplicates from previous backups of the same backup set, since such duplicates are frequent due to repeated full backups containing a lot of data which is not changed. Systems with in-line deduplicate intends to detects duplicates during writing and avoids storing them, such fragmentation causes data from the latest backup being scattered across older backups. This survey focused on various techniques to detect inline deduplication. As per literature, need to develop a focused on deduplication reduce the time and storage space. Proposed novel method to avoid the reduction in restores performance without reducing write performance and without affecting deduplication effectiveness.

show abstract

“…It also provides space-efficient VM image storage since VM images have high content similarities [5]. However, deduplication has a drawback of introducing fragmentation [6,8,10,14,15], since some blocks of a file may now refer to other identical blocks of a different file. To illustrate, Figure 1(a) shows three snapshots of a VM, denoted by VM 1 , VM 2 , and VM 3 , which are to be written to disk that initially has no data.…”

Section: Introductionmentioning

confidence: 99%

“…On the other hand, we believe that achieving high read throughput is necessary in any backup system. For instance, a fast restore operation can minimize the system downtime during disaster recovery [10,16].…”

Section: Introductionmentioning

confidence: 99%

RevDedup

Lee

2013

Proceedings of the 4th Asia-Pacific Workshop on Systems

View full text Add to dashboard Cite

Deduplication is known to effectively eliminate duplicates, yet it introduces fragmentation that degrades read performance. We propose RevDedup, a deduplication system that optimizes reads to the latest backups of virtual machine (VM) images using reverse deduplication. In contrast with conventional deduplication that removes duplicates from new data, RevDedup removes duplicates from old data, thereby shifting fragmentation to old data while keeping the layout of new data as sequential as possible. We evaluate our RevDedup prototype using a 12-week span of real-world VM image snapshots of 160 users. We show that RevDedup achieves high deduplication efficiency, high backup throughput, and high read throughput.Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Permissions@acm.org.

show abstract

Assuring Demanded Read Performance of Data Deduplication Storage with Backup Datasets

Cited by 48 publications

References 7 publications

Survey on Data Deduplication for Cloud Storage to Reduce Fragmentation

Survey on Data Deduplication for Cloud Storage to Reduce Fragmentation

Survey on Fragmentation for Deduplication in Backup Storage

RevDedup

Contact Info

Product

Resources

About