2015
DOI: 10.1587/transinf.2014edp7242
|View full text |Cite
|
Sign up to set email alerts
|

A Study of Effective Replica Reconstruction Schemes for the Hadoop Distributed File System

Abstract: SUMMARYDistributed file systems, which manage large amounts of data over multiple commercially available machines, have attracted attention as management and processing systems for Big Data applications. A distributed file system consists of multiple data nodes and provides reliability and availability by holding multiple replicas of data. Due to system failure or maintenance, a data node may be removed from the system, and the data blocks held by the removed data node are lost. If data blocks are missing, the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 9 publications
0
3
0
Order By: Relevance
“…HDFS is the foundation of data storage management in distributed computing, which has the advantages of high reliability, strong expansibility, and throughput. The premise and goal of the system design are as follows [24][25][26][27]:…”
Section: Foundationmentioning
confidence: 99%
See 1 more Smart Citation
“…HDFS is the foundation of data storage management in distributed computing, which has the advantages of high reliability, strong expansibility, and throughput. The premise and goal of the system design are as follows [24][25][26][27]:…”
Section: Foundationmentioning
confidence: 99%
“…With the aid of Formula (26) and the MapReduce model, the solution of the covariance matrix of the whole training set is solved. The specific algorithm is described as follows:…”
Section: Structure and Parallelization Of The Pegasos Algorithmmentioning
confidence: 99%
“…In order to ensure the reliability and availability of data storage, replication strategy and erasure codes have been more widely adopted in many current DSSs [3][4][5][6]. For example, Google File System (GFS) and Hadoop Distributed File System (HDFS) adopt multi-replication [7,8]. However, since multi-replication needs to store a large number of data to ensure high reliability, its storage cost is high.…”
Section: Introductionmentioning
confidence: 99%