Proceedings of the 31st Annual ACM Symposium on Applied Computing 2016
DOI: 10.1145/2851613.2851945
|View full text |Cite
|
Sign up to set email alerts
|

An efficient Hadoop data replication method design for heterogeneous clusters

Abstract: The performance of each datanode in a heterogeneous Hadoop cluster differs, and the number of slots that can be numbered to simultaneously execute tasks differs. For this reason, Hadoop is susceptible to replica placement problems and data replication problems. Because of this, replication problems and allocation problems occur. These problems can deteriorate the performance of Hadoop. In this paper, we summarize existing research to improve data locality, and design a data replication method to solve replicat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2016
2016
2020
2020

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(3 citation statements)
references
References 3 publications
0
3
0
Order By: Relevance
“…Another data placement approach considers the characteristics of the data such as popularity 15–18 . For example, Xiong et al 15,16 utilized both the computing performance of each node and the popularity of each block evaluated using the time series and frequency of requests.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Another data placement approach considers the characteristics of the data such as popularity 15–18 . For example, Xiong et al 15,16 utilized both the computing performance of each node and the popularity of each block evaluated using the time series and frequency of requests.…”
Section: Related Workmentioning
confidence: 99%
“…Tandon et al 17 proposed a scheme that replicates popular blocks to reduce the network overhead caused by remote access. Park et al 18 proposed a scheme that replicates the least frequently accessed and oldest blocks. Wang et al 19 and Wu et al 20 proposed data placement strategies focusing on correlated data based on the fact that data locality is increased if correlated data are accessed simultaneously.…”
Section: Related Workmentioning
confidence: 99%
“…Each slave node manages individual map or reduce task as a TaskTracker [11] in hadoop1.0 and it is replaced as a YARN services in Hadoop 2.0 later version. DataNode [12] is provides a service to stores huge data or running MapReduce operations. The DataNodes are communicating with NameNode in regular intervals for updating metadata and also the TaskTracker communicating with JobTracker regularly.…”
Section: B Slave Node [10]mentioning
confidence: 99%