The performance of each datanode in a heterogeneous Hadoop cluster differs, and the number of slots that can be numbered to simultaneously execute tasks differs. For this reason, Hadoop is susceptible to replica placement problems and data replication problems. Because of this, replication problems and allocation problems occur. These problems can deteriorate the performance of Hadoop. In this paper, we summarize existing research to improve data locality, and design a data replication method to solve replication and allocation problems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.