International Conference on Smart Structures and Systems - Icsss'13 2013
DOI: 10.1109/icsss.2013.6623017
|View full text |Cite
|
Sign up to set email alerts
|

Aggrandizing Hadoop in terms of node Heterogeneity & Data Locality

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
2
1
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(6 citation statements)
references
References 4 publications
0
6
0
Order By: Relevance
“…The input to the task must be present on a node where task is supposed to be executed otherwise needs transferring input data which ultimately increased execution time. Sujitha et al [218] proposed a methodology to address the issues of heterogeneity and data locality in Hadoop.…”
Section: Locality Aware Data Placement In Heterogeneous Environmentmentioning
confidence: 99%
“…The input to the task must be present on a node where task is supposed to be executed otherwise needs transferring input data which ultimately increased execution time. Sujitha et al [218] proposed a methodology to address the issues of heterogeneity and data locality in Hadoop.…”
Section: Locality Aware Data Placement In Heterogeneous Environmentmentioning
confidence: 99%
“…To handle big data, a single machine is usually inadequate, so a cluster, which refers to a group of coordinated machines, needs to be set up to distribute the workload to different machines [73].…”
Section: Figure 1 Framework Architecturementioning
confidence: 99%
“…The technologies such as Amazon EMR, which is a big data platform to quickly and effectively process huge amounts of data, can be employed to set up a cluster, and Apache Spark can be used to process data in the cluster environment. A cluster (see Figure 3 [75]) typically includes one master node and several worker nodes -a node is an individual machine in the cluster [73]. It can be managed by using tools such as Apache YARN, which is formed by two types of daemons: a ResourceManager running on the master node and NodeManager(s) running on the worker node(s) [76].…”
Section: Figure 2 Data Analysis Process [74]mentioning
confidence: 99%
“…MapReduce is processing large-scale data via the distributed, parallel programming approach [2, 3]. However, the map and reduce processes are not optimized for heterogeneous environment [4]. Various approaches have been proposed to improve MapReduce performance in heterogeneous environment [1,4,5,6].…”
Section: Introductionmentioning
confidence: 99%
“…However, the map and reduce processes are not optimized for heterogeneous environment [4]. Various approaches have been proposed to improve MapReduce performance in heterogeneous environment [1,4,5,6]. [1] proposes a data placement algorithm, namely Dynamic Data Placement (DDP), to resolve the unbalanced node workload problem in heterogeneous environment.…”
Section: Introductionmentioning
confidence: 99%