2015 Ninth International Conference on Complex, Intelligent, and Software Intensive Systems 2015
DOI: 10.1109/cisis.2015.37
|View full text |Cite
|
Sign up to set email alerts
|

Novel Data-Distribution Technique for Hadoop in Heterogeneous Cloud Environments

Abstract: The Hadoop framework has been developed to effectively process data-intensive MapReduce applications. Hadoop users specify the application computation logic in terms of a map and a reduce function, which are often termed MapReduce applications. The Hadoop distributed file system is used to store the MapReduce application data on the Hadoop cluster nodes called Datanodes, whereas Namenode is a control point for all Datanodes. While its resilience is increased, its current data-distribution methodologies are not… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 24 publications
(11 citation statements)
references
References 15 publications
0
8
0
Order By: Relevance
“…In contrast, big data is processed by clustering and scans multiple nodes of clusters in the network [23] . This processing is based on the concept of parallelism to handle large medical data sets [24] . Freely available frameworks, such as Hadoop, MapReduce, Pig, Sqoop, Hive, and HBase Avro, all have ability to process the health related data sets for healthcare systems.…”
Section: Big Data Analytics Architecture For Health Informaticsmentioning
confidence: 99%
See 2 more Smart Citations
“…In contrast, big data is processed by clustering and scans multiple nodes of clusters in the network [23] . This processing is based on the concept of parallelism to handle large medical data sets [24] . Freely available frameworks, such as Hadoop, MapReduce, Pig, Sqoop, Hive, and HBase Avro, all have ability to process the health related data sets for healthcare systems.…”
Section: Big Data Analytics Architecture For Health Informaticsmentioning
confidence: 99%
“…In the first component is the requirement for big data sources for processing. In the second component clusters with a centralized big-data processing infrastructure are at the peak of high performance [24] . It has been observed that the tools mainly available for big-data analytics processing provide data security, scalability, and manageability with the help of the MapReduce paradigm.…”
Section: Big Data Analytics Architecture For Health Informaticsmentioning
confidence: 99%
See 1 more Smart Citation
“…Xie et al and Anjos et al explore the possibility of placing data blocks to minimize job latency. Data blocks are placed based on the computing ratio in other works, to minimize makespan, whereas Chen et al place data blocks to minimize network transfer time. Anjos et al considers the capacity of nodes to minimize the latency of a job.…”
Section: Literature Surveymentioning
confidence: 99%
“…Become a Big Data era these days, many tools that analyzing the massive data efficiently such as R or Hadoop are released [4][5][6]. Especially Hadoop has strong point that it's possible to distributed processing the massive data in low cost, there's a drift towards research about Hadoop or System using Hadoop [7][8][9]. Hadoop consists of two parts, HDFS (Hadoop Distributed File System) and MapReduce framework.…”
Section: Introductionmentioning
confidence: 99%