2012 IEEE Fifth International Conference on Cloud Computing 2012
DOI: 10.1109/cloud.2012.118
|View full text |Cite
|
Sign up to set email alerts
|

Evaluating Hadoop for Data-Intensive Scientific Operations

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
32
0

Year Published

2013
2013
2019
2019

Publication Types

Select...
3
3
2

Relationship

2
6

Authors

Journals

citations
Cited by 30 publications
(32 citation statements)
references
References 20 publications
0
32
0
Order By: Relevance
“…Additionally, the authors report significant performance penalties due to the virtualization layer. Other work related to our study was reported in [35], where the authors analyzed the streaming features of Hadoop [36] and reported the existence of an overhead due to streaming. We have made a similar observation regarding the scalability of stream processing.…”
Section: Related Workmentioning
confidence: 99%
“…Additionally, the authors report significant performance penalties due to the virtualization layer. Other work related to our study was reported in [35], where the authors analyzed the streaming features of Hadoop [36] and reported the existence of an overhead due to streaming. We have made a similar observation regarding the scalability of stream processing.…”
Section: Related Workmentioning
confidence: 99%
“…To evaluate the performance and energy efficiency of Hadoop applications in different Hadoop deployment scenarios we use three micro-benchmarks: TeraGen, TeraSort, and Wikipedia data processing [15]. The former two benchmarks are among the most widely used standard Hadoop benchmarks.…”
Section: A Workloadsmentioning
confidence: 99%
“…al [21] shows that a proper MapReduce implementation can achieve a performance close to parallel databases through experiments performed on Amazon EC2. Previous work [15] evaluated Hadoop for scientific applications and the tradeoffs of various hardware and file system configurations. Our work complements the aforementioned performance efforts by investigating the Hadoop performance with separated data and compute layers and specific data operations.…”
Section: Related Workmentioning
confidence: 99%
“…al [23] shows that a proper MapReduce implementation can achieve a performance close to parallel databases through experiments performed on Amazon EC2. Previous work [16] evaluated Hadoop for scientific applications and the trade-offs of various hardware and file system configurations.…”
Section: Related Workmentioning
confidence: 99%
“…To evaluate the performance and energy efficiency of Hadoop applications in different Hadoop deployment scenarios we use three micro-benchmarks: TeraGen, TeraSort, and Wikipedia data processing [16]. The former two benchmarks are among the most widely used standard Hadoop benchmarks.…”
Section: Workloadsmentioning
confidence: 99%