2011 IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems 2011
DOI: 10.1109/mascots.2011.12
|View full text |Cite
|
Sign up to set email alerts
|

The Case for Evaluating MapReduce Performance Using Workload Suites

Abstract: Abstract-MapReduce systems face enormous challenges due to increasing growth, diversity, and consolidation of the data and computation involved. Provisioning, configuring, and managing large-scale MapReduce clusters require realistic, workloadspecific performance insights that existing MapReduce benchmarks are ill-equipped to supply.In this paper, we build the case for going beyond benchmarks for MapReduce performance evaluations. We analyze and compare two production MapReduce traces to develop a vocabulary f… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

13
245
0
1

Year Published

2013
2013
2021
2021

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 326 publications
(259 citation statements)
references
References 10 publications
13
245
0
1
Order By: Relevance
“…Chen et al describe MapReduce workloads from 6 months on a 600-machine cluster and 1.5 months on a 3000-machine cluster at Facebook, from 3 weeks on a cluster at Yahoo!, and from several other installations [125,124]. Some of their data is shown in Figures 9.34 and 9.35.…”
Section: End Boxmentioning
confidence: 99%
See 3 more Smart Citations
“…Chen et al describe MapReduce workloads from 6 months on a 600-machine cluster and 1.5 months on a 3000-machine cluster at Facebook, from 3 weeks on a cluster at Yahoo!, and from several other installations [125,124]. Some of their data is shown in Figures 9.34 and 9.35.…”
Section: End Boxmentioning
confidence: 99%
“…For example, consider the data about MapReduce workloads at Facebook available from the SWIM project [125]. MapReduce applications have two stages, a map stage and a reduce stage (this is explained in the box on page 498).…”
Section: Erroneous Datamentioning
confidence: 99%
See 2 more Smart Citations
“…To compare the resulting service they need a benchmark build around a set of representative analytical tasks. Most research in the area in done on actual MapReduce benchmarks like MRBench [28] or designing appropriate MapReduce workloads [29]. Pavlo et al [30] show how to have analytical workload run by both MapReduce and Distributed Databases and compare the results.…”
Section: High Workload Analytical Platformmentioning
confidence: 99%