Proceedings of the International Symposium on Memory Management 2011
DOI: 10.1145/1993478.1993495
|View full text |Cite
|
Sign up to set email alerts
|

Garbage collection auto-tuning for Java mapreduce on multi-cores

Abstract: MapReduce has been widely accepted as a simple programming pattern that can form the basis for efficient, large-scale, distributed data processing. The success of the MapReduce pattern has led to a variety of implementations for different computational scenarios. In this paper we present MRJ, a MapReduce Java framework for multi-core architectures. We evaluate its scalability on a fourcore, hyperthreaded Intel Core i7 processor, using a set of standard MapReduce benchmarks. We investigate the significant impac… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2012
2012
2017
2017

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 19 publications
(5 citation statements)
references
References 31 publications
0
5
0
Order By: Relevance
“…Most of the workloads have been used in popular data analysis workload suites such as BigDataBench [7], DCBench [6], HiBench [14] and Cloudsuite [5]. Phoenix++ [15], Phoenix rebirth [16] and Java MapReduce [17] tests the performance of devised sharedmemory frameworks based on Word Count, Grep and K-Means. We use Spark version of the selected benchmarks from BigDataBench and employ Big Data Generator Suite (BDGS), an open source tool, to generate synthetic datasets for every benchmark based on raw data sets [18].…”
Section: A Benchmarksmentioning
confidence: 99%
“…Most of the workloads have been used in popular data analysis workload suites such as BigDataBench [7], DCBench [6], HiBench [14] and Cloudsuite [5]. Phoenix++ [15], Phoenix rebirth [16] and Java MapReduce [17] tests the performance of devised sharedmemory frameworks based on Word Count, Grep and K-Means. We use Spark version of the selected benchmarks from BigDataBench and employ Big Data Generator Suite (BDGS), an open source tool, to generate synthetic datasets for every benchmark based on raw data sets [18].…”
Section: A Benchmarksmentioning
confidence: 99%
“…In [15] the GC is auto-tuned in order to improve the performance of a MapReduce [16] Java implementation for multi-core hardware. For each relevant benchmark, machine learning techniques are used to find the best execution time for each combination of input size, heap size and number of threads in relation to a given GC algorithm (i.e.…”
Section: Related Workmentioning
confidence: 99%
“…The proposed structure is evaluated by using a trace-based simulator with SPEC 2006 [Henning 2006;SPEC CPU 2006;Phansalkar et al 2007], SPLASH-2 [Woo et al 1995], grep [Singer et al 2011], and PostMark [Katcher 1997] traces. According to our simulation results, execution time can be reduced by about 89% compared to a conventional hierarchy of DRAM main memory and HDD secondary storage and 77% over a set of DRAM buffers/PRAM main memory/HDD disk pair proposed as state-ofthe-art proposals [Qureshi et al 2009].…”
Section: Introductionmentioning
confidence: 99%