2011
DOI: 10.1145/2076022.1993495
|View full text |Cite
|
Sign up to set email alerts
|

Garbage collection auto-tuning for Java mapreduce on multi-cores

Abstract: MapReduce has been widely accepted as a simple programming pattern that can form the basis for efficient, large-scale, distributed data processing. The success of the MapReduce pattern has led to a variety of implementations for different computational scenarios. In this paper we present MRJ, a MapReduce Java framework for multi-core architectures. We evaluate its scalability on a fourcore, hyperthreaded Intel Core i7 processor, using a set of standard MapReduce benchmarks. We investigate the significant impac… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
6
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(6 citation statements)
references
References 30 publications
0
6
0
Order By: Relevance
“…In addition to search-based algorithms, there have been several efforts to use machine learning for the purpose of automatic parameter tuning [26,47]. However, approaches using machine learning generally require a large amount of training data to be able to build a classifier with good accuracy because they can only learn about scenarios and configurations that have been seen in the past.…”
Section: Related Workmentioning
confidence: 99%
“…In addition to search-based algorithms, there have been several efforts to use machine learning for the purpose of automatic parameter tuning [26,47]. However, approaches using machine learning generally require a large amount of training data to be able to build a classifier with good accuracy because they can only learn about scenarios and configurations that have been seen in the past.…”
Section: Related Workmentioning
confidence: 99%
“…Most of the workloads have been used in popular data analysis workload suites such as BigDataBench [7], DCBench [6], HiBench [14] and Cloudsuite [5]. Phoenix++ [15], Phoenix rebirth [16] and Java MapReduce [17] tests the performance of devised sharedmemory frameworks based on Word Count, Grep and K-Means. We use Spark version of the selected benchmarks from BigDataBench and employ Big Data Generator Suite (BDGS), an open source tool, to generate synthetic datasets for every benchmark based on raw data sets [18].…”
Section: B Top-down Methods For Hardware Performance Countersmentioning
confidence: 99%
“…Machine Learning Models: Machine learning approaches [13,14,19,29,49] employ various algorithms such as KCCA [6], artificial neural networks, decision trees, reinforcement learning, and Bayesian networks [39] to determine a correlation between configuration parameters and performance. The principal obstacle to the widespread adoption of machine learning techniques is the difficulty of the model building process.…”
Section: Related Workmentioning
confidence: 99%
“…Typical state of the practice in industry is to employ rules-of-thumb, and to rely on past experience to guess at relevant configuration parameters. More recent academic work has examined techniques based on domain-specific analytical cost models [5,24,25,26,61,62], hill climbing algorithms on customized frameworks [31], machine learning techniques [13,14,19,29,49], and genetic algorithms executed on the real application [32]. While these techniques are promising, they have not been widely deployed due to their inherent limitations.…”
Section: Introductionmentioning
confidence: 99%