Garbage collection auto-tuning for Java mapreduce on multi-cores

Singer, Jeremy; Kovoor, George; Brown, Gavin; Luján, Mikel

doi:10.1145/1993478.1993495

Cited by 19 publications

(5 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Most of the workloads have been used in popular data analysis workload suites such as BigDataBench [7], DCBench [6], HiBench [14] and Cloudsuite [5]. Phoenix++ [15], Phoenix rebirth [16] and Java MapReduce [17] tests the performance of devised sharedmemory frameworks based on Word Count, Grep and K-Means. We use Spark version of the selected benchmarks from BigDataBench and employ Big Data Generator Suite (BDGS), an open source tool, to generate synthetic datasets for every benchmark based on raw data sets [18].…”

Section: A Benchmarksmentioning

confidence: 99%

Performance Characterization of In-Memory Data Analytics on a Modern Cloud Server

Awan

Brorsson

Vlassov

et al. 2015

2015 IEEE Fifth International Conference on Big Data and Cloud Computing

View full text Add to dashboard Cite

In last decade, data analytics have rapidly progressed from traditional disk-based processing to modern inmemory processing. However, little effort has been devoted at enhancing performance at micro-architecture level. This paper characterizes the performance of in-memory data analytics using Apache Spark framework. We use a single node NUMA machine and identify the bottlenecks hampering the scalability of workloads. We also quantify the inefficiencies at micro-architecture level for various data analysis workloads. Through empirical evaluation, we show that spark workloads do not scale linearly beyond twelve threads, due to work time inflation and thread level load imbalance. Further, at the micro-architecture level, we observe memory bound latency to be the major cause of work time inflation.

show abstract

Section: A Benchmarksmentioning

confidence: 99%

Performance Characterization of In-Memory Data Analytics on a Modern Cloud Server

Awan

Brorsson

Vlassov

et al. 2015

2015 IEEE Fifth International Conference on Big Data and Cloud Computing

View full text Add to dashboard Cite

show abstract

“…In [15] the GC is auto-tuned in order to improve the performance of a MapReduce [16] Java implementation for multi-core hardware. For each relevant benchmark, machine learning techniques are used to find the best execution time for each combination of input size, heap size and number of threads in relation to a given GC algorithm (i.e.…”

Section: Related Workmentioning

confidence: 99%

VM Economics for Java Cloud Computing: An Adaptive and Resource-Aware Java Runtime with Quality-of-Execution

Simão

Veiga

2012

2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (Ccgrid 2012)

View full text Add to dashboard Cite

Resource management in Cloud Computing has been dominated by system-level virtual machines to enable the management of resources using a coarse grained approach, largely in a manner independent from the applications running on these infrastructures. However, in such environments, although different types of applications can be running, the resources are delivered equally to each one, missing the opportunity to manage the available resources in a more efficient and application driven way. So, as more applications target managed runtimes, high level virtualization is a relevant abstraction layer that has not been properly explored to enhance resource usage, control, and effectiveness.We propose a VM economics model to manage cloud infrastructures, governed by a quality-of-execution (QoE) metric and implemented by an extended virtual machine. The Adaptive and Resource-Aware Java Virtual Machine (ARA-JVM) is a clusterenabled virtual execution environment with the ability to monitor base mechanisms (e.g. thread scheduling, garbage collection, memory or network consumptions) to assess application's performance and reconfigure these mechanisms in runtime according to previously defined resource allocation policies. Reconfiguration is driven by incremental gains in quality-of-execution (QoE), used by the VM economics model to balance relative resource savings and perceived performance degradation.Our work in progress, aims to allow cloud providers to exchange resource slices among virtual machines, continually addressing where those resources are required, while being able to determine where the reduction will be more economically effective, i.e., will contribute in lesser extent to performance degradation.

show abstract

“…The proposed structure is evaluated by using a trace-based simulator with SPEC 2006 [Henning 2006;SPEC CPU 2006;Phansalkar et al 2007], SPLASH-2 [Woo et al 1995], grep [Singer et al 2011], and PostMark [Katcher 1997] traces. According to our simulation results, execution time can be reduced by about 89% compared to a conventional hierarchy of DRAM main memory and HDD secondary storage and 77% over a set of DRAM buffers/PRAM main memory/HDD disk pair proposed as state-ofthe-art proposals [Qureshi et al 2009].…”

Section: Introductionmentioning

confidence: 99%

A New Memory-Disk Integrated System with HW Optimizer

Lee

Yoon

Kim

et al. 2015

ACM Trans. Archit. Code Optim.

View full text Add to dashboard Cite

Current high-performance computer systems utilize a memory hierarchy of on-chip cache, main memory, and secondary storage due to differences in device characteristics. Limiting the amount of main memory causes page swap operations and duplicates data between the main memory and the storage device. The characteristics of next-generation memory, such as nonvolatility, byte addressability, and scaling to greater capacity, can be used to solve these problems. Simple replacement of secondary storage with new forms of nonvolatile memory in a traditional memory hierarchy still causes typical problems, such as memory bottleneck, page swaps, and write overhead. Thus, we suggest a single architecture that merges the main memory and secondary storage into a system called a Memory-Disk Integrated System (MDIS). The MDIS architecture is composed of a virtually decoupled NVRAM and a nonvolatile memory performance optimizer combining hardware and software to support this system. The virtually decoupled NVRAM module can support conventional main memory and disk storage operations logically without data duplication and can reduce write operations to the NVRAM. To increase the lifetime and optimize the performance of this NVRAM, another hardware module called a Nonvolatile Performance Optimizer (NVPO) is used that is composed of four small buffers. The NVPO exploits spatial and temporal characteristics of static/dynamic data based on program execution characteristics. Enhanced virtual memory management and address translation modules in the operating system can support these hardware components to achieve a seamless memory-storage environment. Our experimental results show that the proposed architecture can improve execution time by about 89% over a conventional DRAM main memory/HDD storage system, and 77% over a state-of-the-art PRAM main memory/HDD disk system with DRAM buffer. Also, the lifetime of the virtually decoupled NVRAM is estimated to be 40% longer than that of a traditional hierarchy based on the same device technology. ACM Reference Format:Do-Heon Lee, Su-Kyung Yoon, Jung-Geun Kim, Charles C. Weems, and Shin-Dug Kim. 2015. A new memorydisk integrated system with HW optimizer. ACM Trans.

show abstract

Garbage collection auto-tuning for Java mapreduce on multi-cores

Cited by 19 publications

References 31 publications

Performance Characterization of In-Memory Data Analytics on a Modern Cloud Server

Performance Characterization of In-Memory Data Analytics on a Modern Cloud Server

VM Economics for Java Cloud Computing: An Adaptive and Resource-Aware Java Runtime with Quality-of-Execution

A New Memory-Disk Integrated System with HW Optimizer

Contact Info

Product

Resources

About