MEMTUNE: Dynamic Memory Management for In-Memory Data Analytic Platforms

Xu, Lingfei; Li, Min; Zhang, Li; Butt, Ali R.; Wang, Yandong; Hu, Zane Zhenhua

doi:10.1109/ipdps.2016.105

Cited by 58 publications

(38 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…However, it does not considers tuning any parameters and their effect on duration. MEMTune [31] is able to determine the memory of Spark's executors by changing dynamically the size of both the JVM and the RDD cache. It also prefetchs data that is going to be used in future stages and evicts data blocks that are not going to be used.…”

Section: Related Workmentioning

confidence: 99%

Using machine learning to optimize parallelism in big data applications

Hernández

Pérez

Gupta

et al. 2018

Future Generation Computer Systems

View full text Add to dashboard Cite

In-memory cluster computing platforms have gained momentum in the last years, due to their ability to analyse big amounts of data in parallel. These platforms are complex and difficult-to-manage environments. In addition, there is a lack of tools to better understand and optimize such platforms that consequently form backbone of big data infrastructure and technologies. This directly leads to underutilization of available resources and application failures in such environment. One of the key aspects that can address this problem is optimization of the task parallelism of application in such environments. In this paper, we propose a machine learning based method that recommends optimal parameters for task parallelization in big data workloads. By monitoring and gathering metrics at system and application level, we are able to find statistical correlations that allow us to characterize and predict the effect of different parallelism settings on performance. These predictions are used to recommend an optimal configuration to users before launching their workloads in the cluster, avoiding possible failures, performance degradation and wastage of resources. We evaluate our method with a benchmark of 15 Spark applications on the Grid5000 testbed. We observe up to a 51% gain on performance when using the recommended parallelism settings. The model is also interpretable and can give insights to the user into how different metrics and parameters affect the performance.

show abstract

Section: Related Workmentioning

confidence: 99%

Using machine learning to optimize parallelism in big data applications

Hernández

Pérez

Gupta

et al. 2018

Future Generation Computer Systems

View full text Add to dashboard Cite

show abstract

“…Megastore [44] offers a distributed storage system with strong consistency guarantees and high availability for interactive online applications. EAD [45] and MemTune [46] are dynamic memory managers based on workload memory demand and inmemory data cache needs.…”

Section: Spark Implementationmentioning

confidence: 99%

Intermediate Data Caching Optimization for Multi-Stage and Parallel Big Data Frameworks

Yang

Jia

Ioannidis

et al. 2018

2018 IEEE 11th International Conference on Cloud Computing (CLOUD)

View full text Add to dashboard Cite

In the era of big data and cloud computing, large amounts of data are generated from user applications and need to be processed in the datacenter. Data-parallel computing frameworks, such as Apache Spark, are widely used to perform such data processing at scale. Specifically, Spark leverages distributed memory to cache the intermediate results, represented as Resilient Distributed Datasets (RDDs). This gives Spark an advantage over other parallel frameworks for implementations of iterative machine learning and data mining algorithms, by avoiding repeated computation or hard disk accesses to retrieve RDDs. By default, caching decisions are left at the programmer's discretion, and the LRU policy is used for evicting RDDs when the cache is full. However, when the objective is to minimize total work, LRU is woefully inadequate, leading to arbitrarily suboptimal caching decisions. In this paper, we design an algorithm for multi-stage big data processing platforms to adaptively determine and cache the most valuable intermediate datasets that can be reused in the future. Our solution automates the decision of which RDDs to cache: this amounts to identifying nodes in a direct acyclic graph (DAG) representing computations whose outputs should persist in the memory. Our experiment results show that our proposed cache optimization solution can improve the performance of machine learning applications on Spark decreasing the total work to recompute RDDs by 12%.

show abstract

“…Despite the significant performance impact of memory caches, cache management remains a relatively unchartered territory in data parallel systems. Prevalent parallel frameworks (e.g., Spark [2], Tez [7], and Tachyon [4]) simply employ LRU to manage cached data on cluster machines, which results in a significant performance loss [3], [24].…”

Section: Related Workmentioning

confidence: 99%

“…To our knowledge, the recently proposed MemTune [24] is the only caching system that leverages the application semantics. MemTune dynamically adjusts the memory share for task computation and data caching in Spark and evicts/prefetches data as needed.…”

Section: Related Workmentioning

confidence: 99%

LRC: Dependency-aware cache management for data analytics clusters

Wang

Letaief

2017

IEEE INFOCOM 2017 - IEEE Conference on Computer Communications

View full text Add to dashboard Cite

Abstract-Memory caches are being aggressively used in today's data-parallel systems such as Spark, Tez, and Piccolo. However, prevalent systems employ rather simple cache management policies-notably the Least Recently Used (LRU) policy-that are oblivious to the application semantics of data dependency, expressed as a directed acyclic graph (DAG). Without this knowledge, memory caching can at best be performed by "guessing" the future data access patterns based on historical information (e.g., the access recency and/or frequency), which frequently results in inefficient, erroneous caching with low hit ratio and a long response time.In this paper, we propose a novel cache replacement policy, Least Reference Count (LRC), which exploits the applicationspecific DAG information to optimize the cache management. LRC evicts the cached data blocks whose reference count is the smallest. The reference count is defined, for each data block, as the number of dependent child blocks that have not been computed yet. We demonstrate the efficacy of LRC through both empirical analysis and cluster deployments against popular benchmarking workloads. Our Spark implementation shows that, compared with LRU, LRC speeds up typical applications by 60%.

show abstract

MEMTUNE: Dynamic Memory Management for In-Memory Data Analytic Platforms

Cited by 58 publications

References 14 publications

Using machine learning to optimize parallelism in big data applications

Using machine learning to optimize parallelism in big data applications

Intermediate Data Caching Optimization for Multi-Stage and Parallel Big Data Frameworks

LRC: Dependency-aware cache management for data analytics clusters

Contact Info

Product

Resources

About