Twin-entry turbine losses: An analysis using CFD data

In data-intensive cluster computing platforms such as Hadoop YARN, performance and fairness are two important concerns for users. Existing studies show that, because of the resource contention between users/jobs, there is a tradeoff between the performance and fairness. In our work, we observe that such trade-off is related to the resource demand of the workload and is changing with the variation of multi-resource demand of submitted jobs during the computation. We also find that having an algorithm to be aware of the resource demand variation is important for the bi-criteria optimization between performance and fairness. However, most previous studies are not aware of this and design their heuristic algorithms with the assumption of fixed trade-off. In this paper, we propose a adaptive scheduler called Gemini for Hadoop YARN. For Gemini, it first develops a regression approach to construct a model which can estimate the performance improvement and the fairness loss under the sharing computation compared to the exclusive non-sharing scenario. Next, it leverages the model to guide the resource allocation for pending tasks to optimize the performance of the cluster given the user-defined fairness level. Instead of using a static scheduling policy, Gemini adaptively decides the proper scheduling policy according to the current running workload. We implement Gemini in Hadoop YARN. Experimental results show that Gemini outperforms the state-ofthe-art work in two aspects. 1) For the same fairness loss, Gemini increases the performance improvement up to 225% and 200% in real deployment and the large-scale simulation, respectively; 2) For the same performance improvement, Gemini reduces the fairness loss up to 70% and 62.5% in real deployment and the large-scale simulation, respectively.

show abstract

An Adaptive Efficiency-Fairness Meta-Scheduler for Data-Intensive Computing

Niu

Tang

2019

IEEE Trans. Serv. Comput.

View full text Add to dashboard Cite

Not All Joules are Equal: Towards Energy-Efficient and Green-Aware Data Processing Frameworks

Niu

Liu

2016

View full text Add to dashboard Cite

A Study of Big Data Computing Platforms: Fairness and Energy Consumption

Niu

2016

View full text Add to dashboard Cite

Elastic Multi-resource Fairness: Balancing Fairness and Efficiency in Coupled CPU-GPU Architectures

Tang

Zhang

et al. 2016

View full text Add to dashboard Cite

JouleMR: Towards Cost-Effective and Green-Aware Data Processing Frameworks

Niu

Liu

2018

IEEE Trans. Big Data

View full text Add to dashboard Cite

Interests have been growing in energy management of the cluster effectively in order to reduce the energy consumption as well as the electricity cost. Renewable energy and dynamic pricing schemes in smart grids are two major emerging trends in energy markets. However, current data processing frameworks are not aware of the efficiency of each joule consumed by the data center workloads in the context of these two major trends. In fact, not all joules are equal in the sense that the amount of work that can be done by a joule can vary significantly in data centers. Ignoring this fact leads to significant energy waste (by 25% of the total energy consumption in Hadoop YARN on a Facebook production trace according to our study). In this paper, we propose JouleMR, a cost-effective and green-aware data processing framework. Specifically, we investigate how to exploit such joule efficiency to maximize the benefits of renewable energy as well as dynamic pricing schemes for MapReduce framework. We develop job/task scheduling algorithms with a particular focus on the factors on joule efficiency in the data center, including the energy efficiency of MapReduce workloads, renewable energy supply, dynamic pricing and the battery usage. We further develop a simple yet effective performance-energy consumption model to guide our scheduling decisions. We have implemented JouleMR on top of Hadoop YARN. The experiments demonstrate the accuracy of our models, and the effectiveness of our cost-effective and green-aware optimizations outperform the state-of-the-art implementations over Hadoop YARN.

show abstract

Storage Technologies for Big Data

Niu¹,

He²

2019

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Zhaojie Niu

Long-Term Multi-Resource Fairness for Pay-as-you Use Computing Systems

Gemini: An Adaptive Performance-Fairness Scheduler for Data-Intensive Cluster Computing

An Adaptive Efficiency-Fairness Meta-Scheduler for Data-Intensive Computing

Not All Joules are Equal: Towards Energy-Efficient and Green-Aware Data Processing Frameworks

A Study of Big Data Computing Platforms: Fairness and Energy Consumption

Elastic Multi-resource Fairness: Balancing Fairness and Efficiency in Coupled CPU-GPU Architectures

JouleMR: Towards Cost-Effective and Green-Aware Data Processing Frameworks

Storage Technologies for Big Data

Contact Info

Product

Resources

About