Prior node selection for scheduling workflows in a heterogeneous system

Kanemitsu, Hidehiro; Hanada, Masaki; Nakazato, Hidenori

doi:10.1016/j.jpdc.2017.06.005

Cited by 6 publications

(2 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This flexibility introduces some challenges that should be addressed to improve makespan and resource utilization when a batch of heterogeneous jobs are periodically submitted. Moreover, heterogeneous VMs in MapReduce virtual cluster accommodate a different number of containers. For instance, as shown in Figure A, consider two VMs ( VM 1 with <4,6>, and VM 2 with <2,4>) and MapReduce jobs ( J 1 , J 2 … J 6 ) in a batch as shown in Figure .…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Improving MapReduce scheduler for heterogeneous workloads in a heterogeneous environment

Jeyaraj

Ananthanarayana

Paul

2019

Concurrency and Computation

View full text Add to dashboard Cite

Summary Big data is largely influencing business entities and research sectors to be more data‐driven. Hadoop MapReduce is one of the cost‐effective ways to process large scale datasets and offered as a service over the Internet. Even though cloud service providers promise an infinite amount of resources available on‐demand, it is inevitable that some of the hired virtual resources for MapReduce are left unutilized and makespan is limited due to various heterogeneities that exist while offering MapReduce as a service. As MapReduce v2 allows users to define the size of containers for the map and reduce tasks, jobs in a batch become heterogeneous and behave differently. Also, the different capacity of virtual machines in the MapReduce virtual cluster accommodate a varying number of map/reduce tasks. These factors highly affect resource utilization in the virtual cluster and the makespan for a batch of MapReduce jobs. Default MapReduce job schedulers do not consider these heterogeneities that exist in a cloud environment. Moreover, virtual machines in MapReduce virtual cluster process an equal number of blocks regardless of their capacity, which affects the makespan. Therefore, we devised a heuristic‐based MapReduce job scheduler that exploits virtual machine and MapReduce workload level heterogeneities to improve resource utilization and makespan. We proposed two methods to achieve this: (i) roulette wheel scheme based data block placement in heterogeneous virtual machines, and (ii) a constrained 2‐dimensional bin packing to place heterogeneous map/reduce tasks. We compared heuristic‐based MapReduce job scheduler against the classical fair scheduler in MapReduce v2. Experimental results showed that our proposed scheduler improved makespan and resource utilization by 45.6% and 47.9% over classical fair scheduler.

show abstract

Section: Introductionmentioning

confidence: 99%

mentioning

confidence: 99%