MapReduce might be a modish computing paradigm in Hadoop. MapReduce is utilized for large-scale process in large info. However, the slot-based MapReduce system (e.g., Hadoop MRv1) can suffer from poor performance as a result of its unoptimized resource allocation. to resolve this downside this paper identifies and optimizes the resource allocation from three key aspects.First, alowing to the pre-configuration of distinct reduce slots and map slots that aren't fungible, slots will be severely under-utilized. as a result of map slots can be totally used whereas reduce slots area unit empty, and vice-versa. This paper conjointly proposes another technique known as Dynamic Hadoop Slot Allocation by keeping the slot-based model. It relaxes the slot allocation constraint to permit slots to be reallocated to either map or reduce tasks reckoning on their desires. Second, the speculative execution will tackle the strayer drawback, that has shown to boost the performance for one job however at the expense of the cluster potenc In view of this, we tend to propose Speculative Execution Performance leveling to balance the performance between one job and a bunch of jobs. Third, delay programming has shown to enhance the info neighbourhood however at the price of fairness. to boot, the paper propose a way referred to as Slot PreScheduling which will improve the info neighbourhood however with no impact on fairness. Finally, by combining these techniques along, we tend to type a gradual slot allocation system referred to as DynamicMR which will improve the performance of MapReduce workloads considerably.
Abstract:The increasing use of internet leads to handle lots of data by internet service providers. MapReduce is one of the goodsolutions for implementing large scale distributed data application. AMapReduce workload generally contains a set of jobs, each of which consists of multiple map tasks followed by multiple reducetasks. Due to 1) that map tasks can only run in map slots and reduce tasks can only run in reduce slots, and 2) the general executionconstraints that map tasks are executed before reduce tasks, different job execution orders and map/reduce slot configurations for a MapReduce workload have significantly different performance and system utilization. Makespanand total completion time are two key performancemetrics T his paper proposes two algorithm for these two key. Our first class of algorithms focuses onthe job ordering optimization for a MapReduce workload under a given map/reduce slot configuration. Our second class ofalgorithms considers the scenario that we can perform optimization for map/reduce slot configuration for a MapReduce workload.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.