2012
DOI: 10.5120/9617-4256
|View full text |Cite
|
Sign up to set email alerts
|

Survey on Task Assignment Techniques in Hadoop

Abstract: MapReduce is an implementation for processing large scale data parallelly. Actual benefits of MapReduce occur when this framework is implemented in large scale, shared nothing cluster. MapReduce framework abstracts the complexity of running distributed data processing across multiple nodes in cluster. Hadoop is open source implementation of MapReduce framework, which processes the vast amount of data in parallel on large clusters. In Hadoop pluggable scheduler was implemented, because of this several algorithm… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
3
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 5 publications
0
3
0
Order By: Relevance
“…In addition, numerous of scheduling algorithms are coming into being. For example, Delay Scheduler [5], which is based on enhancing data locality; Dynamic Proportional Scheduler [6], which is based on user preferences and changing the way of task allocation proportion and dynamic priority; Constraint-Based Scheduler [7], which is based on deadline of Real-time; [8] proposed a scheduling algorithm which considers when the task scheduler can't choose the Data-local task whether it allows to assign the Non-local task or not; [9] proposed a scheduling algorithm which is based on the number of Map task node and data sheet replication mode; LATE [10] (Longest Approximate Time to End) and SAMR [11] (Self-adaptive MapReduce Scheduling Algorithm), which are based on under heterogeneous environment how to improve the scheduling efficiency; [12] and [13] proposed a scheduling algorithm which are based on job types classifying and the load dynamic of heterogeneous; [14] proposed a scheduling algorithm which is based on the adaptive node capacity; [15] proposed a scheduling algorithm which is based on matching rules; beyond above, the scheduler based on the intelligent algorithm and simulated annealing algorithm [16] and artificial fish algorithm [17] and genetic algorithm [18]; etc. however all of these scheduling algorithm don't consider from the resources allocation model, their task scheduling strategy is to distinguish the slot type.…”
Section: Related Workmentioning
confidence: 99%
“…In addition, numerous of scheduling algorithms are coming into being. For example, Delay Scheduler [5], which is based on enhancing data locality; Dynamic Proportional Scheduler [6], which is based on user preferences and changing the way of task allocation proportion and dynamic priority; Constraint-Based Scheduler [7], which is based on deadline of Real-time; [8] proposed a scheduling algorithm which considers when the task scheduler can't choose the Data-local task whether it allows to assign the Non-local task or not; [9] proposed a scheduling algorithm which is based on the number of Map task node and data sheet replication mode; LATE [10] (Longest Approximate Time to End) and SAMR [11] (Self-adaptive MapReduce Scheduling Algorithm), which are based on under heterogeneous environment how to improve the scheduling efficiency; [12] and [13] proposed a scheduling algorithm which are based on job types classifying and the load dynamic of heterogeneous; [14] proposed a scheduling algorithm which is based on the adaptive node capacity; [15] proposed a scheduling algorithm which is based on matching rules; beyond above, the scheduler based on the intelligent algorithm and simulated annealing algorithm [16] and artificial fish algorithm [17] and genetic algorithm [18]; etc. however all of these scheduling algorithm don't consider from the resources allocation model, their task scheduling strategy is to distinguish the slot type.…”
Section: Related Workmentioning
confidence: 99%
“…To the best of our knowledge, there is no published literature that clearly articulates the problem of scheduling in big data frameworks and provides a research taxonomy for succinct classification of the existing scheduling techniques in Hadoop, Spark, Storm, and Mesos frameworks. Previous efforts [6], [7] [8] that attempted to provide a comprehensive review of scheduling issues in big data platforms were limited to Hadoop only. Moreover, they did not include all papers that were published during the periods covered by their studies (i.e., 2012 and 2015).…”
Section: Introductionmentioning
confidence: 99%
“…Furthermore, no studies have succinctly discussed the Hadoop scheduling problem and provide a research taxonomy for classifying existing scheduling techniques. Early efforts [5][6][7] to conduct a detailed study of Hadoop platform scheduling problems were limited in scope.…”
mentioning
confidence: 99%