2018
DOI: 10.1109/jsyst.2017.2764481
|View full text |Cite
|
Sign up to set email alerts
|

An Enhanced Data-Locality-Aware Task Scheduling Algorithm for Hadoop Applications

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
10
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 16 publications
(10 citation statements)
references
References 12 publications
0
10
0
Order By: Relevance
“…The advantage of the improved fair scheduling scheme is its efficiency in producing throughput for datasets of variable size; however, the disadvantages are that long jobs can slow the algorithm and cause overloading issues at a node. Authors in [29] proposed a data locality-aware enhanced task scheduling algorithm was proposed to improve the job completion time when an input split consists of multiple data blocks that are distributed and stored in different nodes, this data location method fails to cope with the degradation in processing performance due to the increased frequency of data block copying. To solve this issue, authors have proposed a task scheduling algorithm by defining a method to classify data locality taking into account the location of all data blocks that comprise an input split, categorizing tasks based on the defined method, and sequentially assigning tasks according to a given priority.…”
Section: Problem Statementmentioning
confidence: 99%
See 2 more Smart Citations
“…The advantage of the improved fair scheduling scheme is its efficiency in producing throughput for datasets of variable size; however, the disadvantages are that long jobs can slow the algorithm and cause overloading issues at a node. Authors in [29] proposed a data locality-aware enhanced task scheduling algorithm was proposed to improve the job completion time when an input split consists of multiple data blocks that are distributed and stored in different nodes, this data location method fails to cope with the degradation in processing performance due to the increased frequency of data block copying. To solve this issue, authors have proposed a task scheduling algorithm by defining a method to classify data locality taking into account the location of all data blocks that comprise an input split, categorizing tasks based on the defined method, and sequentially assigning tasks according to a given priority.…”
Section: Problem Statementmentioning
confidence: 99%
“…5, Improved Hadoop displays better performance than Hadoop because the improved ACO is used to schedule client jobs. Next, the proposed approach is compared with state-of-the-art approaches, including DGNS [9] and iShufe [29].…”
Section: Effectiveness Of the Proposed Approach Versus Those In Previous Workmentioning
confidence: 99%
See 1 more Smart Citation
“…As an instance, an offline scheduling algorithm based on graph models was proposed by Selvitopi et al [25], which correctly encodes the interactions between map and reduce tasks. Choi et al [26] addressed a problem in which a map split consisted of multiple data blocks distributed and stored in different nodes. Two data-locality-aware task scheduling algorithms were proposed by Beaumont et al [27], which optimized makespan.…”
Section: Related Workmentioning
confidence: 99%
“…Selvitopi et al [44] proposed an offline scheduling algorithm based on graph and hypergraph models, which correctly encoded the interactions between map and reduce tasks. Choi et al [45] aimed at a problem where an input split consisted of multiple data blocks that were distributed and stored in different nodes. Beaumont et al [46] proposed two data-locality-aware task scheduling algorithms that optimized makespan and communication, respectively, and theoretically studied their performance.…”
Section: Related Workmentioning
confidence: 99%