2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems 2013
DOI: 10.1109/mascots.2013.44
|View full text |Cite
|
Sign up to set email alerts
|

Towards Improving MapReduce Task Scheduling Using Online Simulation Based Predictions

Abstract: Abstract-MapReduce is the model of choice for processing emerging big-data applications, and is facing an ever increasing demand for higher efficiency. In this context, we propose a novel task scheduling scheme that uses current task and system state information to drive online simulations concurrently within Hadoop, and predict with high accuracy future events, e.g., when a job would complete, or when task-specific datalocal nodes would be available. These predictions can then be used to make more efficient r… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2014
2014
2014
2014

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 11 publications
0
1
0
Order By: Relevance
“…A technique, which decides the execution order of the tasks in these slots, is the Job Scheduling Algorithms. Different Job Scheduling Algorithms [3][4][5][6][7][8][9][10][11][12], [15], [19][20][21][22][23], [25] are used for this parallel processing framework which takes care of the Data Locality [7][8], [24], Resource Utilization [5], MapReduce Interdependence [5] and meeting Job Deadline [9][10]. Data Locality needs to be considered when jobs are relatively smaller and data travelling and network cost cannot be ignored during the calculation of a Map Task Completion Time (MTCT).…”
Section: Introductionmentioning
confidence: 99%
“…A technique, which decides the execution order of the tasks in these slots, is the Job Scheduling Algorithms. Different Job Scheduling Algorithms [3][4][5][6][7][8][9][10][11][12], [15], [19][20][21][22][23], [25] are used for this parallel processing framework which takes care of the Data Locality [7][8], [24], Resource Utilization [5], MapReduce Interdependence [5] and meeting Job Deadline [9][10]. Data Locality needs to be considered when jobs are relatively smaller and data travelling and network cost cannot be ignored during the calculation of a Map Task Completion Time (MTCT).…”
Section: Introductionmentioning
confidence: 99%