2013 IEEE International Conference on Big Data 2013
DOI: 10.1109/bigdata.2013.6691554
|View full text |Cite
|
Sign up to set email alerts
|

HFSP: Size-based scheduling for Hadoop

Abstract: Abstract-Size-based scheduling with aging has, for long, been recognized as an effective approach to guarantee fairness and near-optimal system response times. We present HFSP, a scheduler introducing this technique to a real, multi-server, complex and widely used system such as Hadoop.Size-based scheduling requires a priori job size information, which is not available in Hadoop: HFSP builds such knowledge by estimating it on-line during job execution.Our experiments, which are based on realistic workloads gen… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
16
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 29 publications
(18 citation statements)
references
References 27 publications
2
16
0
Order By: Relevance
“…Although these policies have been analyzed for distributedserver systems [14], [15], supercomputing workloads [25], and cloud compute-intensive workloads [10], a realistic investigation of such policies in datacenters for MapReduce workloads is currently missing. In particular, size-based scheduling has been employed in Hadoop [23] with adaptations of two policies: Shortest-Remaining-Processing-Time (SRPT) and Fair-SojournProtocol (FSP). However, these approaches have rather limited applicability in large-scale datacenters as they require either accurate estimations of job sizes [22] or periodic simulations of queued jobs in a virtually fair system [11], [23].…”
Section: E Improvements From Tyrexmentioning
confidence: 99%
See 1 more Smart Citation
“…Although these policies have been analyzed for distributedserver systems [14], [15], supercomputing workloads [25], and cloud compute-intensive workloads [10], a realistic investigation of such policies in datacenters for MapReduce workloads is currently missing. In particular, size-based scheduling has been employed in Hadoop [23] with adaptations of two policies: Shortest-Remaining-Processing-Time (SRPT) and Fair-SojournProtocol (FSP). However, these approaches have rather limited applicability in large-scale datacenters as they require either accurate estimations of job sizes [22] or periodic simulations of queued jobs in a virtually fair system [11], [23].…”
Section: E Improvements From Tyrexmentioning
confidence: 99%
“…In particular, size-based scheduling has been employed in Hadoop [23] with adaptations of two policies: Shortest-Remaining-Processing-Time (SRPT) and Fair-SojournProtocol (FSP). However, these approaches have rather limited applicability in large-scale datacenters as they require either accurate estimations of job sizes [22] or periodic simulations of queued jobs in a virtually fair system [11], [23]. The main idea behind FSP is to extend the SRPT policy with a job aging function which virtually decreases the sizes of the waiting jobs, thus avoiding starvation of the large jobs.…”
Section: E Improvements From Tyrexmentioning
confidence: 99%
“…Robust approaches to deal with uncertainty are widely used on MapReduce systems [13], [20], in Hadoop [25], [22], on databases [15] and on web servers [3]. The HSFS and FLEX schedulers provide robustness in scheduling against uncertain job size [25], [17]. Cannon and Jeannot [2] analyzed the correlation between various metrics used to measure robustness and provided scheduling heuristics that optimizes both makespan and robustness for scheduling task graph on heterogeneous system.…”
Section: Related Workmentioning
confidence: 99%
“…Does not consider user-specified goal. [20] HFSP Avoids job starvation, guarantees short response time.…”
Section: Heterogeneous Mapreduce Scheduling Techniquesmentioning
confidence: 99%