2020
DOI: 10.1109/tnet.2020.2973224
|View full text |Cite
|
Sign up to set email alerts
|

Optimal Server Selection for Straggler Mitigation

Abstract: The performance of large-scale distributed compute systems is adversely impacted by stragglers when the execution time of a job is uncertain. To manage stragglers, we consider a multi-fork approach for job scheduling, where additional parallel servers are added at forking instants. In terms of the forking instants and the number of additional servers, we compute the job completion time and the cost of server utilization when the task processing times are assumed to have a shifted exponential distribution. We u… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
20
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
2

Relationship

2
6

Authors

Journals

citations
Cited by 18 publications
(20 citation statements)
references
References 31 publications
0
20
0
Order By: Relevance
“…Much recent work in big data systems focuses on improving workflows [16], [17], [18], programming framework [19], [20], [21], task scheduling [22], [23], [24].…”
Section: Related Workmentioning
confidence: 99%
“…Much recent work in big data systems focuses on improving workflows [16], [17], [18], programming framework [19], [20], [21], task scheduling [22], [23], [24].…”
Section: Related Workmentioning
confidence: 99%
“…However, in large-scale setups, monitoring data across all host machines is inefficient and can create network bandwidth contention, negatively impacting job response times. The work in [30] proposes a task replication approach for job scheduling to minimize the effect of the Long-Tail problem. The authors analyze the impact of this approach in a heterogeneous platform.…”
Section: Related Workmentioning
confidence: 99%
“…This allows their model to run multiple instances in datacenters with powerful computational resources. However, the approach can handle only a single job system with the same workload characteristics and fails in the presence of diverse workloads as pointed by [30].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Multi-objective Optimization: Many realistic applications have multiple objectives, e.g., capacity and power usage in the communication system [ 26 , 27 ], latency and energy consumption [ 28 ], efficiency and safety in robotic systems [ 29 ]. Further, the overall aim is to optimize a non-linear function of the different objectives.…”
mentioning
confidence: 99%