2016
DOI: 10.1145/2882783
|View full text |Cite
|
Sign up to set email alerts
|

Improving Resource Efficiency at Scale with Heracles

Abstract: User-facing, latency-sensitive services, such as websearch, underutilize their computing resources during daily periods of low traffic. Reusing those resources for other tasks is rarely done in production services since the contention for shared resources can cause latency spikes that violate the service-level objectives of latency-sensitive tasks. The resulting under-utilization hurts both the affordability and energy efficiency of large-scale datacenters. With the slowdown in technology scaling caused by the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
240
1

Year Published

2016
2016
2022
2022

Publication Types

Select...
3
2
2

Relationship

1
6

Authors

Journals

citations
Cited by 148 publications
(242 citation statements)
references
References 78 publications
1
240
1
Order By: Relevance
“…Servers running latency-critical applications operate at low utilization to guard against queuing delays, long requests, and other sources of performance variability. Further, their spare capacity cannot be used by batch applications, as uncontrolled sharing of cores, caches, and power causes high and unpredictable tail latency degradation [30,33,36]. As a result, datacenters servers typically have utilizations of 5-30% [8,9,37].…”
Section: A Anatomy Of Latency-critical Applicationsmentioning
confidence: 99%
See 3 more Smart Citations
“…Servers running latency-critical applications operate at low utilization to guard against queuing delays, long requests, and other sources of performance variability. Further, their spare capacity cannot be used by batch applications, as uncontrolled sharing of cores, caches, and power causes high and unpredictable tail latency degradation [30,33,36]. As a result, datacenters servers typically have utilizations of 5-30% [8,9,37].…”
Section: A Anatomy Of Latency-critical Applicationsmentioning
confidence: 99%
“…These techniques include new cluster managers that schedule and migrate applications across systems to reduce interference [18,32,36,54], fast dynamic voltage-frequency scaling (DVFS) techniques to improve power efficiency [25,29,32,48], hardware and software schemes to use low power idle states [37,39,53], and hardware resource partitioning schemes that allow batch workloads to run alongside latency-critical ones, improving utilization [29,30,33,57].…”
Section: A Anatomy Of Latency-critical Applicationsmentioning
confidence: 99%
See 2 more Smart Citations
“…This work is orthogonal to ours and could be a useful additional signal for our control plane. Heracles manages multiple hardware and software isolation mechanisms, including packet scheduling and cache partitioning, to co-locate latency-sensitive applications with batch tasks while maintaining millisecond SLOs [30]. We limit our focus to DVFS and core assignment but target more aggressive SLOs.…”
Section: Related Workmentioning
confidence: 99%