Decoupling the control plane from program control flow for flexibility and performance in cloud computing

Qu, Hang; Mashayekhi, Omid; Shah, Chinmayee; Levis, Philip

doi:10.1145/3190508.3190516

Cited by 12 publications

(9 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Thus, unawareness of application-level QoS at runtime could lead to host resource uneven or over-saturation-intrusive applications consume too many resources-making neighbor protected workloads experience performance outliers. 14,16 As shown in Figure 1A,B, we observed the JCT of the Spark Kmeans job co-located with stream 17 and the JCT of the Mapreduce Terasort job co-located with fio. 18 With the increase in the concurrency of the co-located intrusive workloads, the JCT of spark and Mapreduce jobs continue to grow, and the growth has gone from initially flat to relatively sharp as intrusive workloads steal more and more resources.…”

Section: Performance Interferencementioning

confidence: 72%

“…Existing efforts on performance interference migration of scale-out workloads focus on application-level scheduling or rescheduling. 6,10,14,16,43 Work in Reference 14 employs a white-box method to collect and analyze the footprints of the scale-out application at runtime to guide the placement of intrusive tasks to avoid interference, which means that a certain amount of resource utilization is sacrificed in exchange for QoS assurance. In response to the evicted scale-out task, multiple replicas were launched and placed to different hosts according to their load level in Reference 10.…”

Section: Migrating Interference Using Application-level Based Schedulingmentioning

confidence: 99%

See 1 more Smart Citation

ScaleReactor: A graceful performance isolation agent with interference detection and investigation for container‐based scale‐out workloads

Zhu

et al. 2021

Concurrency and Computation

View full text Add to dashboard Cite

Striking a balance between improved cluster utilization and guaranteed applicationQoS is a long-standing research problem in multi-tenants shared cluster. The typical solution is to detect performance degradation and investigate the root cause to conduct performance isolation. Existing efforts rely on lots of prior knowledge of applications and the assumption of interference-free workload placement is possible.Performance interference is usually mitigated through application-level approaches such as centralized rescheduling, which is usually an hindsight and a waste of resources.In this article, we present ScaleReactor, a graceful runtime agent on a per node basis that decouples the performance isolation from centralized resource management, and migrates the performance interference of scale-out workloads in container-based cluster using a lightweight black-box approach. ScaleReactor analyzes the degree of contention for multi-dimensional resources among co-located workloads to detect the performance degradation without additional prior information, and uses correlation analysis to locate the cause of contention, while isolating resources in a graceful manner to reduce system overhead and the performance degradation of intrusive workloads. Experiments have demonstrated that ScaleReactor effectively reduces the job completion time of scale-out applications in shared clusters, with the maximum value up to 36% and low system overhead against the existing isolation mechanism.

show abstract

Section: Performance Interferencementioning

confidence: 72%

Section: Migrating Interference Using Application-level Based Schedulingmentioning

confidence: 99%

ScaleReactor: A graceful performance isolation agent with interference detection and investigation for container‐based scale‐out workloads

Zhu

et al. 2021

Concurrency and Computation

View full text Add to dashboard Cite

show abstract

“…We evaluate four things: (1) how much micro‐partitioning and randomized partition assignment improve the end‐to‐end performance of simulations, (2) how the number of partitions affects performance, (3) how Birdshot performs when using different numbers of nodes and (4) how well Birdshot performs compared with other load balancing algorithms. Birdshot scheduling uses a task‐based runtime implemented in C++ [QMSL18]. MPI implementations (Open MPI 1.6.5 [url17]) are used as a reference point without micro‐partitioning or randomized assignment.…”

Section: Discussionmentioning

confidence: 99%

Accelerating Distributed Graphical Fluid Simulations with Micro‐partitioning

Mashayekhi

Shah

et al. 2019

Computer Graphics Forum

Self Cite

View full text Add to dashboard Cite

Graphical fluid simulations are CPU‐bound. Parallelizing simulations on hundreds of cores in the computing cloud would make them faster, but requires evenly balancing load across nodes. Good load balancing depends on manual decisions from experts, which are time‐consuming and error prone, or dynamic approaches that estimate and react to future load, which are non‐deterministic and hard to debug. This paper proposes Birdshot scheduling, an automatic and purely static load balancing algorithm whose performance is close to expert decisions and reactive algorithms without their difficulty or complexity. Birdshot scheduling's key insight is to leverage the high‐latency, high‐throughput, full bisection bandwidth of cloud computing nodes. Birdshot scheduling splits the simulation domain into many micro‐partitions and statically assigns them to nodes randomly. Analytical results show that randomly assigned micro‐partitions balance load with high probability. The high‐throughput network easily handles the increased data transfers from micro‐partitions, and full bisection bandwidth allows random placement with no performance penalty. Overlapping the communications and computations of different micro‐partitions masks latency. Experiments with particle‐level set, SPH, FLIP and explicit Eulerian methods show that Birdshot scheduling speeds up simulations by a factor of 2‐3, and can out‐perform reactive scheduling algorithms. Birdshot scheduling performs within 21% of state‐of‐the‐art dynamic methods that require running a second, parallel simulation. Unlike speculative algorithms, Birdshot scheduling is purely static: it requires no controller, runtime data collection, partition migration or support for these operations from the programmer.

show abstract

“…General cloud schedulers A variety of task and cluster management systems include scheduling subsystems. Many architectures have been designed, from distributed [68,75,78,78,100] to centralized [48,97,98] techniques. Kubernetes can assign QoS to pods [19], but cannot provide function-level QoS.…”

Section: Related Workmentioning

confidence: 99%

Sequoia

Tariq

Pahl

Nimmagadda

et al. 2020

Proceedings of the 11th ACM Symposium on Cloud Computing

View full text Add to dashboard Cite

Serverless computing is a rapidly growing paradigm that easily harnesses the power of the cloud. With serverless computing, developers simply provide an event-driven function to cloud providers, and the provider seamlessly scales function invocations to meet demands as event-triggers occur. As current and future serverless oerings support a wide variety of serverless applications, eective techniques to manage serverless workloads becomes an important issue. This work examines current management and scheduling practices in cloud providers, uncovering many issues including inated application run times, function drops, inecient allocations, and other undocumented and unexpected behavior. To x these issues, a new quality-of-service function scheduling and allocation framework, called Sequoia, is designed. Sequoia allows developers or administrators to easily dene how serverless functions and applications should be deployed, capped, prioritized, or altered based on easily congured, exible policies. Results with controlled and realistic workloads show Sequoia seamlessly adapts to policies, eliminates mid-chain drops, reduces queuing times by up to 6.4⇥, enforces tight chain-level fairness, and improves run-time performance up to 25⇥.

show abstract

Decoupling the control plane from program control flow for flexibility and performance in cloud computing

Cited by 12 publications

References 21 publications

ScaleReactor: A graceful performance isolation agent with interference detection and investigation for container‐based scale‐out workloads

ScaleReactor: A graceful performance isolation agent with interference detection and investigation for container‐based scale‐out workloads

Accelerating Distributed Graphical Fluid Simulations with Micro‐partitioning

Sequoia

Contact Info

Product

Resources

About