Abstract:Existing cloud computing control planes do not scale to more than a few hundred cores, while frameworks without control planes scale but take seconds to reschedule a job. We propose an asynchronous control plane for cloud computing systems, in which a central controller can dynamically reschedule jobs but worker nodes never block on communication with the controller. By decoupling control plane traffic from program control flow in this way, an asynchronous control plane can scale to run millions of computation… Show more
“…Thus, unawareness of application-level QoS at runtime could lead to host resource uneven or over-saturation-intrusive applications consume too many resources-making neighbor protected workloads experience performance outliers. 14,16 As shown in Figure 1A,B, we observed the JCT of the Spark Kmeans job co-located with stream 17 and the JCT of the Mapreduce Terasort job co-located with fio. 18 With the increase in the concurrency of the co-located intrusive workloads, the JCT of spark and Mapreduce jobs continue to grow, and the growth has gone from initially flat to relatively sharp as intrusive workloads steal more and more resources.…”
Section: Performance Interferencementioning
confidence: 72%
“…Existing efforts on performance interference migration of scale-out workloads focus on application-level scheduling or rescheduling. 6,10,14,16,43 Work in Reference 14 employs a white-box method to collect and analyze the footprints of the scale-out application at runtime to guide the placement of intrusive tasks to avoid interference, which means that a certain amount of resource utilization is sacrificed in exchange for QoS assurance. In response to the evicted scale-out task, multiple replicas were launched and placed to different hosts according to their load level in Reference 10.…”
Section: Migrating Interference Using Application-level Based Schedulingmentioning
Striking a balance between improved cluster utilization and guaranteed applicationQoS is a long-standing research problem in multi-tenants shared cluster. The typical solution is to detect performance degradation and investigate the root cause to conduct performance isolation. Existing efforts rely on lots of prior knowledge of applications and the assumption of interference-free workload placement is possible.Performance interference is usually mitigated through application-level approaches such as centralized rescheduling, which is usually an hindsight and a waste of resources.In this article, we present ScaleReactor, a graceful runtime agent on a per node basis that decouples the performance isolation from centralized resource management, and migrates the performance interference of scale-out workloads in container-based cluster using a lightweight black-box approach. ScaleReactor analyzes the degree of contention for multi-dimensional resources among co-located workloads to detect the performance degradation without additional prior information, and uses correlation analysis to locate the cause of contention, while isolating resources in a graceful manner to reduce system overhead and the performance degradation of intrusive workloads. Experiments have demonstrated that ScaleReactor effectively reduces the job completion time of scale-out applications in shared clusters, with the maximum value up to 36% and low system overhead against the existing isolation mechanism.
“…Thus, unawareness of application-level QoS at runtime could lead to host resource uneven or over-saturation-intrusive applications consume too many resources-making neighbor protected workloads experience performance outliers. 14,16 As shown in Figure 1A,B, we observed the JCT of the Spark Kmeans job co-located with stream 17 and the JCT of the Mapreduce Terasort job co-located with fio. 18 With the increase in the concurrency of the co-located intrusive workloads, the JCT of spark and Mapreduce jobs continue to grow, and the growth has gone from initially flat to relatively sharp as intrusive workloads steal more and more resources.…”
Section: Performance Interferencementioning
confidence: 72%
“…Existing efforts on performance interference migration of scale-out workloads focus on application-level scheduling or rescheduling. 6,10,14,16,43 Work in Reference 14 employs a white-box method to collect and analyze the footprints of the scale-out application at runtime to guide the placement of intrusive tasks to avoid interference, which means that a certain amount of resource utilization is sacrificed in exchange for QoS assurance. In response to the evicted scale-out task, multiple replicas were launched and placed to different hosts according to their load level in Reference 10.…”
Section: Migrating Interference Using Application-level Based Schedulingmentioning
Striking a balance between improved cluster utilization and guaranteed applicationQoS is a long-standing research problem in multi-tenants shared cluster. The typical solution is to detect performance degradation and investigate the root cause to conduct performance isolation. Existing efforts rely on lots of prior knowledge of applications and the assumption of interference-free workload placement is possible.Performance interference is usually mitigated through application-level approaches such as centralized rescheduling, which is usually an hindsight and a waste of resources.In this article, we present ScaleReactor, a graceful runtime agent on a per node basis that decouples the performance isolation from centralized resource management, and migrates the performance interference of scale-out workloads in container-based cluster using a lightweight black-box approach. ScaleReactor analyzes the degree of contention for multi-dimensional resources among co-located workloads to detect the performance degradation without additional prior information, and uses correlation analysis to locate the cause of contention, while isolating resources in a graceful manner to reduce system overhead and the performance degradation of intrusive workloads. Experiments have demonstrated that ScaleReactor effectively reduces the job completion time of scale-out applications in shared clusters, with the maximum value up to 36% and low system overhead against the existing isolation mechanism.
“…We evaluate four things: (1) how much micro‐partitioning and randomized partition assignment improve the end‐to‐end performance of simulations, (2) how the number of partitions affects performance, (3) how Birdshot performs when using different numbers of nodes and (4) how well Birdshot performs compared with other load balancing algorithms. Birdshot scheduling uses a task‐based runtime implemented in C++ [QMSL18]. MPI implementations (Open MPI 1.6.5 [url17]) are used as a reference point without micro‐partitioning or randomized assignment.…”
Graphical fluid simulations are CPU‐bound. Parallelizing simulations on hundreds of cores in the computing cloud would make them faster, but requires evenly balancing load across nodes. Good load balancing depends on manual decisions from experts, which are time‐consuming and error prone, or dynamic approaches that estimate and react to future load, which are non‐deterministic and hard to debug.
This paper proposes Birdshot scheduling, an automatic and purely static load balancing algorithm whose performance is close to expert decisions and reactive algorithms without their difficulty or complexity. Birdshot scheduling's key insight is to leverage the high‐latency, high‐throughput, full bisection bandwidth of cloud computing nodes. Birdshot scheduling splits the simulation domain into many micro‐partitions and statically assigns them to nodes randomly. Analytical results show that randomly assigned micro‐partitions balance load with high probability. The high‐throughput network easily handles the increased data transfers from micro‐partitions, and full bisection bandwidth allows random placement with no performance penalty. Overlapping the communications and computations of different micro‐partitions masks latency.
Experiments with particle‐level set, SPH, FLIP and explicit Eulerian methods show that Birdshot scheduling speeds up simulations by a factor of 2‐3, and can out‐perform reactive scheduling algorithms. Birdshot scheduling performs within 21% of state‐of‐the‐art dynamic methods that require running a second, parallel simulation. Unlike speculative algorithms, Birdshot scheduling is purely static: it requires no controller, runtime data collection, partition migration or support for these operations from the programmer.
“…General cloud schedulers A variety of task and cluster management systems include scheduling subsystems. Many architectures have been designed, from distributed [68,75,78,78,100] to centralized [48,97,98] techniques. Kubernetes can assign QoS to pods [19], but cannot provide function-level QoS.…”
Serverless computing is a rapidly growing paradigm that easily harnesses the power of the cloud. With serverless computing, developers simply provide an event-driven function to cloud providers, and the provider seamlessly scales function invocations to meet demands as event-triggers occur. As current and future serverless oerings support a wide variety of serverless applications, eective techniques to manage serverless workloads becomes an important issue. This work examines current management and scheduling practices in cloud providers, uncovering many issues including inated application run times, function drops, inecient allocations, and other undocumented and unexpected behavior. To x these issues, a new quality-of-service function scheduling and allocation framework, called Sequoia, is designed. Sequoia allows developers or administrators to easily dene how serverless functions and applications should be deployed, capped, prioritized, or altered based on easily congured, exible policies. Results with controlled and realistic workloads show Sequoia seamlessly adapts to policies, eliminates mid-chain drops, reduces queuing times by up to 6.4⇥, enforces tight chain-level fairness, and improves run-time performance up to 25⇥.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.