Characterizing Machines and Workloads on a Google Cluster

Liu, Zitao; Cho, Sangyeun

doi:10.1109/icppw.2012.57

Cited by 129 publications

(73 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…There has been some limited uptake in analyzing these trace logs, with work focusing on different research objectives including job behavior [6], statistical properties of workload [7][8], as well as machine events and job behavior [9]. However, as stated in [9], due to the massive dataset sizes as well as the required computation and storage power necessary to perform comprehensive analysis, until now it has only been possible to perform analysis at a coarse-grain or perform an in-depth analysis on a small time frame that represents a fraction of the entire trace log.…”

Section: Introductionmentioning

confidence: 99%

An Analysis of the Server Characteristics and Resource Utilization in Google Cloud

Garraghan

Townend

2013

2013 IEEE International Conference on Cloud Engineering (IC2E)

View full text Add to dashboard Cite

Abstract. Understanding the resource utilization and server characteristics of large-scale systems is crucial if service providers are to optimize their operations whilst maintaining Quality of Service. For large-scale datacenters, identifying the characteristics of resource demand and the current availability of such resources, allows system managers to design and deploy mechanisms to improve datacenter utilization and meet Service Level Agreements with their customers, as well as facilitating business expansion. In this paper, we present a large-scale analysis of server resource utilization and a characterization of a production Cloud datacenter using the most recent datacenter trace logs made available by Google. We present their statistical properties, and a comprehensive coarse-grain analysis of the data, including submission rates, server classification, and server resource utilization. Additionally, we perform a fine-grained analysis to quantify the resource utilization of servers wasted due to the early termination of tasks. Our results show that datacenter resource utilization remains relatively stable at between 40 -60%, that the degree of correlation between server utilization and Cloud workload environment varies by server architecture, and that the amount of resource utilization wasted varies between 4.53 -14.22% for different server architectures. This provides invaluable real-world empirical data for Cloud researchers in many subject areas.

show abstract

Section: Introductionmentioning

confidence: 99%

An Analysis of the Server Characteristics and Resource Utilization in Google Cloud

Garraghan

Townend

2013

2013 IEEE International Conference on Cloud Engineering (IC2E)

View full text Add to dashboard Cite

show abstract

“…Liu and Cho reported in their paper Characterizing machines and workloads on a Google cluster [41] that the majority (93%) of the machines monitored in the Google cluster dataset have a capacity set to 0.5, which supports AGILE's argument for setting overload at a capacity of 0.7 and higher.…”

Section: Setupmentioning

confidence: 86%

“…Data was recorded every 5 minutes (300 seconds) over a period of 29 days. According to Liu and Cho [41] the dataset has been sanitised to obfuscate condential information, but still gives useful and accurate information on cluster usage and load. This is important for the evaluations performed in this thesis.…”

Section: Datasetsmentioning

confidence: 99%

Forecasting methods for cloud hosted resources, a comparison

Engelbrecht

Greunen

2015

2015 11th International Conference on Network and Service Management (CNSM)

View full text Add to dashboard Cite

“…Several recent comprehensive analyses (e.g., [28,23]) of the workload characteristics derived from Google cloud tracelogs, featuring over 900 users submitting approximately 25 million tasks over a month, yielded significant data on the characteristics of submitted workloads and the management of cluster machines. These studies enable further work on important issues in the domain of resource optimization and energy efficiency improvement.…”

Section: Workloads Based On Google Cloud Tracelogsmentioning

confidence: 99%

PIASA: A power and interference aware resource management strategy for heterogeneous workloads in cloud data centers

Sampaio

Barbosa

Prodan

2015

Simulation Modelling Practice and Theory

View full text Add to dashboard Cite

a b s t r a c tCloud data centers have been progressively adopted in different scenarios, as reflected in the execution of heterogeneous applications with diverse workloads and diverse quality of service (QoS) requirements. Virtual machine (VM) technology eases resource management in physical servers and helps cloud providers achieve goals such as optimization of energy consumption. However, the performance of an application running inside a VM is not guaranteed due to the interference among co-hosted workloads sharing the same physical resources. Moreover, the different types of co-hosted applications with diverse QoS requirements as well as the dynamic behavior of the cloud makes efficient provisioning of resources even more difficult and a challenging problem in cloud data centers. In this paper, we address the problem of resource allocation within a data center that runs different types of application workloads, particularly CPU-and network-intensive applications. To address these challenges, we propose an interference-and power-aware management mechanism that combines a performance deviation estimator and a scheduling algorithm to guide the resource allocation in virtualized environments. We conduct simulations by injecting synthetic workloads whose characteristics follow the last version of the Google Cloud tracelogs. The results indicate that our performance-enforcing strategy is able to fulfill contracted SLAs of real-world environments while reducing energy costs by as much as 21%.

show abstract

Characterizing Machines and Workloads on a Google Cluster

Cited by 129 publications

References 9 publications

An Analysis of the Server Characteristics and Resource Utilization in Google Cloud

An Analysis of the Server Characteristics and Resource Utilization in Google Cloud

Forecasting methods for cloud hosted resources, a comparison

PIASA: A power and interference aware resource management strategy for heterogeneous workloads in cloud data centers

Contact Info

Product

Resources

About