Kaiyuan Hou scite author profile

Virtualized data centers enable sharing of resources among hosted applications. However, it is difficult to satisfy servicelevel objectives (SLOs) of applications on shared infrastructure, as application workloads and resource consumption patterns change over time. In this paper, we present AutoControl, a resource control system that automatically adapts to dynamic workload changes to achieve application SLOs. AutoControl is a combination of an online model estimator and a novel multi-input, multi-output (MIMO) resource controller. The model estimator captures the complex relationship between application performance and resource allocations, while the MIMO controller allocates the right amount of multiple virtualized resources to achieve application SLOs. Our experimental evaluation with RUBiS and TPC-W benchmarks along with production-trace-driven workloads indicates that AutoControl can detect and mitigate CPU and disk I/O bottlenecks that occur over time and across multiple nodes by allocating each resource accordingly. We also show that AutoControl can be used to provide service differentiation according to the application priorities during resource contention.

show abstract

Application-assisted live migration of virtual machines with Java applications

Hou

Shin

Sung

2015

View full text Add to dashboard Cite

Live migration of virtual machines (VMs) can consume excessive time and resources, and may affect application performance significantly if VM memory pages get dirtied faster than their content can be transferred to the destination. Existing approaches to this problem transfer memory content faster with high-speed networks, slow down the dirtying of memory pages by throttling the execution of applications, or reduce the amount of memory content to be transferred, for example, using compression. However, these approaches incur high resource costs or application performance penalties. In this paper, we propose to skip the transfer of VM memory pages that need not be migrated for the execution of running applications at the destination, by exploiting applications' assistance. We have designed a generic framework for application-assisted live migration and then used it to build and evaluate JAVMM, which migrates VMs running various types of Java applications skipping the transfer of garbage in Java memory. Our experimental results show that JAVMM can reduce the completion time, the network traffic of transferring memory pages, and the application downtime of Java VM migration, all by up to over 90%, compared to the vanilla Xen VM migration, without incurring noticeable performance penalty to applications.

show abstract

Integration of Burst Buffer in High-level Parallel I/O Library for Exa-scale Computing Era

Hou¹,

Al-Bahrani²,

Rangel³

et al. 2018

View full text Add to dashboard Cite

Tradeoffs in compressing virtual machine checkpoints

Hou

Shin

Turner

et al. 2013

View full text Add to dashboard Cite

Checkpoint replication is a prevalent way of maintaining virtual machine availability in the presence of host failures. Since checkpoint replication can impose heavy load on network resources, checkpoint compression has been suggested to reduce network usage. This paper presents the first detailed evaluation and characterization of the effectiveness and overheads of checkpoint compression methods for various workloads frequently seen in high-availability systems. We propose a lightweight compression method that exploits similarities in checkpoints to eliminate redundant network traffic, and compare it with two well-known methods, gzip and delta compression. Our results show that gzip and delta compression reduce network traffic significantly for various workloads, but incur high CPU and memory overheads, respectively. The proposed similarity compression is most effective for VM clusters running homogeneous workloads, while using both CPU and memory efficiently. Based on our extensive evaluation, we suggest guidelines for selecting and using these compression methods.

show abstract

Optimizing Performance of Parallel I/O Accesses to Non-contiguous Blocks in Multiple Array Variables

Kang

Breitenfeld

Hou³

et al. 2021

View full text Add to dashboard Cite

Improving MPI Collective I/O for High Volume Non-Contiguous Requests With Intra-Node Aggregation

Kang¹,

Lee²,

Hou³

et al. 2020

IEEE Trans. Parallel Distrib. Syst.

View full text Add to dashboard Cite

Two-phase I/O is a well-known strategy for implementing collective MPI-IO functions. It redistributes I/O requests among the calling processes into a form that minimizes the file access costs. As modern parallel computers continue to grow into the exascale era, the communication cost of such request redistribution can quickly overwhelm collective I/O performance. This effect has been observed from parallel jobs that run on multiple compute nodes with a high count of MPI processes on each node. To reduce the communication cost, we present a new design for collective I/O by adding an extra communication layer that performs request aggregation among processes within the same compute nodes. This approach can significantly reduce inter-node communication contention when redistributing the I/O requests. We evaluate the performance and compare it with the original two-phase I/O on Cray XC40 parallel computers (Theta and Cori) with Intel KNL and Haswell processors. Using I/O patterns from two large-scale production applications and an I/O benchmark, we show our proposed method effectively reduces the communication cost and hence maintains the scalability for a large number of processes.

show abstract

Supporting Data Compression in PnetCDF

Hou¹,

Qiao²,

Lee³

et al. 2021

View full text Add to dashboard Cite

LA-TinyOS

Huang

Hou

et al. 2007

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Kaiyuan Hou

Automated control of multiple virtualized resources

Application-assisted live migration of virtual machines with Java applications

Integration of Burst Buffer in High-level Parallel I/O Library for Exa-scale Computing Era

Tradeoffs in compressing virtual machine checkpoints

Optimizing Performance of Parallel I/O Accesses to Non-contiguous Blocks in Multiple Array Variables

Improving MPI Collective I/O for High Volume Non-Contiguous Requests With Intra-Node Aggregation

Supporting Data Compression in PnetCDF

LA-TinyOS

Contact Info

Product

Resources

About