Diego Ongaro scite author profile

This paper explores the relationship between domain scheduling in a virtual machine monitor (VMM) and I/O performance. Traditionally, VMM schedulers have focused on fairly sharing the processor resources among domains while leaving the scheduling of I/O resources as a secondary concern. However, this can result in poor and/or unpredictable application performance, making virtualization less desirable for applications that require efficient and consistent I/O behavior.This paper is the first to study the impact of the VMM scheduler on performance using multiple guest domains concurrently running different types of applications. In particular, different combinations of processor-intensive, bandwidth-intensive, and latencysensitive applications are run concurrently to quantify the impacts of different scheduler configurations on processor and I/O performance. These applications are evaluated on 11 different scheduler configurations within the Xen VMM. These configurations include a variety of scheduler extensions aimed at improving I/O performance. This cross product of scheduler configurations and application types offers insight into the key problems in VMM scheduling for I/O and motivates future innovation in this area.

show abstract

Fast crash recovery in RAMCloud

Ongaro

et al. 2011

View full text Add to dashboard Cite

RAMCloud is a DRAM-based storage system that provides inexpensive durability and availability by recovering quickly after crashes, rather than storing replicas in DRAM. RAMCloud scatters backup data across hundreds or thousands of disks, and it harnesses hundreds of servers in parallel to reconstruct lost data. The system uses a log-structured approach for all its data, in DRAM as well as on disk; this provides high performance both during normal operation and during recovery. RAMCloud employs randomized techniques to manage the system in a scalable and decentralized fashion. In a 60-node cluster, RAMCloud recovers 35 GB of data from a failed server in 1.6 seconds. Our measurements suggest that the approach will scale to recover larger memory sizes (64 GB or more) in less time with larger clusters.

show abstract

The RAMCloud Storage System

Ousterhout

Gopalan

Gupta

et al. 2015

ACM Trans. Comput. Syst.

198

136

View full text Add to dashboard Cite

RAMCloud is a storage system that provides low-latency access to large-scale datasets. To achieve low latency, RAMCloud stores all data in DRAM at all times. To support large capacities (1PB or more), it aggregates the memories of thousands of servers into a single coherent key-value store. RAMCloud ensures the durability of DRAM-based data by keeping backup copies on secondary storage. It uses a uniform logstructured mechanism to manage both DRAM and secondary storage, which results in high performance and efficient memory usage. RAMCloud uses a polling-based approach to communication, bypassing the kernel to communicate directly with NICs; with this approach, client applications can read small objects from any RAMCloud storage server in less than 5μs, durable writes of small objects take about 13.5μs. RAMCloud does not keep multiple copies of data online; instead, it provides high availability by recovering from crashes very quickly (1 to 2 seconds). RAMCloud's crash recovery mechanism harnesses the resources of the entire cluster working concurrently so that recovery performance scales with cluster size. 7:2 J. Ousterhout et al.[Ritchie and Thompson 1974]. Over the past 15 years, the use of DRAM in storage systems has accelerated, driven by the needs of large-scale Web applications. These applications manipulate very large datasets with an intensity that cannot be satisfied by disk and flash alone. As a result, applications are keeping more and more of their long-term data in DRAM. By 2005, all of the major Web search engines kept their search indexes entirely in DRAM, and large-scale caching systems such as memcached [Memcached 2011] have become widely used for applications such as Facebook, Twitter, Wikipedia, and YouTube.Although DRAM's role is increasing, it is still difficult for application developers to capture the full performance potential of DRAM-based storage. In many cases, DRAM is used as a cache for some other storage system, such as a database; this approach forces developers to manage consistency between the cache and the backing store, and its performance is limited by cache misses and backing store overheads. In other cases, DRAM is managed in an application-specific fashion, which provides high performance but at a high complexity cost for developers. A few recent systems such as Redis [2014] and Cassandra [2014] have begun to provide general-purpose facilities for accessing data in DRAM, but their performance does not approach the full potential of DRAMbased storage.This article describes RAMCloud, a general-purpose distributed storage system that keeps all data in DRAM at all times. RAMCloud combines three overall attributes: low latency, large scale, and durability. When used with state-of-the-art networking, RAM-Cloud offers exceptionally low latency for remote access. In our 80-node development cluster with QDR Infiniband, a client can read any 100-byte object in less than 5μs, and durable writes take about 13.5μs. In a large datacenter with 100,000 nodes, we expect small reads to compl...

show abstract

The case for RAMCloud

et al. 2011

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Diego Ongaro

Scheduling I/O in virtual machine monitors

Fast crash recovery in RAMCloud

The RAMCloud Storage System

The case for RAMCloud

Contact Info

Product

Resources

About