New congestion control algorithms are rapidly improving datacenters by reducing latency, overcoming incast, increasing throughput and improving fairness. Ideally, the operating system in every server and virtual machine is updated to support new congestion control algorithms. However, legacy applications often cannot be upgraded to a new operating system version, which means the advances are offlimits to them. Worse, as we show, legacy applications can be squeezed out, which in the worst case prevents the entire network from adopting new algorithms. Our goal is to make it easy to deploy new and improved congestion control algorithms into multitenant datacenters, without having to worry about TCP-friendliness with nonparticipating virtual machines. This paper presents a solution we call virtualized congestion control. The datacenter owner may introduce a new congestion control algorithm in the hypervisors. Internally, the hypervisors translate between the new congestion control algorithm and the old legacy congestion control, allowing legacy applications to enjoy the benefits of the new algorithm. We have implemented proof-of-concept systems for virtualized congestion control in the Linux kernel and in VMware's ESXi hypervisor, achieving improved fairness, performance, and control over guest bandwidth allocations.
Yahoo mail servers have been receiving an enormous number of messages each day for the past 17 years. The vast majority of todays messages are machine-generated (about 90% of the messages), based on a boilerplate with a small number of specific per-recipient changes. We show that the popular Zlib compression to gzip format fails to fully utilize the high similarity between these machine-generated messages. In this paper we analyze the data redundancy in Yahoo mail, and present methods to reduce its space requirements while using the standard Zlib library. Our results show we can further reduce the compressed data size by a factor of almost 2.5, compared to traditional gzip compression.
Data Compression Conference
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.