Radhika Niranjan Mysore scite author profile

This paper considers the requirements for a scalable, easily manageable, fault-tolerant, and efficient data center network fabric. Trends in multi-core processors, end-host virtualization, and commodities of scale are pointing to future single-site data centers with millions of virtual end points. Existing layer 2 and layer 3 network protocols face some combination of limitations in such a setting: lack of scalability, difficult management, inflexible communication, or limited support for virtual machine migration. To some extent, these limitations may be inherent for Ethernet/IP style protocols when trying to support arbitrary topologies. We observe that data center networks are often managed as a single logical network fabric with a known baseline topology and growth model. We leverage this observation in the design and implementation of PortLand, a scalable, fault tolerant layer 2 routing and forwarding protocol for data center environments. Through our implementation and evaluation, we show that PortLand holds promise for supporting a ``plug-and-play" large-scale, data center network.

show abstract

TritonSort

Rasmussen

Porter

Conley

et al. 2013

ACM Trans. Comput. Syst.

View full text Add to dashboard Cite

We present TritonSort, a highly efficient, scalable sorting system. It is designed to process large datasets, and has been evaluated against as much as 100TB of input data spread across 832 disks in 52 nodes at a rate of 0.938TB/min. When evaluated against the annual Indy GraySort sorting benchmark, TritonSort is 66% better in absolute performance and has over six times the per-node throughput of the previous record holder. When evaluated against the 100TB Indy JouleSort benchmark, TritonSort sorted 9703 records/Joule. In this article, we describe the hardware and software architecture necessary to operate TritonSort at this level of efficiency. Through careful management of system resources to ensure cross-resource balance, we are able to sort data at approximately 80% of the disks’ aggregate sequential write speed. We believe the work holds a number of lessons for balanced system design and for scale-out architectures in general. While many interesting systems are able to scale linearly with additional servers, per-server performance can lag behind per-server capacity by more than an order of magnitude. Bridging the gap between high scalability and high performance would enable either significantly less expensive systems that are able to do the same work or provide the ability to address significantly larger problem sets with the same infrastructure.

show abstract

Brief Announcement: A Randomized Algorithm for Label Assignment in Dynamic Networks

Walraed-Sullivan

Mysore

Marzullo

et al. 2011

View full text Add to dashboard Cite

Condor

Schlinker

Mysore

Smith

et al. 2015

View full text Add to dashboard Cite

The design space for large, multipath datacenter networks is large and complex, and no one design fits all purposes. Network architects must trade off many criteria to design costeffective, reliable, and maintainable networks, and typically cannot explore much of the design space. We present Condor, our approach to enabling a rapid, efficient design cycle. Condor allows architects to express their requirements as constraints via a Topology Description Language (TDL), rather than having to directly specify network structures. Condor then uses constraint-based synthesis to rapidly generate candidate topologies, which can be analyzed against multiple criteria. We show that TDL supports concise descriptions of topologies such as fat-trees, BCube, and DCell; that we can generate known and novel variants of fat-trees with simple changes to a TDL file; and that we can synthesize large topologies in tens of seconds. We also show that Condor supports the daunting task of designing multi-phase network expansions that can be carried out on live networks.

show abstract

FasTrak

Mysore

Porter

Vahdat

2013

View full text Add to dashboard Cite

The shared nature of multi-tenant cloud networks requires providing tenant isolation and quality of service, which in turn requires enforcing thousands of network-level rules, policies, and traffic rate limits. Enforcing these rules in virtual machine hypervisors imposes significant computational overhead, as well as increased latency. In FasTrak, we seek to exploit temporal locality in flows and flow sizes to offload a subset of network virtualization functionality from the hypervisor into switch hardware freeing up the hypervisor. FasTrak manages the required hardware and hypervisor rules as a unified set, moving rules back and forth to minimize the overhead of network virtualization, and focusing on flows (or flow aggregates) that are either most latency sensitive or exhibit the highest packets-per-second rates.

show abstract

Alias

Walraed-Sullivan

Mysore

Tewari

et al. 2011

View full text Add to dashboard Cite

Modern data centers can consist of hundreds of thousands of servers and millions of virtualized end hosts. Managing address assignment while simultaneously enabling scalable communication is a challenge in such an environment. We present ALIAS, an addressing and communication protocol that automates topology discovery and address assignment for the hierarchical topologies that underlie many data center network fabrics. Addresses assigned by ALIAS interoperate with a variety of scalable communication techniques. ALIAS is fully decentralized, scales to large network sizes, and dynamically recovers from arbitrary failures, without requiring modifications to hosts or to commodity switch hardware. We demonstrate through simulation that ALIAS quickly and correctly configures networks that support up to hundreds of thousands of hosts, even in the face of failures and erroneous cabling, and we show that ALIAS is a practical solution for auto-configuration with our NetFPGA testbed implementation.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.