Abstract-Reliable and highly available computer networks must implement resilient fast rerouting mechanisms: upon a link or node failure, an alternative route is determined quickly, without involving the network control plane. Designing such fast failover mechanisms capable of dealing with multiple concurrent failures however is challenging, as failover rules need to be installed proactively, i.e., ahead of time, without knowledge of the actual failures happening at runtime. Indeed, only little is known today about the design of resilient routing algorithms. This paper presents a deterministic local failover mechanism which we prove to result in a minimum network load for a wide range of communication patterns, solving an open problem. Our mechanism relies on the key insight that resilient routing essentially constitutes a distributed algorithm without coordination. Accordingly, we build upon the theory of combinatorial designs and develop a novel deterministic failover mechanism based on symmetric block design theory which tolerates a maximal number of Ω(n) link failures in an n-node network and in the worstcase, while always ensuring routing connectivity. In particular, we show that at least Ω(φ 2 ) link failures are needed to generate a maximum link load of at least φ, which matches an existing bound on the number of link failures needed for an optimal failover scheme. We complement our formal analysis with simulations, showing that our approach outperforms prior schemes not only in the worst-case.
Modern Internet applications, from HD video-conferencing to health monitoring and remote control of power-plants, pose stringent demands on network latency, bandwidth and availability. Centralized inter-domain routing brokers is an approach to support such applications and provide inter-domain guarantees, enabling new avenues for innovation. These entities centralize routing control for missioncritical traffic across domains, working in parallel to BGP. In this work, we propose using IXPs as natural points for stitching interdomain paths under the control of inter-domain routing brokers.To evaluate the potential of this approach, we first map the global substrate of inter-IXP pathlets that IXP members could offer, based on measurements for 229 IXPs worldwide. We show that using IXPs as stitching points has two useful properties. Up to 91% of the total IPv4 address space can be served by such inter-domain routing brokers when working in concert with just a handful of large IXPs and their associated ISP members. Second, path diversity on the inter-IXP graph increases by up to 29 times, as compared to current BGP valley-free routing. To exploit the rich path diversity, we introduce algorithms that inter-domain routing brokers can use to embed paths, subject to bandwidth and latency constraints. We show that our algorithms scale to the sizes of the measured graphs and can serve diverse simulated path request mixes. Our work highlights a novel direction for SDN innovation across domains, based on logically centralized control and programmable IXP fabrics.
Network failures are frequent and disruptive, and can significantly reduce the throughput even in highly connected and regular networks such as datacenters. While many modern networks support some kind of local fast failover to quickly reroute flows encountering link failures to new paths, employing such mechanisms is known to be non-trivial, as conditional failover rules can only depend on local failure information. While over the last years, important insights have been gained on how to design failover schemes providing high resiliency, existing approaches have the shortcoming that the resulting failover routes may be unnecessarily long, i.e., they have a large stretch compared to the original route length. This is a serious drawback, as long routes entail higher latencies and introduce loads, which may cause the rerouted flows to interfere with existing flows and harm throughput. This paper presents the first deterministic local fast failover algorithms providing provable resiliency and failover route lengths, even in the presence of many concurrent failures. We present stretch-optimal failover algorithms for different network topologies, including multi-dimensional grids, hypercubes and Clos networks, as they are frequently deployed in the context of HPC clusters and datacenters. We show that the computed failover routes are optimal in the sense that no failover algorithm can provide shorter paths for a given number of link failures.
We consider the fundamental problem of updating arbitrary routes in a software-defined network in a (transiently) loop-free manner. Our objective is to compute fast network update schedules which minimize the number of interactions (i.e., rounds) between the controller and the network nodes. We first prove that this problem is difficult in general: The problem of deciding whether a k-round update schedule exists is NPcomplete already for k = 3, and there are problem instances requiring Ω(n) rounds, where n is the network size. Given these negative results, we introduce an attractive, relaxed notion of loop-freedom. We show that relaxed loop-freedom admits for much shorter update schedules (up to a factor Ω(n) in the best case), and present a scheduling algorithm which requires at most Θ(log n) rounds.
Control planes of forthcoming Software-Defined Networks (SDNs) will be distributed : to ensure availability and fault-tolerance, to improve load-balancing, and to reduce overheads, modules of the control plane should be physically distributed. However, in order to guarantee consistency of network operation, actions performed on the data plane by different controllers may need to be synchronized , which is a nontrivial task. In this paper, we propose a synchronization framework for control planes based on atomic transactions, implemented in-band , on the data-plane switches. We argue that this in-band approach is attractive as it keeps the failure scope local and does not require additional out-of-band coordination mechanisms. It allows us to realize fundamental consensus primitives in the presence of controller failures, and we discuss their applications for consistent policy composition and fault-tolerant control-planes. Interestingly, by using part of the data plane configuration space as a shared memory and leveraging the match-action paradigm, we can implement our synchronization framework in today's standard OpenFlow protocol, and we report on our proof-of-concept implementation.
Virtual switches are a crucial component of SDN-based cloud systems, enabling the interconnection of virtual machines in a flexible and "software-defined" manner. This paper raises the alarm on the security implications of virtual switches. In particular, we show that virtual switches not only increase the attack surface of the cloud, but virtual switch vulnerabilities can also lead to attacks of much higher impact compared to traditional switches. We present a systematic security analysis and identify four design decisions which introduce vulnerabilities. Our findings motivate us to revisit existing threat models for SDNbased cloud setups, and introduce a new attacker model for SDN-based cloud systems using virtual switches.
We consider the task of computing (combined) function mapping and routing for requests in Software-Defined Networks (SDNs). Function mapping refers to the assignment of nodes in the substrate network to various processing stages that requests must undergo. Routing refers to the assignment of a path in the substrate network that begins in a source node of the request, traverses the nodes that are assigned functions for this request, and ends in a destination of the request.The algorithm either rejects a request or completely serves a request, and its goal is to maximize the sum of the benefits of the served requests. The solution must abide edge and vertex capacities.We follow the framework suggested by Even et al. [1] for the specification of the processing requirements and routing of requests via processing-and-routing graphs (PR-graphs). In this framework, each request has a demand, a benefit, and PR-graph.Our main result is a randomized approximation algorithm for path computation and function placement with the following guarantee. Let m denote the number of links in the substrate network, ε denote a parameter such that 0 < ε < 1, and opt f denote the maximum benefit that can be attained by a fractional solution (one in which requests may be partly served and flow may be split along multiple paths). Let c min denote the minimum edge capacity, and let d max denote the maximum demand. Let ∆ max denote an upper bound on the number of processing stages a request undergoes. If c min /(∆ max · d max ) = Ω((log m)/ε 2 ), then with probability at least 1 − 1 m − exp(−Ω(ε 2 · opt f /(b max · d max ))), the algorithm computes a (1 − ε)-approximate solution.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.