Storage clouds, such as Amazon S3, are being widely used for web services and Internet applications. It has been observed that the delay for retrieving data from and placing data into the clouds is quite random, and exhibits weak correlations between different read/write requests. This inspires us to investigate a key problem: can we reduce the delay by transmitting data replications in parallel or using powerful erasure codes?In this paper, we study the problem of reducing the delay of downloading data from cloud storage systems by leveraging multiple parallel threads, assuming that the data has been encoded and stored in the clouds using fixed rate forward error correction (FEC) codes with parameters (n, k). That is., each file is divided into k equal-sized chunks, which are then expanded into n chunks such that any k chunks out of the n are sufficient to successfully restore the original file. The model can be depicted as a multipleserver queue with arrivals of data retrieving requests and a server corresponding to a thread. However, this is not a typical queueing model because a server can terminate its operation, depending on when other servers complete their service (due to the redundancy that is spread across the threads). Hence, to the best of our knowledge, the analysis of this queueing model remains quite uncharted.Real traces from Amazon S3 show that the time to retrieve a fixed size chunk is random and can be accurately approximated as an i.i.d. exponentially distributed random variable. We show that any work-conserving scheme is delay-optimal when k = 1. When k > 1, we find that a simple greedy scheme, which allocates all available threads to the head of line request, is delay optimal, which appears surprising.
Abstract-Software Defined Networks (SDN) decouple the forwarding and control planes from each other. The control plane is assumed to have a global knowledge of the underlying physical and/or logical network topology so that it can monitor, abstract and control the forwarding plane. In our paper, we present solutions that install an optimal or near-optimal (i.e., within 14% of the optimal) number of static forwarding rules on switches/routers so that any controller can verify the topology connectivity and detect/locate link failures at data plane speeds without relying on state updates from other controllers. Our upper bounds on performance indicate that sub-second link failure localization is possible even at data-center scale networks. For networks with hundreds or few thousand links, tens of milliseconds of latency is achievable.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.