Consider a linear [n, k, d] q code C. We say that that i-th coordinate of C has locality r, if the value at this coordinate can be recovered from accessing some other r coordinates of C. Data storage applications require codes with small redundancy, low locality for information coordinates, large distance, and low locality for parity coordinates. In this paper we carry out an in-depth study of the relations between these parameters.We establish a tight bound for the redundancy n − k in terms of the message length, the distance, and the locality of information coordinates. We refer to codes attaining the bound as optimal. We prove some structure theorems about optimal codes, which are particularly strong for small distances. This gives a fairly complete picture of the tradeoffs between codewords length, worst-case distance and locality of information symbols.We then consider the locality of parity check symbols and erasure correction beyond worst case distance for optimal codes. Using our structure theorem, we obtain a tight bound for the locality of parity symbols possible in such codes for a broad class of parameter settings. We prove that there is a tradeoff between having good locality for parity checks and the ability to correct erasures beyond the minimum distance.
P2P file downloading and streaming have already become very popular Internet applications. These systems dramatically reduce the server loading, and provide a platform for scalable content distribution, as long as there is interest for the content. P2P-based video-on-demand (P2P-VoD) is a new challenge for the P2P technology. Unlike streaming live content, P2P-VoD has less synchrony in the users sharing video content, therefore it is much more difficult to alleviate the server loading and at the same time maintaining the streaming performance. To compensate, a small storage is contributed by every peer, and new mechanisms for coordinating content replication, content discovery, and peer scheduling are carefully designed. In this paper, we describe and discuss the challenges and the architectural design issues of a large-scale P2P-VoD system based on the experiences of a real system deployed by PPLive. The system is also designed and instrumented with monitoring capability to measure both system and component specific performance metrics (for design improvements) as well as user satisfaction. After analyzing a large amount of collected data, we present a number of results on user behavior, various system performance metrics, including user satisfaction, and discuss what we observe based on the system design. The study of a real life system provides valuable insights for the future development of P2P-VoD technology.
Video-on-demand in the Internet has become an immensely popular service in recent years. But due to its high bandwidth requirements and popularity, it is also a costly service to provide. We consider the design and potential benefits of peer-assisted video-on-demand, in which participating peers assist the server in delivering VoD content. The assistance is done in such a way that it provides the same user quality experience as pure client-server distribution. We focus on the single-video approach, whereby a peer only redistributes a video that it is currently watching. Using a nine-month trace from a client-server VoD deployment for MSN Video, we assess what the 95 percentile server bandwidth costs would have been if a peer-assisted employment had been instead used. We show that peer-assistance can dramatically reduce server bandwidth costs, particularly if peers prefetch content when there is spare upload capacity in the system. We consider the impact of peer-assisted VoD on the cross-traffic among ISPs. Although this traffic is significant, if care is taken to localize the P2P traffic within the ISPs, we can eliminate the ISP cross traffic while still achieving important reductions in server bandwidth. We also develop a simple analytical model which captures many of the critical features of peer-assisted VoD, including its operational modes.
Consider a systematic linear code where some (local) parity symbols depend on few prescribed symbols, while other (heavy) parity symbols may depend on all data symbols. Local parities allow to quickly recover any single symbol when it is erased, while heavy parities provide tolerance to a large number of simultaneous erasures. A code as above is maximally-recoverable, if it corrects all erasure patterns which are information theoretically recoverable given the code topology. In this paper we present explicit families of maximally-recoverable codes with locality. We also initiate the study of the trade-off between maximal recoverability and alphabet size.Definition 3. Let C be a data-local (k, r, h)-code. We say that C is maximally-recoverable if for any set E ⊆ [n], where E is obtained by picking one coordinate from each of k r local groups, puncturing C in coordinates in E yields a maximum distance separable [k + h, k] code.A [k+h, k] MDS code obviously corrects all patterns of h erasures. Therefore a maximally-recoverable data-local (k, r, h)-code corrects all erasure patterns E ⊆ [n] that involve erasing one coordinate per local group, and h additional coordinates. We now argue that any erasure pattern that is not dominated by a pattern above has to be uncorrectable.Lemma 4. Let C be an arbitrary data-local (k, r, h)-code. Let E ⊆ [n] be an erasure pattern. Suppose E affects t local groups and |E| > t + h; then E is not correctable.
Abstract-Network codes designed specifically for distributed storage systems have the potential to provide dramatically higher storage efficiency for the same availability. One main challenge in the design of such codes is the exact repair problem: if a node storing encoded information fails, in order to maintain the same level of reliability we need to create encoded information at a new node. One of the main open problems in this emerging area has been the design of simple coding schemes that allow exact and low cost repair of failed nodes and have high data rates. In particular, all prior known explicit constructions have data rates bounded by 1/2.In this paper we introduce the first family of distributed storage codes that have simple look-up repair and can achieve arbitrarily high rates. Our constructions are very simple to implement and perform exact repair by simple XORing of packets. We experimentally evaluate the proposed codes in a realistic cloud storage simulator and show significant benefits in both performance and reliability compared to replication and standard Reed-Solomon codes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.