Birenjith Sasidharan scite author profile

In a distributed storage system, code symbols are dispersed across space in nodes or storage units as opposed to time. In settings such as that of a large data center, an important consideration is the efficient repair of a failed node. Efficient repair calls for erasure codes that in the face of node failure, are efficient in terms of minimizing the amount of repair data transferred over the network, the amount of data accessed at a helper node as well as the number of helper nodes contacted. Coding theory has evolved to handle these challenges by introducing two new classes of erasure codes, namely regenerating codes and locally recoverable codes as well as by coming up with novel ways to repair the ubiquitous Reed-Solomon code. This survey provides an overview of the efforts in this direction that have taken place over the past decade. I. INTRODUCTIONThis survey article deals with the use of erasure coding for the reliable and efficient storage of large amounts of data in settings such as that of a data center. The amount of data stored in a single data center can run into tens or hundreds of petabytes. Reliability of data storage is ensured in part by introducing redundancy in some form, ranging from simple replication to the use of more sophisticated erasure-coding schemes such as Reed-Solomon codes. Minimizing the storage overhead that comes with ensuring reliability is a key consideration in the choice of erasure-coding scheme. More recently a second problem has surfaced, namely, that of node repair.In [1], [2] the authors study the Facebook warehouse cluster and analyze the frequency of node failures as well as the resultant network traffic relating to node repair. It was observed in [1] that a median of 50 nodes are unavailable per day and that a median of 180TB of cross-rack traffic is generated as a result of node unavailability. It was also reported that 98.08% of the cases have exactly one block missing in a stripe. The erasure code that was deployed in this instance was an [n = 14, k = 10] Reed Solomon (RS) code. Here n denotes the block length of the code and k the dimension. The conventional repair of an [n, k] RS code is inefficient in that the repair of a single node, calls for contacting k other (helper) nodes and downloading k times the amount of data stored in the failed node, which is clearly inefficient. Thus there is significant practical interest in the design of erasure-coding techniques that offer both low overhead and which can also be repaired efficiently.Coding theorists have responded to this need by coming up with two new classes of codes, namely ReGenerating (RG) and Locally Recoverable (LR) codes. The focus in a RG code is on minimizing the amount of data download needed to repair a failed node, termed the repair bandwidth while LR codes seek to minimize the number of helper nodes contacted for node repair, termed the repair degree. In a different direction, coding theorists have also re-examined the problem of node repair in RS codes and have come up with new and more efficient ...

show abstract

Layered Exact-Repair Regenerating Codes via Embedded Error Correction and Block Designs

Tian

Sasidharan

Aggarwal

et al. 2015

IEEE Trans. Inform. Theory

View full text Add to dashboard Cite

A new class of exact-repair regenerating codes is constructed by stitching together shorter erasure correction codes, where the stitching pattern can be viewed as block designs. The proposed codes have the help-by-transfer property where the helper nodes simply transfer part of the stored data directly, without performing any computation. This embedded error correction structure makes the decoding process straightforward, and in some cases the complexity is very low. We show that this construction is able to achieve performance better than spacesharing between the minimum storage regenerating codes and the minimum repair-bandwidth regenerating codes, and it is the first class of codes to achieve this performance. In fact, it is shown that the proposed construction can achieve a nontrivial point on the optimal functional-repair tradeoff, and it is asymptotically optimal at high rate, i.e., it asymptotically approaches the minimum storage and the minimum repair-bandwidth simultaneously.

show abstract

An explicit, coupled-layer construction of a high-rate MSR code with low sub-packetization level, small field size and d < (n − 1)

Sasidharan

Vajha

Kumar

2017

View full text Add to dashboard Cite

This paper presents an explicit construction for an ((n = 2qt, k = 2q(t−1), d = n−(q+1)), (α = q(2q) t−1 , β = α q )) regenerating code over a field F Q operating at the Minimum Storage Regeneration (MSR) point. The MSR code can be constructed to have rate k/n as close to 1 as desired, sub-packetization level α ≤ r n r for r = (n − k), field size Q no larger than n and where all code symbols can be repaired with the same minimum data download. This is the first-known construction of such an MSR code for d < (n − 1).

show abstract

A high-rate MSR code with polynomial sub-packetization level

Sasidharan

Agarwal

Kumar

2015

View full text Add to dashboard Cite

We present a high-rate (n, k, d = n − 1)-MSR code with a sub-packetization level that is polynomial in the dimension k of the code. While polynomial sub-packetization level was achieved earlier for vector MDS codes that repair systematic nodes optimally, no such MSR code construction is known. In the low-rate regime (i. e., rates less than one-half), MSR code constructions with a linear sub-packetization level are available. But in the high-rate regime (i. e., rates greater than one-half), the known MSR code constructions required a sub-packetization level that is exponential in k. In the present paper, we construct an MSR code for d = n − 1 with a fixed rate R = t−1 t , t ≥ 2, achieveing a sub-packetization level α = O(k t ). The code allows help-by-transfer repair, i. e., no computations are needed at the helper nodes during repair of a failed node.

show abstract

Codes with hierarchical locality

Sasidharan

Agarwal

Kumar

2015

View full text Add to dashboard Cite

In this paper, we study the notion of codes with hierarchical locality that is identified as another approach to local recovery from multiple erasures. The well-known class of codes with locality is said to possess hierarchical locality with a single level. In a code with two-level hierarchical locality, every symbol is protected by an inner-most local code, and another middle-level code of larger dimension containing the local code. We first consider codes with two levels of hierarchical locality, derive an upper bound on the minimum distance, and provide optimal code constructions of low field-size under certain parameter sets. Subsequently, we generalize both the bound and the constructions to hierarchical locality of arbitrary levels.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.