We study a generalization of the setting of regenerating codes, motivated by applications to storage systems consisting of clusters of storage nodes. There are n clusters in total, with m nodes per cluster. A data file is coded and stored across the mn nodes, with each node storing α symbols. For availability of data, we require that the file be retrievable by downloading the entire content from any subset of k clusters. Nodes represent entities that can fail. We distinguish between intra-cluster and inter-cluster bandwidth (BW) costs during node repair. Node-repair in a cluster is accomplished by downloading β symbols each from any set of d other clusters, dubbed remote helper clusters, and also up to α symbols each from any set of surviving nodes, dubbed local helper nodes, in the host cluster. We first identify the optimal trade-off between storage-overhead and inter-cluster repair-bandwidth under functional repair, and also present optimal exact-repair code constructions for a class of parameters. The new tradeoff is strictly better than what is achievable via space-sharing existing coding solutions, whenever > 0. We then obtain sharp lower bounds on the necessary intra-cluster repair BW to achieve optimal trade-off. Under functional repair, random linear network codes (RLNCs) simultaneously optimize usage of both inter-and intra-cluster repair BW; simulation results based on RLNCs suggest optimality of the bounds on intra-cluster repair-bandwidth. Our bounds reveal the interesting fact that, while it is beneficial to increase the number of local helper nodes in order to improve the storage-vs-inter-cluster-repair-BW trade-off, increasing not only increases intra-cluster BW in the host-cluster, but also increases the intra-cluster BW in the remote helper clusters. We also analyze resilience of the clustered storage system against passive eavesdropping by providing file-size bounds and optimal code constructions.
We study the trade-off between storage overhead and inter-cluster repair bandwidth in clustered storage systems, while recovering from multiple node failures within a cluster. A cluster is a collection of m nodes, and there are n clusters. For data collection, we download the entire content from any k clusters. For repair of t ≥ 2 nodes within a cluster, we take help from local nodes, as well as d helper clusters. We characterize the optimal trade-off under functional repair, and also under exact repair for the minimum storage and minimum inter-cluster bandwidth (MBR) operating points. Our bounds show the following interesting facts: 1) When t|(m− ) the tradeoff is the same as that under t = 1, and thus there is no advantage in jointly repairing multiple nodes, 2) When t (m − ), the optimal file-size at the MBR point under exact repair can be strictly less than that under functional repair. 3) Unlike the case of t = 1, increasing the number of local helper nodes does not necessarily increase the system capacity under functional repair.
In distributed cloud storages fault tolerance is achieved by regenerating the lost data from the surviving clouds. Recent studies suggest using maximum distance separable (MDS) network codes in cloud storage systems to allow efficient and reliable recovery after node faults. MDS codes are designed to use a substantial number of repair nodes and rely on centralized management and a static fully connected network between the nodes. However, in highly dynamic environments, like edge caching in communication networks or peer-to-peer networks, the nodes and the communication links availability is very volatile. In these scenarios MDS codes functionality is limited. In this paper we study a non-MDS network coded approach, which operates in a decentralized manner and requires a small number of repair nodes for node recovery. We investigate long-term behavior of the modeled system and demonstrate, analytically and numerically, the durability gains over uncoded storage.
Abstract-In this paper we consider the additive white Gaussian noise channel with an average input power constraint in the power-limited regime. A well-known result in information theory states that the capacity of this channel can be achieved by random Gaussian coding with analog quadrature amplitude modulation (QAM). In practical applications, however, discrete binary channel codes with digital modulation are most often employed. We analyze the matched filter decoding error probability in random binary and Gaussian coding setups in the wide bandwidth regime, and show that the performance in the two cases is surprisingly similar without explicit adaptation of the codeword construction to the modulation. The result also holds for the multiple access and the broadcast Gaussian channels, when signal-to-noise ratio is low. Moreover, the two modulations can be even mixed together in a single codeword resulting in a hybrid modulation with asymptotically close decoding behavior. In this sense the matched filter decoder demonstrates the performance that is largely insensitive to the choice of binary versus Gaussian modulation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.