Consider a linear [n, k, d] q code C. We say that that i-th coordinate of C has locality r, if the value at this coordinate can be recovered from accessing some other r coordinates of C. Data storage applications require codes with small redundancy, low locality for information coordinates, large distance, and low locality for parity coordinates. In this paper we carry out an in-depth study of the relations between these parameters.We establish a tight bound for the redundancy n − k in terms of the message length, the distance, and the locality of information coordinates. We refer to codes attaining the bound as optimal. We prove some structure theorems about optimal codes, which are particularly strong for small distances. This gives a fairly complete picture of the tradeoffs between codewords length, worst-case distance and locality of information symbols.We then consider the locality of parity check symbols and erasure correction beyond worst case distance for optimal codes. Using our structure theorem, we obtain a tight bound for the locality of parity symbols possible in such codes for a broad class of parameter settings. We prove that there is a tradeoff between having good locality for parity checks and the ability to correct erasures beyond the minimum distance.
With increasing development of applications for heterogeneous, distributed c omputing grids, the focus of performance analysis has shifted f r om a posteriori optimization on homogeneous parallel systems to application tuning for heterogeneous resources with time varying availability. This shift has profound implications for performance instrumentation and analysis techniques. Autopilot is a new infrastructure for dynamic performance tuning of heterogeneous computational grids based on closed l o op control. This paper describes the Autopilot model of distributed sensors, actuators, and decision procedures, reports preliminary performance b enchmarks, and presents a case study in which the Autopilot library is utilized in the development of an adaptive parallel input output system.
Although there are several extant studies of parallel scientific application request patterns, there is little experimental data on the correlation of physical I/O patterns with application I/O stimuli. To understand these correlations, the authors have instrumented the SCSI device drivers of the Intel Paragon OSF/1 operating system to record key physical I/O activities, and have correlated this data with the I/O patterns of scientific applications captured via the Pablo analysis toolkit. This analysis shows that disk hardware features profoundly affect the distribution of request delays and that current parallel file systems respond to parallel application I/O patterns in nonscalable ways.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.