Despite the importance of storage in enterprise computer systems, there are few adequate tools to design and configure a storage system to meet application data requirements efficiently. Storage system design involves choosing the disk arrays to use, setting the configuration options on those arrays, and determining an efficient mapping of application data onto the configured system. This is a complex process because of the multitude of disk array configuration options, and the need to take into account both capacity and potentially contending I/O performance demands when placing the data. Thus, both existing tools and administrators using rules of thumb often generate designs that are of poor quality.This article presents the Disk Array Designer (DAD), which is a tool that can be used both to guide administrators in their design decisions and to automate the design process. DAD uses a generalized best-fit bin packing heuristic with randomization and backtracking to search efficiently through the huge number of possible design choices. It makes decisions using device models that estimate storage system performance. We evaluate DAD's designs based on traces from a variety of database, filesystem, and e-mail workloads. We show that DAD can handle the difficult task of configuring midrange and high-end disk arrays, even with complex real-world workloads. We also show that DAD quickly generates near-optimal storage system designs, improving in both speed and quality over previous tools.
We provide a competitive analysis framework for online prefetching and buffer management algorithms in parallel IrO systems, using a read-once model of block references. This has widespread applicability to key IrO-bound applications such as external merging and concurrent playback of multiple video streams. Two realistic lookahead models, global lookahead and local lookahead, are defined. Algorithms NOM and GREED, based on these two forms of lookahead are analyzed for shared buffer and distributed buffer configurations, both of which COMPETITIVE PARALLEL DISK PREFETCHING 153 occur frequently in existing systems. An important aspect of our work is that we show how to implement both of the models of lookahead in practice using the simple techniques of forecasting and flushing.Given a D-disk parallel IrO system and a globally shared IrO buffer that can ' Ž . hold up to M disk blocks, we derive a lower bound of ⍀ D on the competitive Ž . ratio of any deterministic online prefetching algorithm with O M lookahead. NOM is shown to match the lower bound using global M-block lookahead. In Ž . contrast, using only local lookahead results in an ⍀ D competitive ratio. When the buffer is distributed into D portions of MrD blocks each, the algorithm GREED based on local lookahead is shown to be optimal, and NOM is within a constant factor of optimal. Thus we provide a theoretical basis for the intuition that global lookahead is more valuable for prefetching in the case of a shared buffer configuration, whereas it is enough to provide local lookahead in the case of a distributed configuration. Finally, we analyze the performance of these algorithms for reference strings generated by a uniformly-random stochastic process and we show that they achieve the minimal expected number of IrOs. These results also give bounds on the worst-case expected performance of algorithms which employ randomization in the data layout. ᮊ
Maintaining the highest levels of availability for content providers is challenging in the face of scale, network evolution, and complexity. Little, however, is known about the network failures large content providers are susceptible to, and what mechanisms they employ to ensure high availability. From a detailed analysis of over 100 high-impact failure events within Google's network, encompassing many data centers and two WANs, we quantify several dimensions of availability failures. We find that failures are evenly distributed across different network types and across data, control, and management planes, but that a large number of failures happen when a network management operation is in progress within the network. We discuss some of these failures in detail, and also describe our design principles for high availability motivated by these failures. These include using defense in depth, maintaining consistency across planes, failing open on large failures, carefully preventing and avoiding failures, and assessing root cause quickly. Our findings suggest that, as networks become more complicated, failures lurk everywhere, and, counter-intuitively, continuous incremental evolution of the network can, when applied together with our design principles, result in a more robust network.
We present a" optimal algorithm, LOPT, for prefetching and UO scheduling in parallel UO systems using a read-once model of block reference. The algorithm usesknowledge of the next L block references, L-block lookahead, to schedule UOs in a" on-line manner. It uses a dynamic priority assiwent scheme to decide when blocks should be prefetched, so as to minimize the total number of UOs.The parallel disk model of a" UO system is used tu study the perfonnancc of L-OPT. We show that L-OFT is comparable to the best on-line algorithm with the same amuun, of lookahead; the ratio of the length of its schedule to the length of the optimal schedule is within a constant factor of the best possible show that the competitive ratio of LOFT is 0( matches the lower bound on the competitive ratio of any prefetching algorithm with L-block lookahead. I" addition we show that when the lookahead consists of the entire reference string, LOFT perfmms the tibnmu possible number of UOs; hence L-OFT is the optimal &line algorithm. Finally, using synthetic traces we empirically study the perfonnancc characteristics of LOFT.
Abstract-We address the problem of prefetching and caching in a parallel I/O system and present a new algorithm for parallel disk scheduling. Traditional buffer management algorithms that minimize the number of block misses are substantially suboptimal in a parallel I/O system where multiple I/Os can proceed simultaneously. We show that in the offline case, where a priori knowledge of all the requests is available, PC-OPT performs the minimum number of I/Os to service the given I/O requests. This is the first parallel I/O scheduling algorithm that is provably offline optimal in the parallel disk model. In the online case, we study the context of global L-block lookahead, which gives the buffer management algorithm a lookahead consisting of L distinct requests. We show that the competitive ratio of PC-OPT, with global L-block lookahead, is ÂðM À L þ DÞ, when L M, and ÂðMD=LÞ, when L > M, where the number of disks is D and buffer size is M.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.