During concurrent I/O workloads, sequential access to one I/O stream can be interrupted by accesses to other streams in the system. Frequent switching between multiple sequential I/O streams may severely affect I/O efficiency due to long disk seek and rotational delays of disk-based storage devices. Aggressive prefetching can improve the granularity of sequential data access in such cases, but it comes with a higher risk of retrieving unneeded data. This paper proposes a competitive prefetching strategy that controls the prefetching depth so that the overhead of disk I/O switch and unnecessary prefetching are balanced. The proposed strategy does not require a-priori information on the data access pattern, and achieves at least half the performance (in terms of I/O throughput) of the optimal offline policy. We also provide analysis on the optimality of our competitiveness result and extend the competitiveness result to capture prefetching in the case of random-access workloads.We have implemented the proposed competitive prefetching policy in Linux 2.6.10 and evaluated its performance on both standalone disks and a disk array using a variety of workloads (including two common file utilities, Linux kernel compilation, the TPC-H benchmark, the Apache web server, and index searching). Compared to the original Linux kernel, our competitive prefetching system improves performance by up to 53%. At the same time, it trails the performance of an oracle prefetching strategy by no more than 42%.
During concurrent I/O workloads, sequential access to one I/O stream can be interrupted by accesses to other streams in the system. Frequent switching between multiple sequential I/O streams may severely affect I/O efficiency due to long disk seek and rotational delays of disk-based storage devices. Aggressive prefetching can improve the granularity of sequential data access in such cases, but it comes with a higher risk of retrieving unneeded data. This paper proposes a competitive prefetching strategy that controls the prefetching depth so that the overhead of disk I/O switch and unnecessary prefetching are balanced. The proposed strategy does not require a-priori information on the data access pattern, and achieves at least half the performance (in terms of I/O throughput) of the optimal offline policy. We also provide analysis on the optimality of our competitiveness result and extend the competitiveness result to capture prefetching in the case of random-access workloads.We have implemented the proposed competitive prefetching policy in Linux 2.6.10 and evaluated its performance on both standalone disks and a disk array using a variety of workloads (including two common file utilities, Linux kernel compilation, the TPC-H benchmark, the Apache web server, and index searching). Compared to the original Linux kernel, our competitive prefetching system improves performance by up to 53%. At the same time, it trails the performance of an oracle prefetching strategy by no more than 42%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.