Packet scheduling, together with classification, is one of the most expensive processing steps in systems providing tight bandwidth and delay guarantees at high packet rates. Schedulers with near-optimal service guarantees and ${ O}({1})$ time complexity have been proposed in the past, using techniques such as timestamp rounding and flow grouping to keep their execution time small. However, even the two best proposals in this family have a per-packet cost component that is linear either in the number of groups or in the length of the packet being transmitted. Furthermore, no studies are available on the actual execution time of these algorithms. In this paper we make two contributions. First, we present Quick Fair Queueing (QFQ), a new ${ O}({ 1})$ scheduler that provides near-optimal guarantees and is the first to achieve that goal with a truly constant cost also with respect to the number of groups and the packet length. The QFQ algorithm has no loops and uses very simple instructions and data structures that contribute to its speed of operation. Second, we have developed production-quality implementations of QFQ and of its closest competitors, which we use to present a detailed comparative performance analysis of the various algorithms. Experiments show that QFQ fulfills our expectations, outperforming the other algorithms in the same class. In absolute terms, even on a low-end workstation, QFQ takes about 110 ns for an enqueue()/dequeue() pair (only twice the time of DRR, but with much better service guarantees)
Sparse Matrix-Vector multiplication (SpMV) is a fundamental kernel, used by a large class of numerical algorithms. Emerging big-data and machine learning applications are propelling a renewed interest in SpMV algorithms that can tackle massive amount of unstructured data-rapidly approaching the TeraByte range-with predictable, high performance. In this paper we describe a new methodology to design SpMV algorithms for shared memory multiprocessors (SMPs) that organizes the original SpMV algorithm into two distinct phases. In the first phase we build a scaled matrix, that is reduced in the second phase, providing numerous opportunities to exploit memory locality. Using this methodology, we have designed two algorithms. Our experiments on irregular big-data matrices (an order of magnitude larger than the current state of the art) show a quasi-optimal scaling on a large-scale POWER8 SMP system, with an average performance speedup of 3.8×, when compared to an equally optimized version of the CSR algorithm. In terms of absolute performance, with our implementation, the POWER8 SMP system is comparable to a 256-node cluster. In terms of size, it can process matrices with up to 68 billion edges, an order of magnitude larger than state-of-the-art clusters. CCS Concepts•Computing methodologies → Linear algebra algorithms; Shared memory algorithms; Vector / streaming algorithms; •Mathematics of computing → Graph algorithms; •Theory of computation → Graph algorithms analysis; Data structures design and anal- * Fabrizio Petrini has since changed his affiliation. His current contact is fabrizio.petrini@intel.com ACM acknowledges that this contribution was authored or co-authored by an employee, or contractor of the national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only. Permission to make digital or hard copies for personal or classroom use is granted. Copies must bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. To copy otherwise, distribute, republish, or post, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org.
Abstract-Mainstream applications-such as file copy/transfer, Web, DBMS, or video streaming-typically issue synchronous disk requests. As shown in this paper, this fact may cause workconserving schedulers to fail both to enforce guarantees and to provide a high disk throughput. A high throughput can be however recovered by just idling the disk for a short time interval after the completion of each request. In contrast, guarantees may still be violated by existing timestamp-based schedulers, because of the rules they use to tag requests.Budget Fair Queueing (BFQ), the new disk scheduler presented in this paper, is an example of how disk idling, combined with proper back-shifting of request timestamps, may allow a timestamp-based disk scheduler to preserve both guarantees and a high throughput. Under BFQ each application is always guaranteed-over any time interval and independently of whether it issues synchronous requests-a bounded lag with respect to its reserved fraction of the total number of bytes transferred by the disk device.We show the single-disk performance of our implementation of BFQ in the Linux kernel through experiments with real and emulated mainstream applications.Index Terms-Scheduling, secondary storage, quality of service.
Abstract. This paper addresses the issue of how to meet the strict timing constraints of (soft) real-time virtualized applications while the Virtual Machine (VM) hosting them is undergoing a live migration. To this purpose, it is essential that the resource requirements of a migration are identified in advance, that appropriate resources are reserved to the process, and that multiple VMs sharing the same resources are temporally isolated from each other. The first issue is dealt with by introducing a stochastic model for the migration process. The other ones by introducing a methodology making use of proper scheduling algorithms (for both CPU and network) that allow for reserving resource shares to individual VMs. Also, an extensive set of simulations have been done by using traces of a VLC video server virtualized by using KVM on Linux. The traces have been obtained by patching KVM at the kernel level, and the same patch constitutes an important step towards the complete implementation of the proposed technique. The obtained results highlight the benefits of the proposed approach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.