Abstract-High-performance clusters have been widely deployed to solve challenging and rigorous scientific and engineering tasks. On one hand, high performance is certainly an important consideration in designing clusters to run parallel applications. On the other hand, the ever increasing energy cost requires us to effectively conserve energy in clusters. To achieve the goal of optimizing both performance and energy efficiency in clusters, in this paper, we propose two energy-efficient duplication-based scheduling algorithms-Energy-Aware Duplication (EAD) scheduling and Performance-Energy Balanced Duplication (PEBD) scheduling. Existing duplication-based scheduling algorithms replicate all possible tasks to shorten schedule length without reducing energy consumption caused by duplication. Our algorithms, in contrast, strive to balance schedule lengths and energy savings by judiciously replicating predecessors of a task if the duplication can aid in performance without degrading energy efficiency. To illustrate the effectiveness of EAD and PEBD, we compare them with a nonduplication algorithm, a traditional duplication-based algorithm, and the dynamic voltage scaling (DVS) algorithm. Extensive experimental results using both synthetic benchmarks and real-world applications demonstrate that our algorithms can effectively save energy with marginal performance degradation.
There yet exist no truly parallel file systems. Those that make the claim fall short when it comes to providing adequate concurrent write performance at large scale. This limitation causes large usability headaches in HPC.Users need two major capabilities missing from current parallel file systems. One, they need low latency interactivity. Two, they need high bandwidth for large parallel IO; this capability must be resistant to IO patterns and should not require tuning. There are no existing parallel file systems which provide these features. Frighteningly, exascale renders these features even less attainable from currently available parallel file systems. Fortunately, there is a path forward.
Optimizing energy consumption has become a major concern in designing economical clusters.
Abstract-MapReduce has become an important distributed processing model for large-scale data-intensive applications like data mining and web indexing. Hadoop-an open-source implementation of MapReduce is widely used for short jobs requiring low response time. The current Hadoop implementation assumes that computing nodes in a cluster are homogeneous in nature.
Abstract-Many energy conservation techniques have been proposed to achieve high energy efficiency in disk systems. Unfortunately, growing evidence shows that energy-saving schemes in disk drives usually have negative impacts on storage systems. Existing reliability models are inadequate to estimate reliability of parallel disk systems equipped with energy conservation techniques. To solve this problem, we propose a mathematical model -called MINT -to evaluate the reliability of a parallel disk system where energy-saving mechanisms are implemented. In this paper, we focus on modeling the reliability impacts of two well-known energysaving techniques -the Popular Disk Concentration technique (PDC) and the Massive Array of Idle Disks (MAID). We started this research by investigating how PDC and MAID affect the utilization and power-state transition frequency of each disk in a parallel disk system. We then model the annual failure rate of each disk as a function of the disk's utilization, powerstate transition frequency as well as operating temperature, because these parameters are key reliability-affecting factors in addition to disk ages. Next, the reliability of a parallel disk system can be derived from the annual failure rate of each disk in the parallel disk system. Finally, we used MINT to study the reliability of a parallel disk system equipped with the PDC and MAID techniques. Experimental results show that PDC is more reliable than MAID when disk workload is low. In contrast, the reliability of MAID is higher than that of PDC under relatively high I/O load.
During the past decade, cluster computing and mobile communication technologies have been extensively deployed and widely applied because of their giant commercial value. The rapid technological advancement makes it feasible to integrate these two technologies and a revolutionary application called mobile cluster computing is arising on the horizon. Mobile cluster computing technology can further enhance the power of our laptops and mobile devices by running parallel applications. However, scheduling parallel applications on mobile clusters is technically challenging due to the significant communication latency and limited battery life of mobile devices. Therefore, shortening schedule length and conserving energy consumption have become two major concerns in designing efficient and energy-aware scheduling algorithms for mobile clusters. In this paper, we propose two novel scheduling strategies aimed at leveraging performance and power consumption for parallel applications running on mobile clusters. Our research focuses on scheduling precedence constrained parallel tasks and thus duplication heuristics are applied to schedule parallel tasks to minimize communication overheads. However, existing duplication algorithms are developed with consideration of schedule lengths, completely ignoring energy consumption of clusters. In this regard, we design two energy-aware duplication scheduling algorithms, called EADUS and TEBUS, to schedule precedence constrained parallel tasks with a complexity of O(n 2 ), where n is the number of tasks in a parallel task set. Unlike the existing duplication-based scheduling algorithms that replicate all the possible predecessors of each task, the proposed algorithms judiciously replicate predecessors of a task if the duplication can help in conserving energy. Our energy-aware scheduling strategies are conducive to balancing scheduling lengths and energy savings of a set of precedence constrained parallel tasks. We conducted extensive experiments using both synthetic benchmarks and real-world applications to compare our algorithms with two existing approaches. Experimental results based on simulated mobile clusters demonstrate the effectiveness and practicality of the proposed duplicationbased scheduling strategies. For example, EADUS and TABUS can reduce energy consumption for the Gaussian Elimination application by averages of 16.08% and 8.1% with merely 5.7% and 2.2% increase in schedule length respectively.
In the past decade parallel disk systems have been highly scalable and able to alleviate the problem of disk I/O bottleneck, thereby being widely used to support a wide range of data-intensive applications. Optimizing energy consumption in parallel disk systems has strong impacts on the cost of backup power-generation and cooling equipment, because a significant fraction of the operation cost of data centres is incurred by energy consumption and cooling. Although flash memory is very energyefficient compared to disk drives, flash memory is too expensive to use as a major component in large-scale storage systems. In other words, it is not a cost-effective way to make use of large flash memory to build energy-efficient storage systems. To address this problem, in this paper we proposed a hybrid disk architecture or HYBUD that integrates a non-volatile flash memory with buffer disks to build cost-effective and energy-efficient parallel disk systems. While the most popular data sets are cached in flash memory, the second most popular data sets can be stored and retrieved from buffer disks. HYBUD is energy efficient because flash memory coupled with buffer disks can serve a majority of incoming disk requests, thereby keeping a large number of other data disks in the low-power state for longer period times. Furthermore, HYBUD is cost-effective by the virtue of inexpensive buffer disks assisting flash memory to cache a huge amount of popular data. Experimental results demonstratively show that compared with two existing non-hybrid architectures, HYBUD provides significant energy savings for parallel disk systems in a very cost effective way.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.