Abstract. Thread-level speculation (TLS) allows potentially dependent threads to speculatively execute in parallel, thus making it easier for the compiler to extract parallel threads. However, the high cost associated with unbalanced load, failed speculation, and inter-thread value communication makes it difficult to obtain the desired performance unless the speculative threads are carefully chosen.In this paper, we focus on extracting parallel threads from loops in generalpurpose applications because loops, with their regular structures and significant coverage on execution time, are ideal candidates for extracting parallel threads. General-purpose applications, however, usually contain a large number of nested loops with unpredictable parallel performance and dynamic behavior, thus making it difficult to decide which set of loops should be parallelized to improve overall program performance. Our proposed loop selection algorithm addresses all these difficulties. We have found that (i) with the aid of profiling information, compiler analyses can achieve a reasonably accurate estimation of the performance of parallel execution, and that (ii) different invocations of a loop may behave differently, and exploiting this dynamic behavior can further improve performance. With a judicious choice of loops, we can improve the overall program performance of SPEC2000 integer benchmarks by as much as 20%.
Real-time systems are increasingly appearing in complex and dynamic environments such as cruise controllers, life support systems and nuclear reactors. These systems contain components that sense, control and stabilize the environment towards achieving the mission or target. These consociate components synchronize, compute and control themselves locally or have a centralized component to achieve their mission. Distributed computing techniques improve the overall performance and reliability of large real-time systems with spread components.Partially Clairvoyant scheduling was introduced in Saksena, M., PhD thesis (1994) to determine the schedulability of hard real-time jobs with variable execution times. The problem of deciding the Partially Clairvoyant schedulability of a constrained set of jobs was well studied in Gerber, R., et al., IEEE Trans.
With the onset of distributed computing in hard real-time applications, the problem of assigning to, scheduling in, and executing jobs on processors, has received a lot of attention. Usually, real-time systems are embedded in closed loop reactive environments with uncertain behaviors and such systems take varying times to respond to such stimuli. One of the fundamental features of such systems is the presence of complex timing constraints between pairs of jobs. A secondary feature is the non-constant nature of the execution times of jobs. Real-time operating systems such as MARUTI can measure the interval within which the execution time varies (Mosse et al. ) to schedule jobs with varying execution times and non-trivial timing constraints. The schedulability of the job set is determined offline and a set of dispatch functions are produced from the given set of constraints if the job set is schedulable. The dispatch functions bind the start time of a job J to an interval that depends on the start and execution times of jobs sequenced before J . The online dispatcher of the system reads these dispatch functions and computes the interval within which a job can start without violating the constraints imposed on the system. In certain situations, the dispatcher fails to dispatch a job as the time to compute the dispatch functions associated with a job is greater than the interval within which the job needs to be dispatched. This , we propose and implement a partially clairvoyant dispatching algorithm on a shared memory cluster with Concurrent Read Exclusive Write (CREW) architecture and contrast it with the sequential approach. For a preset number of processors, our approach has O(1) dispatch complexity while using a total of O(n 2 ) space, while the sequential approach requires (n) time. The detailed implementation profile obtained clearly demonstrates the superiority of the multiprocessor approach to dispatching. We also address the issue of scalability of the dispatcher for increasing number of processors and show that job sets of different sizes require different number of processors. Finally, we demonstrate the effect of execution time on the dispatchability of schedules.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.