Abstract-We present a study on estimating the dynamic power consumption of a processor based on performance counters. Today's processors feature a large number of such counters to monitor various CPU and memory parameters such as utilization, occupancy, bandwidth and, page, cache and branch buffer hit rates. The use of various sets of performance counters to estimate the power consumed by the processor has been demonstrated in the past. Our goal is to find out whether there exists a subset of counters that can be used to estimate, with sufficient accuracy, the dynamic power consumption of processors with varying microarchitecture. To this end we consider two recent processor configurations representing two extremes of the performance spectrum, one targeting low power, while the other high performance. Our results indicate that only three counters measuring (i) the number of fetched instructions, (ii) level 1 cache hits and (iii) dispatch stalls, are sufficient to achieve adequate precision. These counters are shown to be effective in predicting the dynamic power consumption across processors of varying resource sizes achieving a prediction accuracy of 95%.
Abstract-The trend toward multicore processors is moving the emphasis in computation from sequential to parallel processing. However, not all applications can be parallelized and benefit from multiple cores. Such applications lead to under-utilization of parallel resources, hence sub-optimal performance/watt. They may however, benefit from powerful uniprocessors. On the other hand, not all applications can take advantage of more powerful uniprocessors. To address competing requirements of diverse applications, we propose a heterogeneous multicore architecture with a Dynamic Core Morphing (DCM) capability. Depending on the computational demands of the currently executing applications, the resources of a few tightly coupled cores are morphed at runtime. We present a simple hardware-based algorithm to monitor the time-varying computational needs of the application and when deemed beneficial, trigger reconfiguration of the cores at fine-grain time scales to maximize the performance/watt of the application. The proposed dynamic scheme is then compared against a baseline static heterogeneous multicore configuration and an equivalent homogeneous configuration. Our results show that dynamic morphing of cores can provide performance/watt gains of 43% and 16% on an average, when compared to the homogeneous and baseline heterogeneous configurations, respectively.
Asymmetric multi-core processors (AMPs) have been shown to outperform symmetric ones in terms of performance and performance/watt. Improved performance and power efficiency are achieved when the program threads are matched to their most suitable cores. Since the computational needs of a program may change during its execution, the best thread to core assignment will likely change with time. We have, therefore, developed an online program phase classification scheme that allows the swapping of threads when the current needs of the threads justify a change in the assignment.
The architectural differences among the cores in an AMP can never match the diversity that exists among different programs and even between different phases of the same program. Consider, for example, a program (or a program phase) that has a high instruction-level parallelism (ILP) and will exhibit high power efficiency if executed on a powerful core. We can not, however, include such powerful cores in the designed AMP, since they will remain underutilized most of the time, and they are not power efficient when the programs do not exhibit a high degree of ILP. Thus, we must expect to see program phases where the designed cores will be unable to support the ILP that the program can exhibit. We, therefore, propose in this article a dynamic morphing scheme. This scheme will allow a core to gain control of a functional unit that is ordinarily under the control of a neighboring core during periods of intense computation with high ILP. This way, we dynamically adjust the hardware resources to the current needs of the application.
Our results show that combining online phase classification and dynamic core morphing can significantly improve the performance/watt of most multithreaded workloads.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.