Abstract-Performance, power, and temperature are now all first-order design constraints. Balancing power efficiency, thermal constraints, and performance requires some means to convey data about real-time power consumption and tem perature to intelligent resource managers. Resource managers can use this information to meet performance goals, maintain power budgets, and obey thermal constraints. Unfortunately, obtaining the required machine introspection is challenging.Most current chips provide no support for per-core power monitoring, and when support exists, it is not exposed to software. We present a methodology for deriving per-core power models using sampled performance counter values and temperature sensor readings. We develop application independent models for four different (four-to eight-core) platforms, validate their accuracy, and show how they can be used to guide scheduling decisions in power-aware resource managers. Model overhead is negligible, and estimations exhibit 1.1 %-5.2% per-suite median error on the NAS, SPEC OMP, and SPEC 2006 benchmarks (and 1.2%-4.4% overall).
I. IN TRODUCTIONPower and temperature have joined performance as first order system design constraints. All three influence each other, and together they affect architectural and packaging choices. Power consumption characteristics further influence operating cost, reliability, battery lifetime, and device life time. Balancing power efficiency and thermal constraints with performance requires intelligent resource management, and achieving that balance requires real-time power con sumption and temperature information broken down accord ing to resource, together with software and hardware that can leverage such information to enforce management policies.One logical place to institute intelligent resource manage ment with respect to power, performance, and temperature for chip mUltiprocessor (CMP) systems is at the level of in dividual cores. Measuring run-time power of a single core is problematic, though. Current chips do not support it. Power meters only report total consumption for everything behind a single power cable, and even if such aggregate data were sufficient, the use of meters becomes completely infeasible as machines scale up: coordinating output and feedback from thousands of meters would require a separate (super) computing system. Cycle-level system simulators provide in depth information, but are extremely time consuming and 978-1-4244-7614-511 0/$26.00 ©20 10 IEEE prone to error. Power models implemented on top of the architectural abstractions in such simulators are inherently inaccurate [19], and are impossible to verify when attempt ing to assess new architectural designs. Hardware could be enhanced to measure the current and power draw of a CPU socket, but per-core measurement is difficult when cores share a power plane. Embedding measurement devices on chip is usually infeasible. Even when measurement facilities exist -e.g., the Intel Core i7 [16] features per-core power monitoring at the chip-level -they ar...