Energy consumption is one of the most important design parameters for future large-scale computing systems. While the end of Dennard scaling demands for increasing energy-proportional components, interconnection networks have not received much attention regarding this topic.However, these networks are expected to contribute about 20% to the overall power consumption of these systems in the near future. Furthermore, this fraction increases if other energy-proportional components, such as CPUs, accelerators, and memory, are not fully utilized.To avoid becoming the main contributor to power consumption and to reduce overall power consumption, it is mandatory to improve the energy-proportionality of interconnection networks. In this work, we analyze different aspects of energy-proportionality in interconnection networks for systems designed within current technical constraints but also for future systems that might be designed with different parameters. First, we discuss the impact of multiple design parameters and the most feasible approach for improved energy consumption, such as transition time and power state granularity. Based on this study, we introduce three different power saving policies, which try to address different requirements. While an on/off policy allows for large energy savings, it can also cause significant performance losses for adverse setups. In order to meet the demand for sustainable performance, we present two new policies that trade power saving potential for performance. For all three workload classes, we use a power-aware network simulation to report the impact on execution time and energy consumption compared to the current situation and an idealized network. While we show that a highly regular pattern enables power saving possibilities close to the theoretical minimum, even slight deviations from such a highly iterative and temporal behavior demand for further improvements in all policies.
KEYWORDS
energy-proportionality, interconnection networks, network simulation, power saving
INTRODUCTIONToday's CMOS-based compute technology is mainly constrained by power consumption, as the scaling rules introduced by Dennard are no longer valid. Thus, in Post-Dennard performance scaling, traditional techniques like frequency scaling and an improved amount of instructions per cycle (IPC) are no longer applicable. Instead, scaling the amount of operations per Watt is key to performance. Due to data dependencies, data movement is an inherent part of scientific applications, and besides optimizing the amount of floating point or integer operations per Watt, the associated costs have to be considered when moving input and output operands. Furthermore, energy consumption for data movements strongly depends on distance: for short on-die links, power consumption depends linearly on the transmission distance, while for longer connections, it quickly behaves super-linear due to effects including dielectric loss and skin effect. As a result, for a clustered system, inter-node communication significantly contr...