SONAR: Automated Communication Characterization for HPC Applications

Lammel, Steffen; Zahn, Felix; Fröning, Holger

doi:10.1007/978-3-319-46079-6_8

Cited by 7 publications

(6 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…3 Furthermore, we have shown how workloads differ among themselves and characterized typical traffic parameters. 4 This work extends our previous contributions by an analysis at network level based on a network simulation that is extended by a power consumption model and by proposing several policies for power saving. We start with a simple but effective power saving policy, derived from the work of Venkatesh et al, 5 and further introduce two new power saving policies, which better match the technical constraints, such as energy-proportionality and transition time.…”

supporting

confidence: 53%

“…Therefore, we classify these workloads by their communication behavior. For this analysis we use SONAR, a tool which characterizes the communication of a given application at MPI level. We believe communication patterns to be the most appropriate characteristic for improved energy consumption.…”

Section: Evaluation Methodologymentioning

confidence: 99%

“…In recent work, we have shown that typical high‐performance computing (HPC) applications entail long inactivity periods, in which links could be scaled down to less power‐consuming states . Furthermore, we have shown how workloads differ among themselves and characterized typical traffic parameters …”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

On link width scaling for energy‐proportional direct interconnection networks

Zahn

Lammel

Fröning

2018

Concurrency and Computation

Self Cite

View full text Add to dashboard Cite

Energy consumption is one of the most important design parameters for future large-scale computing systems. While the end of Dennard scaling demands for increasing energy-proportional components, interconnection networks have not received much attention regarding this topic.However, these networks are expected to contribute about 20% to the overall power consumption of these systems in the near future. Furthermore, this fraction increases if other energy-proportional components, such as CPUs, accelerators, and memory, are not fully utilized.To avoid becoming the main contributor to power consumption and to reduce overall power consumption, it is mandatory to improve the energy-proportionality of interconnection networks. In this work, we analyze different aspects of energy-proportionality in interconnection networks for systems designed within current technical constraints but also for future systems that might be designed with different parameters. First, we discuss the impact of multiple design parameters and the most feasible approach for improved energy consumption, such as transition time and power state granularity. Based on this study, we introduce three different power saving policies, which try to address different requirements. While an on/off policy allows for large energy savings, it can also cause significant performance losses for adverse setups. In order to meet the demand for sustainable performance, we present two new policies that trade power saving potential for performance. For all three workload classes, we use a power-aware network simulation to report the impact on execution time and energy consumption compared to the current situation and an idealized network. While we show that a highly regular pattern enables power saving possibilities close to the theoretical minimum, even slight deviations from such a highly iterative and temporal behavior demand for further improvements in all policies. KEYWORDS energy-proportionality, interconnection networks, network simulation, power saving INTRODUCTIONToday's CMOS-based compute technology is mainly constrained by power consumption, as the scaling rules introduced by Dennard are no longer valid. Thus, in Post-Dennard performance scaling, traditional techniques like frequency scaling and an improved amount of instructions per cycle (IPC) are no longer applicable. Instead, scaling the amount of operations per Watt is key to performance. Due to data dependencies, data movement is an inherent part of scientific applications, and besides optimizing the amount of floating point or integer operations per Watt, the associated costs have to be considered when moving input and output operands. Furthermore, energy consumption for data movements strongly depends on distance: for short on-die links, power consumption depends linearly on the transmission distance, while for longer connections, it quickly behaves super-linear due to effects including dielectric loss and skin effect. As a result, for a clustered system, inter-node communication significantly contr...

show abstract

supporting

confidence: 53%

Section: Evaluation Methodologymentioning

confidence: 99%

See 1 more Smart Citation

On link width scaling for energy‐proportional direct interconnection networks

Zahn

Lammel

Fröning

2018

Concurrency and Computation

Self Cite

View full text Add to dashboard Cite

show abstract

“…Performance profiling tools for HPC exist, such as HPC Toolkit (Tallent et al, 2008) and SONAR (Lammel et al, 2016). TAU (Shende and Malony, 2006) and ScoreP (Knpfer et al, 2012) extract performance profiles from applications, but these tools are not optimized for largescale workflows, nor do they collect comprehensive provenance information required for detailed introspection and analysis.…”

Section: Provenancementioning

confidence: 99%

Priority research directions for in situ data management: Enabling scientific discovery from diverse data sources

Peterka

Bard

Bennett

et al. 2020

The International Journal of High Performance Computing Applica

View full text Add to dashboard Cite

In January 2019, the US Department of Energy, Office of Science program in Advanced Scientific Computing Research, convened a workshop to identify priority research directions (PRDs) for in situ data management (ISDM). A fundamental finding of this workshop is that the methodologies used to manage data among a variety of tasks in situ can be used to facilitate scientific discovery from many different data sources—simulation, experiment, and sensors, for example—and that being able to do so at numerous computing scales will benefit real-time decision-making, design optimization, and data-driven scientific discovery. This article describes six PRDs identified by the workshop, which highlight the components and capabilities needed for ISDM to be successful for a wide variety of applications—making ISDM capabilities more pervasive, controllable, composable, and transparent, with a focus on greater coordination with the software stack and a diversity of fundamentally new data algorithms.

show abstract

“…There is a rich portfolio of tools for performance analysis [8,9,11,14], modeling, and prediction for single applications. However, they have not been adapted to handle workflows that have specific performance issues based on underlying resource management, potential contention for resources, and interdependence of the different workflow tasks.…”

Section: Introductionmentioning

confidence: 99%

Chimbuko: A Workflow-Level Scalable Performance Trace Analysis Tool

Kelly

Huck³

et al. 2020

ISAV'20 in Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization

View full text Add to dashboard Cite

Due to the sheer volume of data it is typically impractical to analyze the detailed performance of an HPC application running at-scale. While conventional small-scale benchmarking and scaling studies are often sufficient for simple applications, many modern workflow-based applications couple multiple elements with competing resource demands and complex inter-communication patterns for which performance cannot easily be studied in isolation and at small scale. This work discusses Chimbuko, a performance analysis framework that provides real-time, in situ anomaly detection. By focusing specifically on performance anomalies and their origin (aka provenance), data volumes are dramatically reduced without losing necessary details. To the best of our knowledge, Chimbuko is the first online, distributed, and scalable workflow-level performance trace analysis framework. We demonstrate the tool's usefulness on Oak Ridge National Laboratory's Summit system. CCS CONCEPTS • Software and its engineering → Software creation and management; • Human-centered computing → Visualization techniques; • Computing methodologies → Parallel computing methodologies.

show abstract

SONAR: Automated Communication Characterization for HPC Applications

Cited by 7 publications

References 11 publications

On link width scaling for energy‐proportional direct interconnection networks

On link width scaling for energy‐proportional direct interconnection networks

Priority research directions for in situ data management: Enabling scientific discovery from diverse data sources

Chimbuko: A Workflow-Level Scalable Performance Trace Analysis Tool

Contact Info

Product

Resources

About