Topology-Aware Data Aggregation for Intensive I/O on Large-Scale Supercomputers

Tessier, François; Malakar, Preeti; Vishwanath, Venkatram; Jeannot, Emmanuel; Isaila, Florin

doi:10.1109/comhpc.2016.013

Cited by 13 publications

(14 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It is crucial to account for these different resources at the same time to perform global locality optimizations. For instance, optimizing storage access and memory access simultaneously results in good performance gain as shown in early results [64].…”

Section: Trends and Requirementsmentioning

confidence: 79%

Trends in Data Locality Abstractions for HPC Systems

Unat

Dubey

Hoefler

et al. 2017

IEEE Trans. Parallel Distrib. Syst.

Self Cite

View full text Add to dashboard Cite

Abstract-The cost of data movement has always been an important concern in high performance computing (HPC) systems. It has now become the dominant factor in terms of both energy consumption and performance. Support for expression of data locality has been explored in the past, but those efforts have had only modest success in being adopted in HPC applications for various reasons. However, with the increasing complexity of the memory hierarchy and higher parallelism in emerging HPC systems, locality management has acquired a new urgency. Developers can no longer limit themselves to low-level solutions and ignore the potential for productivity and performance portability obtained by using locality abstractions. Fortunately, the trend emerging in recent literature on the topic alleviates many of the concerns that got in the way of their adoption by application developers. Data locality abstractions are available in the forms of libraries, data structures, languages and runtime systems; a common theme is increasing productivity without sacrificing performance. This paper examines these trends and identifies commonalities that can combine various locality concepts to develop a comprehensive approach to expressing and managing data locality on future large-scale high-performance computing systems.

show abstract

Section: Trends and Requirementsmentioning

confidence: 79%

Trends in Data Locality Abstractions for HPC Systems

Unat

Dubey

Hoefler

et al. 2017

IEEE Trans. Parallel Distrib. Syst.

Self Cite

View full text Add to dashboard Cite

show abstract

“…HPC applications usually rely on highly tuned libraries such as MPI-IO, parallel netCDF or HDF5 to perform their I/O. Tessier et al propose to integrate topology awareness into these libraries [28]. They show that performing data aggregation while considering the topology allow to diminish the bandwidth required to perform I/O.…”

Section: Related Workmentioning

confidence: 99%

A Methodology for Handling Data Movements by Anticipation: Position Paper

Bleuse

Lucarelli

Trystram

2018

Lecture Notes in Computer Science

View full text Add to dashboard Cite

The enhanced capabilities of large scale parallel and distributed platforms produce a continuously increasing amount of data which have to be stored, exchanged and used by various tasks allocated on dierent nodes of the system. The management of such a huge communication demand is crucial for reaching the best possible performance of the system. Meanwhile, we have to deal with more interferences as the trend is to use a single all-purpose interconnection network whatever the interconnect (tree-based hierarchies or topology-based heterarchies). There are two dierent types of communications, namely, the ows induced by data exchanges during the computations, and the ows related to Input/Output operations. We propose in this paper a general model for interference-aware scheduling, where explicit communications are replaced by external topological constraints. Specically, the interferences of both communication types are reduced by adding geometric constraints on the allocation of tasks into machines. The proposed constraints reduce implicitly the data movements by restricting the set of possible allocations for each task. This methodology has been proved to be ecient in a recent study for a restricted interconnection network (a line/ring of processors which is an intermediate between a tree and higher dimensions grids/torus). The obtained results illustrated well the diculty of the problem even on simple topologies, but also provided a pragmatic greedy solution, which was assessed to be ecient by simulations. We are currently extending this solution for more complex topologies. This work is a position paper which describes the methodology, it does not focus on the solving part.

show abstract

“…2) Application-side I/O management strategies (such as [30,22,29]) then would be responsible to ensure the correct transfer of I/O at the right time by limiting the bandwidth used by nodes that transfer I/O. The start and end time for each I/O as well as the used bandwidth are described in input files.…”

Section: High-level Implementation Proof Of Conceptmentioning

confidence: 99%

Periodic I/O Scheduling for Super-Computers

Aupy

Gainaru

Fèvre

2017

Lecture Notes in Computer Science

View full text Add to dashboard Cite

With the ever-growing need of data in HPC applications, the congestion at the I/O level becomes critical in super-computers. Architectural enhancement such as burst-buffers and pre-fetching are added to machines, but are not sufficient to prevent congestion. Recent online I/O scheduling strategies have been put in place, but they add an additional congestion point and overheads in the computation of applications.In this work, we show how to take advantage of the periodic nature of HPC applications in order to develop efficient periodic scheduling strategies for their I/O transfers. Our strategy computes once during the job scheduling phase a pattern where it defines the I/O behavior for each application, after which the applications run independently, transferring their I/O at the specified times. Our strategy limits the amount of I/O congestion at the I/O node level and can be easily integrated into current job schedulers. We validate this model through extensive simulations and experiments by comparing it to state-of-the-art online solutions, showing that not only our scheduler has the advantage of being de-centralized and thus overcoming the overhead of online schedulers, but also that it performs better than these solutions, improving the application dilation up to 13% and the maximum system efficiency up to 18%.

show abstract

Topology-Aware Data Aggregation for Intensive I/O on Large-Scale Supercomputers

Cited by 13 publications

References 15 publications

Trends in Data Locality Abstractions for HPC Systems

Trends in Data Locality Abstractions for HPC Systems

A Methodology for Handling Data Movements by Anticipation: Position Paper

Periodic I/O Scheduling for Super-Computers

Contact Info

Product

Resources

About