Davide Rossetti scite author profile

This is a PDF file of a peer-reviewed paper that has been accepted for publication. Although unedited, the content has been subjected to preliminary formatting. Nature is providing this early version of the typeset paper as a service to our authors and readers. The text and figures will undergo copyediting and a proof review before the paper is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers apply.

show abstract

UCX: An Open Source Framework for HPC Network APIs and Beyond

Shamis

Venkata

Lopez

et al. 2015

125

View full text Add to dashboard Cite

This paper presents Unified Communication X (UCX), a set of network APIs and their implementations for high throughput computing. UCX comes from the combined effort of national laboratories, industry, and academia to design and implement a high-performing and highly-scalable network stack for next generation applications and systems. UCX design provides the ability to tailor its APIs and network functionality to suit a wide variety of application domains and hardware. We envision these APIs to satisfy the networking needs of many programming models such as Message Passing Interface (MPI), OpenSHMEM, Partitioned Global Address Space (PGAS) languages, task-based paradigms and I/O bound applications. To evaluate the design we implement the APIs and protocols, and measure the performance of overhead-critical network primitives fundamental for implementing many parallel programming models and system libraries. Our results show that the latency, bandwidth, and message rate achieved by the portable UCX prototype is very close to that of the underlying driver. With UCX, we achieved a message exchange latency of 0.89 us, a bandwidth of 6138.5 MB/s, and a message rate of 14 million messages per second. As far as we know, this is the highest bandwidth and message rate achieved by any network stack (publicly known) on this hardware.

show abstract

APEnet+: a 3D Torus network optimized for GPU-based HPC Systems

Ammendola

Biagioni

Frezza

et al. 2012

J. Phys.: Conf. Ser.

View full text Add to dashboard Cite

Numerical simulations of the dynamical behavior of the SK model

Marinari¹,

Parisi

Rossetti

1998

Eur. Phys. J. B

View full text Add to dashboard Cite

We study the dynamical behavior of the Sherrington Kirkpatrick model. Thanks to the APE supercomputer we are able to analyze large lattice volumes, and to investigate the low T region. We present a new and precise determination of the remnant magnetization and of its time decay exponent, of the energy time decay exponent, and we discuss aging phenomena in the model. We exclude validity of naive aging, and propose different options that fit the numerical data.

show abstract

Benchmarking of communication techniques for GPUs

Bernaschi

Bisson

Rossetti

2013

Journal of Parallel and Distributed Computing

View full text Add to dashboard Cite

Computing for LQCD: apeNEXT

et al. 2006

View full text Add to dashboard Cite

show abstract

GPU-Centric Communication on NVIDIA GPU Clusters with InfiniBand: A Case Study with OpenSHMEM

Potluri

Goswami

Rossetti

et al. 2017

View full text Add to dashboard Cite

GPU Peer-to-Peer Techniques Applied to a Cluster Interconnect

Ammendola

Bernaschi

Biagioni

et al. 2013

View full text Add to dashboard Cite

Modern GPUs support special protocols to exchange data directly across the PCI Express bus. While these protocols could be used to reduce GPU data transmission times, basically by avoiding staging to host memory, they require specific hardware features which are not available on current generation network adapters. In this paper we describe the architectural modifications required to implement peer-topeer access to NVIDIA Fermi-and Kepler-class GPUs on an FPGA-based cluster interconnect.Besides, the current software implementation, which integrates this feature by minimally extending the RDMA programming model, is discussed, as well as some issues raised while employing it in a higher level API like MPI.Finally, the current limits of the technique are studied by analyzing the performance improvements on low-level benchmarks and on two GPU-accelerated applications, showing when and how they seem to benefit from the GPU peer-to-peer method.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Davide Rossetti

FXR inhibition may protect from SARS-CoV-2 infection by reducing ACE2

UCX: An Open Source Framework for HPC Network APIs and Beyond

APEnet+: a 3D Torus network optimized for GPU-based HPC Systems

Numerical simulations of the dynamical behavior of the SK model

Benchmarking of communication techniques for GPUs

Computing for LQCD: apeNEXT

GPU-Centric Communication on NVIDIA GPU Clusters with InfiniBand: A Case Study with OpenSHMEM

GPU Peer-to-Peer Techniques Applied to a Cluster Interconnect

Contact Info

Product

Resources

About