Elena Pastorelli scite author profile

Over the past decade there has been a growing interest in the development of parallel hardware systems for simulating large-scale networks of spiking neurons. Compared to other highly-parallel systems, GPU-accelerated solutions have the advantage of a relatively low cost and a great versatility, thanks also to the possibility of using the CUDA-C/C++ programming languages. NeuronGPU is a GPU library for large-scale simulations of spiking neural network models, written in the C++ and CUDA-C++ programming languages, based on a novel spike-delivery algorithm. This library includes simple LIF (leaky-integrate-and-fire) neuron models as well as several multisynapse AdEx (adaptive-exponential-integrate-and-fire) neuron models with current or conductance based synapses, different types of spike generators, tools for recording spikes, state variables and parameters, and it supports user-definable models. The numerical solution of the differential equations of the dynamics of the AdEx models is performed through a parallel implementation, written in CUDA-C++, of the fifth-order Runge-Kutta method with adaptive step-size control. In this work we evaluate the performance of this library on the simulation of a cortical microcircuit model, based on LIF neurons and current-based synapses, and on balanced networks of excitatory and inhibitory neurons, using AdEx or Izhikevich neuron models and conductance-based or current-based synapses. On these models, we will show that the proposed library achieves state-of-the-art performance in terms of simulation time per second of biological activity. In particular, using a single NVIDIA GeForce RTX 2080 Ti GPU board, the full-scale cortical-microcircuit model, which includes about 77,000 neurons and 3 · 108 connections, can be simulated at a speed very close to real time, while the simulation time of a balanced network of 1,000,000 AdEx neurons with 1,000 connections per neuron was about 70 s per second of biological activity.

show abstract

Sleep-like slow oscillations improve visual classification through synaptic homeostasis and memory association in a thalamo-cortical model

Capone

Pastorelli

Golosio

et al. 2019

Sci Rep

View full text Add to dashboard Cite

The occurrence of sleep passed through the evolutionary sieve and is widespread in animal species. Sleep is known to be beneficial to cognitive and mnemonic tasks, while chronic sleep deprivation is detrimental. Despite the importance of the phenomenon, a complete understanding of its functions and underlying mechanisms is still lacking. In this paper, we show interesting effects of deep-sleep-like slow oscillation activity on a simplified thalamo-cortical model which is trained to encode, retrieve and classify images of handwritten digits. During slow oscillations, spike-timing-dependent-plasticity (STDP) produces a differential homeostatic process. It is characterized by both a specific unsupervised enhancement of connections among groups of neurons associated to instances of the same class (digit) and a simultaneous down-regulation of stronger synapses created by the training. This hierarchical organization of post-sleep internal representations favours higher performances in retrieval and classification tasks. The mechanism is based on the interaction between top-down cortico-thalamic predictions and bottom-up thalamo-cortical projections during deep-sleep-like slow oscillations. Indeed, when learned patterns are replayed during sleep, cortico-thalamo-cortical connections favour the activation of other neurons coding for similar thalamic inputs, promoting their association. Such mechanism hints at possible applications to artificial learning systems.

show abstract

The ExaNeSt Project: Interconnects, Storage, and Packaging for Exascale Systems

Katevenis

Chrysos

Marazakis

et al. 2016

View full text Add to dashboard Cite

ExaNest is one of three European projects that support a ground-breaking computing architecture for exascale-class systems built upon power-efficient 64-bit ARM processors. This group of projects share an 'everything-close' and 'share-anything' paradigm, which trims down the power consumption - by shortening the distance of signals for most data transfers - as well as the cost and footprint area of the installation - by reducing the number of devices needed to meet performance targets. In ExaNeSt, we will design and implement: (i) a physical rack prototype and its liquid-cooling subsystem providing ultra-dense compute packaging, (ii) a storage architecture with distributed (in-node) non-volatile memory (NVM) devices, (iii) a unified, low-latency interconnect, designed to efficiently uphold desired Quality-of-Service guarantees for a mix of storage with inter-processor flows, and (iv) efficient rack-level memory sharing, where each page is cacheable at only a single node . Our target is to test alternative storage and interconnect options on actual hardware, using real-world HPC applications. The ExaNeSt consortium brings together technology, skills, and knowledge across the entire value chain, from computing IP, packaging, and system deployment, all the way up to operating systems, storage, HPC, big data frameworks, and cutting-edge applications

show abstract

Scaling of a Large-Scale Simulation of Synchronous Slow-Wave and Asynchronous Awake-Like Activity of a Cortical Model With Long-Range Interconnections

Pastorelli

Capone

Simula

et al. 2019

Front. Syst. Neurosci.

View full text Add to dashboard Cite

Cortical synapse organization supports a range of dynamic states on multiple spatial and temporal scales, from synchronous slow wave activity (SWA), characteristic of deep sleep or anesthesia, to fluctuating, asynchronous activity during wakefulness (AW). Such dynamic diversity poses a challenge for producing efficient large-scale simulations that embody realistic metaphors of short- and long-range synaptic connectivity. In fact, during SWA and AW different spatial extents of the cortical tissue are active in a given timespan and at different firing rates, which implies a wide variety of loads of local computation and communication. A balanced evaluation of simulation performance and robustness should therefore include tests of a variety of cortical dynamic states. Here, we demonstrate performance scaling of our proprietary Distributed and Plastic Spiking Neural Networks (DPSNN) simulation engine in both SWA and AW for bidimensional grids of neural populations, which reflects the modular organization of the cortex. We explored networks up to 192 × 192 modules, each composed of 1,250 integrate-and-fire neurons with spike-frequency adaptation, and exponentially decaying inter-modular synaptic connectivity with varying spatial decay constant. For the largest networks the total number of synapses was over 70 billion. The execution platform included up to 64 dual-socket nodes, each socket mounting 8 Intel Xeon Haswell processor cores @ 2.40 GHz clock rate. Network initialization time, memory usage, and execution time showed good scaling performances from 1 to 1,024 processes, implemented using the standard Message Passing Interface (MPI) protocol. We achieved simulation speeds of between 2.3 × 10 9 and 4.1 × 10 9 synaptic events per second for both cortical states in the explored range of inter-modular interconnections.

show abstract

The Next Generation of Exascale-Class Systems: The ExaNeSt Project

Ammendola

Biagioni

Cretaro

et al. 2017

View full text Add to dashboard Cite

The ExaNeSt project started on December 2015 and is funded by EU H2020 research framework (call H2020-FETHPC-2014, n. 671553) to study the adoption of low-cost, Linux-based power-efficient 64-bit ARM processors clusters for Exascale-class systems. The ExaNeSt consortium pools partners with industrial and academic research expertise in storage, interconnects and applications that share a vision of an European Exascale-class supercomputer. Their goal is designing and implementing a physical rack prototype together with its cooling system, the storage non-volatile memory (NVM) architecture and a low-latency interconnect able to test different options for interconnection and storage. Furthermore, the consortium is to provide real HPC applications to validate the system. Herein we provide a status report of the project initial developments.

show abstract

Fast Simulation of a Multi-Area Spiking Network Model of Macaque Cortex on an MPI-GPU Cluster

Tiddia

Golosio

Albers

et al. 2022

Front. Neuroinform.

View full text Add to dashboard Cite

Spiking neural network models are increasingly establishing themselves as an effective tool for simulating the dynamics of neuronal populations and for understanding the relationship between these dynamics and brain function. Furthermore, the continuous development of parallel computing technologies and the growing availability of computational resources are leading to an era of large-scale simulations capable of describing regions of the brain of ever larger dimensions at increasing detail. Recently, the possibility to use MPI-based parallel codes on GPU-equipped clusters to run such complex simulations has emerged, opening up novel paths to further speed-ups. NEST GPU is a GPU library written in CUDA-C/C++ for large-scale simulations of spiking neural networks, which was recently extended with a novel algorithm for remote spike communication through MPI on a GPU cluster. In this work we evaluate its performance on the simulation of a multi-area model of macaque vision-related cortex, made up of about 4 million neurons and 24 billion synapses and representing 32 mm2 surface area of the macaque cortex. The outcome of the simulations is compared against that obtained using the well-known CPU-based spiking neural network simulator NEST on a high-performance computing cluster. The results show not only an optimal match with the NEST statistical measures of the neural activity in terms of three informative distributions, but also remarkable achievements in terms of simulation time per second of biological activity. Indeed, NEST GPU was able to simulate a second of biological time of the full-scale macaque cortex model in its metastable state 3.1× faster than NEST using 32 compute nodes equipped with an NVIDIA V100 GPU each. Using the same configuration, the ground state of the full-scale macaque cortex model was simulated 2.4× faster than NEST.

show abstract

NaNet: a configurable NIC bridging the gap between HPC and real-time HEP GPU computing

et al. 2015

View full text Add to dashboard Cite

NaNet is a FPGA-based PCIe Network Interface Card (NIC) design with GPUDirect and Remote Direct Memory Access (RDMA) capabilities featuring a configurable and extensible set of network channels. The design currently supports both standard -Gbe (1000BASE-T) and 10GbE (10Base-R) -and custom -34 Gbps APElink and 2.5 Gbps deterministic latency KM3link -channels, but its modularity allows for straightforward inclusion of other link technologies. The GPUDirect feature combined with a transport layer offload module and a data stream processing stage makes NaNet a low-latency NIC suitable for real-time GPU processing. In this

show abstract

Next generation of Exascale-class systems: ExaNeSt project and the status of its interconnect and storage development

Katevenis

Ammendola

Biagioni

et al. 2018

Microprocessors and Microsystems

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.