On Multiple AER Handshaking Channels Over High-Speed Bit-Serial Bidirectional LVDS Links With Flow-Control and Clock-Correction on Commercial FPGAs for Scalable Neuromorphic Systems
Abstract:Address event representation (AER) is a widely employed asynchronous technique for interchanging "neural spikes" between different hardware elements in neuromorphic systems. Each neuron or cell in a chip or a system is assigned an address (or ID), which is typically communicated through a high-speed digital bus, thus time-multiplexing a high number of neural connections. Conventional AER links use parallel physical wires together with a pair of handshaking signals (request and acknowledge). In this paper, we p… Show more
“…An FPGA-based board was used to implement a digital system design, which controls the neuromorphic robot platform. This approach was considered in other similar works to implement a hardware version of CPGs, such as [8] and [19], and also in the field of neurorobotics and neuromorphic engineering [22]. This reconfigurable hardware offers flexibility against analog designs and adaptability in real time, in case of system failures.…”
Initially, robots were developed with the aim of making our life easier, carrying out repetitive or dangerous tasks for humans. Although they were able to perform these tasks, the latest generation of robots are being designed to take a step further, by performing more complex tasks that have been carried out by smart animals or humans up to date. To this end, inspiration needs to be taken from biological examples. For instance, insects are able to optimally solve complex environment navigation problems, and many researchers have started to mimic how these insects behave. Recent interest in neuromorphic engineering has motivated us to present a real-time, neuromorphic, spike-based Central Pattern Generator of application in neurorobotics, using an arthropod-like robot. A Spiking Neural Network was designed and implemented on SpiNNaker. The network models a complex, online-change capable Central Pattern Generator which generates three gaits for a hexapod robot locomotion. Reconfigurable hardware was used to manage both the motors of the robot and the real-time communication interface with the Spiking Neural Networks. Real-time measurements confirm the simulation results, and locomotion tests show that NeuroPod can perform the gaits without any balance loss or added delay.
“…An FPGA-based board was used to implement a digital system design, which controls the neuromorphic robot platform. This approach was considered in other similar works to implement a hardware version of CPGs, such as [8] and [19], and also in the field of neurorobotics and neuromorphic engineering [22]. This reconfigurable hardware offers flexibility against analog designs and adaptability in real time, in case of system failures.…”
Initially, robots were developed with the aim of making our life easier, carrying out repetitive or dangerous tasks for humans. Although they were able to perform these tasks, the latest generation of robots are being designed to take a step further, by performing more complex tasks that have been carried out by smart animals or humans up to date. To this end, inspiration needs to be taken from biological examples. For instance, insects are able to optimally solve complex environment navigation problems, and many researchers have started to mimic how these insects behave. Recent interest in neuromorphic engineering has motivated us to present a real-time, neuromorphic, spike-based Central Pattern Generator of application in neurorobotics, using an arthropod-like robot. A Spiking Neural Network was designed and implemented on SpiNNaker. The network models a complex, online-change capable Central Pattern Generator which generates three gaits for a hexapod robot locomotion. Reconfigurable hardware was used to manage both the motors of the robot and the real-time communication interface with the Spiking Neural Networks. Real-time measurements confirm the simulation results, and locomotion tests show that NeuroPod can perform the gaits without any balance loss or added delay.
“…It can scale from the 10k neurons in a worm brain up to 86B neurons in a human brain while using the same basic building blocks and architecture. This level of scalability is achieved via asynchronous distributed processing and event-based communication [10]. Inspired from biology, event-based simulation techniques were used for simulations of spiking neuron networks for a long time [11] and hardware implementation of asynchronous communication and processing is implemented in many of the available neuromorphic processors [12] [13] [14] [15] [16].…”
Section: * Amirreza Yousefzadeh and Mina A Khoei Contributed Equally mentioning
confidence: 99%
“…PilotNet is a neural network introduced by NVIDIA [42] along with the dataset 9 . The dataset contains video recording (10 frames per second) by a camera placed in front of the car as well as the corresponding steering angels for each frame of the video 10 . To find out the amount of redundancy (especially temporal sparsity) in this dataset, we compressed the raw frames using "Motion JPEG 2000 lossless compression" algorithm which resulted in 18 times reduction in size.…”
Section: B Autonomous Steering (Pilotnet)mentioning
Inference of Deep Neural Networks for stream signal (Video/Audio) processing in edge devices is still challenging. Unlike the most state of the art inference engines which are efficient for static signals, our brain is optimized for real-time dynamic signal processing. We believe one important feature of the brain (asynchronous state-full processing) is the key to its excellence in this domain. In this work, we show how asynchronous processing with state-full neurons allows exploitation of the existing sparsity in natural signals. This paper explains three different types of sparsity and proposes an inference algorithm which exploits all types of sparsities in the execution of already trained networks. Our experiments in three different applications (Handwritten digit recognition, Autonomous Steering and Hand-Gesture recognition) show that this model of inference reduces the number of required operations for sparse input data by a factor of one to two orders of magnitudes. Additionally, due to fully asynchronous processing this type of inference can be run on fully distributed and scalable neuromorphic hardware platforms.
“…A baseboard, called DockSoC, designed for this MMP platform (manufactured by COBER) is able to manage all MMP needed power supplies (from 1V to 12V), the JTAG port over UART and several parallel interfaces to Neuromorphic chips over the CAVIAR and ROME parallel AER connectors are included [13]. The DockSoC can act as a daughter board for the AERNode [14] platform to expand connectivity to other PSoC platforms andor to support the connectivity to other Neuromorphic systems. Figure 2 shows a picture of the used setup with the PSoC platform, the DockSoC baseboard and a USB neuromorphic retina, called DAVIS.…”
Many FPGAs vendors have recently included embedded processors in their devices, like Xilinx with ARM-Cortex A cores, together with programmable logic cells. These devices are known as Programmable System on Chip (PSoC). Their ARM cores (embedded in the processing system or PS) communicates with the programmable logic cells (PL) using ARM-standard AXI buses. In this paper we analyses the performance of exhaustive data transfers between PS and PL for a Xilinx Zynq FPGA in a co-design real scenario for Convolutional Neural Networks (CNN) accelerator, which processes, in dedicated hardware, a stream of visual information from a neuromorphic visual sensor for classification. In the PS side, a Linux operating system is running, which recollects visual events from the neuromorphic sensor into a normalized frame, and then it transfers these frames to the accelerator of multi-layered CNNs, and read results, using an AXI-DMA bus in a per-layer way. As these kind of accelerators try to process information as quick as possible, data bandwidth becomes critical and maintaining a good balanced data throughput rate requires some considerations. We present and evaluate several data partitioning techniques to improve the balance between RX and TX transfer and two different ways of transfers management: through a polling routine at the userlevel of the OS, and through a dedicated interrupt-based kernellevel driver. We demonstrate that for longer enough packets, the kernel-level driver solution gets better timing in computing a CNN classification example. Main advantage of using kernel-level driver is to have safer solutions and to have tasks scheduling in the OS to manage other important processes for our application, like frames collection from sensors and their normalization.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.