Asynchronous Spatio-Temporal Memory Network for Continuous Event-Based Object Detection

Li, Jianing; Li, Jia; Zhu, Lin; Xiang, Xijie; Huang, Tiejun

doi:10.1109/tip.2022.3162962

Cited by 44 publications

(40 citation statements)

References 52 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Gehrig et al [135] implement an event-based sampling module in CARLA [327], which renders high-frame-rate images and converts them into dynamic events using ESIM [138]. Li et al [136] utilize V2E [139] to convert videos into dynamic events for object detection, and they directly use existing large-scale annotated labels from KITTI [325] dataset. Lin et al [137] propose an omnidirectional discrete gradient algorithm to convert frames into event streams.…”

Section: B Categorizationmentioning

confidence: 99%

Brain Inspired Computing: A Systematic Survey and Future Trends

Li¹,

Deng²,

Tang³

et al. 2023

Preprint

View full text Add to dashboard Cite

<p>Brain Inspired Computing (BIC) is an emerging research field that aims to build fundamental theories, models, hardware architectures, and application systems toward more general Artificial Intelligence (AI) by learning from the information processing mechanisms or structures/functions of biological nervous systems. It is regarded as one of the most promising research directions for future intelligent computing in the post-Moore era. In the past few years, various new schemes in this field have sprung up to explore more general AI. These works are quite divergent in the aspects of modeling/algorithm, software tool, hardware platform, and benchmark data, since BIC is an interdisciplinary field that consists of many different domains, including computational neuroscience, artificial intelligence, computer science, statistical physics, material science, microelectronics and so forth. This situation greatly impedes researchers from obtaining a clear picture and getting started in the right way. Hence, there is an urgent requirement to do a comprehensive survey in this field to help correctly recognize and analyze such bewildering methodologies. What are the key issues to enhance the development of BIC? What roles do the current mainstream technologies play in the general framework of BIC? Which techniques are truly useful in real-world applications? These questions largely remain open. To address the above issues, in this survey we first clarify the biggest challenge of BIC: how can AI models benefit from the recent advancements in computational neuroscience? With this challenge in mind, we will focus on discussing the concept of BIC and summarize four components of BIC infrastructure development: 1) modeling/algorithm; 2) hardware platform; 3) software tool; and 4) benchmark data. For each component, we will summarize its recent progress, main challenges to resolve, and future trends. On the basis of these studies, we present a general framework for the real-world applications of BIC systems, which is promising to benefit both AI and brain science. Finally, we claim that it is extremely important to build a research ecology to promote prosperity continuously in this field.</p>

show abstract

Section: B Categorizationmentioning

confidence: 99%

Brain Inspired Computing: A Systematic Survey and Future Trends

Li¹,

Deng²,

Tang³

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…For instance, single-modal event-based object detection is currently in the experimental phase, where learned methods make up the majority of the state-of-the-art. Learned implementations typically combine and embed events in an image-like representation, which are used with modified frame-based DNN architectures to make them compatible with event data, in either a temporal 29 – 31 or a nontemporal 32 manner. Recurrent and temporal approaches are typically more suitable for event data, given that events only provide brightness changes and not absolute brightness, at a given point in time, in contrast to frames.…”

Section: Related Workmentioning

confidence: 99%

“…Thus the temporal approaches would incorporate meaningful input history, instead of treating each split, of the input stream, independently, yet at the expense of higher computation. Although the single-modal event-based approaches are promising, their performance typically lags both the frame-based solutions, under normal conditions, as well as the combined solutions, which incorporate both modalities 8 , 29 , 33 – 35 The main reason behind this is due to the nature of the event data, especially in scenes where there is limited motion.…”

Section: Related Workmentioning

confidence: 99%

“…Although the single-modal event-based approaches are promising, their performance typically lags both the frame-based solutions, under normal conditions, as well as the combined solutions, which incorporate both modalities. 8,29,[33][34][35] The main reason behind this is due to the nature of the event data, especially in scenes where there is limited motion.…”

Section: Event-based Approachesmentioning

confidence: 99%

See 1 more Smart Citation

High-temporal-resolution event-based vehicle detection and tracking

Shair

Rawashdeh

2022

Opt. Eng.

View full text Add to dashboard Cite

Event-based vision has been rapidly growing in recent years justified by the unique characteristics, such as its high temporal resolutions (∼1 μs), high dynamic range (>120 dB), and output latency of only a few microseconds. Our work further explores a hybrid, multimodal approach for object detection and tracking that leverages state-of-the-art frame-based detectors complemented by hand-crafted event-based methods to improve the overall tracking performance with minimal computational overhead. The methods presented include event-based bounding box (BB) refinement that improves the precision of the resulting BBs, as well as a continuous event-based object detection method, to recover missed detections and generate interframe detections that enable a high-temporal-resolution tracking output. The advantages of these methods are quantitatively verified by an ablation study using the higher order tracking accuracy (HOTA) metric. Results show significant performance gains resembled by an improvement in the HOTA from 56.6%, using only frames, to 64.1% and 64.9%, for the event and edgebased mask configurations combined with the two methods proposed, at the baseline frame rate of 24 Hz. Likewise, incorporating these methods with the same configurations has improved HOTA from 52.5% to 63.1% and from 51.3% to 60.2% at the high-temporal-resolution tracking rate of 384 Hz. Finally, a validation experiment is conducted to analyze the real-world singleobject tracking performance using high-speed LiDAR. Empirical evidence shows that our approaches provide significant advantages compared to using frame-based object detectors at the baseline frame rate of 24 Hz and higher tracking rates of up to 500 Hz.

show abstract

“…The highly sparse and fluctuating nature of events poses challenges for conventional object detection techniques based on Artificial Neural Networks (ANNs). While recurrent architectural and algorithmic approaches [44,32] have been proposed for preprocessing events, they often come with high computation costs and increased latency. In contrast, the Spiking Neural Networks (SNNs) are a new type of network inspired by the brain that propagate information through discrete spikes generated from their inherent temporal dynamics.…”

Section: Introductionmentioning

confidence: 99%

Huawei Technologies

Zhang¹,

Rohlfer²

The Source of Innovation in China

View full text Add to dashboard Cite

Event-based sensors, with their high temporal resolution (1µs) and dynamical range (120dB), have the potential to be deployed in high-speed platforms such as vehicles and drones. However, the highly sparse and fluctuating nature of events poses challenges for conventional object detection techniques based on Artificial Neural Networks (ANNs). In contrast, Spiking Neural Networks (SNNs) are well-suited for representing event-based data due to their inherent temporal dynamics. In particular, we demonstrate that the membrane potential dynamics can modulate network activity upon fluctuating events and strengthen features of sparse input. In addition, the spike-triggered adaptive threshold can stabilize training which further improves network performance. Based on this, we develop an efficient spiking feature pyramid network for event-based object detection. Our proposed SNN outperforms previous SNNs and sophisticated ANNs with attention mechanisms, achieving a mean average precision (map50) of 47.7% on the Gen1 benchmark dataset. This result significantly surpasses the previous best SNN by 9.7% and demonstrates the potential of SNNs for event-based vision. Our model has a concise architecture while maintaining high accuracy and much lower computation cost as a result of sparse computation. Our code will be publicly available.

show abstract

Asynchronous Spatio-Temporal Memory Network for Continuous Event-Based Object Detection

Cited by 44 publications

References 52 publications

Brain Inspired Computing: A Systematic Survey and Future Trends

Brain Inspired Computing: A Systematic Survey and Future Trends

High-temporal-resolution event-based vehicle detection and tracking

Huawei Technologies

Contact Info

Product

Resources

About