This paper describes novel event-based spatio-temporal features called time-surfaces and how they can be used to create a hierarchical event-based pattern recognition architecture. Unlike existing hierarchical architectures for pattern recognition, the presented model relies on a time oriented approach to extract spatio-temporal features from the asynchronously acquired dynamics of a visual scene. These dynamics are acquired using biologically inspired frameless asynchronous event-driven vision sensors. Similarly to cortical structures, subsequent layers in our hierarchy extract increasingly abstract features using increasingly large spatio-temporal windows. The central concept is to use the rich temporal information provided by events to create contexts in the form of time-surfaces which represent the recent temporal activity within a local spatial neighborhood. We demonstrate that this concept can robustly be used at all stages of an event-based hierarchical model. First layer feature units operate on groups of pixels, while subsequent layer feature units operate on the output of lower level feature units. We report results on a previously published 36 class character recognition task and a four class canonical dynamic card pip task, achieving near 100 percent accuracy on each. We introduce a new seven class moving face recognition task, achieving 79 percent accuracy.This paper describes novel event-based spatio-temporal features called time-surfaces and how they can be used to create a hierarchical event-based pattern recognition architecture. Unlike existing hierarchical architectures for pattern recognition, the presented model relies on a time oriented approach to extract spatio-temporal features from the asynchronously acquired dynamics of a visual scene. These dynamics are acquired using biologically inspired frameless asynchronous event-driven vision sensors. Similarly to cortical structures, subsequent layers in our hierarchy extract increasingly abstract features using increasingly large spatio-temporal windows. The central concept is to use the rich temporal information provided by events to create contexts in the form of time-surfaces which represent the recent temporal activity within a local spatial neighborhood. We demonstrate that this concept can robustly be used at all stages of an event-based hierarchical model. First layer feature units operate on groups of pixels, while subsequent layer feature units operate on the output of lower level feature units. We report results on a previously published 36 class character recognition task and a four class canonical dynamic card pip task, achieving near 100 percent accuracy on each. We introduce a new seven class moving face recognition task, achieving 79 percent accuracy.
Event-based cameras have recently drawn the attention of the Computer Vision community thanks to their advantages in terms of high temporal resolution, low power consumption and high dynamic range, compared to traditional frame-based cameras. These properties make event-based cameras an ideal choice for autonomous vehicles, robot navigation or UAV vision, among others. However, the accuracy of event-based object classification algorithms, which is of crucial importance for any reliable system working in real-world conditions, is still far behind their framebased counterparts. Two main reasons for this performance gap are: 1. The lack of effective low-level representations and architectures for event-based object classification and 2. The absence of large real-world event-based datasets. In this paper we address both problems. First, we introduce a novel event-based feature representation together with a new machine learning architecture. Compared to previous approaches, we use local memory units to efficiently leverage past temporal information and build a robust eventbased representation. Second, we release the first large real-world event-based dataset for object classification. We compare our method to the state-of-the-art with extensive experiments, showing better classification performance and real-time computation.
This paper introduces a new methodology to compute dense visual flow using the precise timings of spikes from an asynchronous event-based retina. Biological retinas, and their artificial counterparts, are totally asynchronous and data-driven and rely on a paradigm of light acquisition radically different from most of the currently used frame-grabber technologies. This paper introduces a framework to estimate visual flow from the local properties of events' spatiotemporal space. We will show that precise visual flow orientation and amplitude can be estimated using a local differential approach on the surface defined by coactive events. Experimental results are presented; they show the method adequacy with high data sparseness and temporal resolution of event-based acquisition that allows the computation of motion flow with microsecond accuracy and at very low computational cost.
We present a method to label and trace the lineage of multiple neural progenitors simultaneously in vertebrate animals via multiaddressable genome-integrative color (MAGIC) markers. We achieve permanent expression of combinatorial labels from new Brainbow transgenes introduced in embryonic neural progenitors with electroporation of transposon vectors. In the mouse forebrain and chicken spinal cord, this approach allows us to track neural progenitor's descent during pre- and postnatal neurogenesis or perinatal gliogenesis in long-term experiments. Color labels delineate cytoarchitecture, resolve spatially intermixed clones, and specify the lineage of astroglial subtypes and adult neural stem cells. Combining colors and subcellular locations provides an expanded marker palette to individualize clones. We show that this approach is also applicable to modulate specific signaling pathways in a mosaic manner while color-coding the status of individual cells regarding induced molecular perturbations. This method opens new avenues for clonal and functional analysis in varied experimental models and contexts.
Abstract-This paper introduces a spiking hierarchical model for object recognition which utilizes the precise timing information inherently present in the output of biologically inspired asynchronous Address Event Representation (AER) vision sensors. The asynchronous nature of these systems frees computation and communication from the rigid predetermined timing enforced by system clocks in conventional systems. Freedom from rigid timing constraints opens the possibility of using true timing to our advantage in computation. We show not only how timing can be used in object recognition, but also how it can in fact simplify computation. Specifically, we rely on a simple temporal-winnertake-all rather than more computationally intensive synchronous operations typically used in biologically inspired neural networks for object recognition. This approach to visual computation represents a major paradigm shift from conventional clocked systems and can find application in other sensory modalities and computational tasks. We showcase effectiveness of the approach by achieving the highest reported accuracy to date (97.5%±3.5%) for a previously published four class card pip recognition task and an accuracy of 84.9%±1.9% for a new more difficult 36 class character recognition task.
We present a novel event-based stereo matching algorithm that exploits the asynchronous visual events from a pair of silicon retinas. Unlike conventional frame-based cameras, recent artificial retinas transmit their outputs as a continuous stream of asynchronous temporal events, in a manner similar to the output cells of the biological retina. Our algorithm uses the timing information carried by this representation in addressing the stereo-matching problem on moving objects. Using the high temporal resolution of the acquired data stream for the dynamic vision sensor, we show that matching on the timing of the visual events provides a new solution to the real-time computation of 3-D objects when combined with geometric constraints using the distance to the epipolar lines. The proposed algorithm is able to filter out incorrect matches and to accurately reconstruct the depth of moving objects despite the low spatial resolution of the sensor. This brief sets up the principles for further event-based vision processing and demonstrates the importance of dynamic information and spike timing in processing asynchronous streams of visual events.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.