We propose to embed features extracted from event-driven dynamic vision sensors to binary sparse representations in hyperdimensional (HD) space for regression. This embedding compresses events generated across 346×260 differential pixels to a sparse 8160bit vector by applying random activation functions. The sparse representation not only simplifies inference, but also enables online learning with the same memory footprint. Specifically, it allows efficient updates by retaining binary vector components over the course of online learning that cannot be otherwise achieved with dense representations demanding multibit vector components. We demonstrate online learning capability: using estimates and confidences of an initial model trained with only 25% of data, our method continuously updates the model for the remaining 75% of data, resulting in a close match with accuracy obtained with an oracle model on ground truth labels. When mapped on an 8-core accelerator, our method also achieves lower error, latency, and energy compared to other sparse/dense alternatives. Furthermore, it is 9.84× more energy-efficient and 6.25× faster than an optimized 9-layer perceptron with comparable accuracy.
CCS Concepts• Computer systems organization → Embedded software; • Computing methodologies → Machine learning.
Forecasting the future behavior of all traffic agents in the vicinity is a key task to achieve safe and reliable autonomous driving systems. It is a challenging problem as agents adjust their behavior depending on their intentions, the others' actions, and the road layout. In this paper, we propose Decoder Fusion RNN (DF-RNN), a recurrent, attention-based approach for motion forecasting. Our network is composed of a recurrent behavior encoder, an inter-agent multi-headed attention module, and a context-aware decoder. We design a map encoder that embeds polyline segments, combines them to create a graph structure, and merges their relevant parts with the agents' embeddings. We fuse the encoded map information with further inter-agent interactions only inside the decoder and propose to use explicit training as a method to effectively utilize the information available. We demonstrate the efficacy of our method by testing it on the Argoverse motion forecasting dataset and show its state-of-the-art performance on the public benchmark.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.