We introduce a network that directly predicts the 3D layout of lanes in a road scene from a single image. This work marks a first attempt to address this task with onboard sensing without assuming a known constant lane width or relying on pre-mapped environments. Our network architecture, 3D-LaneNet, applies two new concepts: intranetwork inverse-perspective mapping (IPM) and anchorbased lane representation. The intra-network IPM projection facilitates a dual-representation information flow in both regular image-view and top-view. An anchor-percolumn output representation enables our end-to-end approach which replaces common heuristics such as clustering and outlier rejection, casting lane estimation as an object detection problem. In addition, our approach explicitly handles complex situations such as lane merges and splits. Results are shown on two new 3D lane datasets, a synthetic and a real one. For comparison with existing methods, we test our approach on the image-only tuSimple lane detection benchmark, achieving performance competitive with stateof-the-art.
Obstacle detection is a fundamental technological enabler for autonomous driving and vehicle active safety applications. While dense laser scanners are best suitable for the task (e.g. Google's self driving car), camera-based systems, which are significantly less expensive, continue to improve. Stereo-based commercial solutions such as Daimler's "intelligent drive" are good at general obstacle detection while monocular-based systems such as Mobileye's are usually designed to detect specific categories of objects (cars, pedestrians, etc.). The problem of general obstacle detection remains a difficult task for monocular camera based systems. Such systems have clear advantages over stereo-based ones in terms of cost and packaging size.Another related task commonly performed by camera-based systems is scene labeling, in which a label (e.g. road, car, sidewalk) is assigned to each pixel in the image. As a result full detection and segmentation of all the obstacles and of the road is obtained, but scene labeling is generally a difficult task. Instead, we propose in this paper to solve a more constrained task: detecting in each image column the image contact point (pixel) between the closest obstacle and the ground as depicted in Figure 1(Left). The idea is borrowed from the "Stixel-World" obstacle representation [1] in which the obstacle in each column is represented by a so called "Stixel", and our goal is to find the bottom pixel of each such "Stixel". Note that since we don't consider each non-road object (e.g. sidewalk, grass) as an obstacle, the task of road segmentation is different from obstacle detection. Notice also that free-space detection task is ambiguously used in the literature to describe the above mentioned obstacle detection task [1] and the road segmentation task [4].Current "Stixel-based" methods [1] for general obstacle detection use stereo vision while our method is monocular-based. A different approach for monocular based obstacle detection relies the host vehicle motion and uses Structure-from-Motion (SfM) from sequences of frames in the video [3]. In contrast our method uses a single image as input and therefore operates also when the host vehicle is stationary. In addition, the SfM approach is orthogonal to ours and can therefore be later combined to improve performance. For the task of road segmentation, the common approach is to perform pixel or patch level [4]. In contrast, we propose to solve the problem using the same column-based regression approach as for obstacle detection. Our approach is novel in providing a unified framework for both the obstacle detection and road-segmentation tasks, and in using the first to facilitate the second in the training phase.We propose solving the obstacle detection task using a two stage approach. In the first stage we divide the image into columns and solve the detection as a regression problem using a convolutional neural network, which we call "StixelNet". Figure 1(Right) shows an example network input and output. In the second stage we improve the ...
The problem of estimating the number of sources and their angles of arrival from a single antenna array observation has been an active area of research in the signal processing community for the last few decades. When the number of sources is large, the maximum likelihood estimator is intractable due to its very high complexity, and therefore alternative signal processing methods have been developed with some performance loss. In this paper, we apply a deep neural network (DNN) approach to the problem and analyze its advantages with respect to signal processing algorithms. We show that an appropriate designed network can attain the maximum likelihood performance with feasible complexity and outperform other feasible signal processing estimation methods over various signal to noise ratios and array response inaccuracies.Index Terms-Angle of arrival, deep neural networks, model order determination, single snapshot
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.