Whether it be to feed data for an object detectionand-tracking system or to generate proper occupancy grids, 3D point cloud extraction of the ground and data classification are critical processing tasks, on their efficiency can drastically depend the whole perception chain. Flat-ground assumption or form recognition in point clouds can either lead to systematic error, or massive calculations. This paper describes an adaptive method for ground labeling in 3D Point clouds, based on a local ground elevation estimation. The system proposes to model the ground as a Spatio-Temporal Conditional Random Field (STCRF). Spatial and temporal dependencies within the segmentation process are unified by a dynamic probabilistic framework based on the conditional random field (CRF). Ground elevation parameters are estimated in parallel in each node, using an interconnected Expectation Maximization (EM) algorithm variant. The approach, designed to target high-speed vehicle constraints and performs efficiently with highly-dense (Velodyne-64) and sparser (Ibeo-Lux) 3D point clouds, has been implemented and deployed on experimental vehicle and platforms, and are currently tested on embedded systems (Nvidia Jetson TX1, TK1). The experiments on real road data, in various situations (city, countryside, mountain roads,...), show promising results.
Ground plane estimation and ground point segmentation is a crucial precursor for many applications in robotics and intelligent vehicles like navigable space detection and occupancy grid generation, 3D object detection, point cloud matching for localization and registration for mapping. In this paper, we present GndNet, a novel end-to-end approach that estimates the ground plane elevation information in a grid-based representation and segments the ground points simultaneously in real-time. GndNet uses PointNet and Pillar Feature Encoding network to extract features and regresses ground height for each cell of the grid. We augment the SemanticKITTI dataset to train our network. We demonstrate qualitative and quantitative evaluation of our results for ground elevation estimation and semantic segmentation of point cloud. GndNet establishes a new state-of-the-art, achieves a run-time of 55Hz for ground plane estimation and ground point segmentation.
Accurate detection of objects in 3D point clouds is a central problem for autonomous navigation. Most existing methods use techniques of hand-crafted features representation or multi-sensor approaches prone to sensor failure. Approaches like PointNet that directly operate on sparse point data have shown good accuracy in the classification of single 3D objects. However, LiDAR sensors on Autonomous Vehicles generate a large scale point cloud. Real-time object detection in such a cluttered environment still remains a challenge. In this study, we propose Attentional Point-Net, which is a novel end-to-end trainable deep architecture for object detection in point clouds. We extend the theory of visual attention mechanisms to 3D point clouds and introduce a new recurrent 3D Localization Network module. Rather than processing the whole point cloud, the network learns where to look (finding regions of interest), which significantly reduces the number of points to be processed and inference time. Evaluation on KITTI car detection benchmark shows that our Attentional PointNet achieves comparable results with the state-of-the-art LiDAR-based 3D detection methods in detection and speed.
Automotive systems must undergo a strict process of validation before their release on commercial vehicles. With the increased use of probabilistic approaches in autonomous systems, standard validation methods are not applicable to this end. Furthermore, real life validation, when even possible, implies costs which can be obstructive. New methods for validation and testing are thus necessary. In this paper, we propose a generic method to evaluate complex probabilistic frameworks for autonomous driving. The method is based on Statistical Model Checking (SMC), using specifically defined Key Performance Indicators (KPIs), as temporal properties depending on a set of identified metrics. By studying the behavior of these metrics during a large number of simulations via our statistical model checker, we finally evaluate the probability for the system to meet the KPIs. We show how this method can be applied to two different subsystems of an autonomous vehicle: a perception system and a decisionmaking approach. An overview of these two systems is given to understand related validation challenges. Extensive validation results are then provided for the decision-making case.
HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L'archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d'enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Semantic grids are a succinct and convenient approach to represent the environment for mobile robotics and autonomous driving applications. While the use of Lidar sensors is now generalized in robotics, most semantic grid prediction approaches in the literature focus only on RGB data. In this paper, we present an approach for semantic grid prediction that uses a transformer architecture to fuse Lidar sensor data with RGB images from multiple cameras. Our proposed method, TransFuseGrid, first transforms both input streams into topview embeddings, and then fuses these embeddings at multiple scales with Transformers. Finally, a decoder transforms the fused, top-view feature map into a semantic grid of the vehicle's environment. We evaluate the performance of our approach on the nuScenes dataset for the vehicle, drivable area, lane divider and walkway segmentation tasks. The results show that Trans-FuseGrid achieves superior performance than competing RGBonly and Lidar-only methods. Additionally, the Transformer feature fusion leads to a significative improvement over naive RGB-Lidar concatenation. In particular, for the segmentation of vehicles, our model outperforms state-of-the-art RGB-only and Lidar-only methods by 24% and 53%, respectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.