Visual-based vehicle detection has been extensively applied for autonomous driving systems and advanced driving assistant systems, however, it faces great challenges as a partial observation regularly happens owing to occlusion from infrastructure or dynamic objects or a limited vision field. This paper presents a two-stage detector based on Faster R-CNN for high occluded vehicle detection, in which we integrate a part-aware region proposal network to sense global and local visual knowledge among different vehicle attributes. That entails the model simultaneously generating partial-level proposals and instance-level proposals at the first stage. Then, different parts belong to the same vehicle are encoded and reconfigured into a compositional entire proposal through a part affinity fields, allowing the model to generate integral candidates and mitigate the impact of occlusion challenge to the utmost extent. Extensive experiments conducted on KITTI benchmark exhibit that our method outperforms most machine-learning-based vehicle detection methods and achieves high recall in the severely occluded application scenario.
Lane detection severs as one of the pivotal techniques to promote the development of local navigation and HD Map building of autonomous driving. However, lane detection remains an unresolved problem for the challenge of detection accuracy in diverse driving scenarios and computational limitation in on-board devices, let alone other road guidance markings. In this paper, we go beyond aforementioned limitations and propose a segmentation-by-detection method for road marking extraction. The architecture of this method consists of three modules: pre-processing, road marking detection and segmentation. In the pre-processing stage, image enhancement operation is used to highlight the contrast especially between road markings and road background. To reduce the computational complexity, the road region will be cropped by vanishing point detection algorithm in this module. Then, a lightweight network is dedicated designed for road marking detection. In order to enhance the network sensitivity to road markings and improve the detection accuracy, we further incorporate a Siamese attention module by integrating with the channel and spatial maps into the network. In the segmentation module, different from the method of semantic segmentation by neural network, our segmentation method is mainly based on conventional image morphological algorithms, which is less computational and also can achieve pixel-level accuracy. Additionally, the sliding search box and maximum stable external region (MSER) algorithms are utilized to compensate for missed detection and position error of bounding boxes. In the experiments, our proposed method delivers outstanding performances on cross datasets and achieves the real-time speed on the embedded devices.
Vision-based technologies have been extensively applied for on-street parking space sensing, aiming at providing timely and accurate information for drivers and improving daily travel convenience. However, it faces great challenges as a partial visualization regularly occurs owing to occlusion from static or dynamic objects or a limited perspective of camera. This paper presents an imagery-based framework to infer parking space status by generating 3D bounding box of the vehicle. A specially designed convolutional neural network based on ResNet and feature pyramid network is proposed to overcome challenges from partial visualization and occlusion. It predicts 3D box candidates on multi-scale feature maps with five different 3D anchors, which generated by clustering diverse scales of ground truth box according to different vehicle templates in the source data set. Subsequently, vehicle distribution map is constructed jointly from the coordinates of vehicle box and artificially segmented parking spaces, where the normative degree of parked vehicle is calculated by computing the intersection over union between vehicle's box and parking space edge. In space status inference, to further eliminate mutual vehicle interference, three adjacent spaces are combined into one unit and then a multinomial logistic regression model is trained to refine the status of the unit. Experiments on KITTI benchmark and Shanghai road show that the proposed method outperforms most monocular approaches in 3D box regression and achieves satisfactory accuracy in space status inference.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.