It holds great implications for large-scale traffic scene understanding, to enable traffic sign detection in real environment. The large intra-category variance in features caused by instances of different spatial scales and appearances are the main problem. In this paper, a traffic sign detection framework using scale-aware and domain adaptive network (SADANet) was proposed, which seamlessly combines a multiscale prediction network (MSPN) with a domain adaptive network (DAN) in a tightly-coupled manner in order to tackle the challenge. Specifically, MSPN is dedicated to extracting the multiscale feature. It utilizes fully, the low-level location and high-level semantic information, and considers the combination of the context information and the instance specific content awareness in the scale transformation. DAN is dedicated to making features domain invariant without enough labeled test data. It aligns the domain distributions from the different scales effectively by leveraging the mapping relationship between the image representation and the multiscale features. Experimental results show that the SADANet is effective in traffic sign detection task and is also competitive when compared with the state-of-the-art methods. INDEX TERMS Adaptive feature weighting, domain adaptation, multiscale feature, traffic sign detection.
Road detection is one of the crucial tasks for scene understanding in autonomous driving. Recently, methods based on deep learning had rapidly grown and addressed this task excellently, because they can extract more abundant features. In this study, the authors consider the visual road detection problem as a classification for each pixel of the given image, which is road or non‐road. There is complex illumination encounter in traffic applications, so that the detection model has poor adaptability. They address this problem by proposing a deep network architecture, which combines the network U‐Net‐prior and domain adaptation model (DAM). U‐Net‐prior is a modified segmentation network which integrates location prior and shape prior into U‐Net. DAM is a model for reducing the gap between training images and test images, which is optimised in adversarial learning to make the features extracted from different datasets close to each other. They validate the effectiveness of each component of the algorithm, and compare the overall architecture with other state‐of‐the‐art methods, and the results show that the architecture achieves top accuracies with the shortest run time in monocular‐vision‐based methods, simultaneously, compared with the methods based on other sensors, the architecture also achieves a competitive result.
Traffic sign detection is a research hotspot in advanced assisted driving systems, given the complex background, light transformation, and scale changes of traffic sign targets, as well as the problems of slow result acquisition and low accuracy of existing detection methods. To solve the above problems, this paper proposes a traffic sign detection method based on a lightweight multiscale feature fusion network. Since a lightweight network model is simple and has fewer parameters, it can greatly improve the detection speed of a target. To learn more target features and improve the generalization ability of the model, a multiscale feature fusion method can be used to improve recognition accuracy during training. Firstly, MobileNetV3 was selected as the backbone network, a new spatial attention mechanism was introduced, and a spatial attention branch and a channel attention branch were constructed to obtain a mixed attention weight map. Secondly, a feature-interleaving module was constructed to convert the single-scale feature map of the specified layer into a multiscale feature fusion map to realize the combined encoding of high-level semantic information and low-level semantic information. Then, a feature extraction base network for lightweight multiscale feature fusion with an attention mechanism based on the above steps was constructed. Finally, a key-point detection network was constructed to output the location information, bias information, and category probability of the center points of traffic signs to achieve the detection and recognition of traffic signs. The model was trained, validated, and tested using TT100K datasets, and the detection accuracy of 36 common categories of traffic signs reached more than 85%, among which the detection accuracy of five categories exceeded 95%. The results showed that, compared with the traditional methods of Faster R-CNN, CornerNet, and CenterNet, traffic sign detection based on a lightweight multiscale feature fusion network had obvious advantages in the speed and accuracy of recognition, significantly improved the detection performance for small targets, and achieved a better real-time performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.