Identifying and locating track areas in images through machine vision technology is the primary task of autonomous UAV inspection. Aiming at the problems that railway track images are greatly affected by light and perspective, the background environment is complex and easy to misidentify, and existing methods are difficult to reason correctly about the obscured track area, this paper proposes a generative adversarial network (GAN)-based railway track precision segmentation framework, RT-GAN. RT-GAN consists of an encoder–decoder generator (named RT-seg) and a patch-based track discriminator. For the generator design, a linear span unit (LSU) and linear extension pyramid (LSP) are used to concatenate network features with different resolutions. In addition, a loss function containing gradient information is designed, and the gradient image of the segmentation result is added into the input of the track discriminator, aiming to guide the generator, RT-seg, to focus on the linear features of the railway tracks faster and more accurately. Experiments on the railway track dataset proposed in this paper show that with the improved loss function and adversarial training, RT-GAN provides a more accurate segmentation of rail tracks than the state-of-the-art techniques and has stronger occlusion inference capabilities, achieving 88.07% and 81.34% IoU in unaugmented and augmented datasets.
Heatmap-based traditional approaches for estimating human pose usually suffer from drawbacks such as high network complexity or suboptimal accuracy. Focusing on the issue of multi-person pose estimation without heatmaps, this paper proposes an end-to-end, lightweight human pose estimation network using a multi-scale coordinate attention mechanism based on the Yolo-Pose network to improve the overall network performance while ensuring the network is lightweight. Specifically, the lightweight network GhostNet was first integrated into the backbone to alleviate the problem of model redundancy and produce a significant number of effective feature maps. Then, by combining the coordinate attention mechanism, the sensitivity of our proposed network to direction and location perception was enhanced. Finally, the BiFPN module was fused to balance the feature information of different scales and further improve the expression ability of convolutional features. Experiments on the COCO 2017 dataset showed that, compared with the baseline method YOLO-Pose, the average accuracy of the proposed network on the COCO 2017 validation dataset was improved by 4.8% while minimizing the amount of network parameters and calculations. The experimental results demonstrated that our proposed method can improve the detection accuracy of human pose estimation while ensuring that the model is lightweight.
The primary premise of autonomous railway inspection using unmanned aerial vehicles is achieving autonomous flight along the railway. In our previous work, fitted centerline-based unmanned aerial vehicle (UAV) navigation is proven to be an effective method to guide UAV autonomous flying. However, the empirical parameters utilized in the fitting procedure lacked a theoretical basis and the fitted curves were also not coherent nor smooth. To address these problems, this paper proposes a skeleton detection method, called the dynamic-weight parallel instance and skeleton network, to directly extract the centerlines that can be viewed as skeletons. This multi-task branch network for skeleton detection and instance segmentation can be trained end to end. Our method reformulates a fused loss function with dynamic weights to control the dominant branch. During training, the sum of the weights always remains constant and the branch with a higher weight changes from instance to skeleton gradually. Experiments show that our model yields 93.98% mean average precision (mAP) for instance segmentation, a 51.9% F-measure score (F-score) for skeleton detection, and 60.32% weighted mean metrics for the entire network based on our own railway skeleton and instance dataset which comprises 3235 labeled overhead-view images taken in various environments. Our method can achieve more accurate railway skeletons and is useful to guide the autonomous flight of a UAV in railway inspection.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.