Video satellites have recently become an attractive method of Earth observation, providing consecutive images of the Earth’s surface for continuous monitoring of specific events. The development of on-board optical and communication systems has enabled the various applications of satellite image sequences. However, satellite video-based target tracking is a challenging research topic in remote sensing due to its relatively low spatial and temporal resolution. Thus, this survey systematically investigates current satellite video-based tracking approaches and benchmark datasets, focusing on five typical tracking applications: traffic target tracking, ship tracking, typhoon tracking, fire tracking, and ice motion tracking. The essential aspects of each tracking target are summarized, such as the tracking architecture, the fundamental characteristics, primary motivations, and contributions. Furthermore, popular visual tracking benchmarks and their respective properties are discussed. Finally, a revised multi-level dataset based on wpafb videos is generated and quantitatively evaluated for future development in the satellite video-based tracking area. In addition, 54.3% of the tracklets with lower ds are selected and renamed as the Easy group, while 27.2% and 18.5% of the tracklets are grouped into the Medium-ds group and the Hard-ds group, respectively.
In this paper, a visual navigation method based on binocular vision and a deep learning approach is proposed to solve the navigation problem of the unmanned aerial vehicle autonomous aerial refueling docking process. First, to meet the requirements of high accuracy and high frame rate in aerial refueling tasks, this paper proposes a single-stage lightweight drogue detection model, which greatly increases the inference speed of binocular images by introducing image alignment and depth-separable convolution and improves the feature extraction capability and scale adaptation performance of the model by using an efficient attention mechanism (ECA) and adaptive spatial feature fusion method (ASFF). Second, this paper proposes a novel method for estimating the pose of the drogue by spatial geometric modeling using optical markers, and further improves the accuracy and robustness of the algorithm by using visual reprojection. Moreover, this paper constructs a visual navigation vision simulation and semi-physical simulation experiments for the autonomous aerial refueling task, and the experimental results show the following: (1) the proposed drogue detection model has high accuracy and real-time performance, with a mean average precision (mAP) of 98.23% and a detection speed of 41.11 FPS in the embedded module; (2) the position estimation error of the proposed visual navigation algorithm is less than ±0.1 m, and the attitude estimation error of the pitch and yaw angle is less than ±0.5°; and (3) through comparison experiments with the existing advanced methods, the positioning accuracy of this method is improved by 1.18% compared with the current advanced methods.
Deep learning-based algorithms for single object tracking (SOT) have shown impressive performance but remain susceptible to adversarial patch attacks. However, existing adversarial patch generation methods primarily focus on generating patches within the search region, neglecting the incorporation of template information, which limits their effectiveness in carrying out successful attacks. There is also a lack of evaluation metrics to assess the patch’s adversarial abilities. In this study, we propose a bilateral adversarial patch-generating network to address these limitations and advance the field of adversarial patch generation for SOT networks. Our network leverages a Focus structure that effectively integrates both template and search region information, generating separate adversarial patches for each branch. We also introduce the DeFocus structure to solve the size discrepancy between the template and search region of the tracking network. To effectively mislead the tracking network, we have designed adversarial object loss and adversarial regression loss functions tailored to the network’s output. Moreover, we propose a comprehensive evaluation metric that measures the patch’s adversarial ability by establishing a relationship between the relative patch size and attack performance. As UAV view data often constitute small objects requiring smaller patches, we evaluate our approach on the UAV123 and UAVDT datasets. Our evaluation encompasses not only the overall attack performance but also the effectiveness of our strategy and the transferability of the attacks. Experimental results demonstrate that our algorithm generates patches with higher attack efficiency compared to existing methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.