Cloud detection plays a vital role in remote sensing data preprocessing. Traditional cloud detection algorithms have difficulties in feature extraction and thus produce a poor detection result when processing remote sensing images with uneven cloud distribution and complex surface background. To achieve better detection results, a cloud detection method with multi-scale feature extraction and content-aware reassembly network (MCNet) is proposed. Using pyramid convolution and channel attention mechanisms to enhance the model’s feature extraction capability, MCNet can fully extract the spatial information and channel information of clouds in an image. The content-aware reassembly is used to ensure that sampling on the network can recover enough in-depth semantic information and improve the model cloud detection effect. The experimental results show that the proposed MCNet model has achieved good detection results in cloud detection tasks.
Aerial image-based target detection has problems such as low accuracy in multiscale target detection situations, slow detection speed, missed targets and falsely detected targets. To solve this problem, this paper proposes a detection algorithm based on the improved You Only Look Once (YOLO)v3 network architecture from the perspective of model efficiency and applies it to multiscale image-based target detection. First, the K-means clustering algorithm is used to cluster an aerial dataset and optimize the anchor frame parameters of the network to improve the effectiveness of target detection. Second, the feature extraction method of the algorithm is improved, and a feature fusion method is used to establish a multiscale (large-, medium-, and small-scale) prediction layer, which mitigates the problem of small target information loss in deep networks and improves the detection accuracy of the algorithm. Finally, label regularization processing is performed on the predicted value, the generalized intersection over union (GIoU) is used as the bounding box regression loss function, and the focal loss function is integrated into the bounding box confidence loss function, which not only improves the target detection accuracy but also effectively reduces the false detection rate and missed target rate of the algorithm. An experimental comparison on the RSOD and NWPU VHR-10 aerial datasets shows that the detection effect of high-efficiency YOLO (HE-YOLO) is significantly improved compared with that of YOLOv3, and the average detection accuracies are increased by 8.92% and 7.79% on the two datasets, respectively. The algorithm not only shows better detection performance for multiscale targets but also reduces the missed target rate and false detection rate and has good robustness and generalizability.
The TransR model solves the problem that TransE and TransH models are not sufficient for modeling in public spaces, and is considered a highly potential knowledge representation model. However, TransR still adopts the translation principles based on the TransE model, and the constraints are too strict, which makes the model’s ability to distinguish between very similar entities low. Therefore, we propose a representation learning model TransR* based on flexible translation and relational matrix projection. Firstly, we separate entities and relationships in different vector spaces; secondly, we combine our flexible translation strategy to make translation strategies more flexible. During model training, the quality of generating negative triples is improved by replacing semantically similar entities, and the prior probability of the relationship is used to distinguish the relationship of similar coding. Finally, we conducted link prediction experiments on the public data sets FB15K and WN18, and conducted triple classification experiments on the WN11, FB13, and FB15K data sets to analyze and verify the effectiveness of the proposed model. The evaluation results show that our method has a better improvement effect than TransR on Mean Rank, Hits@10 and ACC indicators.
Target tracking technology that is based on aerial videos is widely used in many fields; however, this technology has challenges, such as image jitter, target blur, high data dimensionality, and large changes in the target scale. In this paper, the research status of aerial video tracking and the characteristics, background complexity and tracking diversity of aerial video targets are summarized. Based on the findings, the key technologies that are related to tracking are elaborated according to the target type, number of targets and applicable scene system. The tracking algorithms are classified according to the type of target, and the target tracking algorithms that are based on deep learning are classified according to the network structure. Commonly used aerial photography datasets are described, and the accuracies of commonly used target tracking methods are evaluated in an aerial photography dataset, namely, UAV123, and a long-video dataset, namely, UAV20L. Potential problems are discussed, and possible future research directions and corresponding development trends in this field are analyzed and summarized.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.