Synthetic Aperture Radar (SAR) imaging and visible light imaging are the two most commonly used imaging methods for remote sensing satellites. Since the imaging information of the two is highly complementary, many scenarios of data fusion need to use these two heterogeneous data. However, before data fusion, the data of the two modalities need to be aligned, and the performance of heterogeneous data matching algorithm directly affects the performance of obtaining ground control points during alignment. At present, there are many one-stage and two-stage methods for heterogeneous remote sensing image matching. The existing one-stage methods have problems such as large average prediction offset, low matching accuracy, and imbalance of features at different levels when combining features. The stage method cannot meet the actual needs in terms of speed and accuracy. To address these issues, this paper proposes an end-to-end heterogeneous remote sensing image matching framework HB3CF. The framework uses the classic image feature extraction network to construct a pair of pseudo-twin networks, which extract features from two heterogeneous images respectively. Then, the features of each level are uniformly sampled on the channel using the convolution layer to reduce the weight of the high-dimensional features in the joint features, which effectively improves the expression ability of the features of each level of the model. Finally, the matching results are obtained by performing convolution, cross-correlation and up-sampling operations on the high-dimensional features of the SAR image and the optical image. Experiments show that the average offset error of the model is reduced by about 25% compared to state-of-the-art methods. When the average offset error is less than or equal to 0 pixel, 1 pixel, 2 pixel and 3 pixel, the accuracy is increased by 8.53%, 9.54%, 4.16% and 1.12% respectively, reaching 25.90%, 65.03%, 86.65% and 92.15%, which greatly improves the accuracy of the matching method of heterogeneous images, and explores the application of deep learning methods in large-scale heterogeneous remote sensing image matching tasks.
Fine-grained recognition has many applications in many fields and aims to identify targets from subcategories. This is a highly challenging task due to the minor differences between subcategories. Both modal missing and adversarial sample attacks are easily encountered in fine-grained recognition tasks based on multimodal data. These situations can easily lead to the model needing to be fixed. An Enhanced Framework for the Complementarity of Multimodal Features (EFCMF) is proposed in this study to solve this problem. The model’s learning of multimodal data complementarity is enhanced by randomly deactivating modal features in the constructed multimodal fine-grained recognition model. The results show that the model gains the ability to handle modal missing without additional training of the model and can achieve 91.14% and 99.31% accuracy on Birds and Flowers datasets. The average accuracy of EFCMF on the two datasets is 52.85%, which is 27.13% higher than that of Bi-modal PMA when facing four adversarial example attacks, namely FGSM, BIM, PGD and C&W. In the face of missing modal cases, the average accuracy of EFCMF is 76.33% on both datasets respectively, which is 32.63% higher than that of Bi-modal PMA. Compared with existing methods, EFCMF is robust in the face of modal missing and adversarial example attacks in multimodal fine-grained recognition tasks. The source code is available at https://github.com/RPZ97/EFCMF (accessed on 8 January 2023).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.