As one kind of remote sensing image (RSI), Directional Polarimetric Camera (DPC) data are of great significance in atmospheric radiation transfer and climate feedback. The availability of DPC images is often hindered by clouds, and effective cloud detection is the premise of many applications. Conventional threshold-based cloud detection methods are limited in performance and generalization capability. In this paper, we propose an effective learning-based 3D multimodal fusion cloud detection network (MFCD-Net) model. The network is a three-input stream architecture with a 3D-Unet-like encoder-decoder structure to fuse the multiple modalities of reflectance image, polarization image Q, and polarization image U in DPC imagery, with consideration of the angle and spectral information. Furthermore, cross attention is utilized in fusing the polarization features into the spatial-angle-spectral features in the reflectance image to enhance the expression of the fused features. The dataset used in this paper is obtained from the DPC cloud product and the cloud mask product. The proposed MFCD-Net achieved excellent cloud detection performance, with a recognition accuracy of 95.74%, according to the results of the experiments.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.