After publication of the research paper [1], two identical figures were pointed out: Figure 18 and Figure 19. In fact, the wrong figure was Figure 19[...]
Landslide detection based on remote sensing images is an effective method for rapidly and accurately detecting landslide regions, which can aid in disaster prevention and mitigation. Landslide detection methods based on semantic segmentation can be used to delineate the scope of landslides while detecting their location. Most existing models use multi-temporal or geological data to improve accuracy. However, the large amount of data introduces additional parameters, consuming significant computing resources. Therefore, this study proposes Reg-SA-UNet++, a model for landslide detection, which uses a single-temporal image captured post-landslide. Reg-SA-UNet++ is based on UNet++ with the following modifications: deep supervised pruning is removed for fewer model parameters and increased detection accuracy; RegNet is employed to replace the convolutional blocks in the encoding process to reduce the number of parameters and improve feature acquisition and attention modules are added at the connection of the convolutional blocks of each layer to strengthen the model's attention to landslide features. The overall accuracy and F1 score of the Reg-SA-UNet++ model for the constructed landslide dataset (93.37% and 92.41%, respectively) and landslide mapping (97.09% and 96.10%, respectively) verify the effectiveness of the proposed model in detecting landslides from remote sensing images.
The object detection of unmanned aerial vehicle (UAV) images has widespread applications in numerous fields; however, the complex background, diverse scales, and uneven distribution of objects in UAV images make object detection a challenging task. This study proposes a convolution neural network (CNN)-transformer hybrid model to achieve efficient object detection in UAV images, which has three advantages that contribute to improving the object detection performance. First, the efficient and effective cross-shaped window (CSWin) transformer can be used as a backbone to obtain image features at different levels, and the obtained features can be input into the feature pyramid network (FPN) to achieve multi-scale representation, which will contribute to multi-scale object detection. Second, a hybrid patch embedding module is constructed to extract and utilize low-level information such as the edges and corners of the image. Finally, a slicing-based inference method is constructed to fuse the inference results of the original image and sliced images, which will improve the small object detection accuracy without modifying the original network. Experimental results on public datasets illustrate that the proposed method can improve the performance more effectively than several popular and state-of-the-art object detection methods.
Multispectral image reconstruction, which aims to recover a three-dimensional (3D) spatial-spectral signal from a two-dimensional measurement in a spectral camera based on ghost imaging via sparsity constraint (GISC), has been attracting much attention recently. However, faced with abundant 3D spectral data, the reconstruction quality cannot meet the visual requirements. Based on the robust data processing capability of deep learning, a novel network called SSTU-Net3+ is constructed by improving U-Net3+ with a spatial-spectral transformer (SST). To enhance the feature representation of images during reconstruction, mixed pooling modules and new convolution processes are proposed to improve the performance of the encoder and decoder, with U-Net3+ as the backbone. To boost the quality of reconstructed images, with split and concatenate (Concat) operations, we construct SST modules by exploiting both spatial and spectral correlations of multispectral images to refine the spatial and spectral features. Furthermore, we employ the SST in the decoder to reconstruct the desired 3D cube. Given similar network parameters, experiments on GISC spectral imaging data show that, compared to convolutional neural network-based methods, the average peak signal-to-noise ratio of images reconstructed using SSTU-Net3+ is improved by 3%, the structural similarity is enhanced by 3%, and the spectral angle mapping is cut by 12%. Particularly, compared to differential ghost imaging and compressed sensing, the reconstruction quality of SSTU-Net3+ has been significantly improved. SSTU-Net3+ can process a large amount of 3D multispectral image data more efficiently and construct the target image more accurately than the abovementioned methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.