With constant advances in deep learning methods as applied to image processing, deep convolutional neural networks (CNNs) have been widely explored in single‐image super‐resolution (SISR) problems and have attained significant success. These CNN‐based methods cannot fully use the internal and external information of the image. The authors add a lightweight Transformer structure to capture this information. Specifically, the authors apply a dense block structure and residual connection to build a residual dense convolution block (RDCB) that reduces the parameters somewhat and extracts shallow features. The lightweight transformer block (LTB) further extracts features and learns the texture details between the patches through the self‐attention mechanism. The LTB comprises an efficient multi‐head transformer (EMT) with small graphics processing unit (GPU) memory footprint, and benefits from feature preprocessing by multi‐head attention (MA), reduction, and expansion. The EMT significantly reduces the use of GPU resources. In addition, a detail‐purifying attention block (DAB) is proposed to explore the context information in the high‐resolution (HR) space to recover more details. Extensive evaluations of four benchmark datasets demonstrate the effectiveness of the authors’ proposed model in terms of quantitative metrics and visual effects. The proposed EMT only uses about 40% as much GPU memory as other methods, with better performance.
ABSTRACT:Spatial correlation between pixels is important information for remotely sensed imagery classification. Data field method and spatial autocorrelation statistics have been utilized to describe and model spatial information of local pixels. The original data field method can represent the spatial interactions of neighbourhood pixels effectively. However, its focus on measuring the grey level change between the central pixel and the neighbourhood pixels results in exaggerating the contribution of the central pixel to the whole local window. Besides, Geary's C has also been proven to well characterise and qualify the spatial correlation between each pixel and its neighbourhood pixels. But the extracted object is badly delineated with the distracting salt-and-pepper effect of isolated misclassified pixels. To correct this defect, we introduce the data field method for filtering and noise limitation. Moreover, the original data field method is enhanced by considering each pixel in the window as the central pixel to compute statistical characteristics between it and its neighbourhood pixels. The last step employs a support vector machine (SVM) for the classification of multi-features (e.g. the spectral feature and spatial correlation feature). In order to validate the effectiveness of the developed method, experiments are conducted on different remotely sensed images containing multiple complex object classes inside. The results show that the developed method outperforms the traditional method in terms of classification accuracies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.