Most machine learning tasks can be categorized into classification or regression problems. Regression and classification models are normally used to extract useful geographic information from observed or measured spatial data, such as land cover classification, spatial interpolation, and quantitative parameter retrieval. This paper reviews the progress of four advanced machine learning methods for spatial data handling, namely, support vector machine (SVM)-based kernel learning, semi-supervised and active learning, ensemble learning, and deep learning. These four machine learning modes are representative because they improve learning performances from different views, for example, feature space transform and decision function (SVM), optimized uses of samples (semi-supervised and active learning), and enhanced learning models and capabilities (ensemble learning and deep learning). For spatial data handling via machine learning that can be improved by the four machine learning models, three key elements are learning algorithms, training samples, and input features. To apply machine learning methods to spatial data handling successfully, a four-level strategy is suggested: experimenting and evaluating the applicability, extending the algorithms by embedding spatial properties, optimizing the parameters for better performance, and enhancing the algorithm by multiple means. Firstly, the advances of SVM are reviewed to demonstrate the merits of novel machine learning methods for spatial data, running the line from direct use and comparison with traditional classifiers, and then targeted improvements to address multiple class problems, to optimize parameters of SVM, and to use spatial and spectral features. To overcome the limits of small-size training samples, semi-supervised learning and active learning methods are then utilized to deal with insufficient labeled samples, showing the potential of learning from small-size training samples. Furthermore, considering the poor generalization capacity and instability of machine learning algorithms, ensemble learning is introduced to integrate the advantages of multiple learners and to enhance the generalization capacity. The typical research lines, including the combination of multiple classifiers, advanced ensemble classifiers, and spatial interpolation, are presented. Finally, deep learning, one of the most popular branches of machine learning, is reviewed with specific examples for scene classification and urban structural type recognition from high-resolution remote sensing images. By this review, it can be concluded that machine learning methods are very effective for spatial data handling and have wide application potential in the big data era.
Convolutional neural network (CNN) has exhibited enormous potentials in hyperspectral image (HSI) classification owing to excellent locally modeling ability. Although excellent performance of CNN-based methods has been witnessed, there still have some limitations of their internal network backbone. On the one hand, modeling long-distance context dependencies is an inborn defect, which leads to receptive field limitation and insufficient feature capture in HSI. On the other hand, CNN-based methods usually need various sample distribution to train and cannot infer dynamically, which may not capture the inherent changes of HSI data well. To overcome the above issues, we propose a novel local transformer with spatial partition restore network (SPRLT-Net) for HSI classification. Firstly, local transformer is introduced to obtain the spatial attention weights dynamically by measuring the similarity between related pixel pairs. Secondly, a spatial partition restore (SPR) module is designed to split the input patch into several overlapping continuous sub-patches as sequential. With the obtained attention weights at hand, the SPR module restores the sequential to the original patch. Finally, a fully connected layer is used for classification. SPRLT-Net can capture global context dependencies, and the dynamical attention weights can adapt the inherent changes of HSI spatial pixels. Experimental results based on spatially disjoint samples and randomly selected samples of five benchmark data sets demonstrate that SPRLT-Net outperforms the other state-of-the-art methods in terms of classification accuracy, generalization performance, and computational complexity.
Monitoring mangroves is critical to protect the coastal ecosystems. Some studies resorted to remote sensing for constructing mangrove indices (MIs). However, there are still some drawbacks in existing MIs. On the one hand, difficulty still persists in distinguishing mangroves from non-mangrove vegetation and non-vegetated areas at the same time. On the other hand, the existing MIs have not fully utilized the phenological trajectories, which can greatly help to distinguish mangroves from other land covers. To overcome these issues, we built a novel mangrove index, namely generalized composite mangrove index (GCMI) by compositing vegetation indices (VIs) and water indices (WIs) based on Sentinel-2 time series data. Firstly, to determine the optimal indices, a similarity trend distance (ST distance) measure was proposed based on Pearson correlation coefficient and dynamic time warping (DTW). Secondly, in order to optimize the weights of selected indices, a population reconstruction genetic algorithm (PRGA) was designed. Finally, mangroves were mapped by feeding the time series of GCMI into random forest (RF) classifier. Experiments conducted over three areas along the southern coast of China demonstrate that: 1) GCMI enhances the separability between mangroves and other land covers compared to the existing VIs, WIs, and MIs, with an averaged OA of 91.45%; 2) ST distance outperforms Euclidean distance, Cosine distance, Pearson correlation coefficient, and DTW in optimizing the weights of GCMI; and 3) PRGA greatly improves the probability of attaining global optimal result. The innovation lies in the presented GCMI considering both the vegetation trajectory information and water inundation using time series.
Soil moisture (SM) is a critical parameter in maintaining the balance of water cycle and energy budgets between climate system and the Earth's environment. Generalized regression neural network (GRNN) has been substantially verified as a powerful model for SM estimation due to the ability of capturing complex, non-linear relationships between predictors and responses. However, GRNN builds a full adjacency matrix using Gaussian kernel, which is computationally expensive and may ignore the local structure. In addition, it is laborious to optimize the "spread" parameter. To overcome the above issues, we propose an enhanced generalized regression neural network (EGRNN) for SM estimation, where two main adaptations are made. On the one hand, the City block distance instead of the Euclidean distance is used for building Gaussian kernel. On the other hand, k-nearest neighbors (k-NN) is adopted to yield an empirically sparse adjacency matrix. As the key advantage, the proposed EGRNN weakens the sensitivity to outliers since large differences are weighted more heavily by using Euclidean distance than City block distance. Another advantage is that EGRNN models more local and discriminant information in the pattern layer since only the data points within neighbors are connected by using k-NN. Experiments conducted in the Qinghai-Tibet Plateau (QTP) demonstrate that; 1) EGRNN outperforms the other four neural network models, with R = 0.9485 and RMSE = 0.0325 cm 3 /cm 3 ; 2) It can well capture spatial-temporal dynamics and has higher consistent with the in-situ measurements; 3) It adapts well to different in-situ networks and has better generalization performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.