Earth observation (EO) has an immense potential as being an enabling tool for mapping spatial characteristics of the topsoil layer. Recently, deep learning based algorithms and cloud computing infrastructure have become available with a great potential to revolutionize the processing of EO data. This paper aims to present a novel EO-based soil monitoring approach leveraging open-access Copernicus Sentinel data and Google Earth Engine platform. Building on key results from existing data mining approaches to extract bare soil reflectance values the current study delivers valuable insights on the synergistic use of open access optical and radar images. The proposed framework is driven by the need to eliminate the influence of ambient factors and evaluate the efficiency of a convolutional neural network (CNN) to effectively combine the complimentary information contained in the pool of both optical and radar spectral information and those form auxiliary geographical coordinates mainly for soil. We developed and calibrated our multi-input CNN model based on soil samples (calibration = 80% and validation 20%) of the LUCAS database and then applied this approach to predict soil clay content. A promising prediction performance (R2 = 0.60, ratio of performance to the interquartile range (RPIQ) = 2.02, n = 6136) was achieved by the inclusion of both types (synthetic aperture radar (SAR) and laboratory visible near infrared–short wave infrared (VNIR-SWIR) multispectral) of observations using the CNN model, demonstrating an improvement of more than 5.5% in RMSE using the multi-year median optical composite and current state-of-the-art non linear machine learning methods such as random forest (RF; R2 = 0.55, RPIQ = 1.91, n = 6136) and artificial neural network (ANN; R2 = 0.44, RPIQ = 1.71, n = 6136). Moreover, we examined post-hoc techniques to interpret the CNN model and thus acquire an understanding of the relationships between spectral information and the soil target identified by the model. Looking to the future, the proposed approach can be adopted on the forthcoming hyperspectral orbital sensors to expand the current capabilities of the EO component by estimating more soil attributes with higher predictive performance.
It has been demonstrated that diffuse reflectance spectroscopy in the visible and near‐infrared (vis–NIR) can be exploited to predict chemical and physical soil properties. Immense soil spectral libraries (SSL) are being developed; therefore, more elaborate tools that capitalize on contemporary knowledge and techniques need to be established to provide accurate predictions. In this paper, we propose a novel genetic algorithm‐based stacking model that makes synergetic use of multiple models developed from different preprocessed spectral sources (termed L1 models). This is a form of ensemble learning where multiple hypotheses are combined to create a more robust and more accurate ensemble hypothesis. The genetic algorithm automatically defines the configuration of the stacked model, by selecting the best cooperating subset of the initial models. Our methodology was tested on the newly developed GEO‐CRADLE SSL to predict soil organic matter (SOM). Results showed that the accuracy of prediction of the proposed method (
R2 = 0.76, and ratio of performance to interquartile range [RPIQ] = 2.22) was better than the one attained by the best L1 model (
R2 = 0.65, RPIQ = 1.93). This approach can thus be effectively utilized to enhance the predictions of soil properties in small and large soil spectral libraries alike.
Highlights
A novel model stacking algorithm is proposed, combining spectral models from different sources and algorithms.
Use of a genetic algorithm is examined, accounting for the large number of possible model permutations.
The methodology was applied to the GEO‐CRADLE vis–NIR soil spectral library.
Results indicate that model stacking creates more accurate and robust models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.