Understanding the spatial distribution of soil organic carbon (SOC) content over different climatic regions will enhance our knowledge of carbon gains and losses due to climatic change. However, little is known about the SOC content in the contrasting arid and sub-humid regions of Iran, whose complex SOC–landscape relationships pose a challenge to spatial analysis. Machine learning (ML) models with a digital soil mapping framework can solve such complex relationships. Current research focusses on ensemble ML models to increase the accuracy of prediction. The usual ensemble method is boosting or weighted averaging. This study proposes a novel ensemble technique: the stacking of multiple ML models through a meta-learning model. In addition, we tested the ensemble through rescanning the covariate space to maximize the prediction accuracy. We first applied six state-of-the-art ML models (i.e., Cubist, random forests (RF), extreme gradient boosting (XGBoost), classical artificial neural network models (ANN), neural network ensemble based on model averaging (AvNNet), and deep learning neural networks (DNN)) to predict and map the spatial distribution of SOC content at six soil depth intervals for both regions. In addition, the stacking of multiple ML models through a meta-learning model with/without rescanning the covariate space were tested and applied to maximize the prediction accuracy. Out of six ML models, the DNN resulted in the best modeling accuracies, followed by RF, XGBoost, AvNNet, ANN, and Cubist. Importantly, the stacking of models indicated a significant improvement in the prediction of SOC content, especially when combined with rescanning the covariate space. For instance, the RMSE values for SOC content prediction of the upper 0–5 cm of the soil profiles of the arid site and the sub-humid site by the proposed stacking approaches were 17% and 9% respectively, less than that obtained by the DNN models—the best individual model. This indicates that rescanning the original covariate space by a meta-learning model can extract more information and improve the SOC content prediction accuracy. Overall, our results suggest that the stacking of diverse sets of models could be used to more accurately estimate the spatial distribution of SOC content in different climatic regions.
A recent conversion of the grasslands to cropland degrading the soil quality (SQ), and impacting the soil erosion and crop productivity in the West Corn Belt (WCB) of the USA. The current study was conducted to estimate the spatial distribution of soil erosion at Big Sioux River (BSR) watershed scale using the Geographical Information System (GIS)-enabled Revised Universal Soil Loss Equation (RUSLE). Spatial data such as weather, a digital elevation model (DEM), land use maps and soils were used for assessment of soil erosion was downloaded from the easily available online sources. Data showed that about 7% of grassland acreage reduced from 2008 (24%) to 2015 (17%), whereas, about 7.4% of cropland acreage increased from 2008 (64.6%) to 2015 (72%) in the BSR watershed. This grassland conversion to cropland increased the soil erosion (estimated using the RUSLE model) from 12.2 T ha−1 year−1 in 2008 to 14.8 T ha−1 year−1 in 2015. The present study concludes that grassland conversion to cropland in the BSR watershed increased the soil erosion, therefore, management practices essential to be applied to reduce the erosion risk and various other ecosystem services.
Digital soil maps can be used to depict the ability of soil to fulfill certain functions. Digital maps offer reliable information that can be used in spatial planning programs. Several broad types of data mining approaches through Digital Soil Mapping (DSM) have been tested. The usual approach is to select a model that produces the best validation statistics. However, instead of choosing the best model, it is possible to combine all models realizing their strengths and weaknesses. We applied seven different techniques for the prediction of soil classes based on 194 sites located in Isfahan region. The mapping exercise aims to produce a soil class map that can be used for better understanding and management of soil resources. The models used in this study include Multinomial Logistic Regression (MnLR), Artificial Neural Networks (ANN), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), Bayesian Networks (BN), and Sparse Multinomial Logistic Regression (SMnLR). Two ensemble models based on majority votes (Ensemble.1) and MnLR (Ensemble.2) were implemented for integrating the optimal aspects of the individual techniques. The overall accuracy (OA), Cohen's kappa coefficient index (κ) and the area under the curve (AUC) were calculated based on 10-fold-cross validation with 100 repeats at four soil taxonomic levels. The Ensemble.2 model was able to achieve larger OA, κ coefficient and AUC compared to the best performing individual model (i.e., RF). Results of the ensemble model showed a decreasing trend in OA from Order (0.90) to Subgroup (0.53). This was also the case for the κ statistic, which was the largest for the Order (0.66) and smallest for the Subgroup (0.43). Same decrease was observed for AUC from Order (0.81) to Subgroup (0.67). The improvement in κ was substantial (43 to 60%) at all soil taxonomic levels, except the Order level. We conclude that the application of the ensemble model using the MnLR was optimal, as it provided a highly accurate prediction for all soil taxonomic levels over and above the individual models. It also used information from all models, and thus this method can be recommended for improved soil class modelling. Soil maps created by this DSM approach showed soils that are prone to degradation and need to be carefully managed and conserved to avoid further land degradation.
In order to manage soil salinity effectively, it is necessary to understand the origin and the spatial distribution of salinity. There are about 120 salt dome outcrops in southern Iran and little is known about their contribution as the potential sources of salts and the spatial pattern of salts around them. Six machine learning algorithms were applied to model topsoil electrical conductivity (EC) and sodium adsorption ratio (SAR) in the Darab Plain (surrounded by six salt domes), Fars Province. Decision trees (DT), k‐nearest neighbours (kNN), support vector machines (SVM), Cubist, random forests (RF) and extreme gradient boosting (XGBoost) were used as primary models and the Granger–Ramanathan (GR) method was used to combine the predictions of these models. The results showed that remotely sensed data contributed more to predict EC and SAR than terrain‐based data. In terms of root mean square errors (RMSE), Cubist followed by the RF model, tended to give the best estimates of EC, whereas for SAR, RF performed best and was followed closely by SVM and Cubist. Compared to the primary models, the GR method on average resulted in a decrease of 6.1% and 3.9% in RMSE and an increase of 10% and 10.9% in R2 for EC and SAR, respectively. The spatial pattern of SAR and EC suggested that the contribution of salt domes in soil salinization varied significantly according to their hydraulic behaviour in relation to adjacent aquifers and their activity. In general, the model averaging approach showed the potential to improve the estimates of EC and SAR.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.