Sina Mallah scite author profile

Sina Mallah

3Publications

4Citation Statements Received

0Citation Statements Given

How they've been cited

How they cite others

Affiliations

Agricultural Research & Education Organization, Soil Conservation and Watershed Management Research, University of Tehran

Publications

Order By: Most citations

Predicting Soil Textural Classes Using Random Forest Models: Learning from Imbalanced Dataset

et al. 2022

View full text Add to dashboard Cite

Soil provides a key interface between the atmosphere and the lithosphere and plays an important role in food production, ecosystem services, and biodiversity. Recently, demands for applying machine learning (ML) methods to improve the knowledge and understanding of soil behavior have increased. While real-world datasets are inherently imbalanced, ML models overestimate the majority classes and underestimate the minority ones. The aim of this study was to investigate the effects of imbalance in training data on the performance of a random forest model (RF). The original dataset (imbalanced) included 6100 soil texture data from the surface layer of agricultural fields in northern Iran. A synthetic resampling approach using the synthetic minority oversampling technique (SMOTE) was employed to make a balanced dataset from the original data. Bioclimatic and remotely sensed data, distance, and terrain attributes were used as environmental covariates to model and map soil textural classes. Results showed that based on mean minimal depth (MMD), when imbalanced data was used, distance and annual mean precipitation were important, but when balanced data were employed, terrain attributes and remotely sensed data played a key role in predicting soil texture. Balanced data also improved the accuracies from 44% to 59% and 0.30 to 0.52 with regard to the overall accuracy and kappa values, respectively. Similar increasing trends were observed for the recall and F-scores. It is concluded that, in modeling soil texture classes using RF models through a digital soil mapping approach, data should be balanced before modeling.

show abstract

Towards a global soil taxonomy and classification tool for predicting multi-level soil hierarchy

Mallah

Bodaghabadi

2021

Model. Earth Syst. Environ.

View full text Add to dashboard Cite

Deep Insight on Land Use/Land Cover Geospatial Assessment through Internet-Based Validation Tool in Upper Karkheh River Basin (KRB), South-West Iran

Mallah

Gorji

Balali

et al. 2023

Land

View full text Add to dashboard Cite

Recently, the demand for high-quality land use/land cover (LULC) information for near-real-time crop type mapping, in particular for multi-relief landscapes, has increased. While the LULC classes are inherently imbalanced, the statistics generally overestimate the majority classes and underestimate the minority ones. Therefore, the aim of this study was to assess the classes of the 10 m European Satellite Agency (ESA) WorldCover 2020 land use/land cover product with the support of the Google Earth Engine (GEE) in the Honam sub-basin, west Iran, using the LACOVAL (validation tool for regional-scale land cover and land cover change) online platform. The effect of imbalanced ground truth has also been explored. Four sampling schemes were employed on a total of 720 collected ground truth points over approximately 14,100 ha. The grassland and cropland totally canopied 94% of the study area, while barren land, shrubland, trees and built-up covered the rest. The results of the validation accuracy showed that the equalized sampling scheme was more realistically successful than the others in terms of roughly the same overall accuracy (91.6%), mean user’s accuracy (91.6%), mean producers’ accuracy (91.9%), mean partial portmanteau (91.9%) and kappa (0.9). The product was statistically improved to 93.5% ± 0.04 by the assembling approach and segmented with the help of supplementary datasets and visual interpretation. The findings confirmed that, in mapping LULC, data of classes should be balanced before accuracy assessment. It is concluded that the product is a reliable dataset for environmental modeling at the regional scale but needs some modifications for bare land and grassland classes in mountainous semi-arid regions of the globe.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Sina Mallah

Predicting Soil Textural Classes Using Random Forest Models: Learning from Imbalanced Dataset

Towards a global soil taxonomy and classification tool for predicting multi-level soil hierarchy

Deep Insight on Land Use/Land Cover Geospatial Assessment through Internet-Based Validation Tool in Upper Karkheh River Basin (KRB), South-West Iran

Contact Info

Product

Resources

About