Abstract. Machine learning (ML) and data-driven approaches are increasingly used in
many research areas. Extreme gradient boosting (XGBoost) is a tree boosting method that has evolved into
a state-of-the-art approach for many ML challenges. However, it has rarely
been used in simulations of land use change so far. Xilingol, a typical
region for research on serious grassland degradation and its drivers, was
selected as a case study to test whether XGBoost can provide alternative
insights that conventional land-use models are unable to generate. A set of
20 drivers was analysed using XGBoost, involving four alternative
sampling strategies, and SHAP (Shapley additive explanations) to interpret
the results of the purely data-driven approach. The results indicated that,
with three of the sampling strategies (over-balanced, balanced, and
imbalanced), XGBoost achieved similar and robust simulation results. SHAP
values were useful for analysing the complex relationship between the
different drivers of grassland degradation. Four drivers accounted for
99 % of the grassland degradation dynamics in Xilingol. These four drivers
were spatially allocated, and a risk map of further degradation was
produced. The limitations of using XGBoost to predict future land-use change
are discussed.