Abstract. Soil organic carbon (SOC) plays a major role concerning chemical, physical,
and biological soil properties and functions. To get a better understanding
of how soil management affects the SOC content, the precise monitoring of
SOC on long-term field experiments (LTFEs) is needed. Visible and
near-infrared (Vis–NIR) reflectance spectrometry provides an inexpensive and
fast opportunity to complement conventional SOC analysis and has often been
used to predict SOC. For this study, 100 soil samples were collected at an
LTFE in central Germany by two different sampling designs. SOC values ranged
between 1.5 % and 2.9 %. Regression models were built using partial least
square regression (PLSR). In order to build robust models, a nested repeated
5-fold group cross-validation (CV) approach was used, which comprised model
tuning and evaluation. Various aspects that influence the obtained error
measure were analysed and discussed. Four pre-processing methods were
compared in order to extract information regarding SOC from the spectra.
Finally, the best model performance which did not consider error propagation
corresponded to a mean RMSEMV of 0.12 % SOC (R2=0.86). This model performance was impaired by ΔRMSEMV=0.04 % SOC while considering input data uncertainties (ΔR2=0.09), and by ΔRMSEMV=0.12 % SOC
(ΔR2=0.17) considering an inappropriate
pre-processing. The effect of the sampling design amounted to a ΔRMSEMV of 0.02 % SOC (ΔR2=0.05). Overall,
we emphasize the necessity of transparent and precise documentation of the
measurement protocol, the model building, and validation procedure in order
to assess model performance in a comprehensive way and allow for a
comparison between publications. The consideration of uncertainty
propagation is essential when applying Vis–NIR spectrometry for soil
monitoring.