Chlorophyll‐a (Chl‐a) is one of the most important indicators of the trophic status of inland waters, and its continued monitoring is essential. Recently, the operated Sentinel‐2 MSI satellite offers high spatial resolution images for remote water quality monitoring. In this study, we tested the performance of the three well‐known machine learning (ML) (random forest [RF], support vector machine [SVM], and Gaussian process [GP]) and the two novel ML (extreme gradient boost (XGB) and CatBoost [CB]) models for estimation a wide range of Chl‐a concentration (10.1–798.7 μg/L) using the Sentinel‐2 MSI data and in situ water quality measurement in the Tri An Reservoir (TAR), Vietnam. GP indicated the most reliable model for predicting Chl‐a from water quality parameters (R2 = 0.85, root‐mean‐square error [RMSE] = 56.65 μg/L, Akaike's information criterion [AIC] = 575.10, and Bayesian information criterion [BIC] = 595.24). Regarding input model as water surface reflectance, CB was the superior model for Chl‐a retrieval (R2 = 0.84, RMSE = 46.28 μg/L, AIC = 229.18, and BIC = 238.50). Our results indicated that GP and CB are the two best models for the prediction of Chl‐a in TAR. Overall, the Sentinel‐2 MSI coupled with ML algorithms is a reliable, inexpensive, and accurate instrument for monitoring Chl‐a in inland waters.
Practitioner points
Machine learning algorithms were used for both remote sensing data and in situ water quality measurements.
The performance of five well‐known machine learning models was tested
Gaussian process was the most reliable model for predicting Chl‐a from water quality parameters
CatBoost was the best model for Chl‐a retrieval from water surface reflectance