Elevated
groundwater nitrate poses risk to the ecosystem and human
health, and delineating the extent of elevated groundwater nitrate
risk is essential for effective groundwater management and public
health safety. Here, using machine learning models (Random Forest,
Boosted Regression Tree, and Logistic Regression) on a large, in situ
dataset, we have predicted the first nationwide extent of groundwater
nitrate contamination risk (concentration >45 mg/L) across India.
We also aimed to delineate the intrinsic (e.g., climate, geomorphic,
hydrogeologic) and extraneous (e.g., anthropogenic input) predictors
for identifying groundwater pollution risk. Of these models, Random
Forest performed best and was considered to develop the final prediction
map of groundwater nitrate at 1 km2 resolution. Climate
variables like precipitation and aridity, and anthropogenic influence,
e.g., fertilizer application and population density, were identified
as the most important variables for predicting groundwater nitrate
risk. Dry arid and semiarid regions in the west, south, and central
parts of the country contained the majority of high-risk areas. Predictions
suggested that about 37% of India’s areal extent and 380 million
people were exposed to elevated nitrate. The prediction model performed
satisfactorily over the validation dataset that indicates the prediction
ability of the model at the local scale. The study aims to provide
an effective approach for identifying elevated groundwater nitrate
risk and aid in the development of awareness and strategies to uphold
public health safety.