Cyanobacterial blooms are considered a major threat to global water security with documented impacts on lake ecosystems and public health. Given that cyanobacteria possess highly adaptive traits that favor them to prevail under different and often complicated stressor regimes, predicting their abundance is challenging. A dataset from 822 Northern European lakes is used to determine which variables better explain the variation of cyanobacteria biomass (CBB) by means of stepwise multiple linear regression. Chlorophyll-a (Chl-a) and total nitrogen (TN) provided the best modelling structure for the entire dataset, while for subsets of shallow and deep lakes, Chl-a, mean depth, TN and TN/TP explained part of the variance in CBB. Path analysis was performed and corroborated these findings. Finally, CBB was translated to a categorical variable according to risk levels for human health associated with the use of lakes for recreational activities. Several machine learning methods, namely Decision Tree, K-Nearest Neighbors, Support-vector Machine and Random Forest, were applied showing a remarkable ability to predict the risk, while Random Forest parameters were tuned and optimized, achieving a 95.81% accuracy, exceeding the performance of all other machine learning methods tested. A confusion matrix analysis is performed for all machine learning methods, identifying the potential of each method to correctly predict CBB risk levels and assessing the extent of false alarms; random forest clearly outperforms the other methods with very promising results. cyanobacterial abundance in lakes has been the focus of several past and current studies that have highlighted different hydrological, climatic and human-oriented conditions. Cyanobacterial blooms are not a modern phenomenon and have been reported in scientific literature for more than 130 years [9]; however, they tend to appear much more frequently in recent decades, mainly due to anthropogenic activities that tend to change the global climatic and environmental regime. Examples of such activities include changes in hydrological flow pathways, excessive use of fertilizers and the gradual removal of natural buffering zones between terrestrial and freshwater ecosystems [10]. On the contrary, there are some anthropogenic changes, such as flooding and flushing, that tend to reduce the growth of cyanobacteria more than other algae [11].Empirical modeling has recognized the fundamental effect of phosphorus and nitrogen on the fluctuation of cyanobacterial biomass, incriminating over-enrichment of lakes with nutrients as a major driver of cyanobacterial blooms [12][13][14][15][16]. In addition, high air or water temperature [11,17], calm weather (low wind speed) [18], high water residence time [19,20], low nitrogen-to-phosphorus ratios [21,22] and low light availability [23,24] are documented as significant factors and possible predictors that determine the dominance of cyanobacteria. However, predicting the concentration of cyanobacterial biomass in lakes is a complex and challen...