This study focuses on the analysis of the distribution, both spatial and temporal, of the PM10 (particulate matter with a diameter of 10 µm or less) concentrations recorded in nine EMEP (European Monitoring and Evaluation Programme) background stations distributed throughout mainland Spain between 2001 and 2019. A study of hierarchical clusters was used to classify the stations into three main groups with similarities in yearly concentrations: GC (coastal location), GNC (north–central location), and GSE (southeastern location). The highest PM10 concentrations were registered in summer. Annual evolution showed statistically significant decreasing trends in PM10 concentration in all the stations covering a range from −0.21 to −0.50 µg m−3/year for Barcarrota and Víznar, respectively. Through the Lamb classification, the weather types were defined during the study period, and those associated with high levels of pollution were identified. Finally, the values exceeding the limits established by the legislation were analyzed for every station assessed in the study.
Modelling non-native marine species distributions is still a challenging activity. This study aims to predict the global distribution of five widespread introduced seaweed species by focusing on two mains aspects of the ensemble modeling process: (1) Does the enforcement of less complex models (in terms of number of predictors) help in obtaining better predictions? (2) What are the implications of tuning the configuration of individual algorithms in terms of ecological realism? Regarding the first aspect, two datasets with different number of predictors were created. Regarding the second aspect, four algorithms and three configurations were tested. Models were evaluated using common evaluation metrics (AUC, TSS, Boyce index and TSS-derived sensitivity) and ecological realism. Finally, a stepwise procedure for model selection was applied to build the ensembles. Models trained with the large predictor dataset generally performed better than models trained with the reduced dataset, but with some exceptions. Regarding algorithms and configurations, Random Forest (RF) and Generalized Boosting Models (GBM) scored the highest metric values in average, even though, RF response curves were the most unrealistic and non-smooth and GBM showed overfitting for some species. Generalized Linear Models (GLM) and MAXENT, despite their lower scores, fitted smoother curves (especially at intermediate complexity levels). Reliable and biologically meaningful predictions were achieved. Inspecting the number of predictors to include in final ensembles and the selection of algorithms and its complexity have been demonstrated to be crucial for this purpose. Additionally, we highlight the importance of combining quantitative (based on multiple evaluation metrics) and qualitative (based on ecological realism) methods for selecting optimal configurations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.