Thousands of deaths associated with air pollution each year could be prevented by forecasting the behavior of factors that pose risks to people's health and their geographical distribution. Proximity to pollution sources, degree of urbanization, and population density are some of the factors whose spatial distribution enables the identification of possible influence on the presence of respiratory diseases (RD). Currently, Bogotá is among the cities with the poorest air quality in Latin America.Specifically, the locality of Kennedy is one of the zones in the city with the highest recorded concentration levels of local pollutants over the last 10 years. From 2009 -2016, there were 8619 deaths associated with respiratory and cardiovascular diseases in the locality. Given these characteristics, this study set out to identify and analyze the areas in which the primary socio-economic and environmental conditions contribute to the presence of symptoms associated with RD. To this end, information collected in field by performing georeferenced surveys was analyzed through geostatistical and machine learning tools which carried out cluster and pattern analyses. Random forests and AdaBoost were applied to establish hotspots where RD could occur, given the conjugation of predictor variables in the micro-territory. It was found that random forests outperformed AdaBoost with 0.63 AUC. In particular, this study's approach applies to densely populated municipalities with high levels of air pollution. In using these tools, Municipalities can anticipate environmental health situations and reduce the cost of respiratory disease treatments.
Different studies have been carried out to evaluate the progress made by countries and cities towards achieving sustainability to compare its evolution. However, the micro-territorial level, which encompasses a community perspective, has not been examined through a comprehensive forecasting method of sustainability categories with machine learning tools. This study aims to establish a method to forecast the sustainability levels of an urban ecosystem through supervised modeling. To this end, it was necessary to establish a set of indicators that characterize the dimensions of sustainable development, consistent with the Sustainable Development Goals. Using the data normalization technique to process the information and combining it in different dimensions made it possible to identify the sustainability level of the urban zone for each year from 2009 to 2017. The resulting information was the basis for the supervised classification. It was found that the sustainability level in the micro-territory has been improving from a low level in 2009, which increased to a medium level in the subsequent years. Forecasts of the sustainability levels of the zone were possible by using decision trees, neural networks, and support vector machines, in which 70% of the data were used to train the machine learning tools, with the remaining 30% used for validation. According to the performance metrics, decision trees outperformed the other two tools.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.