The risks associated with landslides are increasing the personal losses and material damages in more and more areas of the world. These natural disasters are related to geological and extreme meteorological phenomena (e.g., earthquakes, hurricanes) occurring in regions that have already suffered similar previous natural catastrophes. Therefore, to effectively mitigate the landslide risks, new methodologies must better identify and understand all these landslide hazards through proper management. Within these methodologies, those based on assessing the landslide susceptibility increase the predictability of the areas where one of these disasters is most likely to occur. In the last years, much research has used machine learning algorithms to assess susceptibility using different sources of information, such as remote sensing data, spatial databases, or geological catalogues. This study presents the first attempt to develop a methodology based on an automatic machine learning (AutoML) framework. These frameworks are intended to facilitate the development of machine learning models, with the aim to enable researchers focus on data analysis. The area to test/validate this study is the center and southern region of Guerrero (Mexico), where we compare the performance of 16 machine learning algorithms. The best result achieved is the extra trees with an area under the curve (AUC) of 0.983. This methodology yields better results than other similar methods because using an AutoML framework allows to focus on the treatment of the data, to better understand input variables and to acquire greater knowledge about the processes involved in the landslides.
Several performance metrics are currently available to evaluate the performance of Machine Learning (ML) models in classification problems. ML models are usually assessed using a single measure because it facilitates the comparison between several models. However, there is no silver bullet since each performance metric emphasizes a different aspect of the classification. Thus, the choice depends on the particular requirements and characteristics of the problem. An additional problem arises in multi-class classification problems, since most of the well-known metrics are only directly applicable to binary classification problems. In this paper, we propose the General Performance Score (GPS), a methodological approach to build performance metrics for binary and multi-class classification problems. The basic idea behind GPS is to combine a set of individual metrics, penalising low values in any of them. Thus, users can combine several performance metrics that are relevant in the particular problem based on their preferences obtaining a conservative combination. Different GPS-based performance metrics are compared with alternatives in classification problems using real and simulated datasets. The metrics built using the proposed method improve the stability and explainability of the usual performance metrics. Finally, the GPS brings benefits in both new research lines and practical usage, where performance metrics tailored for each particular problem are considered.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.