The political campaigns in Brazilian elections are mostly financed by public money. Every candidate has to provide detailed accountability reports to the legal authorities, which must be analyzed in a short time frame in search of eventual fraud or suspicious transactions. In this work we have compiled a real data set from 2016 Brazilian elections for all city councils in the São Paulo state and used it to propose a framework of data segmentation analysis and validation. An exploratory data analysis is performed to determine the features distribution and to define the required feature pre-processing tasks. A clustering analysis using DBSCAN method is applied to a subset of the original data, focused on segmenting the spending data regarding contracts with car fuel providers and detecting potential outliers. Three clusters were identified and a ridge regression model was used to evaluate the most important features on cluster definition. One cluster was related to candidates that received zero votes and the remaining two discriminated suppliers if they had or not contracts almost exclusively related to candidate spending on car fuel. The hyperparameters from the clustering analysis were validated using a bootstrap method and a null hypothesis of data set structure randomness was rejected using a Monte Carlo approach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.