Marcos M. Raimundo scite author profile

Works requiring taxonomic knowledge face several challenges, such as arduous identification of many taxa and an insufficient number of taxonomists to identify a great deal of collected organisms. Machine learning tools, particularly convolutional neural networks (CNNs), are then welcome to automatically generate high-performance classifiers from available data. Supported by the image datasets available at the largest online database on ant biology, the AntWeb (www.antweb.org), we propose here an ensemble of CNNs to identify ant genera directly from the head, profile and dorsal perspectives of ant images. Transfer learning is also considered to improve the individual performance of the CNN classifiers. The performance achieved by the classifiers is diverse enough to promote a reduction in the overall classification error when they are combined in an ensemble, achieving an accuracy rate of over 80% on top-1 classification and an accuracy of over 90% on top-3 classification.

CriPAV: Street-Level Crime Patterns Analysis and Visualization

Garcia-Zanabria¹,

IEEE Trans. Visual. Comput. Graphics

Poco

et al. 2022

Extracting and analyzing crime patterns in big cities is a challenging spatiotemporal problem. The hardness of the problem is linked to two main factors, the sparse nature of the crime activity and its spread in large spatial areas. Sparseness hampers most time series (crime time series) comparison methods from working properly, while the handling of large urban areas tends to render the computational costs of such methods impractical. Visualizing different patterns hidden in crime time series data is another issue in this context, mainly due to the number of patterns that can show up in the time series analysis. In this paper, we present a new methodology to deal with the issues above, enabling the analysis of spatiotemporal crime patterns in a street-level of detail. Our approach is made up of two main components designed to handle the spatial sparsity and spreading of crimes in large areas of the city. The first component relies on a stochastic mechanism from which one can visually analyze probable×intensive crime hotspots. Such analysis reveals important patterns that can not be observed in the typical intensity-based hotspot visualization. The second component builds upon a deep learning mechanism to embed crime time series in Cartesian space. From the embedding, one can identify spatial locations where the crime time series have similar behavior. The two components have been integrated into a web-based analytical tool called CriPAV (Crime Pattern Analysis and Visualization), which enables global as well as a street-level view of crime patterns. Developed in close collaboration with domain experts, CriPAV has been validated through a set of case studies with real crime data in S ão Paulo -Brazil. The provided experiments and case studies reveal the effectiveness of CriPAV in identifying patterns such as locations where crimes are not intense but highly probable to occur as well as locations that are far apart from each other but bear similar crime patterns.

An extension of the non-inferior set estimation algorithm for many objectives

European Journal of Operational Research

Ferreira

Zuben

2020

Mining Pareto-optimal counterfactual antecedents with a branch-and-bound model-agnostic algorithm

Nonato

Poco

2022

Data Min Knowl Disc

Mining counterfactual antecedents became a valuable tool to discover knowledge and explain machine learning models. It consists of generating synthetic samples from an original sample to achieve the desired outcome in a machine learning model thus helping to understand the prediction. An insightful methodology would explore a broader set of counterfactual antecedents to reveal multiple possibilities while operating on any classifier. Thus, we create a treebased search that requires monotonicity from the objective functions (a.k.a. cost functions); it allows pruning branches that will not improve the objective functions. Since monotonicity is only required for the objective function, this method can be used for any family of classifiers (e.g., linear models, neural networks, decision trees). However, additional classifier properties speed up the tree-search when it foresees branches that will not result in feasible actions. Moreover, the proposed optimization generates a diverse set of Pareto-optimal counterfactual antecedents by relying on multi-objective concepts. The results show an algorithm with working guarantees that enumerates a wide range of counterfactual antecedents. It helps the decision-maker understand the machine learning decision and finds alternatives to achieve the desired outcome. The user can inspect these multiple counterfactual antecedents to find the most suitable one and have a broader understanding of the prediction.

Mining Pareto-Optimal Counterfactual Antecedents With A Branch-And-Bound Model-Agnostic Algorithm

Nonato

Poco

2021

Preprint

Mining counterfactual antecedents became a valuable tool to discover knowledge and explain machine learning models. It consists of generating synthetic samples from an original sample to achieve the desired outcome in a machine learning model thus helping to understand the prediction. An insightful methodology would explore a broader set of counterfactual antecedents to reveal multiple possibilities while operating on any classifier. Thus, we create a tree-based search that requires monotonicity from the objective functions (a.k.a. cost functions); it allows pruning branches that will not improve the objective functions. Since monotonicity is only required for the objective function, this method can be used for any family of classifiers (e.g., linear models, neural networks, decision trees). However, additional classifier properties speed up the tree-search when it foresees branches that will not result in feasible actions. Moreover, the proposed optimization generates a diverse set of Pareto-optimal counterfactual antecedents by relying on multi-objective concepts. The results show an algorithm with working guarantees that enumerates a wide range of counterfactual antecedents. It helps the decision-maker understand the machine learning decision and finds alternatives to achieve the desired outcome. The user can inspect these multiple counterfactual antecedents to find the most suitable one and have a broader understanding of the prediction.

Enforcing fairness using ensemble of diverse Pareto-optimal models

Guardieiro

Raimundo²,

Poco³

2023

Data Min Knowl Disc

Investigating multiobjective methods in multitask classification

Zuben

2018

Multi-criteria analysis involving Pareto-optimal misclassification tradeoffs on imbalanced datasets

Zuben

2020