Works requiring taxonomic knowledge face several challenges, such as arduous identification of many taxa and an insufficient number of taxonomists to identify a great deal of collected organisms. Machine learning tools, particularly convolutional neural networks (CNNs), are then welcome to automatically generate high-performance classifiers from available data. Supported by the image datasets available at the largest online database on ant biology, the AntWeb (www.antweb.org), we propose here an ensemble of CNNs to identify ant genera directly from the head, profile and dorsal perspectives of ant images. Transfer learning is also considered to improve the individual performance of the CNN classifiers. The performance achieved by the classifiers is diverse enough to promote a reduction in the overall classification error when they are combined in an ensemble, achieving an accuracy rate of over 80% on top-1 classification and an accuracy of over 90% on top-3 classification.
Extracting and analyzing crime patterns in big cities is a challenging spatiotemporal problem. The hardness of the problem is linked to two main factors, the sparse nature of the crime activity and its spread in large spatial areas. Sparseness hampers most time series (crime time series) comparison methods from working properly, while the handling of large urban areas tends to render the computational costs of such methods impractical. Visualizing different patterns hidden in crime time series data is another issue in this context, mainly due to the number of patterns that can show up in the time series analysis. In this paper, we present a new methodology to deal with the issues above, enabling the analysis of spatiotemporal crime patterns in a street-level of detail. Our approach is made up of two main components designed to handle the spatial sparsity and spreading of crimes in large areas of the city. The first component relies on a stochastic mechanism from which one can visually analyze probable×intensive crime hotspots. Such analysis reveals important patterns that can not be observed in the typical intensity-based hotspot visualization. The second component builds upon a deep learning mechanism to embed crime time series in Cartesian space. From the embedding, one can identify spatial locations where the crime time series have similar behavior. The two components have been integrated into a web-based analytical tool called CriPAV (Crime Pattern Analysis and Visualization), which enables global as well as a street-level view of crime patterns. Developed in close collaboration with domain experts, CriPAV has been validated through a set of case studies with real crime data in S ão Paulo -Brazil. The provided experiments and case studies reveal the effectiveness of CriPAV in identifying patterns such as locations where crimes are not intense but highly probable to occur as well as locations that are far apart from each other but bear similar crime patterns.
Mining counterfactual antecedents became a valuable tool to discover knowledge and explain machine learning models. It consists of generating synthetic samples from an original sample to achieve the desired outcome in a machine learning model thus helping to understand the prediction. An insightful methodology would explore a broader set of counterfactual antecedents to reveal multiple possibilities while operating on any classifier. Thus, we create a treebased search that requires monotonicity from the objective functions (a.k.a. cost functions); it allows pruning branches that will not improve the objective functions. Since monotonicity is only required for the objective function, this method can be used for any family of classifiers (e.g., linear models, neural networks, decision trees). However, additional classifier properties speed up the tree-search when it foresees branches that will not result in feasible actions. Moreover, the proposed optimization generates a diverse set of Pareto-optimal counterfactual antecedents by relying on multi-objective concepts. The results show an algorithm with working guarantees that enumerates a wide range of counterfactual antecedents. It helps the decision-maker understand the machine learning decision and finds alternatives to achieve the desired outcome. The user can inspect these multiple counterfactual antecedents to find the most suitable one and have a broader understanding of the prediction.
Mining counterfactual antecedents became a valuable tool to discover knowledge and explain machine learning models. It consists of generating synthetic samples from an original sample to achieve the desired outcome in a machine learning model thus helping to understand the prediction. An insightful methodology would explore a broader set of counterfactual antecedents to reveal multiple possibilities while operating on any classifier. Thus, we create a tree-based search that requires monotonicity from the objective functions (a.k.a. cost functions); it allows pruning branches that will not improve the objective functions. Since monotonicity is only required for the objective function, this method can be used for any family of classifiers (e.g., linear models, neural networks, decision trees). However, additional classifier properties speed up the tree-search when it foresees branches that will not result in feasible actions. Moreover, the proposed optimization generates a diverse set of Pareto-optimal counterfactual antecedents by relying on multi-objective concepts. The results show an algorithm with working guarantees that enumerates a wide range of counterfactual antecedents. It helps the decision-maker understand the machine learning decision and finds alternatives to achieve the desired outcome. The user can inspect these multiple counterfactual antecedents to find the most suitable one and have a broader understanding of the prediction.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.