Recently there has been a growing concern in academia, industrial research labs and the mainstream commercial media about the phenomenon dubbed as machine bias, where trained statistical models -unbeknownst to their creators -grow to reflect controversial societal asymmetries, such as gender or racial bias. A significant number of Artificial Intelligence tools have recently been suggested to be harmfully biased towards some minority, with reports of racist criminal behavior predictors, Apple's Iphone X failing to differentiate between two distinct Asian people and the now infamous case of Google photos' mistakenly classifying black people as gorillas. Although a systematic study of such biases can be difficult, we believe that automated translation tools can be exploited through gender neutral languages to yield a window into the phenomenon of gender bias in AI.In this paper, we start with a comprehensive list of job positions from the U.S. Bureau of Labor Statistics (BLS) and used it in order to build sentences in constructions like "He/She is an Engineer" (where "Engineer" is replaced by the job position of interest) in 12 different gender neutral languages such as Hungarian, Chinese, Yoruba, and several others. We translate these sentences into English using the Google Translate API, and collect statistics about the frequency of female, male and gender-neutral pronouns in the translated output. We then show that Google Translate exhibits a strong tendency towards male defaults, in particular for fields typically associated to unbalanced gender distribution or stereotypes such as STEM (Science, Technology, Engineering and Mathematics) jobs. We ran these statistics against BLS' data for the frequency of female participation in each job position, in which we show that Google Translate fails to reproduce a real-world distribution of female workers. In summary, we provide experimental evidence that even if one does not expect in principle a 50:50 pronominal gender distribution, Google Translate yields male defaults much more frequently than what would be expected from demographic data alone.We believe that our study can shed further light on the phenomenon of machine bias and are hopeful that it will ignite a debate about the need to augment current statistical translation tools with debiasing techniques -which can already be found in the scientific literature.
Graph Neural Networks (GNN) are a promising technique for bridging differential programming and combinatorial domains. GNNs employ trainable modules which can be assembled in different configurations that reflect the relational structure of each problem instance. In this paper, we show that GNNs can learn to solve, with very little supervision, the decision variant of the Traveling Salesperson Problem (TSP), a highly relevant N P-Complete problem. Our model is trained to function as an effective message-passing algorithm in which edges (embedded with their weights) communicate with vertices for a number of iterations after which the model is asked to decide whether a route with cost < C exists. We show that such a network can be trained with sets of dual examples: given the optimal tour cost C * , we produce one decision instance with target cost x% smaller and one with target cost x% larger than C * . We were able to obtain 80% accuracy training with −2%, +2% deviations, and the same trained model can generalize for more relaxed deviations with increasing performance. We also show that the model is capable of generalizing for larger problem sizes. Finally, we provide a method for predicting the optimal route cost within 2% deviation from the ground truth. In summary, our work shows that Graph Neural Networks are powerful enough to solve N P-Complete problems which combine symbolic and numeric data.
Complex networks are ubiquitous to several Computer Science domains. Centrality measures are an important analysis mechanism to uncover vital elements of complex networks. However, these metrics have high computational costs and requirements that hinder their applications in large real-world networks. In this tutorial, we explain how the use of neural network learning algorithms can render the application of the metrics in complex networks of arbitrary size. Moreover, the tutorial describes how to identify the best configuration for neural network training and learning such for tasks, besides presenting an easy way to generate and acquire training data. We do so by means of a general methodology, using complex network models adaptable to any application. We show that a regression model generated by the neural network successfully approximates the metric values and therefore are a robust, effective alternative in real-world applications. The methodology and proposed machine learning model use only a fraction of time with respect to other approximation algorithms, which is crucial in complex network applications.
Measures of complex network analysis, such as vertex centrality, have the potential to unveil existing network patterns and behaviors. They contribute to the understanding of networks and their components by analyzing their structural properties, which makes them useful in several computer science domains and applications. Unfortunately, there is a large number of distinct centrality measures and little is known about their common characteristics in practice. By means of an empirical analysis, we aim at a clear understanding of the main centrality measures available, unveiling their similarities and differences in a large number of distinct social networks. Our experiments show that the vertex centrality measures known as information, eigenvector, subgraph, walk betweenness and betweenness can distinguish vertices in all kinds of networks with a granularity performance at 95%, while other metrics achieved a considerably lower result. In addition, we demonstrate that several pairs of metrics evaluate the vertices in a very similar way, i.e. their correlation coefficient values are above 0.7. This was unexpected, considering that each metric presents a quite distinct theoretical and algorithmic foundation. Our work thus contributes towards the development of a methodology for principled network analysis and evaluation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.