Fairness is an increasingly important concern as machine learning models are used to support decision making in high-stakes applications such as mortgage lending, hiring, and prison sentencing. This paper introduces a new open source Python toolkit for algorithmic fairness, AI Fairness 360 (AIF360), released under an Apache v2.0 license (https://github.com/ibm/aif360). The main objectives of this toolkit are to help facilitate the transition of fairness research algorithms to use in an industrial setting and to provide a common framework for fairness researchers to share and evaluate algorithms.The package includes a comprehensive set of fairness metrics for datasets and models, explanations for these metrics, and algorithms to mitigate bias in datasets and models. It also includes an interactive Web experience (https://aif360.mybluemix.net) that provides a gentle introduction to the concepts and capabilities for line-of-business users, as well as extensive documentation, usage guidance, and industry-specific tutorials to enable data scientists and practitioners to incorporate the most appropriate tool for their problem into their work products. The architecture of the package has been engineered to conform to a standard paradigm used in data science, thereby further improving usability for practitioners. Such architectural design and abstractions enable researchers and developers to extend the toolkit with their new algorithms and improvements, and to use it for performance benchmarking. A built-in testing infrastructure maintains code quality.
Accuracy is an important concern for suppliers of artificial intelligence (AI) services, but considerations beyond accuracy, such as safety (which includes fairness and explainability), security, and provenance, are also critical elements to engender consumers' trust in a service. Many industries use transparent, standardized, but often not legally required documents called supplier's declarations of conformity (SDoCs) to describe the lineage of a product along with the safety and performance testing it has undergone. SDoCs may be considered multi-dimensional fact sheets that capture and quantify various aspects of the product and its development to make it worthy of consumers' trust. Inspired by this practice, we propose FactSheets to help increase trust in AI services. We envision such documents to contain purpose, performance, safety, security, and provenance information to be completed by AI service providers for examination by consumers. We suggest a comprehensive set of declaration items tailored to AI and provide examples for two fictitious AI services in the appendix of the paper. * A. Olteanu's work was done while at IBM Research. Author is currently affiliated with Microsoft Research.
The goal of image stitching is to create natural-looking mosaics free of artifacts that may occur due to relative camera motion, illumination changes, and optical aberrations. In this paper, we propose a novel stitching method, that uses a smooth stitching field over the entire target image, while accounting for all the local transformation variations. Computing the warp is fully automated and uses a combination of local homography and global similarity transformations, both of which are estimated with respect to the target. We mitigate the perspective distortion in the non-overlapping regions by linearizing the homography and slowly changing it to the global similarity. The proposed method is easily generalized to multiple images, and allows one to automatically obtain the best perspective in the panorama. It is also more robust to parameter selection, and hence more automated compared with stateof-the-art methods. The benefits of the proposed approach are demonstrated using a variety of challenging cases.
We propose a sparse representation approach for classifying different targets in Synthetic Aperture Radar (SAR) images. Unlike the other feature based approaches, the proposed method does not require explicit pose estimation or any preprocessing. The dictionary used in this setup is the collection of the normalized training vectors itself. Computing a sparse representation for the test data using this dictionary corresponds to finding a locally linear approximation with respect to the underlying class manifold. SAR images obtained from the Moving and Stationary Target Acquisition and Recognition (MSTAR) public database were used in the classification setup. Results show that the performance of the algorithm is superior to using a support vector machines based approach with similar assumptions. Significant complexity reduction is obtained by reducing the dimensions of the data using random projections for only a small loss in performance.
Topological data analysis is becoming a popular way to study high dimensional feature spaces without any contextual clues or assumptions. This paper concerns itself with one popular topological feature, which is the number of d−dimensional holes in the dataset, also known as the Betti−d number. The persistence of the Betti numbers over various scales is encoded into a persistence diagram (PD), which indicates the birth and death times of these holes as scale varies. A common way to compare PDs is by a pointto-point matching, which is given by the n-Wasserstein metric. However, a big drawback of this approach is the need to solve correspondence between points before computing the distance; for n points, the complexity grows according to O(n 3 ). Instead, we propose to use an entirely new framework built on Riemannian geometry, that models PDs as 2D probability density functions that are represented in the square-root framework on a Hilbert Sphere. The resulting space is much more intuitive with closed form expressions for common operations. The distance metric is 1) correspondence-free and also 2) independent of the number of points in the dataset. The complexity of computing distance between PDs now grows according to O(K 2 ), for a K × K discretization of [0, 1] 2 . This also enables the use of existing machinery in differential geometry towards statistical analysis of PDs such as computing the mean, geodesics, classification etc. We report competitive results with the Wasserstein metric, at a much lower computational load, indicating the favorable properties of the proposed approach. arXiv:1605.08912v1 [math.AT]
Abstract-In complex visual recognition tasks it is typical to adopt multiple descriptors, that describe different aspects of the images, for obtaining an improved recognition performance. Descriptors that have diverse forms can be fused into a unified feature space in a principled manner using kernel methods. Sparse models that generalize well to the test data can be learned in the unified kernel space, and appropriate constraints can be incorporated for application in supervised and unsupervised learning. In this paper, we propose to perform sparse coding and dictionary learning in the multiple kernel space, where the weights of the ensemble kernel are tuned based on graph-embedding principles such that class discrimination is maximized. In our proposed algorithm, dictionaries are inferred using multiple levels of 1−D subspace clustering in the kernel space, and the sparse codes are obtained using a simple levelwise pursuit scheme. Empirical results for object recognition and image clustering show that our algorithm outperforms existing sparse coding based approaches, and compares favorably to other state-of-the-art methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.