Abstract-Deep neural networks have achieved near-human accuracy levels in various types of classification and prediction tasks including images, text, speech, and video data. However, the networks continue to be treated mostly as black-box function approximators, mapping a given input to a classification output. The next step in this human-machine evolutionary processincorporating these networks into mission critical processes such as medical diagnosis, planning and control -requires a level of trust association with the machine output.Typically, statistical metrics are used to quantify the uncertainty of an output. However, the notion of trust also depends on the visibility that a human has into the working of the machine. In other words, the neural network should provide humanunderstandable justifications for its output leading to insights about the inner workings. We call such models as interpretable deep networks.Interpretability is not a monolithic notion. In fact, the subjectivity of an interpretation, due to different levels of human understanding, implies that there must be a multitude of dimensions that together constitute interpretability. In addition, the interpretation itself can be provided either in terms of the lowlevel network parameters, or in terms of input features used by the model. In this paper, we outline some of the dimensions that are useful for model interpretability, and categorize prior work along those dimensions. In the process, we perform a gap analysis of what needs to be done to improve model interpretability.
The rapid identification for drugs-of-abuse in airports is of critical importance. In this study we demonstrate the viability of Raman spectroscopy for the rapid identification of illicit substances in their containers in an airport environment.Raman spectra of drugs-of-abuse in situ were collected using portable Raman spectrometers; this technique offers distinct advantages to government agencies, first responders and forensic scientists working in the security field. We have demonstrated that the spectrometers are able to collect the spectra of suspect powders, including cocaine HCl and d-amphetamine sulphate with unknown constituents rapidly and with a high degree of discrimination.
Saliency maps are a popular approach to creating post-hoc explanations of image classifier outputs. These methods produce estimates of the relevance of each pixel to the classification output score, which can be displayed as a saliency map that highlights important pixels. Despite a proliferation of such methods, little effort has been made to quantify how good these saliency maps are at capturing the true relevance of the pixels to the classifier output (i.e. their “fidelity”). We therefore investigate existing metrics for evaluating the fidelity of saliency methods (i.e. saliency metrics). We find that there is little consistency in the literature in how such metrics are calculated, and show that such inconsistencies can have a significant effect on the measured fidelity. Further, we apply measures of reliability developed in the psychometric testing literature to assess the consistency of saliency metrics when applied to individual saliency maps. Our results show that saliency metrics can be statistically unreliable and inconsistent, indicating that comparative rankings between saliency methods generated using such metrics can be untrustworthy.
There is general consensus that it is important for artificial intelligence (AI) and machine learning systems to be explainable and/or interpretable. However, there is no general consensus over what is meant by 'explainable' and 'interpretable'. In this paper, we argue that this lack of consensus is due to there being several distinct stakeholder communities. We note that, while the concerns of the individual communities are broadly compatible, they are not identical, which gives rise to different intents and requirements for explainability/interpretability. We use the software engineering distinction between validation and verification, and the epistemological distinctions between knowns/unknowns, to tease apart the concerns of the stakeholder communities and highlight the areas where their foci overlap or diverge. It is not the purpose of the authors of this paper to 'take sides' -we count ourselves as members, to varying degrees, of multiple communities -but rather to help disambiguate what stakeholders mean when they ask 'Why?' of an AI.
Local field potentials (LFPs) sampled with extracellular electrodes are frequently used as a measure of population neuronal activity. However, relating such measurements to underlying neuronal behaviour and connectivity is non-trivial. To help study this link, we developed the Virtual Electrode Recording Tool for EXtracellular potentials (VERTEX). We first identified a reduced neuron model that retained the spatial and frequency filtering characteristics of extracellular potentials from neocortical neurons. We then developed VERTEX as an easy-to-use Matlab tool for simulating LFPs from large populations (>100,000 neurons). A VERTEX-based simulation successfully reproduced features of the LFPs from an in vitro multi-electrode array recording of macaque neocortical tissue. Our model, with virtual electrodes placed anywhere in 3D, allows direct comparisons with the in vitro recording setup. We envisage that VERTEX will stimulate experimentalists, clinicians, and computational neuroscientists to use models to understand the mechanisms underlying measured brain dynamics in health and disease.Electronic supplementary materialThe online version of this article (doi:10.1007/s00429-014-0793-x) contains supplementary material, which is available to authorized users.
Artificial intelligence (AI) systems hold great promise as decision-support tools, but we must be able to identify and understand their inevitable mistakes if they are to fulfill this potential. This is particularly true in domains where the decisions are high-stakes, such as law, medicine, and the military. In this Perspective, we describe the particular challenges for AI decision support posed in military coalition operations. These include having to deal with limited, low-quality data, which inevitably compromises AI performance. We suggest that these problems can be mitigated by taking steps that allow rapid trust calibration so that decision makers understand the AI system's limitations and likely failures and can calibrate their trust in its outputs appropriately. We propose that AI services can achieve this by being both interpretable and uncertainty-aware. Creating such AI systems poses various technical and human factors challenges. We review these challenges and recommend directions for future research.
Several strands of research have aimed to bridge the gap between artificial intelligence (AI) and human decision-makers in AI-assisted decision-making, where humans are the consumers of AI model predictions and the ultimate decision-makers in high-stakes applications. However, people's perception and understanding are often distorted by their cognitive biases, such as confirmation bias, anchoring bias, availability bias, to name a few. In this work, we use knowledge from the field of cognitive science to account for cognitive biases in the human-AI collaborative decision-making setting, and mitigate their negative effects on collaborative performance. To this end, we mathematically model cognitive biases and provide a general framework through which researchers and practitioners can understand the interplay between cognitive biases and human-AI accuracy. We then focus specifically on anchoring bias, a bias commonly encountered in human-AI collaboration. We implement a time-based de-anchoring strategy and conduct our first user experiment that validates its effectiveness in human-AI collaborative decision-making. With this result, we design a time allocation strategy for a resource-constrained setting that achieves optimal human-AI collaboration under some assumptions. We, then, conduct a second user experiment which shows that our time allocation strategy with explanation can effectively de-anchor the human and improve collaborative performance when the AI model has low confidence and is incorrect.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.