Deep neural networks (DNNs) dominate many tasks in the computer vision domain, but it is still difficult to understand and interpret the information contained within these networks. To gain better insight into how a network learns and operates, there is a strong need to visualize these complex structures, and this remains an important research direction. In this paper, we address the problem of how the interactive display of DNNs in a virtual reality (VR) setup can be used for general understanding and architectural assessment. We compiled a static library as a plugin for the Caffe framework in the Unity gaming engine. We used routines from this plugin to create and visualize a VR-based AlexNet architecture for an image classification task. Our layered interactive model allows the user to freely navigate back and forth within the network during visual exploration. To make the DNN model even more accessible, the user can select certain connections to understand the activity flow at a particular neuron. Our VR setup also allows users to hide the activation maps/filters or even interactively occlude certain features in an image in real-time. Furthermore, we added an interpretation module and reframed the Shapley values to give a deeper understanding of the different layers. Thus, this novel tool offers more direct access to network structures and results, and its immersive operation is especially instructive for both novices and experts in the field of DNNs.
Machine learning and its associated algorithms involving deep neural networks have gained widespread admiration in the computer vision domain. In this regard, significant progress has been made in automating certain application-dependent tasks, especially in the fields of medicine, autonomous driving and robotics. Moreover, considerable improvements have already been made and work is underway to make automated systems secure and robust against failures. Nonetheless, researchers are struggling to find ways to give reasons and explanations for why a machine learning model made a certain decision. In particular, deep neural networks are considered to be "black-box" in this regard, and the reason is that their distributed encoding of information makes it even more challenging to interpret their decision-making capability.In view of the above challenges, this dissertation aims to establish methods to visualize and interpret the decisions of these complex machine learning models in an image classification task. We opt for three types of post hoc methods, i.e., global, hybrid and local interpretability to understand and assess the reasons and type of image features that are vital for a decision in an image classification task. Hence, we call our approach "visualizing and interpreting the decision of deep neural networks".On a global level, we investigate and assess the deep network architecture as a whole, keeping in view the internal connections between adjacent layers, filters and functioning of different hidden layers. Hence, we have proposed a visualization method in the form of a Caffe2Unity plugin to construct and visualize a complete AlexNet architecture in a virtual reality environment. This novel approach allows the user to become part of the virtual network and gives liberty to explore and visualize the internal states of the network. Exploring and visualizing the network in a virtual environment for global assessment, working and understanding of deep neural networks benefits both novices and experts among our target audience.Using a hybrid approach, we gave a local interpretable module within our global virtual model that allowed the user to visualize and interpret the network in real-time.We permitted the user to add an occlusion block on an image and visualize the results, iii as well as verify the decision of the network via our reframed integrated Shapley values approach. In this way, we achieved our goal of finding a good reason to determine which part of the image the network considers important for making its decision.At the local interpretable level, we proposed a layer-wise approach using influence scores to gain deeper insights into the pre-trained models' decision-making capability.We used the layer-wise influence score to determine what each layer has learned and which training data is most influential in the decision. By finding a contrast between the influential image and the network's decision, we also identified the biased nature of the network towards the texture of the images.The proposed m...
An understanding of deep neural network decisions is based on the interpretability of model, which provides explanations that are understandable to human beings and helps avoid biases in model predictions. This study investigates and interprets the model output based on images from the training dataset, i.e., to debug the results of a network model in relation to the training dataset. Our objective was to understand the behavior (specifically, class prediction) of deep learning models through the analysis of perturbations of the loss functions. We calculated influence scores for the VGG16 network at different hidden layers across three types of disturbances in the original images of the ImageNet dataset: texture, style, and background elimination. The global and layer-wise influence scores allowed the identification of the most influential training images for the given testing set. We illustrated our findings using influence scores by highlighting the types of disturbances that bias predictions of the network. According to our results, layer-wise influence analysis pairs well with local interpretability methods such as Shapley values to demonstrate significant differences between disturbed image subgroups. Particularly in an image classification task, our layer-wise interpretability approach plays a pivotal role to identify the classification bias in pre-trained convolutional neural networks, thus, providing useful insights to retrain specific hidden layers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.