With the increasingly widespread use of Transformer-based models for NLU/NLP tasks, there is growing interest in understanding the inner workings of these models, why they are so effective at a wide range of tasks, and how they can be further tuned and improved. To contribute towards this goal of enhanced explainability and comprehension, we present InterpreT, an interactive visualization tool for interpreting Transformer-based models. In addition to providing various mechanisms for investigating general model behaviours, novel contributions made in Inter-preT include the ability to track and visualize token embeddings through each layer of a Transformer, highlight distances between certain token embeddings through illustrative plots, and identify task-related functions of attention heads by using new metrics. Inter-preT is a task agnostic tool, and its functionalities are demonstrated through the analysis of model behaviours for two disparate tasks: Aspect Based Sentiment Analysis (ABSA) and the Winograd Schema Challenge (WSC).
As modern neural networks have grown to billions of parameters, meeting tight latency budgets has become increasingly challenging. Approaches like compression, sparsification and network pruning have proven effective to tackle this problembut they rely on modifications of the underlying network. In this paper, we look at a complimentary approach of optimizing how tensors are mapped to on-chip memory in an inference accelerator while leaving the network parameters untouched. Since different memory components trade off capacity for bandwidth differently, a sub-optimal mapping can result in high latency. We introduce evolutionary graph reinforcement learning (EGRL) -a method combining graph neural networks, reinforcement learning (RL) and evolutionary search -that aims to find the optimal mapping to minimize latency. Furthermore, a set of fast, stateless policies guide the evolutionary search to improve sample-efficiency. We train and validate our approach directly on the Intel NNP-I chip for inference using a batch size of 1. EGRL outperforms policy-gradient, evolutionary search and dynamic programming baselines on BERT, ResNet-101 and ResNet-50. We achieve 28-78% speed-up compared to the native NNP-I compiler on all three workloads.
Background: Deep learning techniques can accurately detect and grade inflammatory findings on images from capsule endoscopy (CE) in Crohn’s disease (CD). However, the predictive utility of deep learning of CE in CD for disease outcomes has not been examined. Objectives: We aimed to develop a deep learning model that can predict the need for biological therapy based on complete CE videos of newly-diagnosed CD patients. Design: This was a retrospective cohort study. The study cohort included treatment-naïve CD patients that have performed CE (SB3, Medtronic) within 6 months of diagnosis. Complete small bowel videos were extracted using the RAPID Reader software. Methods: CE videos were scored using the Lewis score (LS). Clinical, endoscopic, and laboratory data were extracted from electronic medical records. Machine learning analysis was performed using the TimeSformer computer vision algorithm developed to capture spatiotemporal characteristics for video analysis. Results: The patient cohort included 101 patients. The median duration of follow-up was 902 (354–1626) days. Biological therapy was initiated by 37 (36.6%) out of 101 patients. TimeSformer algorithm achieved training and testing accuracy of 82% and 81%, respectively, with an Area under the ROC Curve (AUC) of 0.86 to predict the need for biological therapy. In comparison, the AUC for LS was 0.70 and for fecal calprotectin 0.74. Conclusion: Spatiotemporal analysis of complete CE videos of newly-diagnosed CD patients achieved accurate prediction of the need for biological therapy. The accuracy was superior to that of the human reader index or fecal calprotectin. Following future validation studies, this approach will allow for fast and accurate personalization of treatment decisions in CD.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.