Andy Shih scite author profile

AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical principles (e.g., model architectures, training procedures, data, systems, security, evaluation, theory) to their applications (e.g., law, healthcare, education) and societal impact (e.g., inequity, misuse, economic and environmental impact, legal and ethical considerations). Though foundation models are based on standard deep learning and transfer learning, their scale results in new emergent capabilities, and their effectiveness across so many tasks incentivizes homogenization. Homogenization provides powerful leverage but demands caution, as the defects of the foundation model are inherited by all the adapted models downstream. Despite the impending widespread deployment of foundation models, we currently lack a clear understanding of how they work, when they fail, and what they are even capable of due to their emergent properties. To tackle these questions, we believe much of the critical research on foundation models will require deep interdisciplinary collaboration commensurate with their fundamentally sociotechnical nature.

show abstract

A Symbolic Approach to Explaining Bayesian Network Classifiers

Shih

2018

View full text Add to dashboard Cite

We propose an approach for explaining Bayesian network classifiers, which is based on compiling such classifiers into decision functions that have a tractable and symbolic form. We introduce two types of explanations for why a classifier may have classified an instance positively or negatively and suggest algorithms for computing these explanations. The first type of explanation identifies a minimal set of the currently active features that is responsible for the current classification, while the second type of explanation identifies a minimal set of features whose current state (active or not) is sufficient for the classification. We consider in particular the compilation of Naive and Latent-Tree Bayesian network classifiers into Ordered Decision Diagrams (ODDs), providing a context for evaluating our proposal using case studies and experiments based on classifiers from the literature.

show abstract

Compiling Bayesian Network Classifiers into Decision Graphs

Shih

Choi

Darwiche

2019

AAAI

View full text Add to dashboard Cite

We propose an algorithm for compiling Bayesian network classifiers into decision graphs that mimic the input and output behavior of the classifiers. In particular, we compile Bayesian network classifiers into ordered decision graphs, which are tractable and can be exponentially smaller in size than decision trees. This tractability facilitates reasoning about the behavior of Bayesian network classifiers, including the explanation of decisions they make. Our compilation algorithm comes with guarantees on the time of compilation and the size of compiled decision graphs. We apply our compilation algorithm to classifiers from the literature and discuss some case studies in which we show how to automatically explain their decisions and verify properties of their behavior.

show abstract

On Tractable Representations of Binary Neural Networks

Shi

Shih

Darwiche

et al. 2020

View full text Add to dashboard Cite

We consider the compilation of a binary neural network’s decision function into tractable representations such as Ordered Binary Decision Diagrams (OBDDs) and Sentential Decision Diagrams (SDDs). Obtaining this function as an OBDD/SDD facilitates the explanation and formal veriﬁcation of a neural network’s behavior. First, we consider the task of verifying the robustness of a neural network, and show how we can compute the expected robustness of a neural network, given an OBDD/SDD representation of it. Next, we consider a more efﬁcient approach for compiling neural networks, based on a pseudo-polynomial time algorithm for compiling a neuron. We then provide a case study in a handwritten digits dataset, highlighting how two neural networks trained from the same dataset can have very high accuracies, yet have very different levels of robustness. Finally, in experiments, we show that it is feasible to obtain compact representations of neural networks as SDDs.

show abstract

Using Slit-Lamp Images for Deep Learning-Based Identification of Bacterial and Fungal Keratitis: Model Development and Validation with Different Convolutional Neural Networks

et al. 2021

View full text Add to dashboard Cite

In this study, we aimed to develop a deep learning model for identifying bacterial keratitis (BK) and fungal keratitis (FK) by using slit-lamp images. We retrospectively collected slit-lamp images of patients with culture-proven microbial keratitis between 1 January 2010 and 31 December 2019 from two medical centers in Taiwan. We constructed a deep learning algorithm consisting of a segmentation model for cropping cornea images and a classification model that applies different convolutional neural networks (CNNs) to differentiate between FK and BK. The CNNs included DenseNet121, DenseNet161, DenseNet169, DenseNet201, EfficientNetB3, InceptionV3, ResNet101, and ResNet50. The model performance was evaluated and presented as the area under the curve (AUC) of the receiver operating characteristic curves. A gradient-weighted class activation mapping technique was used to plot the heat map of the model. By using 1330 images from 580 patients, the deep learning algorithm achieved the highest average accuracy of 80.0%. Using different CNNs, the diagnostic accuracy for BK ranged from 79.6% to 95.9%, and that for FK ranged from 26.3% to 65.8%. The CNN of DenseNet161 showed the best model performance, with an AUC of 0.85 for both BK and FK. The heat maps revealed that the model was able to identify the corneal infiltrations. The model showed a better diagnostic accuracy than the previously reported diagnostic performance of both general ophthalmologists and corneal specialists.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Andy Shih

On the Opportunities and Risks of Foundation Models

A Symbolic Approach to Explaining Bayesian Network Classifiers

Compiling Bayesian Network Classifiers into Decision Graphs

On Tractable Representations of Binary Neural Networks

Using Slit-Lamp Images for Deep Learning-Based Identification of Bacterial and Fungal Keratitis: Model Development and Validation with Different Convolutional Neural Networks

Contact Info

Product

Resources

About