Explaining in Style: Training a GAN to explain a classifier in StyleSpace

Lang, Oran; Gandelsman, Yossi; Yarom, Michal; Wald, Yoav; Elidan, Gal; Hassidim, Avinatan; Freeman, William T.; Isola, Phillip; Globerson, Amir; Irani, Michal; Mosseri, Inbar

doi:10.1109/iccv48922.2021.00073

Cited by 82 publications

(38 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The majority of work on interpretability so far has focused on (i), providing post-hoc explanations for a given prediction model. These include pixel attribution methods [Simonyan et al, 2014, Bach et al, 2015, Selvaraju et al, 2017, counterfactual explanations [Chang et al, 2019, Antoran et al, 2021, explanations based on pre-defined concepts , Kazhdan et al, 2020, Yeh et al, 2020, and recently developed StyleGANs [Wu et al, 2021, Lang et al, 2021. Post-hoc methods have a number of shortcomings given our desired objectives: First, it is unclear whether post-hoc explanations indeed reflect the black-box model's true "reasoning" [Rudin, 2018.…”

Section: Post-hoc Methodsmentioning

confidence: 99%

Provable concept learning for interpretable predictions using variational autoencoders

Taeb¹,

Ruggeri²,

Schnuck³

et al. 2022

Preprint

View full text Add to dashboard Cite

In safety critical applications, practitioners are reluctant to trust neural networks when no interpretable explanations are available. Many attempts to provide such explanations revolve around pixel level attributions or use previously known concepts. In this paper we aim to provide explanations by provably identifying high-level, previously unknown concepts. To this end, we propose a probabilistic modeling framework to derive (C)oncept (L)earning and (P)rediction (CLAP) -a VAE-based classifier that uses visually interpretable concepts as linear predictors. Assuming that the data generating mechanism involves predictive concepts, we prove that our method is able to identify them while attaining optimal classification accuracy. We use synthetic experiments for validation, and also show that on real-world (PlantVillage and ChestXRay) datasets, CLAP effectively discovers interpretable factors for classifying diseases.

show abstract

Section: Post-hoc Methodsmentioning

confidence: 99%

Provable concept learning for interpretable predictions using variational autoencoders

Taeb¹,

Ruggeri²,

Schnuck³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…This semantic map can then be used to perform local editing over an image, guided by a target reference image. Lang et al [2021] propose to not only exploit the emerging disentanglement properties of a pretrained StyleGAN, but to train a StyleGAN model for a specific disentangled axis. Through a clever training scheme, combining training StyleGAN along with a classifier for binary or multi-class recognition (e.g., a cat vs. dog classifier), they drive the latent space to capture classifier-specific attributes.…”

Section: Discriminative Applicationsmentioning

confidence: 99%

State-of-the-Art in the Architecture, Methods and Applications of StyleGAN

Bermano¹,

Gal²,

Alaluf³

et al. 2022

Preprint

View full text Add to dashboard Cite

Generative Adversarial Networks (GANs) have established themselves as a prevalent approach to image synthesis. Of these, StyleGAN offers a fascinating case study, owing to its remarkable visual quality and an ability to support a large array of downstream tasks. This state-of-the-art report covers the StyleGAN architecture, and the ways it has been employed since its conception, while also analyzing its severe limitations. It aims to be of use for both newcomers, who wish to get a grasp of the field, and for more experienced readers that might benefit from seeing current research trends and existing tools laid out. Among StyleGAN's most interesting aspects is its learned latent space. Despite being learned with no supervision, it is surprisingly well-behaved and remarkably disentangled. Combined with StyleGAN's visual quality, these properties gave rise to unparalleled editing capabilities. However, the control offered by StyleGAN is inherently limited to the generator's learned distribution, and can only be applied to images generated by StyleGAN itself. Seeking to bring StyleGAN's latent control to real-world scenarios, the study of GAN inversion and latent space embedding has quickly gained in popularity. Meanwhile, this same study has helped shed light on the inner workings and limitations of StyleGAN. We map out StyleGAN's impressive story through these investigations, and discuss the details that have made StyleGAN the go-to generator. We further elaborate on the visual priors StyleGAN constructs, and discuss their use in downstream discriminative tasks. Looking forward, we point out StyleGAN's limitations and speculate on current trends and promising directions for future research, such as task and target specific fine-tuning.

show abstract

“…Related work focuses on understanding if neural networks encode and use concepts (Lucieri et al, 2020;Kim et al, 2018;McGrath et al, 2021), or generate counterfactual explanations to understand model behavior (Ghandeharioun et al, 2021;Abid et al, 2021;Akula et al, 2020). These works mostly use a set of human-specified concepts to analyze model behavior, however, there is an increasing interest in automatically discovering the concepts that are used by a model (Yeh et al, 2020;Ghorbani et al, 2019;Lang et al, 2021).…”

Section: Related Workmentioning

confidence: 99%

Post-hoc Concept Bottleneck Models

Yüksekgönül¹,

Wang²,

Zou³

2022

Preprint

View full text Add to dashboard Cite

Concept Bottleneck Models (CBMs) map the inputs onto a set of interpretable concepts ("the bottleneck") and use the concepts to make predictions. A concept bottleneck enhances interpretability since it can be investigated to understand what concepts the model "sees" in an input and which of these concepts are deemed important. However, CBMs are restrictive in practice as they require concept labels in the training data to learn the bottleneck and do not leverage strong pretrained models. Moreover, CBMs often do not match the accuracy of an unrestricted neural network, reducing the incentive to deploy them in practice. In this work, we address the limitations of CBMs by introducing Post-hoc Concept Bottleneck models (PCBMs). We show that we can turn any neural network into a PCBM without sacrificing model performance while still retaining interpretability benefits. When concept annotation is not available on the training data, we show that PCBM can transfer concepts from other datasets or from natural language descriptions of concepts. PCBM also enables users to quickly debug and update the model to reduce spurious correlations and improve generalization to new (potentially different) data. Through a model-editing user study, we show that editing PCBMs via concept-level feedback can provide significant performance gains without using any data from the target domain or model retraining. 1 Preprint. Under review.

show abstract

Explaining in Style: Training a GAN to explain a classifier in StyleSpace

Cited by 82 publications

References 18 publications

Provable concept learning for interpretable predictions using variational autoencoders

Provable concept learning for interpretable predictions using variational autoencoders

State-of-the-Art in the Architecture, Methods and Applications of StyleGAN

Post-hoc Concept Bottleneck Models

Contact Info

Product

Resources

About