Vikash Sehwag scite author profile

Image diffusion models such as DALL-E 2, Imagen, and Stable Diffusion have attracted significant attention due to their ability to generate high-quality synthetic images. In this work, we show that diffusion models memorize individual images from their training data and emit them at generation time. With a generate-and-filter pipeline, we extract over a thousand training examples from stateof-the-art models, ranging from photographs of individual people to trademarked company logos. We also train hundreds of diffusion models in various settings to analyze how different modeling and data decisions affect privacy. Overall, our results show that diffusion models are much less private than prior generative models such as GANs, and that mitigating these vulnerabilities may require new advances in privacy-preserving training.

show abstract

Fast-Convergent Federated Learning

Nguyen

Sehwag

Hosseinalipour

et al. 2021

IEEE J. Select. Areas Commun.

162

View full text Add to dashboard Cite

RobustBench: a standardized adversarial robustness benchmark

Croce¹,

Andriushchenko²,

Sehwag³

et al. 2020

Preprint

View full text Add to dashboard Cite

Evaluation of adversarial robustness is often error-prone leading to overestimation of the true robustness of models. While adaptive attacks designed for a particular defense are a way out of this, there are only approximate guidelines on how to perform them. Moreover, adaptive evaluations are highly customized for particular models, which makes it difficult to compare different defenses. Our goal is to establish a standardized benchmark of adversarial robustness, which as accurately as possible reflects the robustness of the considered models within a reasonable computational budget. This requires to impose some restrictions on the admitted models to rule out defenses that only make gradient-based attacks ineffective without improving actual robustness. We evaluate robustness of models for our benchmark with AutoAttack, an ensemble of white-and black-box attacks which was recently shown in a large-scale study to improve almost all robustness evaluations compared to the original publications. Our leaderboard, hosted at http://robustbench.github.io/, aims at reflecting the current state of the art on a set of well-defined tasks in ∞ -and 2 -threat models with possible extensions in the future. Additionally, we open-source the library http://github.com/RobustBench/robustbench that provides unified access to state-of-the-art robust models to facilitate their downstream applications. Finally, based on the collected models, we analyze general trends in p -robustness and its impact on other tasks such as robustness to various distribution shifts and out-of-distribution detection.

show abstract

SSD: A Unified Framework for Self-Supervised Outlier Detection

Sehwag¹,

Chiang²,

Mittal³

2021

Preprint

View full text Add to dashboard Cite

Analyzing the Robustness of Open-World Machine Learning

Sehwag

Bhagoji

Song

et al. 2019

View full text Add to dashboard Cite

When deploying machine learning models in real-world applications, an open-world learning framework is needed to deal with both normal in-distribution inputs and undesired out-of-distribution (OOD) inputs. Open-world learning frameworks include OOD detectors that aim to discard input examples which are not from the same distribution as the training data of machine learning classifiers. However, our understanding of current OOD detectors is limited to the setting of benign OOD data, and an open question is whether they are robust in the presence of adversaries. In this paper, we present the first analysis of the robustness of open-world learning frameworks in the presence of adversaries by introducing and designing OOD adversarial examples. Our experimental results show that current OOD detectors can be easily evaded by slightly perturbing benign OOD inputs, revealing a severe limitation of current open-world learning frameworks. Furthermore, we find that OOD adversarial examples also pose a strong threat to adversarial training based defense methods in spite of their effectiveness against in-distribution adversarial attacks. To counteract these threats and ensure the trustworthy detection of OOD inputs, we outline a preliminary design for a robust open-world machine learning framework. CCS CONCEPTS • Computing methodologies → Machine learning; Neural networks; • Security and privacy → Intrusion/anomaly detection and malware mitigation;

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Vikash Sehwag

Extracting Training Data from Diffusion Models

Fast-Convergent Federated Learning

RobustBench: a standardized adversarial robustness benchmark

SSD: A Unified Framework for Self-Supervised Outlier Detection

Analyzing the Robustness of Open-World Machine Learning

Contact Info

Product

Resources

About