2022
DOI: 10.48550/arxiv.2212.03860
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Diffusion Art or Digital Forgery? Investigating Data Replication in Diffusion Models

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
17
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 15 publications
(20 citation statements)
references
References 0 publications
0
17
0
Order By: Relevance
“…Memorized images. We select eight image memorization examples from the recent works [64,10], four of which are shown in Figure 9. It also shows the sample generations before and after fine-tuning.…”
Section: Comparisons and Main Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…Memorized images. We select eight image memorization examples from the recent works [64,10], four of which are shown in Figure 9. It also shows the sample generations before and after fine-tuning.…”
Section: Comparisons and Main Resultsmentioning
confidence: 99%
“…Current methods can synthesize high-quality images with remarkable generalization ability, capable of composing different instances, styles, and concepts in unseen contexts. However, as these models are often trained on copyright images, it learns to mimic various artist styles [64,61] and other copyrighted content [10].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Specifically, they proposed a simple and efficient method for extracting verbatim sequences from a language model's training set using only black-box query access. Recently, in the vision domain, Somepalli et al [246] showed that the data replication problem existed in diffusion models, where the generated images are close to the training data in terms of semantic similarity. To disclose worse-case privacy risk, Carlini et al [247] further explored the privacy vulnerabilities of state-of-the-art diffusion models by leveraging a generate-and-filter pipeline to extract over a thousand training examples from the models.…”
Section: Privacymentioning
confidence: 99%
“…For example, in social networks, participation is usually public; recovering privately shared photos or messages from a model trained on social network data is the privacy violation. These kinds of attacks are referred to as training data reconstruction attacks, and have been successfully demonstrated against a number of machine learning models including language models (Carlini et al, 2021;Mireshghallah et al, 2022), generative models (Somepalli et al, 2022), and image classifiers (Balle et al, 2022;Haim et al, 2022). Recent work (Bhowmick et al, 2018;Balle et al, 2022;Guo et al, 2022a;Stock et al, 2022) has begun to provide evidence that if one is willing to forgo protection against membership inference, then the regime that protects against training data reconstruction is far larger, as predicted by the intuitive reasoning that successful reconstruction requires a significant number of bits about an individual example to be leaked by the model.…”
Section: Introductionmentioning
confidence: 99%