DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models

Shansan, Gong,; Mukai, Li,; Feng, Jing; Wu, Zhiyong; Kong, LingPeng

doi:10.48550/arxiv.2210.08933

Cited by 15 publications

(57 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…2, DINOISER surpasses or closely approaches CMLM even when MBR=1, while the vanilla DiffusionLM heavily relies on a large number of candidates used for MBR decoding. Besides, DINOISER necessitates much fewer NFEs to achieve strong performance, e.g., only 20 steps, resulting in only 1% to 10% computational consumption and latency compared to previous works (Gong et al, 2022;Dieleman et al, 2022). This manifests that DINOISER is more accurate yet efficient compared to previous diffusion-based sequence learning models.…”

Section: Resultsmentioning

confidence: 95%

“…We consider IWSLT14 DE↔EN (160K pairs), WMT14 EN↔DE (4.0M pairs), and WMT14 EN↔RO (610K pairs), six machine translation tasks with variant sizes of training data. Additionally, we experiment on two of the datasets introduced by Dif-fuSeq (Gong et al, 2022), including Wiki (Jiang et al, 2020) for text simplification and QQP 6 for paraphrasing.…”

Section: Methodsmentioning

confidence: 99%

“…(3) Previous diffusion-based sequence generative models, including the vanilla design that simply extends the original DiffusionLM with an additinal condition encoder, and the other recently proposed improved methods CDCD (continuous diffusion for categorical data, Dieleman et al, 2022), DiffuSeq (Gong et al, 2022), SeqDiffuSeq (Yuan et al, 2022) and Difformer (Gao et al, 2022). For text simplification and paraphrasing, we compare our method with DiffuSeq (Gong et al, 2022).…”

Section: Methodsmentioning

confidence: 99%

“…For text simplification and paraphrasing, we report results with various length beams as length prediction on these tasks is more challenging and less studied. For all the diffusion-based methods, we follow previous work Gong et al, 2022;Dieleman et al, 2022) and apply Minimum Bayes-Risk (MBR) decoding (Kumar & Byrne, 2004). For both DiffusionLM and our model, we perform sampling with 20 steps.…”

Section: Methodsmentioning

confidence: 99%

“…The denoising process of diffusion models matches an iterative refinement process (Gong et al, 2022). However, diffusion models are not directly applicable to sequence learning tasks since the original diffusion models operate in continuous space rather than sequences of discrete tokens.…”

Section: Dtmentioning

confidence: 99%

See 4 more Smart Citations

DINOISER: Diffused Conditional Sequence Learning by Manipulating Noises

Ye¹,

Zhou²,

Yu³

et al. 2023

Preprint

View full text Add to dashboard Cite

While diffusion models have achieved great success in generating continuous signals such as images and audio, it remains elusive for diffusion models in learning discrete sequence data like natural languages. Although recent advances circumvent this challenge of discreteness by embedding discrete tokens as continuous surrogates, they still fall short of satisfactory generation quality. To understand this, we first dive deep into the denoised training protocol of diffusion-based sequence generative models and determine their three severe problems, i.e., 1) failing to learn, 2) lack of scalability, and 3) neglecting source conditions. We argue that these problems can be boiled down to the pitfall of the not completely eliminated discreteness in the embedding space, and the scale of noises is decisive herein. In this paper, we introduce DINOISER to facilitate diffusion models for sequence generation by manipulating noises. We propose to adaptively determine the range of sampled noise scales for counterdiscreteness training; and encourage the proposed diffused sequence learner to leverage source conditions with amplified noise scales during inference. Experiments show that DINOISER enables consistent improvement over the baselines of previous diffusion-based sequence generative models on several conditional sequence modeling benchmarks thanks to both effective training and inference strategies. Analyses further verify that DINOISER can make better use of source conditions to govern its generative process.

show abstract

Section: Resultsmentioning

confidence: 95%

Section: Methodsmentioning

confidence: 99%

Section: Methodsmentioning

confidence: 99%

Section: Methodsmentioning

confidence: 99%

Section: Dtmentioning

confidence: 99%

See 3 more Smart Citations

DINOISER: Diffused Conditional Sequence Learning by Manipulating Noises

Ye¹,

Zhou²,

Yu³

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

Data augmentation: A comprehensive survey of modern approaches

Mumuni

2022

Array

179

View full text Add to dashboard Cite

Diffusion model for adversarial attack against NLP models

Qiu,

Gou,

Liang

2024

International Conference on Computer Graphics, Artificial Intelligence, and Data Processing (ICCAID 2023)

View full text Add to dashboard Cite

Current black-box adversarial attacks have proven to be highly effective in generating adversarial texts that can successfully deceive natural language processing models, thereby revealing potential weaknesses in these models. This research proposes an innovative transfer-based black-box attack method, which capitalizes on the combined generative and discriminative abilities of the diffusion model. To ensure semantic similarity and enhance the adversarial ability of generated texts, well-designed semantic-preserving and adversarial objectives are introduced to the training procedure of the diffusion model. The results show that the proposed method can generate adversarial texts that successfully attack text classification models.

show abstract

DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models

Cited by 15 publications

References 15 publications

DINOISER: Diffused Conditional Sequence Learning by Manipulating Noises

DINOISER: Diffused Conditional Sequence Learning by Manipulating Noises

Data augmentation: A comprehensive survey of modern approaches

Diffusion model for adversarial attack against NLP models

Contact Info

Product

Resources

About