Data Manipulation: Towards Effective Instance Learning for Neural Dialogue Generation via Learning to Augment and Reweight

Cai, Hengyi; Chen, Hongshen; Song, Yujiang; Zhang, Cheng; Zhao, Xianfeng; Yin, Dawei

doi:10.18653/v1/2020.acl-main.564

Cited by 42 publications

(43 citation statements)

References 25 publications

(30 reference statements)

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Second, our framework additionally contains a weighting module to reform the generated utterances. Our work is also inspired by Cai et al (2020), which proposes a framework to augment the IND data, while our framework aims to generate OOD data.…”

Section: Weighting Modulementioning

confidence: 99%

Energy-based Unknown Intent Detection with Data Manipulation

Ouyang¹,

Ye²,

Chen³

et al. 2021

Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

View full text Add to dashboard Cite

Unknown intent detection aims to identify the out-of-distribution (OOD) utterance whose intent has never appeared in the training set. In this paper, we propose using energy scores for this task as the energy score is theoretically aligned with the density of the input and can be derived from any classifier. However, highquality OOD utterances are required during the training stage in order to shape the energy gap between OOD and in-distribution (IND), and these utterances are difficult to collect in practice. To tackle this problem, we propose a data manipulation framework to Generate high-quality OOD utterances with importance weighTs (GOT). Experimental results show that the energy-based detector fine-tuned by GOT can achieve state-of-the-art results on two benchmark datasets.

show abstract

Section: Weighting Modulementioning

confidence: 99%

Energy-based Unknown Intent Detection with Data Manipulation

Ouyang¹,

Ye²,

Chen³

et al. 2021

Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

View full text Add to dashboard Cite

show abstract

“…Recently, there is an increased interest on applying data augmentation techniques on sentence-level and sentence-pair natural language processing (NLP) tasks, such as text classification (Wei and Zou, 2019;Xie et al, 2019), natural language inference (Min et al, 2020) and machine translation . Augmentation methods explored for these tasks either create augmented instances by manipulating a few words in the original instance, such as word replacement (Zhang et al, 2015;Wang and Yang, 2015;Cai et al, 2020), random deletion (Wei and Zou, 2019), or word position swap (Ş ahin and Steedman, 2018;Min et al, 2020); or create entirely artificial instances via generative models, such as variational auto encoders (Yoo et al, 2019;Mesbah et al, 2019) or back-translation models (Yu et al, 2018;Iyyer et al, 2018).…”

Section: Introductionmentioning

confidence: 99%

An Analysis of Simple Data Augmentation for Named Entity Recognition

Dai¹,

Adel²

2020

Proceedings of the 28th International Conference on Computational Linguistics

124

View full text Add to dashboard Cite

Simple yet effective data augmentation techniques have been proposed for sentence-level and sentence-pair natural language processing tasks. Inspired by these efforts, we design and compare data augmentation for named entity recognition, which is usually modeled as a token-level sequence labeling problem. Through experiments on two data sets from the biomedical and materials science domains (i2b2-2010 and MaSciP), we show that simple augmentation can boost performance for both recurrent and transformer-based models, especially for small training sets.

show abstract

“…The backbone of our model is the transformerbased sequence to sequence model (Vaswani et al, 2017), and most hyper-parameters follow Cai et al (2020). Specifically, the encoder and decoder each contains 6 layers.…”

Section: Implementation Detailsmentioning

confidence: 99%

Diversifying Dialog Generation via Adaptive Label Smoothing

Wang¹,

Zheng²,

Jiang³

et al. 2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

View full text Add to dashboard Cite

Neural dialogue generation models trained with the one-hot target distribution suffer from the over-confidence issue, which leads to poor generation diversity as widely reported in the literature. Although existing approaches such as label smoothing can alleviate this issue, they fail to adapt to diverse dialog contexts. In this paper, we propose an Adaptive Label Smoothing (AdaLabel) approach that can adaptively estimate a target label distribution at each time step for different contexts. The maximum probability in the predicted distribution is used to modify the soft target distribution produced by a novel light-weight bi-directional decoder module. The resulting target distribution is aware of both previous and future contexts and is adjusted to avoid over-training the dialogue model. Our model can be trained in an endto-end manner. Extensive experiments on two benchmark datasets show that our approach outperforms various competitive baselines in producing diverse responses.

show abstract

Data Manipulation: Towards Effective Instance Learning for Neural Dialogue Generation via Learning to Augment and Reweight

Cited by 42 publications

References 25 publications

Energy-based Unknown Intent Detection with Data Manipulation

Energy-based Unknown Intent Detection with Data Manipulation

An Analysis of Simple Data Augmentation for Named Entity Recognition

Diversifying Dialog Generation via Adaptive Label Smoothing

Contact Info

Product

Resources

About