Group-wise Contrastive Learning for Neural Dialogue Generation

Cai, Hengyi; Chen, Hongshen; Song, Yujiang; Ding, Zhuoye; Bao, Yongjun; Yan, Weipeng; Zhao, Xianfeng

doi:10.48550/arxiv.2009.07543

Cited by 8 publications

(10 citation statements)

References 33 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We plan cover the following: Contrastive Data Augmentation for NLP (Shen et al, 2020;Qu et al, 2021); Text Classification (Fang et al, 2020;Kachuee et al, 2020;Suresh and Ong, 2021;Du et al, 2021;Carlsson et al, 2021;Qiu et al, 2021;Klein and Nabi, 2021); Sentence Embeddings Sedghamiz et al, 2021) including Quick-Thought (Logeswaran and Lee, 2018),Sentence-BERT (Reimers and Gurevych, 2019), Info-Sentence BERT (Zhang et al, 2020a), SimCSE (Gao et al, 2021b), DeCLUTR (Giorgi et al, 2020), ConSERT (Yan et al, 2021b), Di-alogueCSE (Liu et al, 2021a). We will also cover discourse analysis (Iter et al, 2020;Kiyomaru and Kurohashi, 2021); Information Extraction (Qin et al, 2020; Machine Translation (Pan et al, 2021;Vamvas and Sennrich, 2021); Question Answering (Karpukhin et al, 2020;You et al, 2021;Yue et al, 2021); Summarization (Duan et al, 2019; including faithfulness (Cao and Wang, 2021), summary evaluation (Wu et al, 2020a), multilingual summarization , and dialogue summarization ; Text Generation (Chai et al, 2021;Lee et al, 2021b) including logicconsistent text generation (Shu et al, 2021), paraphrase generation (Yang et al, 2021a), grammatical error correction (Cao et al, 2021), dialogue generation (Cai et al, 2020), x-ray report generation (Liu et al, 2021b;…”

Section: Contrastive Learning For Nlpmentioning

confidence: 99%

Contrastive Data and Learning for Natural Language Processing

Zhang¹,

Ji²,

Zhang³

et al. 2022

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

Section: Contrastive Learning For Nlpmentioning

confidence: 99%

Contrastive Data and Learning for Natural Language Processing

Zhang¹,

Ji²,

Zhang³

et al. 2022

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

“…Learning with negative samples has been explored in many natural language tasks, such as dialogue generation (Cai et al, 2020), word embeddings (Mikolov et al, 2013), language modeling (Noji and Takamura, 2020), etc., and computer vision tasks such as image captioning (Dai and Lin, 2017) answering (Yeh and Chen, 2019) and image classification (Hjelm et al, 2018) try to decrease the mutual information between positive and negative samples.…”

Section: Related Workmentioning

confidence: 99%

Learning with Instance Bundles for Reading Comprehension

Dua¹,

Dasigi²,

Singh³

et al. 2021

Preprint

View full text Add to dashboard Cite

When training most modern reading comprehension models, all the questions associated with a context are treated as being independent from each other. However, closely related questions and their corresponding answers are not independent, and leveraging these relationships could provide a strong supervision signal to a model. Drawing on ideas from contrastive estimation, we introduce several new supervision techniques that compare question-answer scores across multiple related instances. Specifically, we normalize these scores across various neighborhoods of closely contrasting questions and/or answers, adding another cross entropy loss term that is used in addition to traditional maximum likelihood estimation. Our techniques require bundles of related question-answer pairs, which we can either mine from within existing data or create using various automated heuristics. We empirically demonstrate the effectiveness of training with instance bundles on two datasets-HotpotQA and ROPES-showing up to 11% absolute gains in accuracy.

show abstract

“…In our case, the multi-turn dialogue data setup allows us to further utilize the context-response relationship, and conduct hard negative sampling by using context-response matching models. Following (Cai et al, 2020), we consider training a Multi-hop Selector Network (MSN) (Yuan et al, 2019) which provides matching scores between the context and response inputs. Specifically, we construct a dialogue dataset, in which each context input c is paired with one positive response sample x, and multiple randomly sample distrator response samples x j .…”

Section: Improved CL With Hard Negative Samplingmentioning

confidence: 99%

“…For auto-regressive model, we consider Dialog-GPT (Zhang et al, 2019), which is a GPT-2 based model that is specifically designed for dialogue response generation. For dialogue response generation model that uses contrastive learning, we include group-wise contrastive learning (GCL) (Cai et al, 2020), which conduct CL between target dialogue model and a pretrained reference model. PLATO (Bao et al, 2019) is another model that uses transformer-based model architecture while including a discrete latent variable to tackle the oneto-many mapping problem.…”

Section: Baseline Modelsmentioning

confidence: 99%

“…Such an approach has been used in the computer vision domain (Robinson et al, 2020), guiding a learning method to correct its mistakes more quickly. In our case, we select the negative samples by using a pretrained context-response matching model (Cai et al, 2020). Given a context input, the responses with the top matching scores would be considered as the negative samples, and used for the contrastive objective.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Dialogue Response Generation via Contrastive Latent Representation Learning

Dai¹,

Wang²,

Park³

et al. 2021

Proceedings of the 3rd Workshop on Natural Language Processing for Conversational AI

View full text Add to dashboard Cite

Large-scale auto-regressive models have achieved great success in dialogue response generation, with the help of Transformer layers. However, these models do not learn a representative latent space of the sentence distribution, making it hard to control the generation. Recent works have tried to learn sentence representations using Transformerbased framework, but do not model the context-response relationship embedded in the dialogue datasets. In this work, we aim to construct a robust sentence representation learning model, that is specifically designed for dialogue response generation, with Transformer-based encoder-decoder structure.An utterance-level contrastive learning is proposed, encoding predictive information in each context representation for its corresponding response. Extensive experiments are conducted to verify the robustness of the proposed representation learning mechanism. By using both referencebased and reference-free evaluation metrics, we provide detailed analysis on the generated sentences, demonstrating the effectiveness of our proposed model.

show abstract

Group-wise Contrastive Learning for Neural Dialogue Generation

Cited by 8 publications

References 33 publications

Contrastive Data and Learning for Natural Language Processing

Contrastive Data and Learning for Natural Language Processing

Learning with Instance Bundles for Reading Comprehension

Dialogue Response Generation via Contrastive Latent Representation Learning

Contact Info

Product

Resources

About