Structured Attention for Unsupervised Dialogue Structure Induction

Qiu, Liang; Zhao, Yizhou; Shi, Weiyan; Liu, Yuan; Shi, Feng; Yuan, Tao; Yu, Zhou; Zhu, Song‐Chun

doi:10.18653/v1/2020.emnlp-main.148

Cited by 17 publications

(32 citation statements)

References 21 publications

(22 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Existing works put more emphasis on unsupervised learning of dialogue structures. Representative ones include training language models based on Hidden Markov Models (HMMs) (Chotimongkol, 2008) or Varia-tional Recurrent Neural Networks (VRNNs) (Shi et al, 2019;Qiu et al, 2020) to reconstruct the original dialogues. The structure built upon the latent states is then evaluated in downstream tasks like dialogue policy learning.…”

Section: Dialoguementioning

confidence: 99%

“…By reconstructing the original dialogues with discrete latent variable models, we can extract a structure representing the transition among the variables. In this direction, people have tried Hidden Markov Models (Chotimongkol, 2008;Ritter et al, 2010;Zhai and Williams, 2014), Variational Auto-Encoders (VAEs) (Kingma and Welling, 2013), and its recurrent version Variational Recurrent Neural Networks (VRNNs) (Chung et al, 2015;Shi et al, 2019;Qiu et al, 2020) Cann et al, 2017;Howard and Ruder, 2018;Peters et al, 2018;Devlin et al, 2019), the Transformer architecture (Vaswani et al, 2017) can be trained on generic corpora and adapted to specific downstream tasks. In dialogue systems, Wu et al (2020) pre-trained the BERT model (Devlin et al, 2019) on task-oriented dialogues for intent recognition, dialogue state tracking, dialogue act prediction, and response selection.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Structure Extraction in Task-Oriented Dialogues with Slot Clustering

Liang¹,

Wu²,

Liu³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

Extracting structure information from dialogue data can help us better understand user and system behaviors. In task-oriented dialogues, dialogue structure has often been considered as transition graphs among dialogue states. However, annotating dialogue states manually is expensive and time-consuming. In this paper, we propose a simple yet effective approach for structure extraction in taskoriented dialogues. We first detect and cluster possible slot tokens with a pre-trained model to approximate dialogue ontology for a target domain. Then we track the status of each identified token group and derive a state transition structure. Empirical results show that our approach outperforms unsupervised baseline models by far in dialogue structure extraction. In addition, we show that data augmentation based on extracted structures enriches the surface formats of training data and can achieve a significant performance boost in dialogue response generation 1 .

show abstract

Section: Dialoguementioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Structure Extraction in Task-Oriented Dialogues with Slot Clustering

Liang¹,

Wu²,

Liu³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…The research into dialog history modeling is much more extensive for open-domain dialog than task-oriented dialog (Tian et al, 2017). In the former, dialog history representation has been explored by, e.g., representing the entire dialog history as a linear sequence of tokens (Sordoni et al, 2015), using a fixed-size window to represent only the recent dialog history (Li et al, 2016), designing hierarchical repre- sentations (Serban et al, 2016;Xing et al, 2018;, leveraging structured attention (Qiu et al, 2020;Su et al, 2019) as well as summarizing (Xu et al, 2021) or re-writing (Xu et al, 2020) dialog history to handle long dialogs.…”

Section: Context Modeling For Dialogmentioning

confidence: 99%

What Did You Say? Task-Oriented Dialog Datasets Are Not Conversational!?

Jakobovits¹,

Piccinno²,

Altün³

2022

Preprint

View full text Add to dashboard Cite

High-quality datasets for task-oriented dialog are crucial for the development of virtual assistants. Yet three of the most relevant largescale dialog datasets suffer from one common flaw: the dialog state update can be tracked, to a great extent, by a model that only considers the current user utterance, ignoring the dialog history. In this work, we outline a taxonomy of conversational and contextual effects, which we use to examine MULTIWOZ, SGD and SMCALFLOW, among the most recent and widely used task-oriented dialog datasets. We analyze the datasets in a model-independent fashion and corroborate these findings experimentally using a strong text-to-text baseline (T5). We find that less than 4% of MULTI-WOZ's turns and 10% of SGD's turns are conversational, while SMCALFLOW is not conversational at all in its current release: its dialog state tracking task can be reduced to singleexchange semantic parsing. We conclude by outlining desiderata for truly conversational dialog datasets.

show abstract

“…Both are related to understanding multi-party conversation structures but they are different tasks. Dialogue structure learning aims to discover latent dialogue topics and construct an implicit utterance dependency tree to represent a multi-party dialogue's turn taking (Qiu et al, 2020), while the goal of conversation disentanglement is to learn an explicit dividing scheme that separates intermingled messages into sessions.…”

Section: Related Workmentioning

confidence: 99%

Unsupervised Conversation Disentanglement through Co-Training

Liu¹,

Shi²,

Zhu³

2021

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

Conversation disentanglement aims to separate intermingled messages into detached sessions, which is a fundamental task in understanding multi-party conversations. Existing work on conversation disentanglement relies heavily upon human-annotated datasets, which are expensive to obtain in practice. In this work, we explore to train a conversation disentanglement model without referencing any human annotations. Our method is built upon a deep co-training algorithm, which consists of two neural networks: a messagepair classifier and a session classifier. The former is responsible for retrieving local relations between two messages while the latter categorizes a message to a session by capturing context-aware information. Both networks are initialized respectively with pseudo data built from an unannotated corpus. During the deep co-training process, we use the session classifier as a reinforcement learning component to learn a session assigning policy by maximizing the local rewards given by the messagepair classifier. For the message-pair classifier, we enrich its training data by retrieving message pairs with high confidence from the disentangled sessions predicted by the session classifier. Experimental results on the large Movie Dialogue Dataset demonstrate that our proposed approach achieves competitive performance compared to the previous supervised methods. Further experiments show that the predicted disentangled conversations can promote the performance on the downstream task of multi-party response selection.

show abstract

Structured Attention for Unsupervised Dialogue Structure Induction

Cited by 17 publications

References 21 publications

Structure Extraction in Task-Oriented Dialogues with Slot Clustering

Structure Extraction in Task-Oriented Dialogues with Slot Clustering

What Did You Say? Task-Oriented Dialog Datasets Are Not Conversational!?

Unsupervised Conversation Disentanglement through Co-Training

Contact Info

Product

Resources

About