Online Conversation Disentanglement with Pointer Networks

Yu, T.; Joty, Shafiq

doi:10.18653/v1/2020.emnlp-main.512

Cited by 25 publications

(25 citation statements)

References 27 publications

(39 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This is particularly interesting especially when we compare with models employ-ing contextual embeddings like . For the cluster scores, the best model is the pointer network model of Yu and Joty (2020a), which is anyway within less than 0.5% of the best contextual model, and within 2.5% of our model. The difference mainly arises from a difference in recall and corresponds to an absolute difference of less than 10 true positive clusters on the test set.…”

Section: Results Discussionmentioning

confidence: 64%

Disentangling Online Chats with DAG-structured LSTMs

Pappadopulo¹,

Bauer²,

Farina³

et al. 2021

Proceedings of *SEM 2021: The Tenth Joint Conference on Lexical and Computational Semantics

View full text Add to dashboard Cite

Many modern messaging systems allow fast and synchronous textual communication among many users. The resulting sequence of messages hides a more complicated structure in which independent sub-conversations are interwoven with one another. This poses a challenge for any task aiming to understand the content of the chat logs or gather information from them. The ability to disentangle these conversations is then tantamount to the success of many downstream tasks such as summarization and question answering. Structured information accompanying the text such as user turn, user mentions, timestamps, is used as a cue by the participants themselves who need to follow the conversation and has been shown to be important for disentanglement. DAG-LSTMs, a generalization of Tree-LSTMs that can handle directed acyclic dependencies, are a natural way to incorporate such information and its non-sequential nature. In this paper, we apply DAG-LSTMs to the conversation disentanglement task. We perform our experiments on the Ubuntu IRC dataset. We show that the novel model we propose achieves state of the art status on the task of recovering reply-to relations and it is competitive on other disentanglement metrics.

show abstract

Section: Results Discussionmentioning

confidence: 64%

Disentangling Online Chats with DAG-structured LSTMs

Pappadopulo¹,

Bauer²,

Farina³

et al. 2021

Proceedings of *SEM 2021: The Tenth Joint Conference on Lexical and Computational Semantics

View full text Add to dashboard Cite

show abstract

“…The two-step methods (Elsner and Charniak, 2008, 2010, 2011Chen et al, 2017;Jiang et al, 2018;Kummerfeld et al, 2019) firstly retrieve the relations between two messages, e.g., "reply-to" relations (Guo et al, 2018;, and then adopt a clustering algorithm to construct individual sessions. The end-to-end models (Tan et al, 2019;Yu and Joty, 2020), instead, perform the disentanglement operation in an end-to-end manner, where the context information of detached sessions will be exploited to classify a message to a session. End-to-end models tend to achieve better performance than two-step models, but both often need large annotated data to get fully trained , which is expensive to obtain and thus encourages the demand on unsupervised algorithms.…”

Section: Related Workmentioning

confidence: 99%

“…In the two-step methods Charniak, 2011, 2008;Jiang et al, 2018), a model first retrieves the "local" relations between two messages by utilizing either feature engineering approaches or deep learning methods, and then a clustering algorithm is employed to divide an entire conversation into separate sessions based on the message pair relations. In contrast, end-to-end methods (Tan et al, 2019;Yu and Joty, 2020) capture the "global" information contained in the context of detached sessions and calculate the matching degree between a session and a message in an end-to-end manner.…”

Section: Introductionmentioning

confidence: 99%

Unsupervised Conversation Disentanglement through Co-Training

Liu¹,

Shi²,

Zhu³

2021

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

Conversation disentanglement aims to separate intermingled messages into detached sessions, which is a fundamental task in understanding multi-party conversations. Existing work on conversation disentanglement relies heavily upon human-annotated datasets, which are expensive to obtain in practice. In this work, we explore to train a conversation disentanglement model without referencing any human annotations. Our method is built upon a deep co-training algorithm, which consists of two neural networks: a messagepair classifier and a session classifier. The former is responsible for retrieving local relations between two messages while the latter categorizes a message to a session by capturing context-aware information. Both networks are initialized respectively with pseudo data built from an unannotated corpus. During the deep co-training process, we use the session classifier as a reinforcement learning component to learn a session assigning policy by maximizing the local rewards given by the messagepair classifier. For the message-pair classifier, we enrich its training data by retrieving message pairs with high confidence from the disentangled sessions predicted by the session classifier. Experimental results on the large Movie Dialogue Dataset demonstrate that our proposed approach achieves competitive performance compared to the previous supervised methods. Further experiments show that the predicted disentangled conversations can promote the performance on the downstream task of multi-party response selection.

show abstract

“…All previous work (Shen et al, 2006;Elsner and Charniak, 2008;Wang and Oard, 2009;Elsner and Charniak, 2011;Jiang et al, 2018;Kummerfeld et al, 2018;Yu and Joty, 2020) treat the task as a sequence of multiplechoice problems. Each of them consists of a sliding window of n utterances.…”

Section: Related Workmentioning

confidence: 99%

“…It would be ideal if we could design an algorithm to automatically organize an entangled conversation into its constituent threads. This is referred to as the task of dialogue disentanglement (Shen et al, 2006;Elsner and Charniak, 2008;Wang and Oard, 2009;Elsner and Charniak, 2011;Jiang et al, 2018;Kummerfeld et al, 2018;Yu and Joty, 2020 Training data for the dialogue disentanglement task is difficult to acquire due to the need for manual annotation. Typically, the data is annotated in the reply-to links format, i.e.…”

Section: Introductionmentioning

confidence: 99%

Zero-Shot Dialogue Disentanglement by Self-Supervised Entangled Response Selection

Chi

Rudnicky

2021

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

Dialogue disentanglement aims to group utterances in a long and multi-participant dialogue into threads. This is useful for discourse analysis and downstream applications such as dialogue response selection, where it can be the first step to construct a clean context/response set. Unfortunately, labeling all reply-to links takes quadratic effort w.r.t the number of utterances: an annotator must check all preceding utterances to identify the one to which the current utterance is a reply. In this paper, we are the first to propose a zero-shot dialogue disentanglement solution. Firstly, we train a model on a multi-participant response selection dataset harvested from the web which is not annotated; we then apply the trained model to perform zero-shot dialogue disentanglement. Without any labeled data, our model can achieve a cluster F1 score of 25. We also fine-tune the model using various amounts of labeled data. Experiments show that with only 10% of the data, we achieve nearly the same performance of using the full dataset 1 .

show abstract

Online Conversation Disentanglement with Pointer Networks

Cited by 25 publications

References 27 publications

Disentangling Online Chats with DAG-structured LSTMs

Disentangling Online Chats with DAG-structured LSTMs

Unsupervised Conversation Disentanglement through Co-Training

Zero-Shot Dialogue Disentanglement by Self-Supervised Entangled Response Selection

Contact Info

Product

Resources

About