Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing 2017
DOI: 10.18653/v1/d17-1139
|View full text |Cite
|
Sign up to set email alerts
|

Learning to Rank Semantic Coherence for Topic Segmentation

Abstract: Topic segmentation plays an important role for discourse parsing and information retrieval. Due to the absence of training data, previous work mainly adopts unsupervised methods to rank semantic coherence between paragraphs for topic segmentation. In this paper, we present an intuitive and simple idea to automatically create a "quasi" training dataset, which includes a large amount of text pairs from the same or different documents with different semantic coherence. With the training corpus, we design a symmet… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
15
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 12 publications
(19 citation statements)
references
References 15 publications
0
15
0
Order By: Relevance
“…For automatic text segmentation, multitude of approaches such as lexical overlap, bayesian learning or dynamic programming (Hearst, 1997;Choi, 2000;Utiyama and Isahara, 2001;Eisenstein and Barzilay, 2008;Du et al, 2013) have been proposed. The recent works rely on neural network models to learn different aspects of text segmentation such as coherence and cohesion (Wang et al, 2017;Sehikh et al, 2017;Bahdanau et al, 2016a;Arnold et al, 2019).…”
Section: Related Workmentioning
confidence: 99%
“…For automatic text segmentation, multitude of approaches such as lexical overlap, bayesian learning or dynamic programming (Hearst, 1997;Choi, 2000;Utiyama and Isahara, 2001;Eisenstein and Barzilay, 2008;Du et al, 2013) have been proposed. The recent works rely on neural network models to learn different aspects of text segmentation such as coherence and cohesion (Wang et al, 2017;Sehikh et al, 2017;Bahdanau et al, 2016a;Arnold et al, 2019).…”
Section: Related Workmentioning
confidence: 99%
“…Recently, proposed SegBot, a bidirectional RNN coupled with a pointer network that addresses both topic segmentation and EDU. Also, LSTM or CNN based approaches have been proposed, for instance through bidirectional layers (Sheikh et al, 2017), sentence embedding-based with four layers bidirectional LSTM (Koshorek et al, 2018) or through two symmetric CNN (Wang et al, 2017), etc. Finally, Arnold et al (2019) proposed Sector, the first LSTM-based architecture that combines topical (latent semantic content) and structural information (segmentation) as a mutual task.…”
Section: Related Workmentioning
confidence: 99%
“…On the contrary, if their similarity is below a certain threshold, a shift is determined (Hearst, 1997;Riedl and Biemann, 2012). When sufficient topically annotated training data are available, deep neural approaches based on CNN (Wang et al, 2017) or LSTM (Koshorek et al, 2018) can be efficiently applied Arnold et al, 2019). Until now, text segmentation methods have exclusively addressed data sets lying within the scope of narrative and expository texts or user dialogues texts and sometimes artificially generated data (Choi, 2000;Jeong and Titov, 2010;Glavaš et al, 2016;Koshorek et al, 2018).…”
Section: Introductionmentioning
confidence: 99%
“…In another line of research, Wang et al (2017) combined learning to rank and a convolutional neural network to learn a coherence function between text pairs; higher-ranked pairs are likely to be segments. Despite a promising approach, stateof-the-art results were not achieved.…”
Section: Related Workmentioning
confidence: 99%