Topic segmentation in ASR transcripts using bidirectional RNNS for change detection

Sehikh, Imran; Fohr, Dominique; Illina, Irina

doi:10.1109/asru.2017.8268979

Cited by 26 publications

(30 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…More recently, Sehikh et al (2017) utilized long short-term memory (LSTM) networks and showed that cohesion between bidirectional layers can be leveraged to predict topic changes. In contrast to our method, the authors focused on segmenting speech recognition transcripts on word level without explicit topic labels.…”

Section: Related Workmentioning

confidence: 99%

SECTOR: A Neural Model for Coherent Topic Segmentation and Classification

Arnold

Schneider

Cudré-Mauroux

et al. 2019

Transactions of the Association for Computational Linguistics

View full text Add to dashboard Cite

When searching for information, a human reader first glances over a document, spots relevant sections and then focuses on a few sentences for resolving her intention. However, the high variance of document structure complicates to identify the salient topic of a given section at a glance. To tackle this challenge, we present SECTOR, a model to support machine reading systems by segmenting documents into coherent sections and assigning topic labels to each section. Our deep neural network architecture learns a latent topic embedding over the course of a document. This can be leveraged to classify local topics from plain text and segment a document at topic shifts. In addition, we contribute WikiSection, a publicly available dataset with 242k labeled sections in English and German from two distinct domains: diseases and cities. From our extensive evaluation of 20 architectures, we report a highest score of 71.6% F1 for the segmentation and classification of 30 topics from the English city domain, scored by our SECTOR LSTM model with bloom filter embeddings and bidirectional segmentation. This is a significant improvement of 29.5 points F1 compared to state-of-the-art CNN classifiers with baseline segmentation. 1 Our source code is available under the Apache License 2.0 at https

show abstract

Section: Related Workmentioning

confidence: 99%

SECTOR: A Neural Model for Coherent Topic Segmentation and Classification

Arnold

Schneider

Cudré-Mauroux

et al. 2019

Transactions of the Association for Computational Linguistics

View full text Add to dashboard Cite

show abstract

“…These context-dependent models assume that utterances with a similar semantic distribution share the same topic. More recent methods leverage the deep architectures, such as recurrent neural networks (RNNs) (Sehikh et al, 2017) and convolutional neural networks (CNNs) (Wang et al, 2016) to semantically encode the utterance into a vector space. Treating the topic segmentation as a sequence labeling problem, labels (i.e., topics) are then assigned to every utterance.…”

Section: Topic Segmentationmentioning

confidence: 99%

Topic-Based Measures of Conversation for Detecting Mild CognitiveImpairment

Chen

Dodge

Asgari

2020

Proceedings of the First Workshop on Natural Language Processing for Medical Conversations

View full text Add to dashboard Cite

Conversation is a complex cognitive task that engages multiple aspects of cognitive functions to remember the discussed topics, monitor the semantic and linguistic elements, and recognize others' emotions. In this paper, we propose a computational method based on the lexical coherence of consecutive utterances to quantify topical variations in semistructured conversations of older adults with cognitive impairments. Extracting the lexical knowledge of conversational utterances, our method generates a set of novel conversational measures that indicate underlying cognitive deficits among subjects with mild cognitive impairment (MCI). Our preliminary results verify the utility of the proposed conversation-based measures in distinguishing MCI from healthy controls.

show abstract

“…Recently, proposed SegBot, a bidirectional RNN coupled with a pointer network that addresses both topic segmentation and EDU. Also, LSTM or CNN based approaches have been proposed, for instance through bidirectional layers (Sheikh et al, 2017), sentence embedding-based with four layers bidirectional LSTM (Koshorek et al, 2018) or through two symmetric CNN (Wang et al, 2017), etc. Finally, Arnold et al (2019) proposed Sector, the first LSTM-based architecture that combines topical (latent semantic content) and structural information (segmentation) as a mutual task.…”

Section: Related Workmentioning

confidence: 99%

Hierarchical Text Segmentation for Medieval Manuscripts

Hazem¹,

Daille²,

Stutzmann³

et al. 2020

Proceedings of the 28th International Conference on Computational Linguistics

View full text Add to dashboard Cite

In this paper, we address the segmentation of books of hours, Latin devotional manuscripts of the late Middle Ages, that exhibit challenging issues: a complex hierarchical entangled structure, variable content, noisy transcriptions with no sentence markers, and strong correlations between sections for which topical information is no longer sufficient to draw segmentation boundaries. We show that the main state-of-the-art segmentation methods are either inefficient or inapplicable for books of hours and propose a bottom-up greedy approach that considerably enhances the segmentation results. We stress the importance of such hierarchical segmentation of books of hours for historians to explore their overarching differences underlying conception about Church.

show abstract

Topic segmentation in ASR transcripts using bidirectional RNNS for change detection

Cited by 26 publications

References 24 publications

SECTOR: A Neural Model for Coherent Topic Segmentation and Classification

SECTOR: A Neural Model for Coherent Topic Segmentation and Classification

Topic-Based Measures of Conversation for Detecting Mild CognitiveImpairment

Hierarchical Text Segmentation for Medieval Manuscripts

Contact Info

Product

Resources

About