1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings
DOI: 10.1109/asru.1997.658992
|View full text |Cite
|
Sign up to set email alerts
|

Automatic detection of discourse structure for speech recognition and understanding

Abstract: , Carol Van Ess-Dykema (DoD) We describe a new approach for statistical modeling and detection of discourse structure for natural conversational speech. Our model is based on 42 'Dialog Acts' (DAs), (question, answer, backchannel, agreement, disagreement, apology, etc). We labeled 1155 conversations from the Switchboard (SWBD) database (Godfrey et al. 1992) of human-to-human telephone conversations with these 42 types and trained a Dialog Act detector based on three distinct knowledge sources: sequences of wor… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
69
0

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 85 publications
(82 citation statements)
references
References 15 publications
1
69
0
Order By: Relevance
“…Words Inflected Form The word form is used as a baseline lexical feature in most modern lexicalized natural language processing approaches [11,44,32,33]. In our case, sentence segmentation is known but capitalization of the first word of the sentence is removed, which decreases the total number of features in our model without impacting accuracy, thanks to the insertion of a special "start-of-utterance" word.…”
Section: Baseline Featuresmentioning
confidence: 99%
See 1 more Smart Citation
“…Words Inflected Form The word form is used as a baseline lexical feature in most modern lexicalized natural language processing approaches [11,44,32,33]. In our case, sentence segmentation is known but capitalization of the first word of the sentence is removed, which decreases the total number of features in our model without impacting accuracy, thanks to the insertion of a special "start-of-utterance" word.…”
Section: Baseline Featuresmentioning
confidence: 99%
“…Some cue words and phrases can also serve as explicit indicators of dialogue structure [43]. For example, 88.4% of the trigrams "<start> do you" occur in English in yes/no questions [44].…”
Section: Related Workmentioning
confidence: 99%
“…2 Related work Jurafsky et al (1997a) and Reithinger and Klesen (1997) used n-gram language modeling on the Switchboard and Verbmobil corpora respectively to classify dialog acts. Grau et al (2004) uses a Bayesian approach with n-grams to categorize dialog acts.…”
Section: Introductionmentioning
confidence: 99%
“…Early approaches start with using the language models [15,16], and also include the use of generative models such as the source-channel model [17], hidden Markov models (HMM) [18,19,20], and the hidden vector state model [21]. Even though discriminative models do not model the joint distribution of features and labels, it is known that they often outperform generative models in classification tasks, since they relax the independence assumption, and enable arbitrary features to be included in the model.…”
Section: Introductionmentioning
confidence: 99%