2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2011
DOI: 10.1109/icassp.2011.5947628
|View full text |Cite
|
Sign up to set email alerts
|

Simultaneous dialog act segmentation and classification from human-human spoken conversations

Abstract: An accurate identification dialog acts (DAs), which represent the illocutionary aspect of communication, is essential to support the understanding of human conversations. This requires 1) the segmentation of human-human dialogs into turns, 2) the intra-turn segmentation into DA boundaries and 3) the classification of each segment according to a DA tag. This process is particularly challenging when both segmentation and tagging are automated and utterance hypotheses derive from the erroneous results of ASR. In … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
28
0

Year Published

2013
2013
2023
2023

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 31 publications
(28 citation statements)
references
References 5 publications
0
28
0
Order By: Relevance
“…Also note that in some scenarios, for example, speech conversations where transcripts are from speech recognition systems, DA segmentation is also needed. This problem has been addressed in some previous work, for example, (Lendvai, 2007;Quarteroni et al, 2011;Ang et al, 2005), which often uses a classification or sequence labeling setup for the segmentation task, or performs joint DA segmentation and classification. We use pre-segmented utterances and focus just on the DA classification task in this work.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Also note that in some scenarios, for example, speech conversations where transcripts are from speech recognition systems, DA segmentation is also needed. This problem has been addressed in some previous work, for example, (Lendvai, 2007;Quarteroni et al, 2011;Ang et al, 2005), which often uses a classification or sequence labeling setup for the segmentation task, or performs joint DA segmentation and classification. We use pre-segmented utterances and focus just on the DA classification task in this work.…”
Section: Related Workmentioning
confidence: 99%
“…(Ji and Bilmes, 2005;Dielmann and Renals, 2008) used DBN for sequence decoding and examined both the generative and the conditional modeling approaches. CRF, as a powerful sequence labeling method, has also been widely used to incorporate context information for DA classification (Kim et al, 2010;Quarteroni et al, 2011;Chen and Eugenio, 2013;Dielmann and Renals, 2008). It is worth noting that (Ribeiro et al, 2015) used different configurations to capture information from previous context in the SVM classifiers, such as n-grams or DA predictions.…”
Section: Related Workmentioning
confidence: 99%
“…In this section we present a set of baseline models: a generative model based on Hidden Markov Models (HMM) with n-grams as language model of DA (model similar to that proposed in [10]), a discriminative model based on Conditional Random Fields (CRF) [65], such as that used in [66], and a sequential model (i.e., apply a model for segmentation and then one for DA assignment on the obtained segments, which is an usual option [71]) that uses CRF for segmentation and Support Vector Machines (SVM) for DA assignment. The performance of the NGT model for the annotation in the unsegmented case will be compared against these models.…”
Section: Baseline Systemmentioning
confidence: 99%
“…They use word sequence and pause duration as features. The authors of [30] exploit a Switching Dynamic Bayesian Network for segmentation, cascaded with a Conditional Random Field for dialogue act classification, while [31] jointly segments and tags with a single model.…”
Section: Related Workmentioning
confidence: 99%
“…These include Hidden Markov Models [11], Bayesian Networks [32], Discriminative Dynamic Bayesian Networks [33], BayesNet [28], Memory-based [34] and Transformation-based Learning [35], Decision Trees [36], Neural Networks [37], but also more advanced approaches such as Boosting [38], Latent Semantic Analysis [39], Hidden Backoff Models [40], Maximum Entropy Models [41], Conditional Random Fields [31,30] and Triangular-chain CRF [42].…”
Section: Related Workmentioning
confidence: 99%