2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2018
DOI: 10.1109/icassp.2018.8461371
|View full text |Cite
|
Sign up to set email alerts
|

Lexico-Acoustic Neural-Based Models for Dialog Act Classification

Abstract: Recent works have proposed neural models for dialog act classification in spoken dialogs. However, they have not explored the role and the usefulness of acoustic information. We propose a neural model that processes both lexical and acoustic features for classification. Our results on two benchmark datasets reveal that acoustic features are helpful in improving the overall accuracy. Finally, a deeper analysis shows that acoustic features are valuable in three cases: when a dialog act has sufficient data, when … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
19
1

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 17 publications
(20 citation statements)
references
References 18 publications
0
19
1
Order By: Relevance
“…One of the main differences between MTs and ATs is that the latter has no punctuation. In [7], it was shown that punctuation provides strong lexical cues. Therefore, we retrained the model on MRDA's MTs without punctuation.…”
Section: Experiments On Automatic Transcriptionsmentioning
confidence: 99%
See 1 more Smart Citation
“…One of the main differences between MTs and ATs is that the latter has no punctuation. In [7], it was shown that punctuation provides strong lexical cues. Therefore, we retrained the model on MRDA's MTs without punctuation.…”
Section: Experiments On Automatic Transcriptionsmentioning
confidence: 99%
“…Automatic DA classification is a crucial preprocessing step for language understanding and dialog systems. This task has been approached using traditional statistical algorithms, for instance hidden Markov models (HMMs) [3], conditional random fields (CRFs) [4], and more recently deep learning (DL) models, such as convolutional neural networks (CNNs) [5], recurrent neural networks (RNNs) [6,7] and attention mechanism (AM) [8,7], achieve state-of-the-art results.…”
Section: Introductionmentioning
confidence: 99%
“…Some authors found that considering the context explicitly in RNN models helps dialog act classification (Ortega and Vu, 2017;Liu et al, 2017a;Kumar et al, 2018;Raheja and Tetreault, 2019;Dai et al, 2020). Also, it has been shown that incorporating acoustic/prosodic features helps as well to some extent (Ortega and Vu, 2018;Si et al, 2020). Colombo et al (2020) report the best result to date for SWDA classification-an accuracy of 85%, obtained by a sequence-to-sequence (seq2seq) GRU model with guided attention.…”
Section: Dialog Act Classificationmentioning
confidence: 99%
“…The Switchboard annotators originally used the DAMSL labeling scheme (Core and Allen, 1997) with 220 dialog acts and clustered them after annotation into a reduced label set. There seems to be no consensus on the reduced label set size-some of the studies using a 42-label set are Quarteroni et al (2011); Liu et al (2017a); Ortega and Vu (2018); Kumar et al (2018), others use a 43-label set (Ortega and Vu, 2017;Raheja and Tetreault, 2019;Zhao and Kawahara, 2019;.…”
Section: Switchboard Dialog Actmentioning
confidence: 99%
“…Tang et al worked on question detection by using 65 LLDs with an RNN-based model [4]. Ortega and Vu classified dialog acts by combining lexical features and 13-dimensional Mel-frequency cepstrum coefficients (MFCC), and their result explored that the acoustic features are helpful to recognize questions [5]. Arsikere et al proposed a number of new statistical acoustic features for dialog act classification [6].…”
Section: Introductionmentioning
confidence: 99%