Interspeech 2020 2020
DOI: 10.21437/interspeech.2020-1783
|View full text |Cite
|
Sign up to set email alerts
|

Modeling ASR Ambiguity for Neural Dialogue State Tracking

Abstract: Spoken dialogue systems typically use one or several (top-N) ASR sequence(s) for inferring the semantic meaning and tracking the state of the dialogue. However, ASR graphs, such as confusion networks (confnets), provide a compact representation of a richer hypothesis space than a top-N ASR list. In this paper, we study the benefits of using confusion networks with a neural dialogue state tracker (DST). We encode the 2dimensional confnet into a 1-dimensional sequence of embeddings using a confusion network enco… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
2
2
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 16 publications
0
3
0
Order By: Relevance
“…[19] shows a two-step approach to generate the best path from confusion network to improve slot filling task. A more recent work [20] studies the approach of using the confusion network in a neural dialogue state tracker (DST), where the authors propose an attentional confusion network encoder that can be used in any DST. On the other hand, there has been work on re-scoring ASR nbest by exploring the morphological, lexical, and syntactic features [21,22].…”
Section: Effect Of Multiple Cnnsmentioning
confidence: 99%
“…[19] shows a two-step approach to generate the best path from confusion network to improve slot filling task. A more recent work [20] studies the approach of using the confusion network in a neural dialogue state tracker (DST), where the authors propose an attentional confusion network encoder that can be used in any DST. On the other hand, there has been work on re-scoring ASR nbest by exploring the morphological, lexical, and syntactic features [21,22].…”
Section: Effect Of Multiple Cnnsmentioning
confidence: 99%
“…Word lattices from ASR were first used by [1] over ASR top-1 hypothesis for tasks such as named-entity extraction and call classification. Word confusion networks have been recently used by [4] for intent classification in dialogue systems and by [2,10] for dialogue state tracking (DST). [2] show that confusion network gives comparable performance to top-N hypotheses of ASR while [10] show that using confusion network improves performance in both in time and accuracy.…”
Section: Related Workmentioning
confidence: 99%
“…Word confusion networks have been recently used by [4] for intent classification in dialogue systems and by [2,10] for dialogue state tracking (DST). [2] show that confusion network gives comparable performance to top-N hypotheses of ASR while [10] show that using confusion network improves performance in both in time and accuracy. Another related task in SLU is that of Spoken Question Answering.…”
Section: Related Workmentioning
confidence: 99%