Interspeech 2019 2019
DOI: 10.21437/interspeech.2019-2366
|View full text |Cite
|
Sign up to set email alerts
|

End-to-End Spoken Language Understanding: Bootstrapping in Low Resource Scenarios

Abstract: End-to-end Spoken Language Understanding (SLU) systems, without speech-to-text conversion, are more promising in low resource scenarios. They can be more effective when there is not enough labeled data to train reliable speech recognition and language understanding systems, or where running SLU on edge is preferred over cloud based services. In this paper, we present an approach for bootstrapping end-to-end SLU in low resource scenarios. We show that incorporating layers extracted from pre-trained acoustic mod… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
23
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 32 publications
(23 citation statements)
references
References 17 publications
0
23
0
Order By: Relevance
“…Go and colleagues [ 11 ] showed that the POS tags caused reduced performance, although POS tags can be strong indicators of emotions in text and serve as a helpful feature in opinion or sentiment analysis [ 18 ]. Moreover, bootstrapping approaches, which rely on a seed list of opinion or emotion words to find other such words in a large corpus, are becoming more popular and have proven effective [ 20 , 21 , 22 , 23 ]. Mihalcea, Banea, and Wiebe [ 23 ] described two types of methods for bootstrapping the subjectivity lexicons into dictionary-based and corpus-based.…”
Section: Introductionmentioning
confidence: 99%
“…Go and colleagues [ 11 ] showed that the POS tags caused reduced performance, although POS tags can be strong indicators of emotions in text and serve as a helpful feature in opinion or sentiment analysis [ 18 ]. Moreover, bootstrapping approaches, which rely on a seed list of opinion or emotion words to find other such words in a large corpus, are becoming more popular and have proven effective [ 20 , 21 , 22 , 23 ]. Mihalcea, Banea, and Wiebe [ 23 ] described two types of methods for bootstrapping the subjectivity lexicons into dictionary-based and corpus-based.…”
Section: Introductionmentioning
confidence: 99%
“…Using unaligned data provides the flexibility to infer slot labels from imperfect transcriptions. Hence, in this work, the NLU module was a seq2seq attention-based model 1 .…”
Section: Pipeline Slumentioning
confidence: 99%
“…of the pipeline approach [1,2]. The main motivation for applying E2E SLU is that word by word recognition is not necessary to infer slots and intents and that the ASR phoneme dictionary and language model (LM) become optional.…”
Section: Introductionmentioning
confidence: 99%
“…[7,11] address this problem using a curriculum and transfer learning approach whereby the model is gradually trained on increasingly relevant data until it is fine-tuned on the actual domain data. Similarly, [5,12] advocate pre-training an ASR model on a large amount of transcribed speech data to initialize a speech-to-intent model that is then trained on a much smaller training set with both transcripts and intent labels.…”
Section: Introductionmentioning
confidence: 99%