End-to-End Spoken Language Understanding for Generalized Voice Assistants

Saxon, Michael; Choudhary, Samridhi; McKenna, Joseph P.; Mouchtaris, Athanasios

doi:10.21437/interspeech.2021-1826

Cited by 7 publications

(3 citation statements)

References 34 publications

(67 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Simple word-and n-gram level approaches have proven surprisingly capable in a-priori characterizations of dataset difficulty (McKenna et al, 2020) and producing difficult test sets (Saxon et al, 2021) in diverse language domains such as SLU. Gardner et al (2021) show how such purely frequentist approaches can identify word-level spurious correlations with respect to label class which drive in-part the shortcut features for classes of "competency problems" such as NLI.…”

Section: Related Workmentioning

confidence: 99%

PECO: Examining Single Sentence Label Leakage in Natural Language Inference Datasets through Progressive Evaluation of Cluster Outliers

Saxon,

Wang,

et al. 2023

Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics

View full text Add to dashboard Cite

Building natural language inference (NLI) benchmarks that are both challenging for modern techniques, and free from shortcut biases is difficult. Chief among these biases is single sentence label leakage, where annotatorintroduced spurious correlations yield datasets where the logical relation between (premise, hypothesis) pairs can be accurately predicted from only a single sentence, something that should in principle be impossible. We demonstrate that despite efforts to reduce this leakage, it persists in modern datasets that have been introduced since its 2018 discovery. To enable future amelioration efforts, introduce a novel model-driven technique, the progressive evaluation of cluster outliers (PECO) which enables both the objective measurement of leakage, and the automated detection of subpopulations in the data which maximally exhibit it.

show abstract

Section: Related Workmentioning

confidence: 99%

PECO: Examining Single Sentence Label Leakage in Natural Language Inference Datasets through Progressive Evaluation of Cluster Outliers

Saxon,

Wang,

et al. 2023

Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics

View full text Add to dashboard Cite

show abstract

“…Table 4: Accuracy on FSC dataset. Alexa 0.987 FSC-baseline [29] 0.988 Cao et al [33] 0.990 FANS [34] 0.990 Reptile [35] 0.992 Finstreder (Quartznet) 0.992 Saxon et al [36] 0.994 AT-AT [26] 0.995 Finstreder (Conformer) 0.995 Borgholt et al [37] 0.996 Seo et al [38] 0.997 Qian et al [39] 0.997 Kim et al [32] 0.997 Finstreder (Quartznet) + AMT 0.997…”

Section: English Frenchmentioning

confidence: 99%

Finstreder: Simple and fast Spoken Language Understanding with Finite State Transducers using modern Speech-to-Text models

Daniel¹,

Poeppel²,

Reif³

2022

Preprint

View full text Add to dashboard Cite

In Spoken Language Understanding (SLU) the task is to extract important information from audio commands, like the intent of what a user wants the system to do and special entities like locations or numbers. This paper presents a simple method for embedding intents and entities into Finite State Transducers, and, in combination with a pretrained general-purpose Speechto-Text model, allows building SLU-models without any additional training. Building those models is very fast and only takes a few seconds. It is also completely language independent. With a comparison on different benchmarks it is shown that this method can outperform multiple other, more resource demanding SLU approaches.

show abstract

“…With the recent advances of neural network, there is growing popularity of designing SLU systems in the end-to-end (E2E) fashion [4,5,6], where the ASR and NLU compo-nents are integrated into a single network and optimised with a joint loss function. The E2E SLU systems generally adopt the encoder-decoder-based sequence-to-sequence (Seq2Seq) framework, which has been widely employed in several areas including neural machine translation [7,8] and ASR [9,10].…”

Section: Introductionmentioning

confidence: 99%

Non-Autoregressive End-to-End Approaches for Joint Automatic Speech Recognition and Spoken Language Understanding

Doddipatla

2023

2022 IEEE Spoken Language Technology Workshop (SLT)

View full text Add to dashboard Cite

This paper presents the use of non-autoregressive (NAR) approaches for joint automatic speech recognition (ASR) and spoken language understanding (SLU) tasks. The proposed NAR systems employ a Conformer encoder that applies connectionist temporal classification (CTC) to transcribe the speech utterance into raw ASR hypotheses, which are further refined with a bidirectional encoder representations from Transformers (BERT)-like decoder. In the meantime, the intent and slot labels of the utterance are predicted simultaneously using the same decoder. Both Mask-CTC and selfconditioned CTC (SC-CTC) approaches are explored for this study. Experiments conducted on the SLURP dataset show that the proposed SC-Mask-CTC NAR system achieves 3.7% and 3.2% absolute gains in SLU metrics and a competitive level of ASR accuracy, when compared to a Conformer-Transformer based autoregressive (AR) model. Additionally, the NAR systems achieve 6× faster decoding speed than the AR baseline.

show abstract

End-to-End Spoken Language Understanding for Generalized Voice Assistants

Cited by 7 publications

References 34 publications

PECO: Examining Single Sentence Label Leakage in Natural Language Inference Datasets through Progressive Evaluation of Cluster Outliers

PECO: Examining Single Sentence Label Leakage in Natural Language Inference Datasets through Progressive Evaluation of Cluster Outliers

Finstreder: Simple and fast Spoken Language Understanding with Finite State Transducers using modern Speech-to-Text models

Non-Autoregressive End-to-End Approaches for Joint Automatic Speech Recognition and Spoken Language Understanding

Contact Info

Product

Resources

About