Interspeech 2020 2020
DOI: 10.21437/interspeech.2020-2215
|View full text |Cite
|
Sign up to set email alerts
|

Confidence Measures in Encoder-Decoder Models for Speech Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
21
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 14 publications
(21 citation statements)
references
References 0 publications
0
21
0
Order By: Relevance
“…A good model shows a lower EER value and higher AUC/NCE values. The details of these metrics can be found in [18,26,27,40].…”
Section: Experimental Settingsmentioning
confidence: 99%
“…A good model shows a lower EER value and higher AUC/NCE values. The details of these metrics can be found in [18,26,27,40].…”
Section: Experimental Settingsmentioning
confidence: 99%
“…In [22], a lightweight CEM that uses internal features of a seq2seq model was proposed to mitigate overconfidence. In [23], softmax temperature values for each token were predicted to adjust overconfident probabilities. In CTCbased ASR models we used in this study, confidence scores can be obtained with the forward-backward algorithm [35], which was reported to perform well [24].…”
Section: Confidence Estimationmentioning
confidence: 99%
“…The proposed rescoring method is closely related to confidence estimation, or the ASR error detection task. Confidence estimation assesses the quality of ASR predictions [18,19,20,21,22,23,24,25], which is useful for many downstream ASR applications such as voice assistants. We demonstrate that our models for rescoring can be applied to confidence estimation without any additional architectural changes or training.…”
Section: Introductionmentioning
confidence: 99%
“…Different network structures like feed forward network (FFN) [22,23], recurrent neural network (RNN) [14,16] and selfattention Transformer [6] could be applied to realize the NCM module. In this study, a residual FFN with three hidden layers is adopted as the classification model.…”
Section: Predictor Featuresmentioning
confidence: 99%
“…However, the softmax probability was found to be unreliable and might perform poorly due to the overconfident behaviour of E2E models [21,22]. To alleviate the problem of unreliability, a neural network can be trained independently to predict a softmax temperature value to re-distribute the original output probabilities at each time step of decoding [23]. In [22], a lightweight neural network was used to estimate neural confidence measure (NCM), which was shown to be more reliable than directly using the Fig.…”
Section: Introductionmentioning
confidence: 99%