2019
DOI: 10.1016/j.procs.2019.01.069
|View full text |Cite
|
Sign up to set email alerts
|

Incorporating label dependency for ASR error detection via RNN

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
4
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 7 publications
(7 citation statements)
references
References 13 publications
0
4
0
Order By: Relevance
“…Both, V-RNN and MLP models consist of a single layer of 2048 units with a relu [19] activation function as described in [8]. The ULSTM consists of 1 hidden layer of 2048 staked LSTM units.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…Both, V-RNN and MLP models consist of a single layer of 2048 units with a relu [19] activation function as described in [8]. The ULSTM consists of 1 hidden layer of 2048 staked LSTM units.…”
Section: Methodsmentioning
confidence: 99%
“…So, the conditional independence assumption is not satisfied in this application. This fact motivated us to propose a new Variant of Recurrent Neural Network (V-RNN) as the classifier for ASR error detection for the first time in [8].…”
Section: Classifiersmentioning
confidence: 99%
See 1 more Smart Citation
“…The recurrent neural network (RNN) deep-learning algorithm is a special neural network structure that uses the partial derivative of the loss function to adjust the weight of each unit and shares the parameters in the connection of each step of input, output, and hidden states. This sharing mechanism can significantly lower the complexity of the model, shorten the training time, and maintain high accuracy [23][24][25][26]; hence, it has greater advantages over other ANNs. Additionally, in the current CLQ evaluation research, many studies have focused only on evaluating a single aspect of CLQ (such as the evaluation of natural grades) while ignoring the multifunctional evaluation of CLQ (such as the evaluation of utilization grades and economic grades).…”
Section: Introductionmentioning
confidence: 99%
“…Our objective to promote the correct ASR hypotheses for unlabeled speech data is similar to Tong et al's, however we focus on limited data scenarios and we use an error detector to bias the supervision towards correct words in both matched and mismatched scenarios. Modern ASR error detectors [13][14][15] are more powerful than the lattice posteriorbased confidence scores used to weight per-frame gradients [6] or to discard erroneous words [16,17] or utterances [18,19] in most semi-supervised neural AM training studies. Yet, to the best of our knowledge, they have not been leveraged for semisupervised LF-MMI training so far.…”
Section: Introductionmentioning
confidence: 99%