2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2018
DOI: 10.1109/icassp.2018.8461751
|View full text |Cite
|
Sign up to set email alerts
|

ASR Performance Prediction on Unseen Broadcast Programs Using Convolutional Neural Networks

Abstract: In this paper, we address a relatively new task: prediction of ASR performance on unseen broadcast programs. We first propose an heterogenous French corpus dedicated to this task. Two prediction approaches are compared: a state-of-the-art performance prediction based on regression (engineered features) and a new strategy based on convolutional neural networks (learnt features). We particularly focus on the combination of both textual (ASR transcription) and signal inputs. While the joint use of textual and sig… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
26
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
4
1

Relationship

2
3

Authors

Journals

citations
Cited by 6 publications
(26 citation statements)
references
References 13 publications
0
26
0
Order By: Relevance
“…In this section, we attempt to understand what our best ASR performance prediction system (Elloumi et al, 2018) learned. We analyze the text and speech representations obtained by our architecture.…”
Section: Evaluating Learned Representations 41 Methodologymentioning
confidence: 99%
See 3 more Smart Citations
“…In this section, we attempt to understand what our best ASR performance prediction system (Elloumi et al, 2018) learned. We analyze the text and speech representations obtained by our architecture.…”
Section: Evaluating Learned Representations 41 Methodologymentioning
confidence: 99%
“…In (Elloumi et al, 2018), we proposed a new approach using convolution neural networks (CNNs) to predict ASR performance from a collection of heterogeneous broadcast programs (both radio and TV). We particularly focused on the combination of text (ASR transcription) and signal (raw speech) inputs which both proved useful for CNN prediction.…”
Section: Asr Performance Prediction Systemmentioning
confidence: 99%
See 2 more Smart Citations
“…ASR system performance 3.3.1 ASR system. The ASR system used in this work is described in [13]. It uses the KALDI toolkit [25], following a standard Kaldi recipe.…”
Section: Gender Bias Evaluation Procedures Of Anmentioning
confidence: 99%