2013 IEEE International Conference on Acoustics, Speech and Signal Processing 2013
DOI: 10.1109/icassp.2013.6639006
|View full text |Cite
|
Sign up to set email alerts
|

Effectiveness of discriminative training and feature transformation for reverberated and noisy speech

Abstract: Automatic speech recognition in the presence of non-stationary interference and reverberation remains a challenging problem. The 2nd Annual Speech Separation and Recognition Challenge introduces a new and difficult task with time-varying reverberation and non-stationary interference including natural background speech, home noises, or music. This paper establishes baselines using state-of-the-art ASR techniques such as discriminative training and various feature transformation on the middle-vocabulary sub-task… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0

Year Published

2013
2013
2017
2017

Publication Types

Select...
6
2
1

Relationship

2
7

Authors

Journals

citations
Cited by 14 publications
(12 citation statements)
references
References 25 publications
(15 reference statements)
0
12
0
Order By: Relevance
“…Large improvements can be obtained even with a clean recognizer back-end; furthermore, in unseen acoustic conditions the data-based method achieves notable performance compared to the model-based method. Future work will concentrate on the integration of discriminative methods both in the ASR back-end training and in the DAE training, which have proven effective for reverberated speech [28]. Furthermore, for better integration with the ASR back-end, we will investigate improved cost functions in DAE training taking account parameters of the ASR back-end instead of just optimizing distances in the spectral domain.…”
Section: Discussionmentioning
confidence: 97%
“…Large improvements can be obtained even with a clean recognizer back-end; furthermore, in unseen acoustic conditions the data-based method achieves notable performance compared to the model-based method. Future work will concentrate on the integration of discriminative methods both in the ASR back-end training and in the DAE training, which have proven effective for reverberated speech [28]. Furthermore, for better integration with the ASR back-end, we will investigate improved cost functions in DAE training taking account parameters of the ASR back-end instead of just optimizing distances in the spectral domain.…”
Section: Discussionmentioning
confidence: 97%
“…Yuuki Tachioka et al [12] verified the effectiveness of discriminate training under clean, reverberant and noisy speech using MFCC features, MFCC+LDA+MLLT features and PLP features separately. The feature transformation are effective on the non-stationary interference and reverberation.…”
Section: Kaldimentioning
confidence: 99%
“…In [13], a model of the noise is estimated from observed data by considering the late reverberation as additive noise, and then the feature vector is enhanced by applying vector Taylor series. A feature transformation based on discriminative training criterion inspired on Maximum Mutual Information is suggested in [14]. Additional features related to the amount of diffuse noise in each frequency bin and frame are employed in [15] to improve deep neural network-based ASR accuracy in noisy and reverberant environments.…”
Section: Distant-talking Asrmentioning
confidence: 99%