2015
DOI: 10.1016/j.specom.2014.11.004
|View full text |Cite
|
Sign up to set email alerts
|

Robust speech recognition in reverberant environments by using an optimal synthetic room impulse response model

Abstract: This paper presents a practical technique for Automatic speech recognition (ASR) in multiple reverberant environments based on multi-model selection. Multiple ASR models are trained with artificial synthetic room impulse responses (IRs), i.e. simulated room IRs, with different reverberation time (T Model 60 s) and tested on real room IRs with varying T Room 60 s. To apply our method, the biggest challenge is to choose a proper artificial room IR model for training ASR models. In this paper, a generalised stati… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
6
2

Relationship

1
7

Authors

Journals

citations
Cited by 12 publications
(4 citation statements)
references
References 27 publications
0
4
0
Order By: Relevance
“…Additionally, several room acoustic parameters have been applied in different dereverberation methods to suppress the reverberation in the signal. C 50 is used in [9] [10] and T 60 in [11] [12] to select the ASR acoustic model that better represents the reverberant conditions of the input utterance. In [13] T 60 is used to add to the current hidden Markov model state the contribution of previous states by applying a piecewise energy decay curve that is separated in early reflections and late reverberation contributions.…”
Section: Introductionmentioning
confidence: 99%
“…Additionally, several room acoustic parameters have been applied in different dereverberation methods to suppress the reverberation in the signal. C 50 is used in [9] [10] and T 60 in [11] [12] to select the ASR acoustic model that better represents the reverberant conditions of the input utterance. In [13] T 60 is used to add to the current hidden Markov model state the contribution of previous states by applying a piecewise energy decay curve that is separated in early reflections and late reverberation contributions.…”
Section: Introductionmentioning
confidence: 99%
“…These results are an important extension from our previous work in static SSL and support the robustness of the system to the sound dynamics in real-world environments. Furthermore, our system can be easily integrated with recent methods to enhance ASR in reverberant environments [55]- [57] without adding computational cost. This is the intrinsic advantage of embodied embedded cognition.…”
Section: Discussionmentioning
confidence: 99%
“…The image model method, first proposed in [ 19 ], is the most widespread among the latter. Alternatively, statistical methods [ 20 ] or methods based on geometric acoustics and ray tracing [ 21 ] can be used. To create realistic sound signals in this work, the image model method was used in the implementation of Lehman, Johansson and Nordholm [ 22 , 23 ].…”
Section: Methodsmentioning
confidence: 99%