2012
DOI: 10.1109/msp.2012.2205029
|View full text |Cite
|
Sign up to set email alerts
|

Making Machines Understand Us in Reverberant Rooms: Robustness Against Reverberation for Automatic Speech Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
125
0

Year Published

2014
2014
2024
2024

Publication Types

Select...
3
3
2

Relationship

1
7

Authors

Journals

citations
Cited by 206 publications
(126 citation statements)
references
References 38 publications
1
125
0
Order By: Relevance
“…The late reverberation part of the room impulse response is often modeled as an exponentially damped Gaussian noise process and treated as additive noise. Hence, the observed reverberant signal x(t) can be written by using the notation in [1] as…”
Section: Speech Enhancement Using Dnnmentioning
confidence: 99%
See 1 more Smart Citation
“…The late reverberation part of the room impulse response is often modeled as an exponentially damped Gaussian noise process and treated as additive noise. Hence, the observed reverberant signal x(t) can be written by using the notation in [1] as…”
Section: Speech Enhancement Using Dnnmentioning
confidence: 99%
“…Automatic speech recognition from distant microphones is a challenging task, because the speech signals to be recognized are degraded by the presence of interfering signals and reverberation due to large speakerto-microphone distance [1]. The conventional multichannel enhancement techniques, such as beamforming, are widely employed to suppress noise and reverberation from the desired speech when multiple microphones (e.g., microphone arrays) are used to capture audio signals [2,3].…”
Section: Introductionmentioning
confidence: 99%
“…We employ the STFT-domain dereverberation algorithm that was first proposed in [12] for a two-microphone one-output case and generalized later in [14]. The single-channel version is briefly described in [11] and in the following.…”
Section: Front Endsmentioning
confidence: 99%
“…In particular, we employ a single distant microphone (SDM) setup, where only speech data from a single table-top microphone are available. As a result of the large distance between the microphone and the speakers, speech signals are contaminated by reverberation, thus making transcription very challenging [11]. To combat the reverberant distortion, we employ one exemplary dereverberation method proposed in [12] and experimentally investigate how it can affect the performance of DNN-based acoustic models for both speaker independent (SI) and speaker adaptive training (SAT) scenarios.…”
Section: Introductionmentioning
confidence: 99%
“…The performance of existing models trained with anechoic speech signals can deteriorate when the person talking to the robot is located a few metres away [6]. Thus far, many algorithms for ASR in reverberant rooms have been developed with a focus mainly on spectrum enhancement, feature enhancement, hidden Markov model (HMM) adaptation and reverberant modeling during speech recognition [7]. Existing research uses multiple channel input [8][9] to deal with background noise or simultaneous speech.…”
Section: Introductionmentioning
confidence: 99%