2022
DOI: 10.1016/j.csl.2021.101277
|View full text |Cite
|
Sign up to set email alerts
|

Dereverberation of autoregressive envelopes for far-field speech recognition

Abstract: The task of speech recognition in far-field environments is adversely affected by the reverberant artifacts that elicit as the temporal smearing of the sub-band envelopes. In this paper, we develop a neural model for speech dereverberation using the long-term sub-band envelopes of speech. The sub-band envelopes are derived using frequency domain linear prediction (FDLP) which performs an autoregressive estimation of the Hilbert envelopes. The neural dereverberation model estimates the envelope gain which when … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2

Relationship

1
4

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 42 publications
(59 reference statements)
0
4
0
Order By: Relevance
“…Recently, end-to-end models with attention based modeling have also been explored on the REVERB challenge dataset [21,22]. Previously, we had proposed a convolutional neural network model to perform dereverberation of speech [10,11]. In the current work, we extend this prior work for E2E transformer based ASR system.…”
Section: Literature Reviewmentioning
confidence: 94%
See 1 more Smart Citation
“…Recently, end-to-end models with attention based modeling have also been explored on the REVERB challenge dataset [21,22]. Previously, we had proposed a convolutional neural network model to perform dereverberation of speech [10,11]. In the current work, we extend this prior work for E2E transformer based ASR system.…”
Section: Literature Reviewmentioning
confidence: 94%
“…Our previous work [10,11] explored the use of dereverberation of sub-band envelopes for hybrid speech recognition systems. The sub-band envelopes are extracted using the autoregressive modeling framework of frequency domain linear prediction [12,13].…”
Section: Introductionmentioning
confidence: 99%
“…One is to improve the pickup equipment or pickup method, such as using a microphone array instead of a single microphone, that is, multichannel speech signal acquisition. The second is to use some signal processing methods to improve the quality of far-field speech. , In this paper, from the perspective of front-end pickup, a flexible graphene sensor is proposed to detect vocal cord vibration signals to improve speech recognition performance in the far-field environment. Like other physiological signals, such as breathing signals and pulse signals, vocal fold vibration signals are relatively weak signals.…”
Section: Introductionmentioning
confidence: 99%
“…The second is to use some signal processing methods to improve the quality of farfield speech. 13,14 In this paper, from the perspective of frontend pickup, a flexible graphene sensor is proposed to detect short vowels in human pronunciation using graphene-like MXene to detect vocal vibration signals. However, the preparation process of this material is complex and the cost is high, and the stability of the MXene material at room temperature is poor.…”
Section: Introductionmentioning
confidence: 99%