2020
DOI: 10.3390/s20205751
|View full text |Cite
|
Sign up to set email alerts
|

Wearable Hearing Device Spectral Enhancement Driven by Non-Negative Sparse Coding-Based Residual Noise Reduction

Abstract: This paper proposes a novel technique to improve a spectral statistical filter for speech enhancement, to be applied in wearable hearing devices such as hearing aids. The proposed method is implemented considering a 32-channel uniform polyphase discrete Fourier transform filter bank, for which the overall algorithm processing delay is 8 ms in accordance with the hearing device requirements. The proposed speech enhancement technique, which exploits the concepts of both non-negative sparse coding (NNSC) and spec… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
11
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 6 publications
(11 citation statements)
references
References 29 publications
0
11
0
Order By: Relevance
“…As a fundamental loss function for calculating scriptLP${\mathcal{L}}_{P}$ and scriptLm${\mathcal{L}}_{m}$, the scale‐invariant signal‐to‐noise ratio (SI‐SNR) is utilized, which is known for its effective denoising performance [4, 5]. Here, scriptLP${\mathcal{L}}_{P}$ is defined as follows: 0.33emLpbadbreak=0.33em10log10)(false∥boldstargetfalse∥22/false∥boldenoisefalse∥22,$$\begin{equation} \ {\mathcal{L}}_{p}=\ 10\textit{lo}{g}_{10}\left(\Vert {\mathbf{s}}^{\textit{targ}\textit{et}}{\Vert}_{2}^{2}/\Vert {\mathbf{e}}^{\textit{noise}}{\Vert}_{2}^{2}\right), \end{equation}$$where starget=trueŝ(boldnτ),bolds(boldnτ)bolds(boldnτ)sfalse(nτfalse)220.33em${{{\bf s}}}^{target} = \frac{{\langle {{{\bf \hat{s}}}( {{{\bf n}} - \tau } ),{{\bf s}}( {{{\bf n}} - \tau } )} \rangle {{\bf s}}( {{{\bf n}} - \tau } )}}{{\|{{\bf s}}( {{{\bf n}} - \tau } )\|_2^2}}\ $ and enoise=trueŝ(nτ)starget${\mathbf{e}}^{\textit{noise}}=\widehat{\mathbf{s}}(\mathbf{n}-\tau )-{\mathbf{s}}^{\textit{targ}\textit{et}}$.…”
Section: Proposed Auditory Fb Denoising Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…As a fundamental loss function for calculating scriptLP${\mathcal{L}}_{P}$ and scriptLm${\mathcal{L}}_{m}$, the scale‐invariant signal‐to‐noise ratio (SI‐SNR) is utilized, which is known for its effective denoising performance [4, 5]. Here, scriptLP${\mathcal{L}}_{P}$ is defined as follows: 0.33emLpbadbreak=0.33em10log10)(false∥boldstargetfalse∥22/false∥boldenoisefalse∥22,$$\begin{equation} \ {\mathcal{L}}_{p}=\ 10\textit{lo}{g}_{10}\left(\Vert {\mathbf{s}}^{\textit{targ}\textit{et}}{\Vert}_{2}^{2}/\Vert {\mathbf{e}}^{\textit{noise}}{\Vert}_{2}^{2}\right), \end{equation}$$where starget=trueŝ(boldnτ),bolds(boldnτ)bolds(boldnτ)sfalse(nτfalse)220.33em${{{\bf s}}}^{target} = \frac{{\langle {{{\bf \hat{s}}}( {{{\bf n}} - \tau } ),{{\bf s}}( {{{\bf n}} - \tau } )} \rangle {{\bf s}}( {{{\bf n}} - \tau } )}}{{\|{{\bf s}}( {{{\bf n}} - \tau } )\|_2^2}}\ $ and enoise=trueŝ(nτ)starget${\mathbf{e}}^{\textit{noise}}=\widehat{\mathbf{s}}(\mathbf{n}-\tau )-{\mathbf{s}}^{\textit{targ}\textit{et}}$.…”
Section: Proposed Auditory Fb Denoising Methodsmentioning
confidence: 99%
“…Monaural auditory device algorithms, including filterbanks (FBs), should not introduce delays exceeding 10 ms, and computational complexity should be low to accommodate the limited computational capacity and battery power of portable devices [3]. To meet these criteria, the polyphase discrete Fourier transform (DFT) FB strategy is predominantly utilized, which is implemented through the integrated structure of fast Fourier transform (FFT)-based short-term Fourier transform (STFT) analysis and overlap-add techniques [3,4].…”
mentioning
confidence: 99%
“…Microphone arrays and speech enhancement components are built into MCSE that processes multiple channels of audio signals in noisy environments such as outdoor environments ( Palla et al, 2017 ; Pauline, Samiappan & Kumar, 2021 ). For example, a spectral statistics filter is applied to hearing aids to handle stationary noise environments (Gaussian noise) and unsteady noise environments (factories, babble, and car noises) from −5 dB to 20 dB ( Kim, 2020 ). The current performance rate at low SNR are 2.16 PESQ score with babble noise, 2.20 considered as low quality of signal with Gaussian noise, 2.13 considered as low quality of signal with factory noise and 3.67 PESQ score considered as a medium quality of signal with car noise on an average of −5 dB to 10 db SNR levels ( Kim, 2020 ).…”
Section: Research Backgroundmentioning
confidence: 99%
“…The existing MCSE provides speech recognition at a 71% Word Recognition Rate (WRR) at 10 dB SNR compared to a single microphone ( Xu et al, 2004 ; Stupakov et al, 2012 ). These multi-channel algorithms (beamforming, adaptive noise reduction and voice activity detection algorithms) suffer from the low performance of recognition rate when SNR is low (−15 dB, −10 dB, −5 dB, 0 dB) ( Pauline, Samiappan & Kumar, 2021 ; Kim, 2020 ). The existing algorithms developed for MCSE systems were only tested for white Gaussian stationary noise at 0 to 60 dB SNRs and were never tested for non-stationary environmental noises.…”
Section: Introductionmentioning
confidence: 99%
“…Microphone array and speech enhancement are the components embedded in wearable speech enhancement (WSE) that process speech signals in multichannel under noisy environments such as the outdoor environments [3]. For example, a spectral statistical filter is applied in wearable hearing devices for handling stationary noise environment (Gaussian noise) and non-stationary noise environments (babble, factory, and car) at −5dB to 20dB [4].…”
Section: Introductionmentioning
confidence: 99%