A reverberation-time-aware DNN approach leveraging spatial information for microphone array dereverberation

Yang, Minglei; Li, Kehuang; Huang, Zhen; Siniscalchi, Sabato Marco; Wang, Tong; Lee, Chin‐Hui

doi:10.1186/s13634-017-0516-6

Cited by 10 publications

(2 citation statements)

References 41 publications

(49 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, the DNN-based vector-to-vector regression, which is the focus in this work, mainly aims at single-channel speech enhancement and is not simply generalized to multi-channel speech enhancement. As shown in Figure 1, a traditional approach to dealing with an array of microphones is exploited spatial information at the input level by concatenating speech vectors from multiple microphones into a single high dimensional vector, e.g., [12,2]. Thus the vector-to-vector regression approach can still be employed for speech enhancement by appending multichannel feature vectors together into a high-dimensional vector and mapping it to a vector extracted from the reference vector.…”

Section: Introductionmentioning

confidence: 99%

Tensor-To-Vector Regression for Multi-Channel Speech Enhancement Based on Tensor-Train Network

Wang

et al. 2020

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Self Cite

View full text Add to dashboard Cite

We propose a tensor-to-vector regression approach to multi-channel speech enhancement in order to address the issue of input size explosion and hidden-layer size expansion. The key idea is to cast the conventional deep neural network (DNN) based vector-to-vector regression formulation under a tensor-train network (TTN) framework. TTN is a recently emerged solution for compact representation of deep models with fully connected hidden layers. Thus TTN maintains DNN's expressive power yet involves a much smaller amount of trainable parameters. Furthermore, TTN can handle a multi-dimensional tensor input by design, which exactly matches the desired setting in multi-channel speech enhancement. We first provide a theoretical extension from DNN to TTN based regression. Next, we show that TTN can attain speech enhancement quality comparable with that for DNN but with much fewer parameters, e.g., a reduction from 27 million to only 5 million parameters is observed in a single-channel scenario. TTN also improves PESQ over DNN from 2.86 to 2.96 by slightly increasing the number of trainable parameters. Finally, in 8-channel conditions, a PESQ of 3.12 is achieved using 20 million parameters for TTN, whereas a DNN with 68 million parameters can only attain a PESQ of 3.06. Code is available online 1 .

show abstract

Section: Introductionmentioning

confidence: 99%

Tensor-To-Vector Regression for Multi-Channel Speech Enhancement Based on Tensor-Train Network

Wang

et al. 2020

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Self Cite

View full text Add to dashboard Cite

show abstract

“…In this work, an improved calibration method, which takes into account the generalisation performance and robustness of geometric parameter correction, is introduced to enhance the essential positioning accuracy of robots. DNN has received substantial attention in both the signal processing field and the machine learning field with its strong regression capabilities [29][30][31]. e design of DNN architecture must be optimised to make the DNN demonstrate the best predictive capacity.…”

Section: Introductionmentioning

confidence: 99%

Evolutionary Robot Calibration and Nonlinear Compensation Methodology Based on GA-DNN and an Extra Compliance Error Model

Chen

Zhang

Sun

2020

Mathematical Problems in Engineering

View full text Add to dashboard Cite

This study addresses the problem of nonlinear error predictive compensation to achieve high positioning accuracy for advanced industrial applications. An improved calibration method based on the generalisation performance evaluation is proposed to enhance the stability and accuracy of robot calibration. With the development of technology, a deep neural network (DNN) optimised by a genetic algorithm (GA) is applied to predict the nonlinear error of the calibrated robot. To address the change of external payload, an extra compliance error model is established with a linear piecewise method. A global compensation method combining the GA-DNN nonlinear regression prediction model and the compliance error model is then proposed to achieve the robot’s high-precision positioning performance under any external payload. Experimental results obtained on a Staubli RX160L robot with a FARO laser tracker are introduced to demonstrate the effectiveness and benefits of our proposed methodology. The enhanced positioning accuracy can reach 0.22 mm with 98% probability (i.e., the maximum positioning error in all test data).

show abstract

A GRU-Based Late Reverberation Suppression Method for Single-Channel Speech Dereverberation

Zhang

et al. 2022

2022 IEEE 22nd International Conference on Communication Technology (ICCT)

View full text Add to dashboard Cite

A reverberation-time-aware DNN approach leveraging spatial information for microphone array dereverberation

Cited by 10 publications

References 41 publications

Tensor-To-Vector Regression for Multi-Channel Speech Enhancement Based on Tensor-Train Network

Tensor-To-Vector Regression for Multi-Channel Speech Enhancement Based on Tensor-Train Network

Evolutionary Robot Calibration and Nonlinear Compensation Methodology Based on GA-DNN and an Extra Compliance Error Model

A GRU-Based Late Reverberation Suppression Method for Single-Channel Speech Dereverberation

Contact Info

Product

Resources

About