ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020
DOI: 10.1109/icassp40776.2020.9054643
|View full text |Cite
|
Sign up to set email alerts
|

DNN-based Distributed Multichannel Mask Estimation for Speech Enhancement in Microphone Arrays

Abstract: Multichannel processing is widely used for speech enhancement but several limitations appear when trying to deploy these solutions in the real world. Distributed sensor arrays that consider several devices with a few microphones is a viable solution which allows for exploiting the multiple devices equipped with microphones that we are using in our everyday life. In this context, we propose to extend the distributed adaptive node-specific signal estimation approach to a neural network framework. At each node, a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
12
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3

Relationship

3
4

Authors

Journals

citations
Cited by 8 publications
(12 citation statements)
references
References 24 publications
0
12
0
Order By: Relevance
“…Similarly to DNN-BF, several existing works integrate the time-frequency masks and the multichannel beamformer. These works either employ a single-channel DNN model [40]- [42], which exploit the spectral information only, or employ a multichannel DNN model [43]- [45], which exploit both the spectral and spatial information of the microphone signals. These various types of DNN models can also be used in the proposed method.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Similarly to DNN-BF, several existing works integrate the time-frequency masks and the multichannel beamformer. These works either employ a single-channel DNN model [40]- [42], which exploit the spectral information only, or employ a multichannel DNN model [43]- [45], which exploit both the spectral and spatial information of the microphone signals. These various types of DNN models can also be used in the proposed method.…”
Section: Discussionmentioning
confidence: 99%
“…Multi-channel approaches typically use time-frequency masks estimated by the DNN model to construct a spatial filter to enhance the target sound [40]- [45]. Extensions of this idea [46], [47] estimate the coefficients of the filter directly from the multi-channel data, which however require a large amount of training data simulated in a variety of scenarios.…”
Section: Introductionmentioning
confidence: 99%
“…In previous work [18], we replaced the oracle VAD used in DANSE by a TF mask predicted by a convolutional recurrent neural network (CRNN), in a similar manner as [7], [8]. We showed that the compressed signals sent to compute the filter of Equation ( 2) could also help to improve the mask prediction at the second step by a multi-node DNN.…”
Section: Dnn-based Distributed Multichannel Wiener Filtermentioning
confidence: 99%
“…One way to reduce the computational cost of the DNN-based methodologies while exploiting spatial information is to use ad-hoc microphone arrays and to distribute the processing over all the devices of the array. In a previous article, we introduced a solution that proved to efficiently process multichannel data in a distributed microphone array in the context of speech enhancement [14]. This approach was based on a two-step version of the distributed adaptive node-specific signal estimation (DANSE) algorithm by Bertrand and Moonen, where so-called compressed signals are sent among the devices [15].…”
Section: Introductionmentioning
confidence: 99%