Interspeech 2020 2020
DOI: 10.21437/interspeech.2020-2493
|View full text |Cite
|
Sign up to set email alerts
|

U-Net Based Direct-Path Dominance Test for Robust Direction-of-Arrival Estimation

Abstract: It has been noted that the identification of the time-frequency bins dominated by the contribution from the direct propagation of the target speaker can significantly improve the robustness of the direction-of-arrival estimation. However, the correct extraction of the direct-path sound is challenging especially in adverse environments. In this paper, a U-net based direct-path dominance test method is proposed. Exploiting the efficient segmentation capability of the U-net architecture, the direct-path informati… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(1 citation statement)
references
References 20 publications
0
1
0
Order By: Relevance
“…Although this approach has been previously applied to beamforming and source separation [14], it has not been extensively studied for DOA estimation. The study in [21] used the logarithmic magnitude spectrogram of a single-channel noisy signal as the input of a NN that classifies direct-path TF bins, and then estimated the DOA from these bins using the MUSIC algorithm [5] or the steered response power -phase transform (SRP-PHAT) algorithm [22]. While this approach displayed good performance, its disadvantages are the typically large number of the features defined by the short-time Fourier transform (STFT) size, and the long time and large amount of data required for training.…”
Section: Introductionmentioning
confidence: 99%
“…Although this approach has been previously applied to beamforming and source separation [14], it has not been extensively studied for DOA estimation. The study in [21] used the logarithmic magnitude spectrogram of a single-channel noisy signal as the input of a NN that classifies direct-path TF bins, and then estimated the DOA from these bins using the MUSIC algorithm [5] or the steered response power -phase transform (SRP-PHAT) algorithm [22]. While this approach displayed good performance, its disadvantages are the typically large number of the features defined by the short-time Fourier transform (STFT) size, and the long time and large amount of data required for training.…”
Section: Introductionmentioning
confidence: 99%