2017 25th European Signal Processing Conference (EUSIPCO) 2017
DOI: 10.23919/eusipco.2017.8081648
|View full text |Cite
|
Sign up to set email alerts
|

Blind spatial sound source clustering and activity detection using uncalibrated microphone array

Abstract: Abstract-This paper presents a method for estimating the number, as well as the activity periods of spatially distributed sound sources using an uncalibrated microphone array. This methodology is applied for the purposes of speaker diarization. In general, speaker diarization has difficulty with: 1) estimating the number of sound sources (speakers), and 2) activity detection of multiple sound sources including overlap of utterances. Several microphone array based techniques have already tackled these challenge… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(2 citation statements)
references
References 30 publications
0
2
0
Order By: Relevance
“…The speakers are assumed to be intermittent which is not the case in most realistic scenarios where source separation has to be applied. Another method proposed recently uses generalized cross-correlation with phase transform (GCC-PHAT) and time difference of arrival (TDOA) together with a clustering algorithm to estimate steering vectors for the speakers, allowing for estimation of speaker activity patterns [9]. However, they did not apply their solution to beamforming for source separation.…”
Section: Introductionmentioning
confidence: 99%
“…The speakers are assumed to be intermittent which is not the case in most realistic scenarios where source separation has to be applied. Another method proposed recently uses generalized cross-correlation with phase transform (GCC-PHAT) and time difference of arrival (TDOA) together with a clustering algorithm to estimate steering vectors for the speakers, allowing for estimation of speaker activity patterns [9]. However, they did not apply their solution to beamforming for source separation.…”
Section: Introductionmentioning
confidence: 99%
“…In this paper, we consider the diarization of audio recordings using spatial features alone. Several solutions have been proposed utilizing spatial features, which use the time-difference-of-arrival (TDOA) features [4,5,6,7]. However, the estimation of TDOA is sensitive to reverberation and noise.…”
Section: Introductionmentioning
confidence: 99%