2016
DOI: 10.1007/s40595-016-0085-x
|View full text |Cite
|
Sign up to set email alerts
|

Source separation employing beamforming and SRP-PHAT localization in three-speaker room environments

Abstract: This paper presents a new blind speech separation algorithm using beamforming technique that is capable of extracting each individual speech signal from a mixture of three speech sources in a room. The speech separation algorithm utilizes the steered response power phase transform for obtaining a localization estimate for each individual speech source in the frequency domain. Based on those estimates each desired speech signal is extracted from the speech mixture using an optimal beamforming technique. To solv… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(2 citation statements)
references
References 24 publications
(32 reference statements)
0
2
0
Order By: Relevance
“…The time delay difference localization method [18,19] is computationally efficient and easy to implement, but the accuracy of the algorithm is highly dependent on the topology of the microphone array, and the accuracy of time delay estimation affects the position estimation. The controllable beamforming method [20][21][22][23], with the SRP (steered response power) algorithm being the most famous example [24,25], is based on the idea of weighting the output of each element and summing them together while directing the array beam to the same direction at the same time to give the direction where the expected signal achieves the maximum output power, thus achieving sound source localization. However, this method has a large computational cost and is not suitable for real-time localization.…”
Section: Introductionmentioning
confidence: 99%
“…The time delay difference localization method [18,19] is computationally efficient and easy to implement, but the accuracy of the algorithm is highly dependent on the topology of the microphone array, and the accuracy of time delay estimation affects the position estimation. The controllable beamforming method [20][21][22][23], with the SRP (steered response power) algorithm being the most famous example [24,25], is based on the idea of weighting the output of each element and summing them together while directing the array beam to the same direction at the same time to give the direction where the expected signal achieves the maximum output power, thus achieving sound source localization. However, this method has a large computational cost and is not suitable for real-time localization.…”
Section: Introductionmentioning
confidence: 99%
“…Currently, generalized cross-correlation methods are generally used to extract time-difference information. They include generalized cross-correlation (GCC), Roth impulse response (RIR), smoothed correlation transform (SCOT) [12,13], phase transformation (PHAT), Hassab-Boucher (HB) weighting method and Hannan–Thomson (HT) [14,15,16]. These types of methods mainly measure the time of the two signal peaks as a way to determine the time difference and are suitable for extracting the information of the far-field time difference of the underground blasting.…”
Section: Introductionmentioning
confidence: 99%