The platform will undergo maintenance on Sep 14 at about 7:45 AM EST and will be unavailable for approximately 2 hours.
2006
DOI: 10.1109/9780470043387
|View full text |Cite
|
Sign up to set email alerts
|

Computational Auditory Scene Analysis

Abstract: We present a computational auditory scene analysis system for separating and recognizing target speech in the presence of competing speech or noise. We estimate, in two stages, the ideal binary time-frequency (T-F) mask which retains the mixture in a local T-F unit if and only if the target is stronger than the interference within the unit. In the first stage, we use harmonicity to segregate the voiced portions of individual sources in each time frame based on multipitch tracking. Additionally, unvoiced portio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
122
0
4

Year Published

2008
2008
2018
2018

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 509 publications
(126 citation statements)
references
References 11 publications
0
122
0
4
Order By: Relevance
“…Several computational auditory scene analysis (CASA) techniques were proposed in the literature modeling the above two-stage segregation process (Wang and Brown, 2006). The goal of CASA techniques was to segregate only the target signal, rather than all interfering sources, from the sound mixtures, and the means suggested for achieving this goal was the ideal T-F binary mask (Wang, 2005).…”
Section: Introductionmentioning
confidence: 99%
“…Several computational auditory scene analysis (CASA) techniques were proposed in the literature modeling the above two-stage segregation process (Wang and Brown, 2006). The goal of CASA techniques was to segregate only the target signal, rather than all interfering sources, from the sound mixtures, and the means suggested for achieving this goal was the ideal T-F binary mask (Wang, 2005).…”
Section: Introductionmentioning
confidence: 99%
“…Computational auditory scene analysis (CASA) is one of the popular speech separation methods that exploits human perceptual processing in computational systems (Wang & Brown, 2006). Human beings have shown great success in speech separation using our inborn capability.…”
Section: Single-microphone Speech Separationmentioning
confidence: 99%
“…Masks are applied to spectrograms of mixed sounds. If the value of 1 is applied for a t-f unit in which the target energy is stronger than the total interference energy, and the value of 0 otherwise, the mask is called ideal binary mask (Wang, Brown, 2006;Brungart et al, 2009).…”
Section: Introductionmentioning
confidence: 99%
“…They are collectively referred to as Computational Auditory Stream Analysis (CASA, for a review, see Wang and Brown, 2006). …”
Section: Introductionmentioning
confidence: 99%