Computational Auditory Scene Analysis

Wang, DeLiang; Brown, Guy J.

doi:10.1109/9780470043387

Cited by 509 publications

(126 citation statements)

References 11 publications

Supporting

Mentioning

122

Contrasting

Unclassified

Order By: Relevance

“…Several computational auditory scene analysis (CASA) techniques were proposed in the literature modeling the above two-stage segregation process (Wang and Brown, 2006). The goal of CASA techniques was to segregate only the target signal, rather than all interfering sources, from the sound mixtures, and the means suggested for achieving this goal was the ideal T-F binary mask (Wang, 2005).…”

Section: Introductionmentioning

confidence: 99%

Factors influencing intelligibility of ideal binary-masked speech: Implications for noise reduction

Loizou

2008

The Journal of the Acoustical Society of America

221

195

View full text Add to dashboard Cite

The application of the ideal binary mask to an auditory mixture has been shown to yield substantial improvements in intelligibility. This mask is commonly applied to the time-frequency (T-F) representation of a mixture signal and eliminates portions of a signal below a signal-to-noise-ratio (SNR) threshold while allowing others to pass through intact. The factors influencing intelligibility of ideal binary-masked speech are not well understood and are examined in the present study. Specifically, the effects of the local SNR threshold, input SNR level, masker type and errors introduced in estimating the ideal mask are examined. Consistent with previous studies, intelligibility of binary-masked stimuli is quite high even at -10 dB SNR for all maskers tested. Performance was affected the most when the masker dominated T-F units were wrongly labeled as target-dominated T-F units. Performance plateaued near 100% correct for SNR thresholds ranging from -20 dB to 5 dB. We believe the existence of the plateau region suggests that it is the pattern of the ideal binary mask that matters the most rather than the local SNR of each T-F unit. This pattern directs the listener's attention to where the target is and enables them to segregate speech effectively in multi-talker environments.

show abstract

Section: Introductionmentioning

confidence: 99%

Factors influencing intelligibility of ideal binary-masked speech: Implications for noise reduction

Loizou

2008

The Journal of the Acoustical Society of America

221

195

View full text Add to dashboard Cite

show abstract

“…Computational auditory scene analysis (CASA) is one of the popular speech separation methods that exploits human perceptual processing in computational systems (Wang & Brown, 2006). Human beings have shown great success in speech separation using our inborn capability.…”

Section: Single-microphone Speech Separationmentioning

confidence: 99%

Single-Microphone Speech Separation: The use of Speech Models

Lee¹

2011

Speech Technologies

View full text Add to dashboard Cite

“…Masks are applied to spectrograms of mixed sounds. If the value of 1 is applied for a t-f unit in which the target energy is stronger than the total interference energy, and the value of 0 otherwise, the mask is called ideal binary mask (Wang, Brown, 2006;Brungart et al, 2009).…”

Section: Introductionmentioning

confidence: 99%

“…They are collectively referred to as Computational Auditory Stream Analysis (CASA, for a review, see Wang and Brown, 2006). …”

Section: Introductionmentioning

confidence: 99%

Multimodal Ultrasonic Imaging for Breast Cancer Detection

Camacho

Medina

Cruza

et al. 2012

Archives of Acoustics

View full text Add to dashboard Cite

Ultrasound is used for breast cancer detection as a technique complementary to mammography, the standard screening method. Current practice is based on reflectivity images obtained with conventional instruments by an operator who positions the ultrasonic transducer by hand over the patient's body. It is a non-ionizing radiation, pain-free and not expensive technique that provides a higher contrast than mammography to discriminate among fluid-filled cysts and solid masses, especially for dense breast tissue. However, results are quite dependent on the operator's skills, images are difficult to reproduce, and state-of-the-art instruments have a limited resolution and contrast to show micro-calcifications and to discriminate between lesions and the surrounding tissue. In spite of their advantages, these factors have precluded the use of ultrasound for screening.This work approaches the ultrasound-based early detection of breast cancer with a different concept. A ring array with many elements to cover 360• around a hanging breast allows obtaining repeatable and operator-independent coronal slice images. Such an arrangement is well suited for multi-modal imaging that includes reflectivity, compounded, tomography, and phase coherence images for increased specificity in breast cancer detection. Preliminary work carried out with a mechanical emulation of the ring array and a standard breast phantom shows a high resolution and contrast, with an artifact-free capability provided by phase coherence processing.

show abstract

Computational Auditory Scene Analysis

Cited by 509 publications

References 11 publications

Factors influencing intelligibility of ideal binary-masked speech: Implications for noise reduction

Factors influencing intelligibility of ideal binary-masked speech: Implications for noise reduction

Single-Microphone Speech Separation: The use of Speech Models

Multimodal Ultrasonic Imaging for Breast Cancer Detection

Contact Info

Product

Resources

About