2008
DOI: 10.1121/1.2932823
|View full text |Cite
|
Sign up to set email alerts
|

Speech perception in noise with binary gains

Abstract: For a given mixture of speech and noise, an ideal binary time-frequency mask is constructed by whether SNR within individual time-frequency units exceeds a local SNR criterion (LC). With linear filters, co-reducing mixture SNR and LC does not alter the ideal binary mask. Taking this manipulation to the limit by setting both mixture SNR and LC to minus infinity produces an output that contains only noise with no target speech at all. This particular output corresponds to turning on or off the filtered noise acc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

3
14
0

Year Published

2010
2010
2014
2014

Publication Types

Select...
4
2
1

Relationship

3
4

Authors

Journals

citations
Cited by 12 publications
(17 citation statements)
references
References 0 publications
3
14
0
Order By: Relevance
“…As with the algorithm, it was found that the IBM improved recognition of isolated phonemes, thus extending results observed previously for sentences (Anzalone et al, 2006;Brungart et al, 2006;Li and Loizou, 2008;Wang et al, 2008Wang et al, , 2009Cao et al, 2011;Sinex, 2013). This result should not be surprising, given the effectiveness of algorithm processing observed here and the fact that the algorithm aimed to estimate the IBM.…”
Section: Algorithm and Ibm Processingsupporting
confidence: 73%
See 1 more Smart Citation
“…As with the algorithm, it was found that the IBM improved recognition of isolated phonemes, thus extending results observed previously for sentences (Anzalone et al, 2006;Brungart et al, 2006;Li and Loizou, 2008;Wang et al, 2008Wang et al, , 2009Cao et al, 2011;Sinex, 2013). This result should not be surprising, given the effectiveness of algorithm processing observed here and the fact that the algorithm aimed to estimate the IBM.…”
Section: Algorithm and Ibm Processingsupporting
confidence: 73%
“…In the Ideal Binary Mask (IBM), the premixed speech and noise signals are known, and the mask corresponds to ideal classification. IBM processing produces remarkable speech-intelligibility improvements in noise for both HI and NH listeners, even at extremely low SNRs (Anzalone et al, 2006;Brungart et al, 2006;Li and Loizou, 2008;Wang et al, 2008Wang et al, , 2009Cao et al, 2011;Sinex, 2013).…”
Section: Introductionmentioning
confidence: 99%
“…The IBM is a binary matrix that classifies the timefrequency (T-F) representation of noisy speech into target-dominated (reliable) and masker-dominated (unreliable) T-F units. Several studies have shown that the IBM can be used to reconstruct a target signal from a noisy mixture, resulting in large improvements of speech intelligibility (e.g., Brungart et al, 2006;Wang et al, 2008). However, the construction of the IBM requires a priori knowledge about the target and the masker, which is typically not available in practice.…”
Section: Introductionmentioning
confidence: 99%
“…Quality is a property of speech that corresponds to its realism and naturalness, characteristics that are not necessary for intelligibility. [3] and [7] have shown that a frequency-dependent gating or modulation of noise, which has low quality, can be highly intelligible. Thus, while ASR performance can be predicted by quality to some extent, the relationship is imperfect and indirect [8], [9].…”
mentioning
confidence: 99%
“…While human intelligibility has been used to evaluate the quality of ground truth masking-based source separation [2], [3], such evaluations are expensive and time consuming and must be rerun for every variant of a source separation algorithm. ASR performance, on the other hand, requires only computational resources.…”
mentioning
confidence: 99%