Objective Intelligibility Measures Based on Mutual Information for Speech Subjected to Speech Enhancement Processing

Taghia, Jalal; Martin, Rainer

doi:10.1109/tasl.2013.2281574

Cited by 43 publications

(36 citation statements)

References 32 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A typical scale is the ERB (equivalent rectangular bandwidth) scale, e.g., [18], [19]. It is natural, e.g., [15], to consider the auditory-domain signal to have one independent component signal per ERB. Auditory models provide a manner of deriving such component signals.…”

Section: A Model With Production and Interpretation Noisementioning

confidence: 99%

“…The interpretation process for speech is also noisy: speech signals that are ambiguous in their pronunciation may be interpreted in various ways. Information theoretical concepts have been used in the analysis of human hearing [14] and for the definition of measures of intelligibility [15]. These models do not have the notion of production noise, but the model of [14] considers sensory noise, which corresponds to our interpretation noise.…”

mentioning

confidence: 99%

“…These models do not have the notion of production noise, but the model of [14] considers sensory noise, which corresponds to our interpretation noise. The models of [14] and [15] appear not to have been used for optimizing intelligibility.…”

mentioning

confidence: 99%

See 2 more Smart Citations

A Simple Model of Speech Communication and its Application to Intelligibility Enhancement

Kleijn

Hendriks

2015

IEEE Signal Process. Lett.

View full text Add to dashboard Cite

Abstract-We introduce a model of communication that includes noise inherent in the message production process as well as noise inherent in the message interpretation process. The production and interpretation noise processes have a fixed signal-to-noise ratio. The resulting system is a simple but effective model of human communication. The model naturally leads to a method to enhance the intelligibility of speech rendered in a noisy environment. State-of-the-art experimental results confirm the practical value of the model.

show abstract

Section: A Model With Production and Interpretation Noisementioning

confidence: 99%

mentioning

confidence: 99%

See 1 more Smart Citation

A Simple Model of Speech Communication and its Application to Intelligibility Enhancement

Kleijn

Hendriks

2015

IEEE Signal Process. Lett.

View full text Add to dashboard Cite

show abstract

“…Recently, information theory (IT) has been proposed as a new paradigm for speech intelligibility prediction [13,14,15]. This is a natural approach to take given that the fundamental goal of speech communication is to transfer information from a talker to a listener.…”

Section: Introductionmentioning

confidence: 99%

An intelligibility metric based on a simple model of speech communication

Kuyk

Kleijn

Hendriks

2016

2016 IEEE International Workshop on Acoustic Signal Enhancement (IWAENC)

View full text Add to dashboard Cite

Instrumental measures of speech intelligibility typically produce an index between 0 and 1 that is monotonically related to listening test scores. As such, these measures are dimensionless and do not represent physical quantities. In this paper, we propose a new instrumental intelligibility metric that describes speech intelligibility using bits per second. The proposed metric builds upon an existing intelligibility metric that was motivated by information theory. Our main contribution is that we use a statistical model of speech communication that accounts for noise inherent in the speech production process. Experiments show that the proposed metric performs at least as well as existing state-of-the-art intelligibility metrics.

show abstract

“…The STOI measure is based on the sum of the correlation between the envelopes of the clean speech signal and the corrupted speech measured with 15 1/3-octave frequency bands starting at 150 Hz. More recently, using the same frequency bands, it has been shown that a mutual information-based measure can perform better than STOI (Taghia and Martin, 2014).…”

Section: Objective Intelligibility Measuresmentioning

confidence: 99%

The third ‘CHiME’ speech separation and recognition challenge: Analysis and outcomes

Barker

Marxer

Vincent

et al. 2017

Computer Speech & Language

101

View full text Add to dashboard Cite

This paper presents the design and outcomes of the CHiME-3 challenge, the first open speech recognition evaluation designed to target the increasingly relevant multichannel, mobile-device speech recognition scenario. The paper serves two purposes. First, it provides a definitive reference for the challenge, including full descriptions of the task design, data capture and baseline systems along with a description and evaluation of the 26 systems that were submitted. The best systems re-engineered every stage of the baseline resulting in reductions in word error rate from 33.4% to as low as 5.8%. By comparing across systems, techniques that are essential for strong performance are identified. Second, the paper considers the problem of drawing conclusions from evaluations that use speech directly recorded in noisy environments. The degree of challenge presented by the resulting material is hard to control and hard to fully characterise. We attempt to dissect the various 'axes of difficulty' by correlating various estimated signal properties with typical system performance on a per session and per utterance basis. We find strong evidence of a dependence on signal-to-noise ratio and channel quality. Systems are less sensitive to variations in the degree of speaker motion. The paper concludes by discussing the outcomes of CHiME-3 in relation to the design of future mobile speech recognition evaluations.

show abstract

Objective Intelligibility Measures Based on Mutual Information for Speech Subjected to Speech Enhancement Processing

Cited by 43 publications

References 32 publications

A Simple Model of Speech Communication and its Application to Intelligibility Enhancement

A Simple Model of Speech Communication and its Application to Intelligibility Enhancement

An intelligibility metric based on a simple model of speech communication

The third ‘CHiME’ speech separation and recognition challenge: Analysis and outcomes

Contact Info

Product

Resources

About