2021
DOI: 10.1109/taslp.2020.3039929
|View full text |Cite
|
Sign up to set email alerts
|

Speech Intelligibility Prediction Using Spectro-Temporal Modulation Analysis

Abstract: General rightsCopyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.-Users may download and print one copy of any publication from the public portal for the purpose of private study or research.-You may not further distribute the material or use it for any profit-making activity or commercia… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
11
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 16 publications
(11 citation statements)
references
References 74 publications
(143 reference statements)
0
11
0
Order By: Relevance
“…STM-based representations of acoustical stimuli have also successfully been applied to computational models of speech representation [ 38 , 39 ]. Furthermore, the ability to detect STM has been shown to be related to speech understanding in listeners with normal pure-tone detection thresholds [ 40 , 41 , 42 ], in listeners with cochlear damage [ 3 , 4 , 43 ], and in listeners who use cochlear implants [ 44 , 45 , 46 ].…”
Section: Introductionmentioning
confidence: 99%
“…STM-based representations of acoustical stimuli have also successfully been applied to computational models of speech representation [ 38 , 39 ]. Furthermore, the ability to detect STM has been shown to be related to speech understanding in listeners with normal pure-tone detection thresholds [ 40 , 41 , 42 ], in listeners with cochlear damage [ 3 , 4 , 43 ], and in listeners who use cochlear implants [ 44 , 45 , 46 ].…”
Section: Introductionmentioning
confidence: 99%
“…The other category is derived based on the observation that reverberation and/or additive noise tends to reduce the modulation depth of the distorted signal, compared with the clean reference signal. Well-known approaches of this category include the speech transmission index (STI) [25], spectro-temporal modulation index (STMI), normalized-covariance measure (NCM) [26], short-time objective intelligibility (STOI) [28], extended STOI (eSTOI) [29], polynomial measure (SOPM) [32], and weighted spectro-temporal modulation index (wSTMI) [33]. To avoid the necessity for clean reference speech, several non-intrusive approaches have been proposed.…”
Section: Introductionmentioning
confidence: 99%
“…This neural network is based on U-net [50], which consists of multiple layers of convolution and transpose convolution at encoder and decoder, respectively. Differently from neural network, wSTMI [51] proposes a linear model, whose weight parameters are optimized through training. wSTMI is based on STMI [52], which predicts speech intelligibility by calculating the correlation coefficient between the spectro-temporal modulation spectrograms of the clean and the degraded signals.…”
Section: Speech Intelligibility Predictionmentioning
confidence: 99%
“…In [41], where SIIB and SIIB Gauss are compared with ESTOI, HASPI, and the other nine SIPs, SIIB and SIIB Gauss perform best among 13 data sets. In [51], the latest developed wSTMI was compared with SIIB, SIIB Gauss , ESTOI, HASPI, and the other nine SIPs. By using some new data sets, wSTMI performs best, with ESTOI ranked second, HASPI ranked third, and SIIB Gauss ranked fourth.…”
Section: Speech Intelligibility Predictionmentioning
confidence: 99%
See 1 more Smart Citation