In an attempt to predict the outcomes of matrix sentence tests in different languages and various noise conditions for native listeners, the simulation framework for auditory discrimination experiments (FADE) and the extended Speech Intelligibility Index (eSII) is employed. FADE uses an automatic speech recognition system to simulate recognition experiments and reports the highest achievable performance as the outcome, which showed good predictions for the German matrix test in noise. The eSII is based on the short-time analysis of weighted signalto-noise ratios in different frequency bands. In contrast to many other approaches, including the eSII, FADE uses no empirical reference. In this work, the FADE approach is evaluated for predictions of the German, Polish, Russian, and Spanish matrix test in stationary and fluctuating noise conditions. The FADEbased predictions yield a high correlation (Pearsons R 2 = 0.94) with the empirical data and a root-mean-square (RMS) prediction error of 1.9 dB outperforming the eSII-based predictions (R 2 = 0.78, RMS = 4.2 dB). FADE can also predict the data of subgroups with only stationary or only fluctuating noises, while the eSII cannot. The FADE-based predictions seem to generalize over different languages and noise conditions.
The benefit in speech-recognition performance due to the compensation of a hearing loss can vary between listeners, even if unaided performance and hearing thresholds are similar. To accurately predict the individual performance benefit due to a specific hearing device, a prediction model is proposed which takes into account hearing thresholds and a frequency-dependent suprathreshold component of impaired hearing. To test the model, the German matrix sentence test was performed in unaided and individually aided conditions in quiet and in noise by 18 listeners with different degrees of hearing loss. The outcomes were predicted by an individualized automatic speech-recognition system where the individualization parameter for the suprathreshold component of hearing loss was inferred from tone-in-noise detection thresholds. The suprathreshold component was implemented as a frequency-dependent multiplicative noise (mimicking level uncertainty) in the feature-extraction stage of the automatic speech-recognition system. Its inclusion improved the root-mean-square prediction error of individual speech-recognition thresholds (SRTs) from 6.3 dB to 4.2 dB and of individual benefits in SRT due to common compensation strategies from 5.1 dB to 3.4 dB. The outcome predictions are highly correlated with both the corresponding observed SRTs ( R2 = .94) and the benefits in SRT ( R2 = .89) and hence might help to better understand—and eventually mitigate—the perceptual consequences of as yet unexplained hearing problems, also discussed in the context of hidden hearing loss.
Sound onsets provide particularly valuable cues for musical instrument identification by human listeners. It has remained unclear whether this onset advantage is due to enhanced perceptual encoding or the richness of acoustical information during onsets. Here this issue was approached by modeling a recent study on instrument identification from tone excerpts [Siedenburg. (2019). J. Acoust. Soc. Am. 145(2), 1078–1087]. A simple Hidden Markov Model classifier with separable Gabor filterbank features simulated human performance and replicated the onset advantage observed previously for human listeners. These results provide evidence that the onset advantage may be driven by the distinct acoustic qualities of onsets.
In many applications in which speech is played back via a sound reinforcement system such as public address systems and mobile phones, speech intelligibility is degraded by additive environmental noise. A possible solution to maintain high intelligibility in noise is to pre-process the speech signal based on the estimated noise power at the position of the listener. The previously proposed AdaptDRC algorithm [Schepker, Rennies, and Doclo (2015). J. Acoust. Soc. Am. 138, 2692-2706] applies both frequency shaping and dynamic range compression under an equal-power constraint, where the processing is adaptively controlled by short-term estimates of the speech intelligibility index. Previous evaluations of the algorithm have focused on normal-hearing listeners. In this study, the algorithm was extended with an adaptive gain stage under an equal-peak-power constraint, and evaluated with eleven normal-hearing and ten mildly to moderately hearing-impaired listeners. For normal-hearing listeners, average improvements in speech reception thresholds of about 4 and 8 dB compared to the unprocessed reference condition were measured for the original algorithm and its extension, respectively. For hearing-impaired listeners, the average improvements were about 2 and 6 dB, indicating that the relative improvement due to the proposed adaptive gain stage was larger for these listeners than the benefit of the original processing stages.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.