A speech pause detection algorithm is an important and sensitive part of most single-microphone noise reduction schemes for enhancement of speech signals corrupted by additive noise as an estimate of the background noise is usually determined when speech is absent. An algorithm is proposed which detects speech pauses by adaptively tracking minima in a noisy signal's power envelope both for the broadband signal and for the high-pass and low-pass filtered signal. In poor signal-to-noise ratios (SNRs), the proposed algorithm maintains a low false-alarm rate in the detection of speech pauses while the standardized algorithm of ITU G.729 shows an increasing false-alarm rate in unfavorable situations. These characteristics are found with different types of noise and indicate that the proposed algorithm is better suited to be used for noise estimation in noise reduction algorithms, as speech deteriorations may thus be kept at a low level. It is shown that in connection with the Ephraim-Malah noise reduction scheme [1], the speech pause detection performance can even be further increased by using the noise-reduced signal instead of the noisy signal as input for the speech pause decision unit.
In noisy and reverberant environments the performance of automatic speech recognition systems drops below acceptable levels. It has been shown before, that using a psychoacoustical model of the peripheral auditory processing, as introduced by Dau [1], yields much higher recognition rates in noisy background situations than standard pre-processing methods [2]. To further improve the robustness of a speaker-independent digit recognition system, singlemicrophone noise reduction procedures [3] have been combined with the auditory model feature extractor and continuous Hidden Markov Models as well as Locally-recurrent Neural Networks. The study shows that the recognition rates improve significantly for certain noise conditions, while the performance for clean speech is not negatively effected.
A single-channel noise suppression algorithm based on the Ephraim–Malah suppression scheme [Y. Ephraim and D. Malah, IEEE Trans. Acoust. Speech Signal Process. 32, 1109–1121 (1984)] was tested with hearing-impaired subjects for different noise conditions. Significant benefits could be demonstrated for hearing-impaired subjects regarding reductions in listener fatigue and in the mental effort needed to listen to speech in noise over longer periods of time. However, an improvement of speech reception thresholds measured with the Göttingen sentence test could not be shown. Hence, a combination of the single-channel noise suppression scheme with the directional filter binaural noise reduction algorithm [T. Wittkop and V. Hohmann, this conference] is investigated. While the directional filter is assumed to suppress distinct noise sources, as, for example, jammer talkers that are not located in the desired direction, the monaural noise suppression algorithm reduces the diffuse stationary noise floor. Different combination strategies are being studied and evaluated with hearing-impaired subjects with regard to speech intelligibility, ease of listening, and subjective quality. The benefits as well as the shortcomings of the combined noise suppression algorithms for hearing-impaired subjects will be described and discussed. Implications for future developments will be drawn.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.