We propose a novel application of the vocoder postfilter to increase perceived loudness of clean speech without increasing signal energy or degrading intelligibility. The critical band concept in auditory theory states that perceived loudness of a narrow-band signal will increase when the bandwidth of that signal increases beyond a critical band, even though the energy remains constant. Our post-filter technique applies formant bandwidth expansion to the vowel regions of speech without changing the vowel power to elevate perceived loudness. Vowels are known to contain the highest energy, have a smooth spectral envelope, long temporal sustenance, and for this reason are suitable candidates to target for a loudness enhancement technique. ISO-532B loudness analysis patterns and listening tests are provided to demonstrate a perceptual loudness improvement corresponding to a 2dB power gain.
A warped filter is presented as a new speech enhancement method to adjust formant bandwidths on a critical band scale. The warped filter enhances perceived loudness without adding signal energy by exploiting the psychoacoustic nature of the auditory system. The critical band concept in auditory theory states that when the energy in a signal remains constant, loudness increases when the energy spreads beyond a critical bandwidth. A warped filter is proposed and developed to elevate the perceived loudness of clean speech by applying non-linear bandwidth expansion to the formant regions of vowels in accordance with the critical band scale. The filter has been inspired and motivated by the biological representation of loudness in the peripheral auditory system and the critical band concept of hearing. BACKGROUNDLoudness is intimately related to the critical band concept of hearing. The critical band concept states that spectral components separated by frequency so as to fall into different auditory channels are processed separately. The critical band concept states that when the energy in a signal remains constant, loudness will increase when a critical band is exceeded. This provides a compelling motivation for a means to increase speech loudness without adding energy to the signal via formant bandwidth expansion. This would be a practical consideration for power limited devices such as cell phones or hearing aids with high audio output requirements.Vowels are precipitated as candidates for formant bandwidth expansion since they are high energy, resonant, and spectrally smooth. Because vowels have formant bandwidths which increase with increasing frequency, the filter should elevate speech loudness by applying bandwidth expansion on a critical band scale. The LPC pole displacement Funding for this research was provided by the iDEN Technology Group and Product Development Group of Motorola technique is applied in the warped domain to increase formants on a critical band scale. The bandwidth adjustment technique has been used in spectral distortion measures [1], to sharpen formant bandwidths [2], as a postfilter [3], and recently to increase vowel loudness on a linear frequency scale [4]. The authors believe they are the first to apply the technique in the warped domain to adjust formants on a critical band scale to elevate perceived loudness [5]. The technique is used within the context of a Warped Linear Prediction Coefficient (WLPC) vocoder structure to provide the necessary degree of freedom for critical bandwidth adjustment [6]. In this paper we show how an off axis radius term is incorporated in the linear transformation of the warped coefficient set. POLE DISPLACEMENT MODELA technique used to alter formant bandwidth is shown in Eq (1) [7] and demonstrated in Fig(1). This provides a way to evaluate the Z transform on a circle with radius r greater than or less than the unit circle. For 0 < r < 1 the evaluation is on a circle closer to the poles and the contribution of the poles has effectively increased, thus sh...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.