It is described in this paper for the application of pitch/tone information in the parametric trajectory model. Pitch as a dynamic feature and its contours----tone as a segmental-level feature are deserved their own particular characteristics, which match case of parametric trajectory model better compared with MFCC and energy. Here we give an improved pitch extraction algorithm and especially the "total" and ''parallel'' integration methods to combine these information with the base model. In the experiment of Mandarin connected digit recognition, we achieve 22.87% and 33.54% error reduction respectively for them, moreover when combined with these two methods, 38.72% error reduction is obtained.
Abstract:The ability to extract a primary speech signal from an environment with multiple speakers is an important issue in speech enhancement [1,2]. This paper presents a method for incorporating multiple parallel beamformers with a Wiener filter.By iteratively improving the spectral magnitude estimates of each speech source, substantial improvement in overall signal separation can be obtained. The performance of the algorithm is illustrated using a simulated multiple speaker environment with resulting SNR and sSNR plots.
The ability to extract and enhance a primary speech signal from an environment with multiple speakers is an important issue [B. Widrow, "A Microphone Array for Hearing Aids;" IEEE Circuits & Systems Magazine, vol. 1, no. 2 (2001)]. While methods exist for a variety of beamforming techniques [M. Brandstein and D. Ward, Microphone Arrays: Signal Processing Techniques and Applications, Springer, New York (2001)] as well as for multi-source filtering in stationary noise [H. Saruwatari, et al, "Speech Enhancement Using Nonlinear Microphone Array With Noise Adaptive Complementary Beamforming," Proc. of IEEE ICASSP, 1049-1052 (2000)], the theory has yet to be developed for integrating spatial filtering with additional enhancement methods to deal with the non-stationary interference from interfering talkers. This paper presents a novel method for incorporating multiple parallel beamformers with traditional speech enhancement algorithms, particularly the Wiener filter and spectral subtraction. By iteratively improving the spectral magnitude estimates of each speech source, substantial improvement in overall signal separation can be obtained. The performance of the algorithm is illustrated using a simulated multiple speaker environment with resulting SNR and sSNR plots. [Work supported by DOE GAANN Fellowship.]
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.