Recognition of broadcast data, such as TV and radio programs is a topic of great interest. One of the problems with such data is the frequent presence of background music that degrades the performance of speech recognition systems.In this paper we examine the effects of different kinds of music on automatic speech recognition systems by comparing the effects of music with the relatively well-known effects of white noise on these systems. We also examine the extent to which compensation algorithms that have been successfully applied to noisy speech are also helpful in improving recognition accuracy for speech that is corrupted by music. It is hoped that these experimental comparisons will lead to a better understanding of how to compensate for the effects of background music.
Acoustic talker direction finders have potential applications to camera pointing for teleconferencing and to microphone array beam steering for audio communication and voice processing systems. This paper describes a laboratory setup and computer interface that was developed for testing talker direction finder algorithms in the Bell Labs Varechoic chamber, a room with computer-controlled absorbing panels. The procedure exploits the full capability of the facility by automatically stepping through a sequence of room panel configurations, outputting a digital speech signal, running the processor, and collecting the data. The advantage of this technique is that it allows for testing under a multitude of different acoustic conditions in the same physical location, thereby enabling a general characterization of the algorithm under evaluation. As an example of the technique, we have implemented the Fischell–Coker talker direction finder algorithm using real-time C-code running on an SGI workstation, which is the same machine that is used to orchestrate the automatic testing procedure.
Acoustic talker direction finders have potential applications to camera pointing for teleconferencing and microphone array beam steering for suppressing reverberation in all types of communication and voice processing systems. This paper describes a laboratory setup and computer interface that was developed for testing talker direction finder algorithms in the Bell Labs varechoic chamber, which is a room with computer-controlled absorbing panels. The procedure utilizes the full capability of the facility by automatically stepping through a sequence of room panel configurations, outputting a digital speech signal, running the processor, and collecting the data. The advantage of this technique is that it allows for testing under a multitude of different acoustic conditions in the same physical location, thereby enabling a general characterization of the algorithm under evaluation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.