This hybrid review-case study introduces three-dimensional (3-D) virtual worlds and their educational potential to medical/health librarians and educators.
In this paper we report on our recent efforts to collect a corpus of spoken lecture material that will enable research directed towards fast, accurate, and easy access to lecture content. Thus far, we have collected a corpus of 270 hours of speech from a variety of undergraduate courses and seminars. We report on an initial analysis of the spontaneous speech phenomena present in these data and the vocabulary usage patterns across three courses. Finally, we examine language model perplexities trained from written and spoken materials, and describe an initial recognition experiment on one course.
The use of segment-based features and segmentation networks in a segment-based speech recognizer complicates the probabilistic modeling because it alters the sample space of all possible segmentation paths and the feature observation space. This paper describes a novel Baum-Welch training algorithm for segment-based speech recognition which addresses these issues by an innovative use of finite-state transducers. This procedure has the desirable property of not requiring initial seed models that were needed by the Viterbi training procedure we have used previously. On the PhoneBook telephone-based corpus of read, isolated words, the Baum-Welch training algorithm obtained a relative error reduction of 37% on the training set and a relative error reduction of 5% on the test set, compared to Viterbi trained models. When combined with a duration model, and more flexible segmentation network, the Baum-Welch trained models obtain an overall word error rate of 7.6%, which is the best result we have seen published for the 8,000 word task.
The SAPPHIRE system is a powerful, extensible, object-oriented toolkit allowing researchers to rapidly build and configure customized speech analysis tools. Implemented in Tcl/Tk and C, the current version of SAPPHIRE provides a wide range of functionality, including the ability to configure and run the SUMMIT speech recognition system. We now use SAPPHIRE widely in almost all aspects of our speech analysis and recognition research. MOTIVATIONInvesting in the development of tools and other forms of infrastructure is a critical role that all research institutions must periodically undertake. Unfortunately, such an investment is time-consuming, and the benefits are not always apparent nor readily measurable. As a result, one can easily be so distracted by day-to-day affairs that such activities are indefinitely deferred.Powerful tools are particularly necessary for research and development of technologies involving human language, since the processes of signal modelling, algorithm development, and system evaluation rely largely on empirical evidence. The situation is confounded by the fact that, more often than not, research activities involve not a single individual but a team of collaborators, each responsible for a specific aspect of the problem. A comprehensive set of research tools will greatly enhance researchers' ability to collectively explore, formulate, and test new ideas with minimum effort.There are several existing tools geared towards the development of human language technologies. In the mid-eighties our group developed SPIRE [1], a set of Lisp-machine based signal analysis and display tools that enjoyed limited distribution. Another speech-related tool is the ISP package designed by Kopec [2]. Perhaps the most widely used is the signal processing and display package from Entropic, ESPS/Waves [3]. Recently, the CSLU at OGI has also developed and disseminated a set of tools, based on Tcl/Tk, that allow researchers to rapidly configure a dialogue-based speech interface [4].Over the past few years, our group has been actively developing human language technologies, and embedding these technologies in conversational systems. In this process, we have become increasingly aware of the multiple roles played by tools, and the important 1 This research was supported by DARPA under contract N66001-94-C-6040, monitored though Naval Command, Control and Ocean Surveillance Center.properties that they must possess. First, the tools must be intuitive. A toolkit that requires significant training on the part of the user will not likely gain wide usage by people with varying computing skills, and will, in all likelihood, have minimal impact. Second, they must be comprehensive. Since the life cycle of research and development involves exploratory studies, signal modelling, algorithm development, system evaluation, and error analysis, the toolkit must support all these activities and more. Third, they must be customizable. For example, a user should be able to change the computing and display parameters easily ...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.