The use of speech interaction is important and useful in a wide range of applications. It is a natural way of interaction and it is easy to use by people in general. The development of speech enabled applications is a big challenge that increases if several languages are required, a common scenario, for example, in Europe. Tackling this challenge requires the proposal of methods and tools that foster easier deployment of speech features, harnessing developers with versatile means to include speech interaction in their applications. Besides, only a reduced variety of voices are available (sometimes only one per language) which raises problems regarding the fulfillment of user preferences and hinders a deeper exploration regarding voices' adequacy to specific applications and users.In this article, we present some of our contributions to these different issues: (a) our generic modality that encapsulates the technical details of using speech synthesis; (b) the process followed to create four new voices, including two young adult and two elderly voices; and (c) some initial results exploring user preferences regarding the created voices.The preliminary studies carried out targeted groups including both young and older-adults and addressed: (a) evaluation of the intrinsic properties of each voice; (b) observation of users while using speech enabled interfaces and elicitation of qualitative impressions regarding the chosen voice and the impact of speech interaction on user satisfaction; and (c) ranking of voices according to preference.The collected results, albeit preliminary, yield some evidence of the positive impact speech interaction has on users, at different levels. Additionally, results show interesting differences among the voice preferences expressed by both age groups and genders.
The benefits of using interactive computer softwares in Education have been discussed for some time. This approach can improve c ognitive capacity, better learning and, mainly, it makes information acquisition easier. This work presents the development of a software called SISFISIO. It is an "exercise and practice" system designed for teaching Physiology in biomedical courses. It was developed using the Delphi interface system and Macromedia Flash. The Flash components were integrated with Delphi using an ActiveX control. The internal structure of the software has two linearly-linked lists (a linked list is one of the fundamental data structures used in computer programming): one for the exercises and another for the question of each exercise. The information is stored in a text file that should be filled by the instructor. The SISFISIO software uploads this file and it fills the data structures cited above. Upon completion of a Physiology module, the student's answers can be immediately verified, the scores tallied and the duration of the exercise measured. SISFISIO was incorporated into biomedicine classes at USS. The student's evaluation, based on the 5-point Likert questionnaire and spontaneous comments, has indicated that the software has facilitated the learning of physiological concepts and was a very stimulating activity.
Large speech corpora with word-level transcriptions annotated for noises and disfluent speech are necessary for training automatic speech recognisers. Crowdsourcing is a lower-cost, faster-turnaround, highly scalable alternative for expert transcription and annotation. In this paper, we showcase our three-step crowdsourcing approach motivated by the importance of accurate transcriptions and annotations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.