Word prediction systems can reduce the number of keystrokes required to form a message in a letter-based AAC system. It has been questioned however, whether such savings translate into an enhanced communication rate due to the additional cognitive load (e.g., shifting of focus and scanning of a prediction list) required in using such a system. Our hypothesis is that word prediction has great potential in enhancing communication rate, but the amount is dependent on the accuracy of the word prediction system. Due to significant user interface variations in AAC systems and significant differences between communication rates achieved by different users on even the same device, this hypothesis is difficult to verify. We present a study of communication rate and word prediction systems using "pseudo-impaired" participants and two different word prediction systems compared against letter-by-letter entry. We find that word prediction systems can in fact speed communication rate, and that a more accurate word prediction system can raise communication rate higher than is explained by the additional accuracy of the system alone.
Word prediction can be used for enhancing the communication ability of persons with speech and language impairments. In this work, we explore two methods of adapting a language model to the topic of conversation, and apply these methods to the prediction of fringe words.
Word prediction systems can reduce the number of keystrokes required to form a message in a letter-based AAC system. It has been questioned, however, whether such savings translate into an enhanced communication rate due to the additional overhead (e.g., shifting of focus and repeated scanning of a prediction list) required in using such a system. Our hypothesis is that word prediction has high potential for enhancing AAC communication rate, but the amount is dependent in a complex way on the accuracy of the predictions. Due to significant user interface variations in AAC systems and the potential bias of prior word prediction experience on existing devices, this hypothesis is difficult to verify. We present a study of two different word prediction methods compared against letter-by-letter entry at simulated AAC communication rates. We find that word prediction systems can in fact speed communication rate (an advanced system gave a 58.6% improvement), and that a more accurate word prediction system can raise the communication rate higher than is explained by the additional accuracy of the system alone due to better utilization (93.6% utilization for advanced versus 78.2% for basic).
We will be demonstrating the ModelTalker Voice Creation System, which allows users to create a personalized synthetic voice with an unrestricted vocabulary. The system includes a tool for recording a speech inventory and a program that converts the recorded inventory into a synthetic voice for the ModelTalker TTS engine. The entire system can be downloaded for use on a home PC or in a clinical setting, and the resulting synthetic voices can be used with any SAPI compliant system.We will demonstrate the recording process, and convert the recordings to a mini-database with a limited vocabulary for participants to hear.
We will demonstrate the ModelTalker Voice Recorder (MT Voice Recorder)-an interface system that lets individuals record and bank a speech database for the creation of a synthetic voice. The system guides users through an automatic calibration process that sets pitch, amplitude, and silence. The system then prompts users with both visual (text-based) and auditory prompts. Each recording is screened for pitch, amplitude and pronunciation and users are given immediate feedback on the acceptability of each recording. Users can then rerecord an unacceptable utterance. Recordings are automatically labeled and saved and a speech database is created from these recordings. The system's intention is to make the process of recording a corpus of utterances relatively easy for those inexperienced in linguistic analysis. Ultimately, the recorded corpus and the resulting speech database is used for concatenative synthetic speech, thus allowing individuals at home or in clinics to create a synthetic voice in their own voice. The interface may prove useful for other purposes as well. The system facilitates the recording and labeling of large corpora of speech, making it useful for speech and linguistic research, and it provides immediate feedback on pronunciation, thus making it useful as a clinical learning tool.
F0, amplitude, and durational cues are considered the primary acoustic correlates of focus or sentence-level stress. However, questions remain regarding: (a) the degree to which each of these cues are necessary or sufficient for signaling focus; (b) the relative importance of each of these cues; and (c) how they interact in the perception of focus. Natural productions of the sentence ‘‘Bob bought Bogg’s box,’’ in which focus was varied over each of the four words of the sentence were altered to produce prosodic cue neutralized versions. The alterations were applied singly and in all possible combinations to form eight experimental versions of each original sentence (the original and seven cue-neutralized versions). The sentences were presented to listeners with the task of identifying the focused item in the sentence. Results indicated that: (a) neutralizing any acoustic cue produced some degradation in performance, but even with all cues nullified, performance remained above chance for at least some words; (b) overall, F0 was more important than either amplitude or duration in signaling focus; and (c) F0 was more important early in the sentence while amplitude and duration played relatively more prominent roles late in the sentence. [Work supported by NIDRR and Nemours.]
Digital recordings of children producing the names ‘‘Rhonda’’ and ‘‘Wanda,’’ and/or ‘‘Toto’’ and ‘‘Coco’’ were made using the microphone input to a Toshiba laptop computer (16-bit samples, 22<th>050-kHz sampling rate) with an AKG C410/B head-mounted condenser microphone. These names were associated with animated characters in a mock video game running on the laptop under the control of a Speech Language Pathologist. The children, ranging in age from four to six years, were undergoing speech therapy at the Alfred I. duPont Hospital for Children for one or both of two common articulation errors: /w/ substituted for /r/; and/or /t/ substituted for /k/. The initial segment in each recorded utterance was classified by laboratory staff as either r/w or t/k, and assigned a goodness rating. Discrete Hidden Markov phoneme Models (DHMMs) trained using data recorded from normally articulating children were then used to classify the same utterances and results of the automatic classification were compared to the human classification. Results indicate that appropriately trained DHMMs can provide accurate classification of utterances from children in speech therapy. This technology could support articulation drill on home computer systems as an adjunct to speech therapy. [Work supported by Nemours Research Programs.]
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.