Adrian Skilling scite author profile

Adrian Skilling

5Publications

33Citation Statements Received

50Citation Statements Given

How they've been cited

How they cite others

Affiliations

Apple (Israel)

Publications

Order By: Most citations

Applications of automatic speech recognition to speech and language development in young children

Russell

Brown²,

Skilling³

et al.

View full text Add to dashboard Cite

Since 1990 the DRA Speech Research Unit has conducted research into applications of speech recognition technology to speech and language development for young children. This has been done in collaboration wirh Hereford and Worcester County Council Education Department (HWCC) and, more recently, w i t h Sherston Software Limited, one of the UK's leading independent educational software publishers.An initial project, known as STAR (Speech Training Aid Research), was prompted by HWCC's awareness of a requirement by teachers for a computerised 'Speech Training Aid' tool to aid young children in the development of a range of communications and language skills. The goal was to develop a computer-based system which was able to distinguish between 'good' and 'poor' pronunciations of a word, spoken by a child in response to a textual, pictorial or verbal prompt, f " a LOO0 word children's vocabulary.The same speech recognition technology has subsequently been integrated into Sherston Software's commercially successful range of animated 'Talking Books'. which use stored digitised speech to enable the computer to read words out-loud to a child. This converts them into 'Talking & Listening Books' which, in addition to the existing functions, are able to 'listen' to a child reading and indicate words which have becn read incomctly. THE CHALLENGEThe use of automatic speech recognition in computer b a d tools for speech and language development in children has enormous potential. While such tools are unlikely to be a substitute for the human interaction which occurs when a teacher or parent helps a child learn to read, they could vastly increase the individual assistance which a child receives, and allow valuable time with the teacher or parent to be used more effectively. Given these advanrages, and the economic importance of literacy, it is not surprising that this problem is receiving attention from the speech technology research community (set, for example [I]).From the perspective of speech technology. the question posed by HWCC was whether automatic speech recognition can be used to distinguish between 'good' and 'poor' pronunciations of a known word spoken by an unknown child. This raises the emotive question of what constitutes a 'good' or 'poor' pronunciation. Jones [2] defines 'poor' speech as a way of talking which it is difficult for most people to understand, caused by mumbling or the lack of definiteness of utterance. By contrast, 'good' pronunciation will enable a child to participate confidently in public, cultural and working life, and will aid accurate reading and spelling. 'Good' pronunciation occurs within the context of a variety of regional accents, and is clearly not the same as Received Pronunciation ('BBC English'). Factors such as a child's confidence in speaking are ais0 relevant.Assuming that 'good' and 'poor' pronunciation can be identified, there remains the question of whether current speech pattern processing techniques are sufficiently accurate to make the required distinction. This compliments ...

show abstract

The STAR system: an interactive pronunciation tutor for young children

Russell

Series²,

Wallace

et al. 2000

Computer Speech & Language

View full text Add to dashboard Cite

Neural Network-Based Modeling of Phonetic Durations

Wei

Hunt

Skilling

2019

View full text Add to dashboard Cite

A deep neural network (DNN)-based model has been developed to predict non-parametric distributions of durations of phonemes in specified phonetic contexts and used to explore which factors influence durations most. Major factors in US English are pre-pausal lengthening, lexical stress, and speaking rate. The model can be used to check that text-to-speech (TTS) training speech follows the script and words are pronounced as expected. Duration prediction is poorer with training speech for automatic speech recognition (ASR) because the training corpus typically consists of single utterances from many speakers and is often noisy or casually spoken. Low probability durations in ASR training material nevertheless mostly correspond to non-standard speech, with some having disfluencies. Children's speech is disproportionately present in these utterances, since children show much more variation in timing.

show abstract

Applications of automatic speech recognition to speech and language development in young children

Russell¹,

Brown²,

Skilling³

et al. 1996

View full text Add to dashboard Cite

show abstract

Neural Network-Based Modeling of Phonetic Durations

Wei

Hunt

Skilling

2019

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Adrian Skilling

Applications of automatic speech recognition to speech and language development in young children

The STAR system: an interactive pronunciation tutor for young children

Neural Network-Based Modeling of Phonetic Durations

Applications of automatic speech recognition to speech and language development in young children

Neural Network-Based Modeling of Phonetic Durations

Contact Info

Product

Resources

About