6th International Conference on Spoken Language Processing (ICSLP 2000) 2000
DOI: 10.21437/icslp.2000-541
|View full text |Cite
|
Sign up to set email alerts
|

A flexible, scalable finite-state transducer architecture for corpus-based concatenative speech synthesis

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2001
2001
2009
2009

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 17 publications
(6 citation statements)
references
References 9 publications
0
4
0
Order By: Relevance
“…The dialogue manager receives the context-resolved semantic frame and communicates with the database and language generation [3] services to provide an appropriate reply to the user. This response is then audibly realized by the text-to-speech server [52].…”
Section: Resultsmentioning
confidence: 99%
“…The dialogue manager receives the context-resolved semantic frame and communicates with the database and language generation [3] services to provide an appropriate reply to the user. This response is then audibly realized by the text-to-speech server [52].…”
Section: Resultsmentioning
confidence: 99%
“…Anecdotally, users of the flight domain game system [2] did not like the quality of speech produced by the general-purpose synthesizer. To improve the synthesis quality, in our current system we utilize the ENVOICE synthesizer [12], a concatenative text-to-speech engine with a scalable finite-state transducer implementation for unit selection. The costs of concatenation and substitution are calculated based on local phonetic context.…”
Section: Technical Componentsmentioning
confidence: 99%
“…In addition to this intuitive connection between SPEECHBUILDER domains and these technology components, there are several other reasons why this approach has been selected. First, significant effort has been devoted in the past at MIT to improving technology in dialogue system architecture [271, speech recognition [11], language understanding [24], language generation [1], discourse and dialogue [28], and, most recently speech synthesis [36]. Employing these HLT components minimizes duplication of effort, and maximizes SPEECHBUILDER'S flexibility to adopt technical advances made in these areas, which may be achieved in efforts entirely disjoint from…”
Section: Approachmentioning
confidence: 99%
“…In addition, an instance of the SLSInfo domain has been manually modified to use the ENVOICE concatenative speech synthesizer that is being developed at MIT [36]. This is encouraging for eventually being able to give developers the option of using ENVOICE as an optional synthesizer for SPEECHBUILDER domains (see Section 7.5).…”
Section: Lcsinfo and Slsinfomentioning
confidence: 99%
See 1 more Smart Citation