Abstract. We present a flexible rule compiler developed for a text-to-speech (TTS) system. The compiler converts a set of rules into a finite-state transducer (FST). The input and output of the FST are subject to parameterization, so that the system can be applied to strings and sequences of feature-structures. The resulting transducer is guaranteed to realize a function (as opposed to a relation), and therefore can be implemented as a deterministic device (either a deterministic FST or a bimachine). MotivationImplementations of TTS systems are often based on operations transforming one sequence of symbols or objects into another. Starting from the input string, the system creates a sequence of tokens which are subject to part-of-speech tagging, homograph disambiguation rules, lexical lookup and grapheme-to-phoneme conversion. The resulting phonetic transcriptions are also transformed by syllabification rules, post-lexical reductions, etc.The character of the above transformations suggests finite-state transducers (FSTs) as a modelling framework ,Mohri, 1997. However, this is not always straightforward for two reasons.Firstly, the transformations are more often expressed by rules than encoded directly in finite-state networks. In order to overcome this difficulty, we need an adequate compiler converting the rules into an FST.Secondly, finite-state machines require a finite alphabet of symbols while it is often more adequate to encode linguistic information using structured representations (e.g. feature structures) the inventory of which might be potentially infinite. Thus, the compilation method must be able to reduce the inifinite set of feature structures to a finite FST input alphabet.In this paper, we show how these two problems have been solved in rVoice, a speech synthesis system developed at Rhetorical Systems. Definitions and NotationA deterministic finite-state automaton (acceptor, DFSA) over a finite alphabet Σ is a quintuple A = (Σ, Q, q 0 , δ, F ) such that:Q is a finite set of states, and q 0 ∈ Q is the initial state of A; δ : Q × Σ → Q is the transition function of A; F ⊂ Q is a non-empty set of final states.
This paper describes a novel method of compiling ranked tagging rules into a deterministic finite-state device called a bimachine. The rules are formulated in the framework of regular rewrite operations and allow unrestricted regular expressions in both left and right rule contexts. The compiler is illustrated by an application within a speech synthesis system.
We describe a novel method of compiling ranked tagging rules into a "bimachine", i.e. a deterministic finite state device composed of two finite automata: a left-to-right one and a right-to-left one. The actual compilation is based on algorithms for finite state acceptors rather than transducers, which guarantees determinizability and the efficiency of compilation. The compiler has been used in a number of applications within a speech synthesis system. MotivationMany NLP applications are formalised as operations that add information to sequences of linguistic descriptions such as graphemes, phonemes, syllables, words or phrases. This process can be performed by a formal machine that reads a sequence of objects and outputs a sequence of actions to be performed on these objects. Finite state transducers (FSTs) are an obvious choice for the formalisation of such mappings. However, their use is not always straightforward: linguistic operations are often formulated as rule systems rather than stated explicitly as FSTs. Therefore, an adequate FST compiler is required.A number of compilation methods have been proposed (Kaplan and Kay, 1994;Mohri and Sproat, 1996;Hetherington, 2001;Skut et al., 2004). They typically expect rules in the format φ → ψ/λ − ρ, meaning "rewrite φ as ψ if it is preceded by λ and followed by ρ", φ, λ and ρ being regular expressions. Such a rule is converted into a number of FSTs, which are then combined using the composition operation. If the goal is to create a single deterministic device implementing the interaction of the rules, the rule transducers T i also have to be composed, yielding the desired transducer T = T 1 • T 2 • · · · • T n .
In this paper we describe a neural networks approach to generation. The task is to generate sentences with hotel-information from a structured database. The system is inspired by Karen Kukich's ANA, but expands on it by adding generality in the form of language independence in representations and lexical look-up.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.