Kathrine Hammervold scite author profile

2004

Abstract. We present a flexible rule compiler developed for a text-to-speech (TTS) system. The compiler converts a set of rules into a finite-state transducer (FST). The input and output of the FST are subject to parameterization, so that the system can be applied to strings and sequences of feature-structures. The resulting transducer is guaranteed to realize a function (as opposed to a relation), and therefore can be implemented as a deterministic device (either a deterministic FST or a bimachine). MotivationImplementations of TTS systems are often based on operations transforming one sequence of symbols or objects into another. Starting from the input string, the system creates a sequence of tokens which are subject to part-of-speech tagging, homograph disambiguation rules, lexical lookup and grapheme-to-phoneme conversion. The resulting phonetic transcriptions are also transformed by syllabification rules, post-lexical reductions, etc.The character of the above transformations suggests finite-state transducers (FSTs) as a modelling framework ,Mohri, 1997. However, this is not always straightforward for two reasons.Firstly, the transformations are more often expressed by rules than encoded directly in finite-state networks. In order to overcome this difficulty, we need an adequate compiler converting the rules into an FST.Secondly, finite-state machines require a finite alphabet of symbols while it is often more adequate to encode linguistic information using structured representations (e.g. feature structures) the inventory of which might be potentially infinite. Thus, the compilation method must be able to reduce the inifinite set of feature structures to a finite FST input alphabet.In this paper, we show how these two problems have been solved in rVoice, a speech synthesis system developed at Rhetorical Systems. Definitions and NotationA deterministic finite-state automaton (acceptor, DFSA) over a finite alphabet Σ is a quintuple A = (Σ, Q, q 0 , δ, F ) such that:Q is a finite set of states, and q 0 ∈ Q is the initial state of A; δ : Q × Σ → Q is the transition function of A; F ⊂ Q is a non-empty set of final states.

A bimachine compiler for ranked tagging rules

Skut¹,

Ulrich²,

2004

A Generic Finite State Compiler for Tagging Rules

Skut¹,

Ulrich²,

2003

Mach Translat

We describe a novel method of compiling ranked tagging rules into a "bimachine", i.e. a deterministic finite state device composed of two finite automata: a left-to-right one and a right-to-left one. The actual compilation is based on algorithms for finite state acceptors rather than transducers, which guarantees determinizability and the efficiency of compilation. The compiler has been used in a number of applications within a speech synthesis system. MotivationMany NLP applications are formalised as operations that add information to sequences of linguistic descriptions such as graphemes, phonemes, syllables, words or phrases. This process can be performed by a formal machine that reads a sequence of objects and outputs a sequence of actions to be performed on these objects. Finite state transducers (FSTs) are an obvious choice for the formalisation of such mappings. However, their use is not always straightforward: linguistic operations are often formulated as rule systems rather than stated explicitly as FSTs. Therefore, an adequate FST compiler is required.A number of compilation methods have been proposed (Kaplan and Kay, 1994;Mohri and Sproat, 1996;Hetherington, 2001;Skut et al., 2004). They typically expect rules in the format φ → ψ/λ − ρ, meaning "rewrite φ as ψ if it is preceded by λ and followed by ρ", φ, λ and ρ being regular expressions. Such a rule is converted into a number of FSTs, which are then combined using the composition operation. If the goal is to create a single deterministic device implementing the interaction of the rules, the rule transducers T i also have to be composed, yielding the desired transducer T = T 1 • T 2 • · · · • T n .

Improving the accuracy of pronunciation prediction for unit selection TTS

Fackrell¹,

Skut²,

2003

Sentence generation and neural networks

Hammervold

2000