Esther Klabbers scite author profile

Esther Klabbers

5Publications

82Citation Statements Received

94Citation Statements Given

How they've been cited

160

How they cite others

113

Affiliations

Oregon Health & Science University, Walker (United States), Eindhoven University of Technology

Publications

Order By: Most citations

Reducing audible spectral discontinuities

Klabbers

Veldhuis

2001

IEEE Trans. Speech Audio Process.

View full text Add to dashboard Cite

Abstract-In this paper, a common problem in diphone synthesis is discussed, viz., the occurrence of audible discontinuities at diphone boundaries. Informal observations show that spectral mismatch is most likely the cause of this phenomenon. We first set out to find an objective spectral measure for discontinuity. To this end, several spectral distance measures are related to the results of a listening experiment. Then, we studied the feasibility of extending the diphone database with context-sensitive diphones to reduce the occurrence of audible discontinuities. The number of additional diphones is limited by clustering consonant contexts that have a similar effect on the surrounding vowels on the basis of the best performing distance measure. A listening experiment has shown that the addition of these context-sensitive diphones significantly reduces the amount of audible discontinuities.

show abstract

From data to speech: a general approach

Theune¹,

Klabbers²,

Pijper³

et al. 2001

Nat. Lang. Eng.

View full text Add to dashboard Cite

We present a data-to-speech system called D2S, which can be used for the creation of datato-speech systems in different languages and domains. The most important characteristic of a data-to-speech system is that it combines language and speech generation: language generation is used to produce a natural language text expressing the system's input data, and speech generation is used to make this text audible. In D2S, this combination is exploited by using linguistic information available in the language generation module for the computation of prosody. This allows us to achieve a better prosodic output quality than can be achieved in a plain text-to-speech system. For language generation in D2S, the use of syntactically enriched templates is guided by knowledge of the discourse context, while for speech generation pre-recorded phrases are combined in a prosodically sophisticated manner. This combination of techniques makes it possible to create linguistically sound but efficient systems with a high quality language and speech output.

show abstract

Efficient Neural Speech Synthesis for Low-Resource Languages Through Multilingual Modeling

Korte¹,

Kim

Klabbers

2020

View full text Add to dashboard Cite

Recent advances in neural TTS have led to models that can produce high-quality synthetic speech. However, these models typically require large amounts of training data, which can make it costly to produce a new voice with the desired quality. Although multi-speaker modeling can reduce the data requirements necessary for a new voice, this approach is usually not viable for many low-resource languages for which abundant multi-speaker data is not available. In this paper, we therefore investigated to what extent multilingual multi-speaker modeling can be an alternative to monolingual multi-speaker modeling, and explored how data from foreign languages may best be combined with low-resource language data. We found that multilingual modeling can increase the naturalness of low-resource language speech, showed that multilingual models can produce speech with a naturalness comparable to monolingual multispeaker models, and saw that the target language naturalness was affected by the strategy used to add foreign language data.

show abstract

A novel pitch decomposition method for the generalized linear alignment model

Langarani

Klabbers

Santen

2014

View full text Add to dashboard Cite

Superpositional models of intonation typically propose decomposing fundamental frequency (F0) contours into phrase curves and accent curves, aligned with phrases and left-headed feet, respectively. Extracting these component curves from F0 contours without making undue assumptions is challenging. We propose a novel method for decomposing pitch curves, based on the assumption that accent curves can be described by combining skewed normal distributions and sigmoid functions. In contrast to an earlier pitch decomposition algorithm ("PRISM"), this allows for simple joint optimization of phrase and accent curve parameters, using fewer parameters. The proposed method was evaluated on three speech corpora containing:(1) synthetically generated pitch curves, (2) all-sonorant utterances, and (3) utterances containing both sonorant and non-sonorant speech sounds. The root weighted mean squared error is small, and, on the corpus for which comparable data are available, is significantly smaller than for PRISM.

show abstract

Prosodic Processing

Santen¹,

Mishra²,

Klabbers³

2008

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Esther Klabbers

Reducing audible spectral discontinuities

From data to speech: a general approach

Efficient Neural Speech Synthesis for Low-Resource Languages Through Multilingual Modeling

A novel pitch decomposition method for the generalized linear alignment model

Prosodic Processing

Contact Info

Product

Resources

About