Corpus-Based Hidden Markov Modelling of the Fundamental Frequency of Lithuanian

Vaičiūnas, Airenas; Raškinis, Gailius; Kazlauskienė, Asta

doi:10.15388/informatica.2016.105

Cited by 3 publications

(3 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The availability of Lithuanian speech corpora for investigative purposes is satisfactory. Speech corpora represent Lithuanian acoustic space, usually are of about 20 hours of duration, precisely annotated by human at phonemic level, and usually comprise spoken words, phrases, syllables, names of cities or persons (Kazlauskienė and Raškinis, 2013;Vaičiūnas et al, 2016). A corpus of large extent would not usually have qualities, which could be obtained only by manual work, and this lack of quality results in a possible impairment of scientific investigations.…”

Section: Related Workmentioning

confidence: 99%

Lithuanian Speech Corpus Liepa for Development of Human-Computer Interfaces Working in Voice Recognition and Synthesis Mode

Laurinčiukaitė¹,

Telksnys²,

Kasparaitis³

et al. 2018

Informatica

View full text Add to dashboard Cite

The problem of speech corpus for design of human-computer interfaces working in voice recognition and synthesis mode is investigated. Specific requirements of speech corpus for speech recognizers and synthesizers were accented. It has been discussed that in order to develop above mentioned speech corpus, it has to consist of two parts. One part of speech corpus should be presented for the needs of Lithuanian text-to-speech synthesizers, another part of speech corpus-for the needs of Lithuanian speech recognition engines. It has been determined that the part of speech corpus designed for speech recognition engines has to ensure the availability to present language specificity by the use of different sets of phonemes. According to the research results, the speech corpus Liepa, which consists of two parts, was developed. This speech corpus opens possibilities for cost-effective and flexible development of human-computer interfaces working in voice recognition and synthesis mode.

show abstract

Section: Related Workmentioning

confidence: 99%

Lithuanian Speech Corpus Liepa for Development of Human-Computer Interfaces Working in Voice Recognition and Synthesis Mode

Laurinčiukaitė¹,

Telksnys²,

Kasparaitis³

et al. 2018

Informatica

View full text Add to dashboard Cite

show abstract

“…Despite the fact that duration models of Lithuanian sounds (Norkevičius and Raškinis, 2008), (Kasparaitis and Beniušė, 2016) and the intonation model of Lithuanian sentences (Vaičiūnas et al, 2016) have been developed in recent years, they will not be used in this work because only the phoneme-based synthesizer has the duration model implemented at the moment. Phonemes and diphones will be cut out of the recordings without any modifications.…”

Section: The Problem Of Missing Diphonesmentioning

confidence: 99%

“…The unit selection speech synthesis method still remains one of the most popular methods, although other methods are gaining popularity, e.g., hidden Markov models (HMM) (Tokuda et al, 2013) or deep neural networks (DNN), recently proposed by Google's DeepMind (van den Oord et al 2016) and Baidu (Arik et al, 2017). HMM method still has certain drawbacks, e. g. somewhat buzzy sound and over-smoothing, while DNN require huge computational power, so we decided to continue our research on well-proven unit selection method.…”

Section: Introductionmentioning

confidence: 99%

Phoneme vs. Diphone in Unit Selection TTS of Lithuanian

Kasparaitis¹,

Kancys²

2018

BJMC

View full text Add to dashboard Cite

Abstract. The present paper deals with choosing the base type for the unit selection speech synthesis method of the Lithuanian language. Phoneme and diphone units have been examined. Besides, two different methods of joining costs calculation were employed in a diphone synthesizer: one was based on the spectral similarity and the other was based on phonological classes of the sounds to be joined. Synthesizers were evaluated according to their performance, algorithm complexity, the number of joins in a synthesized speech and the human listeners' subjective judgment. Experimental testing showed that the diphone synthesizer based on phonological classes was much more acceptable to the listeners than the one based on the spectral similarity. The diphone synthesizer based on phonological classes outperformed the phonemic synthesizer in terms of performance and the number of joins though it was somewhat less acceptable to human listeners.

show abstract

An Overview of Lithuanian Intonation: A Linguistic and Modelling Perspective

Melnik-Leroy¹,

Bernatavičienė²,

Korvel³

et al. 2022

Informatica

View full text Add to dashboard Cite

Intonation is a complex suprasegmental phenomenon essential for speech processing. However, it is still largely understudied, especially in the case of under-resourced languages, such as Lithuanian. The current paper focuses on intonation in Lithuanian, a Baltic pitch-accent language with free stress and tonal variations on accented heavy syllables. Due to historical circumstances, the description and analysis of Lithuanian intonation were carried out within different theoretical frameworks and in several languages, which makes them hardly accessible to the international research community. This paper is the first attempt to gather research on Lithuanian intonation from both the Lithuanian and the Western traditions, the structuralist and generativist points of view, and the linguistic and modelling perspectives. The paper identifies issues in existing research that require special attention and proposes directions for future investigations both in linguistics and modelling.

show abstract

Corpus-Based Hidden Markov Modelling of the Fundamental Frequency of Lithuanian

Cited by 3 publications

References 15 publications

Lithuanian Speech Corpus Liepa for Development of Human-Computer Interfaces Working in Voice Recognition and Synthesis Mode

Lithuanian Speech Corpus Liepa for Development of Human-Computer Interfaces Working in Voice Recognition and Synthesis Mode

Phoneme vs. Diphone in Unit Selection TTS of Lithuanian

An Overview of Lithuanian Intonation: A Linguistic and Modelling Perspective

Contact Info

Product

Resources

About