Deletion of
            <i>znuA</i>
            Virulence Factor Attenuates
            <i>Brucella abortus</i>
            and Confers Protection against Wild-Type Challenge

Building text-to-speech (TTS) synthesisers is a difficult task, especially for low resource languages. Language-specific modules need to be developed for system building. End-to-end speech synthesis has become a popular paradigm as a TTS can be trained using only pairs. However, end-to-end speech synthesis is not scalable in a multilanguage scenario, as the vocabulary increases with the number of different scripts. In this paper, TTSes are trained for Indian languages using two text representations-character-based and phone-based. For the character-based approach, a multi-language character map (MLCM) is proposed to easily train Indic speech synthesisers. The phone-based approach uses the common label set (CLS) representation for Indian languages. Both approaches leverage the similarities that exist among the languages. The advantage is a compact representation across multiple languages. Experiments are conducted by building TTSes using monolingual data and by pooling data across two languages. The ability to synthesise code-mixed text using the phone-based approach is also assessed. Subjective evaluations indicate that reasonably good quality Indic TTSes can be developed using both approaches. This emphasises the need to incorporate multilingual text processing in the end-to-end framework.

show abstract

Code-switching in Indic Speech Synthesisers

Thomas

Prakash

Baby

et al. 2018

View full text Add to dashboard Cite

Most Indians are inherently bilingual or multilingual owing to the diverse linguistic culture in India. As a result, code-switching is quite common in conversational speech. The objective of this work is to train good quality text-to-speech (TTS) synthesisers that can seamlessly handle code-switching. To achieve this, bilingual TTSes that are capable of handling phonotactic variations across languages are trained using combinations of monolingual data in a unified framework. In addition to segmenting Indic speech data using signal processing cues in tandem with hidden Markov model-deep neural network (HMM-DNN), we propose to segment Indian English data using the same approach after NIST syllabification. Then, bilingual HTS-STRAIGHT based systems are trained by randomizing the order of data so that the systematic interactions between the two languages are captured better. Experiments are conducted by considering three language pairs: Hindi+English, Tamil+English and Hindi+Tamil. The code-switched systems are evaluated on monolingual, code-mixed and code-switched texts. Degradation mean opinion score (DMOS) for monolingual sentences shows marginal degradation over that of an equivalent monolingual TTS system, while the DMOS for bilingual sentences is significantly better than that of the corresponding monolingual TTS systems.

show abstract

Acoustic Analysis of Syllables Across Indian Languages

Prakash

Murthy

2016

View full text Add to dashboard Cite

Indian languages are broadly classified as Indo-Aryan or Dravidian. The basic set of phones is more or less the same, varying mostly in the phonotactics across languages. There has also been borrowing of sounds and words across languages over time due to intermixing of cultures. Since syllables are fundamental units of speech production and Indian languages are characterised by syllable-timed rhythm, acoustic analysis of syllables has been carried out. In this paper, instances of common and most frequent syllables in continuous speech have been studied across six Indian languages, from both Indo-Aryan and Dravidian language groups. The distributions of acoustic features have been compared across these languages. This kind of analysis is useful for developing speech technologies in a multilingual scenario. Owing to similarities in the languages, text-to-speech (TTS) synthesisers have been developed by segmenting speech data at the phone level using hidden Markov models (HMM) from other languages as initial models. Degradation mean opinion scores and word error rates indicate that the quality of synthesised speech is comparable to that of TTSes developed by segmenting the data using language-specific HMMs.

show abstract

Multi-Relational Question Answering from Narratives: Machine Reading and Reasoning in Simulated Worlds

Labutov

Yang

Prakash

et al. 2018

View full text Add to dashboard Cite

Question Answering (QA), as a research field, has primarily focused on either knowledge bases (KBs) or free text as a source of knowledge. These two sources have historically shaped the kinds of questions that are asked over these sources, and the methods developed to answer them. In this work, we look towards a practical use-case of QA over user-instructed knowledge that uniquely combines elements of both structured QA over knowledge bases, and unstructured QA over narrative, introducing the task of multirelational QA over personal narrative. As a first step towards this goal, we make three key contributions: (i) we generate and release TEXTWORLDSQA, a set of five diverse datasets, where each dataset contains dynamic narrative that describes entities and relations in a simulated world, paired with variably compositional questions over that knowledge, (ii) we perform a thorough evaluation and analysis of several state-of-the-art QA models and their variants at this task, and (iii) we release a lightweight Python-based framework we call TEXTWORLDS for easily generating arbitrary additional worlds and narrative, with the goal of allowing the community to create and share a growing collection of diverse worlds as a test-bed for this task. . 2014. Learning phrase representations using rnn encoder-decoder for statistical machine translation. In EMNLP.

show abstract

Kinect Based Real Time Gesture Recognition Tool for Air Marshallers and Traffic Policemen

Prakash

Swathi²,

Kumar³

et al. 2016

View full text Add to dashboard Cite

EBJRV: An Ensemble of Bagging, J48 and Random Committee by Voting for Efficient Classification of Intrusions

Niranjan

Prakash

Veena

et al. 2017

View full text Add to dashboard Cite

Building speech synthesis systems for Indian languages

Pradhan

Prakash

Shanmugam

et al. 2015

View full text Add to dashboard Cite

In this paper, new efforts to build text-to-speech synthesis systems (TTS) for Indian languages is presented. The synthesisers are built around both concatenative speech synthesis and statistical parametric speech synthesis frameworks. Text to speech synthesis systems require accurate segmentation. Obtaining accurate segmentation at the phone level is a difficult task. Manual segmentation leads to human errors, while automatic segmentation using statistical approaches (hidden Markov model based approaches) leads to poor boundary information, when the amount of data used for training is small.A group delay based syllable segmentation semi-automatic tool is discussed. The tool is semi-automatic as some of the boundaries obtained are inaccurate and have to be manually corrected. Next, a segmentation algorithm that uses both HMM based segmentation and group delay based segmentation, is used to obtain accurate boundaries automatically.The boundaries obtained are used in the syllable-based synthesiser for unit selection. In the statistical phone-based synthesiser, embedded re estimation is performed at the phone level. Syllable-based and penta-phone based HMMs are used for building the synthesiser. TTS systems for 12 different Indian languages namely Tamil, Hindi, Marathi, Malayalam, Telugu, Rajasthani, Bengali, Odia, Assamese, Ma nipuri, Kannada and Gujarati are built using semi-automatic segmen tation and synthesisers have been built for 7 Indian languages using automatic segmentation. Evaluation of the semi-automatic segmentation systems indicate that the MOS (mean opinion score) is above 3.0 for most of the languages. Pair comparison tests on semi-automatic vs. automatic segmentation show that automatic segmentation is preferred.

show abstract

12 3 4

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Anusha Prakash

Generic Indic Text-to-Speech Synthesisers with Rapid Adaptation in an End-to-End Framework

Building Multilingual End-to-End Speech Synthesisers for Indian Languages

Code-switching in Indic Speech Synthesisers

Acoustic Analysis of Syllables Across Indian Languages

Multi-Relational Question Answering from Narratives: Machine Reading and Reasoning in Simulated Worlds

Kinect Based Real Time Gesture Recognition Tool for Air Marshallers and Traffic Policemen

EBJRV: An Ensemble of Bagging, J48 and Random Committee by Voting for Efficient Classification of Intrusions

Building speech synthesis systems for Indian languages

Contact Info

Product

Resources

About