A Text-to-speech (TTS) synthesis system is the artificial production of human system. This paper reviews recent research advances in field of speech synthesis with related to statistical parametric approach to speech synthesis based on HMM. In this approach, Hidden Markov Model based Text to speech synthesis (HTS) is reviewed in brief. The HTS is based on the generation of an optimal parameter sequence from subword HMMs. The quality of HTS framework relies on the accurate description of the phoneset. The most attractive part of HTS system is the prosodic characteristics of the voice can be modified by simply varying the HMM parameters, thus reducing the large storage requirement.
Text to speech synthesis (TTS) is the production of artificial speech by a machine for the given text as input. The speech synthesis can be achieved by concatenation and Hidden Markov Model techniques. The voice synthesized by these techniques should be evaluated for quality. The study extends towards the comparative analysis for quality of speech synthesis using hidden markov model and unit selection approach. The quality of synthesized speech is analyzed for subjective measurement using mean opinion score and objective measurement based on mean square score and peak signal-to-noise ratio (PSNR). The quality is also accessed by Mel-frequency cepstral coefficient features for synthesized speech. The experimental analysis shows that unit selection method results in better synthesized voice than hidden markov model.
The research paper briefs about the implementation of screen readers for Marathi in Windows and Linux platform using unrestricted domain Marathi Text To Speech with Indian English support. The application is an integration of MTTS with open source Screen readers NVDA and ORCA. MTTS is a syllable based unit selection concatenative system, built around open source festival engine. IE support is provided for the smooth navigation and handling the English words occurring while accessing internet and other applications. The TTS is a concatenative based system in which syllable is the highest unit for concatenation. The TTS output resembles natural human voice since it uses the original speech segments for concatenation. Testing has been done with normal and differently abled users. Tuning of the system for improving the user friendliness has been done based on the feedback from the DA The system gets a Mean Opinion Score of 86.4% when evaluated by a group of DA.
A Text To Speech synthesis (TTS) is the production of artificial speech by a machine for the given text as input. This field of study is known both as Speech Synthesis that is the "synthetic" (computer) generation of speech, and Text-ToSpeech or TTS. It is the process of converting written text into speech. In the process of speech synthesis, mainly two processing components are used; they are NLP (natural language processing) and DSP (digital signal processing) modules. The speech synthesis has enormous applications such as reading for blind people, telecommunication services, language education, and aid to handicapped persons, talking books and toys, call center automation etc. The main aim of the project is to develop a TTS system producing a voice with Indian accent for the given input text. In this project, for the conversion of text to speech, we use Festival in Linux environment. Festival is a general pre-packaged tool for development of multi-language speech synthesis systems; and it will support most of the languages in the text to speech conversion. In this project, the speech generation process is done by using Festival frame work and speech tools. The voice model is generated by using festvox frame work, festival and speech tools. The required speech data for generating voice is recorded in noise less environment. The voice models can be generated by unit selection or clustergen modules present in festvox. It is observed from the generated voices that clustergen voices are better than unit selection voices.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.