Hema A. Murthy scite author profile

Zero shot learning in Image Classification refers to the setting where images from some novel classes are absent in the training data but other information such as natural language descriptions or attribute vectors of the classes are available. This setting is important in the real world since one may not be able to obtain images of all the possible classes at training. While previous approaches have tried to model the relationship between the class attribute space and the image space via some kind of a transfer function in order to model the image space correspondingly to an unseen class, we take a different approach and try to generate the samples from the given attributes, using a conditional variational autoencoder, and use the generated samples for classification of the unseen classes. By extensive testing on four benchmark datasets, we show that our model outperforms the state of the art, particularly in the more realistic generalized setting, where the training classes can also appear at the test time along with the novel classes.

show abstract

Transformation of formants for voice conversion using artificial neural networks

Narendranath

Murthy

Rajendran

et al. 1995

Speech Communication

155

View full text Add to dashboard Cite

In this paper we propose a scheme for developing a voice conversion system that converts the speech signal uttered by a source speaker to a speech signal having the voice characteristics of the target speaker. In particular, we address the issue of transformation of the vocal tract system features from one speaker to another. Formants are used to represent the vocal tract system features and a formant vocoder is used for synthesis. The scheme consists of a formant analysis phase, followed by a learning phase in which the implicit formant transformation is captured by a neural network. The transformed formants together with the pitch contour modified to suit the average pitch of the target speaker are used to synthesize speech with the desired vocal tract system characteristics. Zusammenfassung

show abstract

Significance of the Modified Group Delay Feature in Speech Recognition

Hegde

Murthy

Gadde

2007

IEEE Trans. Audio Speech Lang. Process.

153

View full text Add to dashboard Cite

Significance of group delay functions in spectrum estimation

Yegnanarayana

Murthy

1992

IEEE Trans. Signal Process.

110

View full text Add to dashboard Cite

The modified group delay function and its application to phoneme recognition

Murthy

Gadde

View full text Add to dashboard Cite

Group delay functions and its applications in speech technology

Murthy

Yegnanarayana

2011

Sadhana

View full text Add to dashboard Cite

Traditionally, the information in speech signals is represented in terms of features derived from short-time Fourier analysis. In this analysis the features extracted from the magnitude of the Fourier transform (FT) are considered, ignoring the phase component. Although the significance of the FT phase was highlighted in several studies over the recent three decades, the features of the FT phase were not exploited fully due to difficulty in computing the phase and also in processing the phase function. The information in the short-time FT phase function can be extracted by processing the derivative of the FT phase, i.e., the group delay function. In this paper, the properties of the group delay functions are reviewed, highlighting the importance of the FT phase for representing information in the speech signal. Methods to process the group delay function are discussed to capture the characteristics of the vocal-tract system in the form of formants or through a modified group delay function. Applications of group delay functions for speech processing are discussed in some detail. They include segmentation of speech into syllable boundaries, exploiting the additive and high resolution properties of the group delay functions. The effectiveness of segmentation of speech, and the features derived from the modified group delay are demonstrated in applications such as language identification, speech recognition and speaker recognition. The paper thus demonstrates the need to exploit the potential of the group delay functions for development of speech systems.

show abstract

A Unified Parser for Developing Indian Language Text to Speech Synthesizers

Baby

Nishanthi

Thomas

et al. 2016

View full text Add to dashboard Cite

Speech processing using group delay functions

Murthy

Yegnanarayana

1991

Signal Processing

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Hema A. Murthy

A Generative Model for Zero Shot Learning Using Conditional Variational Autoencoders

Transformation of formants for voice conversion using artificial neural networks

Significance of the Modified Group Delay Feature in Speech Recognition

Significance of group delay functions in spectrum estimation

The modified group delay function and its application to phoneme recognition

Group delay functions and its applications in speech technology

A Unified Parser for Developing Indian Language Text to Speech Synthesizers

Speech processing using group delay functions

Contact Info

Product

Resources

About