Pranay Dighe scite author profile

Pranay Dighe

3Publications

73Citation Statements Received

81Citation Statements Given

How they've been cited

How they cite others

Affiliations

Apple (United Kingdom), Idiap Research Institute, École Polytechnique Fédérale de Lausanne

Publications

Order By: Most citations

Audio event detection from acoustic unit occurrence patterns

Kumar

Dighe

Singh

et al. 2012

View full text Add to dashboard Cite

In most real-world audio recordings, we encounter several types of audio events. In this paper, we develop a technique for detecting signature audio events, that is based on identifying patterns of occurrences of automatically learned atomic units of sound, which we call Acoustic Unit Descriptors or AUDs. Experiments show that the methodology works as well for detection of individual events and their boundaries in complex recordings.

show abstract

Exploiting low-dimensional structures to enhance DNN based acoustic modeling in speech recognition

Dighe

Luyet

Asaei

et al. 2016

View full text Add to dashboard Cite

We propose to model the acoustic space of deep neural network (DNN) class-conditional posterior probabilities as a union of lowdimensional subspaces. To that end, the training posteriors are used for dictionary learning and sparse coding. Sparse representation of the test posteriors using this dictionary enables projection to the space of training data. Relying on the fact that the intrinsic dimensions of the posterior subspaces are indeed very small and the matrix of all posteriors belonging to a class has a very low rank, we demonstrate how low-dimensional structures enable further enhancement of the posteriors and rectify the spurious errors due to mismatch conditions. The enhanced acoustic modeling method leads to improvements in continuous speech recognition task using hybrid DNN-HMM (hidden Markov model) framework in both clean and noisy conditions, where upto 15.4% relative reduction in word error rate (WER) is achieved.

show abstract

Sparse modeling of neural network posterior probabilities for exemplar-based speech recognition

Dighe

Asaei

Bourlard

2016

Speech Communication

View full text Add to dashboard Cite

In this paper, a compressive sensing (CS) perspective to exemplar-based speech processing is proposed. Relying on an analytical relationship between CS formulation and statistical speech recognition (Hidden Markov Models -HMM), the automatic speech recognition (ASR) problem is cast as recovery of highdimensional sparse word representation from the observed low-dimensional acoustic features. The acoustic features are exemplars obtained from (deep) neural network sub-word conditional posterior probabilities. Low-dimensional word manifolds are learned using these sub-word posterior exemplars and exploited to construct a linguistic dictionary for sparse representation of word posteriors. Dictionary learning has been found to be a principled way to alleviate the need of having huge collection of exemplars as required in conventional exemplar-based approaches, while still improving the performance. Context appending and collaborative hierarchical sparsity are used to exploit the sequential and group structure underlying word sparse representation. This formulation leads to a posterior-based sparse modeling approach to speech recognition. The potential of the proposed approach is demonstrated on isolated word (Phonebook corpus) and continuous speech (Numbers corpus) recognition tasks.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Pranay Dighe

Audio event detection from acoustic unit occurrence patterns

Exploiting low-dimensional structures to enhance DNN based acoustic modeling in speech recognition

Sparse modeling of neural network posterior probabilities for exemplar-based speech recognition

Contact Info

Product

Resources

About