Deblin Bagchi scite author profile

Deblin Bagchi

5Publications

85Citation Statements Received

87Citation Statements Given

How they've been cited

104

How they cite others

121

Affiliations

The Ohio State University

Publications

Order By: Most citations

Spectral Feature Mapping with MIMIC Loss for Robust Speech Recognition

Bagchi

Plantinga

Stiff

et al. 2018

View full text Add to dashboard Cite

For the task of speech enhancement, local learning objectives are agnostic to phonetic structures helpful for speech recognition. We propose to add a global criterion to ensure de-noised speech is useful for downstream tasks like ASR. We first train a spectral classifier on clean speech to predict senone labels. Then, the spectral classifier is joined with our speech enhancer as a noisy speech recognizer. This model is taught to imitate the output of the spectral classifier alone on clean speech. This mimic loss is combined with the traditional local criterion to train the speech enhancer to produce de-noised speech. Feeding the de-noised speech to an offthe-shelf Kaldi training recipe for the CHiME-2 corpus shows significant improvements in WER.

show abstract

Combining spectral feature mapping and multi-channel model-based source separation for noise-robust automatic speech recognition

Bagchi

Mandel

Wang

et al. 2015

View full text Add to dashboard Cite

Deep neural network based spectral feature mapping for robust speech recognition

Han

Bagchi

et al. 2015

View full text Add to dashboard Cite

An Exploration of Mimic Architectures for Residual Network Based Spectral Mapping

Plantinga

Bagchi

Fosler‐Lussier

2018

View full text Add to dashboard Cite

Spectral mapping uses a deep neural network (DNN) to map directly from noisy speech to clean speech. Our previous study [1] found that the performance of spectral mapping improves greatly when using helpful cues from an acoustic model trained on clean speech. The mapper network learns to mimic the input favored by the spectral classifier and cleans the features accordingly. In this study, we explore two new innovations: we replace a DNN-based spectral mapper with a residual network that is more attuned to the goal of predicting clean speech. We also examine how integrating long term context in the mimic criterion (via wide-residual biL-STM networks) affects the performance of spectral mapping compared to DNNs. Our goal is to derive a model that can be used as a preprocessor for any recognition system; the features derived from our model are passed through the standard Kaldi ASR pipeline and achieve a WER of 9.3%, which is the lowest recorded word error rate for CHiME-2 dataset using only feature adaptation.

show abstract

Phonetic Feedback for Speech Enhancement with and Without Parallel Speech Data

Plantinga

Bagchi

Fosler‐Lussier

2020

View full text Add to dashboard Cite

While deep learning systems have gained significant ground in speech enhancement research, these systems have yet to make use of the full potential of deep learning systems to provide high-level feedback. In particular, phonetic feedback is rare in speech enhancement research even though it includes valuable top-down information. We use the technique of mimic loss to provide phonetic feedback to an off-the-shelf enhancement system, and find gains in objective intelligibility scores on CHiME-4 data. This technique takes a frozen acoustic model trained on clean speech to provide valuable feedback to the enhancement model, even in the case where no parallel speech data is available. Our work is one of the first to show intelligibility improvement for neural enhancement systems without parallel speech data, and we show phonetic feedback can improve a state-of-the-art neural enhancement system trained with parallel speech data.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Deblin Bagchi

Spectral Feature Mapping with MIMIC Loss for Robust Speech Recognition

Combining spectral feature mapping and multi-channel model-based source separation for noise-robust automatic speech recognition

Deep neural network based spectral feature mapping for robust speech recognition

An Exploration of Mimic Architectures for Residual Network Based Spectral Mapping

Phonetic Feedback for Speech Enhancement with and Without Parallel Speech Data

Contact Info

Product

Resources

About