Jürgen T. Geiger scite author profile

Eliminating the negative effect of non-stationary environmental noise is a long-standing research topic for automatic speech recognition that stills remains an important challenge. Data-driven supervised approaches, including ones based on deep neural networks, have recently emerged as potential alternatives to traditional unsupervised approaches and with sufficient training, can alleviate the shortcomings of the unsupervised methods in various real-life acoustic environments. In this light, we review recently developed, representative deep learning approaches for tackling non-stationary additive and convolutional degradation of speech with the aim of providing guidelines for those involved in the development of environmentally robust speech recognition systems. We separately discuss single-and multi-channel techniques developed for the front-end and back-end of speech recognition systems, as well as joint front-end and back-end training frameworks.

show abstract

The TUM Gait from Audio, Image and Depth (GAID) database: Multimodal recognition of subjects and traits

Hofmann

Geiger

Bachmann

et al. 2014

Journal of Visual Communication and Image Representation

189

117

View full text Add to dashboard Cite

Depth—First Search of Random Trees, and Poisson Point Processes

Geiger

Kersting

1997

View full text Add to dashboard Cite

Large-scale audio feature extraction and SVM for acoustic scene classification

Geiger

Schuller

Rigoll

2013

View full text Add to dashboard Cite

Feature enhancement by deep LSTM networks for ASR in reverberant multisource environments

Weninger

Geiger

Wöllmer

et al. 2014

Computer Speech & Language

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jürgen T. Geiger

Deep Learning for Environmentally Robust Speech Recognition

The TUM Gait from Audio, Image and Depth (GAID) database: Multimodal recognition of subjects and traits

Depth—First Search of Random Trees, and Poisson Point Processes

Large-scale audio feature extraction and SVM for acoustic scene classification

Feature enhancement by deep LSTM networks for ASR in reverberant multisource environments

Contact Info

Product

Resources

About