Mittul Singh scite author profile

Mittul Singh

5Publications

58Citation Statements Received

62Citation Statements Given

How they've been cited

How they cite others

Affiliations

Aalto University, Saarland University, Indian Institute of Technology Delhi

Publications

Order By: Most citations

Semi-supervised Adaptation of Assistant Based Speech Recognition Models for different Approach Areas

Kleinert

Helmke

Siol

et al. 2018

View full text Add to dashboard Cite

Air Navigation Service Providers (ANSPs) replace paper flight strips through different digital solutions. The instructed commands from an air traffic controller (ATCos) are then available in computer readable form. However, those systems require manual controller inputs, i.e. ATCos' workload increases. The Active Listening Assistant (AcListant®) project has shown that Assistant Based Speech Recognition (ABSR) is a potential solution to reduce this additional workload. However, the development of an ABSR application for a specific targetdomain usually requires a large amount of manually transcribed audio data in order to achieve task-sufficient recognition accuracies. MALORCA project developed an initial basic ABSR system and semi-automatically tailored its recognition models for both Prague and Vienna approaches by machine learning from automatically transcribed audio data. Command recognition error rates were reduced from 7.9% to under 0.6% for Prague and from 18.9% to 3.2% for Vienna.

show abstract

Handling Noisy Labels for Robustly Learning from Self-Training Data for Low-Resource Sequence Labeling

Paul¹,

Singh²,

Hedderich³

et al. 2019

View full text Add to dashboard Cite

In this paper, we address the problem of effectively self-training neural networks in a lowresource setting. Self-training is frequently used to automatically increase the amount of training data. However, in a low-resource scenario, it is less effective due to unreliable annotations created using self-labeling of unlabeled data. We propose to combine self-training with noise handling on the self-labeled data. Directly estimating noise on the combined clean training set and self-labeled data can lead to corruption of the clean data and hence, performs worse. Thus, we propose the Clean and Noisy Label Neural Network which trains on clean and noisy self-labeled data simultaneously by explicitly modelling clean and noisy labels separately. In our experiments on Chunking and NER, this approach performs more robustly than the baselines. Complementary to this explicit approach, noise can also be handled implicitly with the help of an auxiliary learning task. To such a complementary approach, our method is more beneficial than other baseline methods and together provides the best performance overall.

show abstract

Estimation of Gap Between Current Language Models and Human Performance

Shen¹,

Oualil²,

Greenberg³

et al. 2017

View full text Add to dashboard Cite

Approximated and Domain-Adapted LSTM Language Models for First-Pass Decoding in Speech Recognition

Singh

Oualil

Klakow

2017

View full text Add to dashboard Cite

Traditionally, short-range Language Models (LMs) like the conventional n-gram models have been used for language model adaptation. Recent work has improved performance for such tasks using adapted long-span models like Recurrent Neural Network LMs (RNNLMs). With the first pass performed using a large background n-gram LM, the adapted RNNLMs are mostly used to rescore lattices or N-best lists, as a second step in the decoding process. Ideally, these adapted RNNLMs should be applied for first-pass decoding. Thus, we introduce two ways of applying adapted long-short-term-memory (LSTM) based RNNLMs for first-pass decoding. Using available techniques to convert LSTMs to approximated versions for first-pass decoding, we compare approximated LSTMs adapted in a Fast Marginal Adaptation framework (FMA) and an approximated version of architecture-based-adaptation of LSTM. On a conversational speech recognition task, these differently approximated and adapted LSTMs combined with a trigram LM outperform other adapted and unadapted LMs. Here, the architectureadapted LSTM combination obtains a 35.9 % word error rate (WER) and is outperformed by FMA-based LSTM combination obtaining the overall lowest WER of 34.4 %.

show abstract

Comparing RNNs and log-linear interpolation of improved skip-model on four Babel languages: Cantonese, Pashto, Tagalog, Turkish

Singh

Klakow

2013

View full text Add to dashboard Cite

Recurrent neural networks (RNNs) are a very recent technique to model long range dependencies in natural languages. They have clearly outperformed trigrams and other more advanced language modeling techniques by using non-linearly modeling long range dependencies. An alternative is to use log-linear interpolation of skip models (i.e. skip bigrams and skip trigrams). The method as such has been published earlier. In this paper we investigate the impact of different smoothing techniques on the skip models as a measure of their overall performance. One option is to use automatically trained distance clusters (both hard and soft) to increase robustness and to combat sparseness in the skip model. We also investigate alternative smoothing techniques on word level. For skip bigrams when skipping a small number of words Kneser-Ney smoothing (KN) is advantageous. For a larger number of words being skipped Dirichlet smoothing performs better. In order to exploit the advantages of both KN and Dirichlet smoothing we propose a new unified smoothing technique. Experiments are performed on four Babel languages: Cantonese, Pashto, Tagalog and Turkish. RNNs and log-linearly interpolated skip models are on par if the skip models are trained with standard smoothing techniques. Using the improved smoothing of the skip models along with distance clusters, we can clearly outperform RNNs by about 8-11 % in perplexity across all four languages.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Mittul Singh

Semi-supervised Adaptation of Assistant Based Speech Recognition Models for different Approach Areas

Handling Noisy Labels for Robustly Learning from Self-Training Data for Low-Resource Sequence Labeling

Estimation of Gap Between Current Language Models and Human Performance

Approximated and Domain-Adapted LSTM Language Models for First-Pass Decoding in Speech Recognition

Comparing RNNs and log-linear interpolation of improved skip-model on four Babel languages: Cantonese, Pashto, Tagalog, Turkish

Contact Info

Product

Resources

About