Kaisheng Yao scite author profile

Semantic slot filling is one of the most challenging problems in spoken language understanding (SLU). In this paper, we propose to use recurrent neural networks (RNNs) for this task, and present several novel architectures designed to efficiently model past and future temporal dependencies. Specifically, we implemented and compared several important RNN architectures, including Elman, Jordan, and hybrid variants. To facilitate reproducibility, we implemented these networks with the publicly available Theano neural network toolkit and completed experiments on the well-known airline travel information system (ATIS) benchmark. In addition, we compared the approaches on two custom SLU data sets from the entertainment and movies domains. Our results show that the RNN-based models outperform the conditional random field (CRF) baseline by 2% in absolute error reduction on the ATIS benchmark. We improve the state-of-the-art by 0.5% in the Entertainment domain, and 6.7% for the movies domain.Index Terms-Recurrent neural network (RNN), slot filling, spoken language understanding (SLU), word embedding.

show abstract

Recent advances in deep learning for speech research at Microsoft

Deng

Huang

et al. 2013

647

325

View full text Add to dashboard Cite

KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition

Yao

et al. 2013

369

228

View full text Add to dashboard Cite

We propose a novel regularized adaptation technique for context dependent deep neural network hidden Markov models (CD-DNNHMMs). The CD-DNN-HMM has a large output layer and many large hidden layers, each with thousands of neurons. The huge number of parameters in the CD-DNN-HMM makes adaptation a challenging task, esp. when the adaptation set is small. The technique developed in this paper adapts the model conservatively by forcing the senone distribution estimated from the adapted model to be close to that from the unadapted model. This constraint is realized by adding Kullback-Leibler divergence (KLD) regularization to the adaptation criterion. We show that applying this regularization is equivalent to changing the target distribution in the conventional backpropagation algorithm. Experiments on Xbox voice search, short message dictation, and Switchboard and lecture speech transcription tasks demonstrate that the proposed adaptation technique can provide 2%-30% relative error reduction against the already very strong speaker independent CD-DNN-HMM systems using different adaptation sets under both supervised and unsupervised adaptation setups.

show abstract

Spoken language understanding using long short-term memory neural networks

et al. 2014

View full text Add to dashboard Cite

Incorporating Structural Alignment Biases into an Attentional Neural Translation Model

et al. 2016

View full text Add to dashboard Cite

Neural encoder-decoder models of machine translation have achieved impressive results, rivalling traditional translation models. However their modelling formulation is overly simplistic, and omits several key inductive biases built into traditional models. In this paper we extend the attentional neural translation model to include structural biases from word based alignment models, including positional bias, Markov conditioning, fertility and agreement over translation directions. We show improvements over a baseline attentional model and standard phrase-based model over several language pairs, evaluating on difficult languages in a low resource setting.

show abstract

Adaptation of context-dependent deep neural networks for automatic speech recognition

Yao

Seide

et al. 2012

178

View full text Add to dashboard Cite

Recurrent neural networks for language understanding

Yao

Zweig²,

Hwang

et al. 2013

206

View full text Add to dashboard Cite

Recurrent conditional random field for language understanding

et al. 2014

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Kaisheng Yao

Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding

Recent advances in deep learning for speech research at Microsoft

KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition

Spoken language understanding using long short-term memory neural networks

Incorporating Structural Alignment Biases into an Attentional Neural Translation Model

Adaptation of context-dependent deep neural networks for automatic speech recognition

Recurrent neural networks for language understanding

Recurrent conditional random field for language understanding

Contact Info

Product

Resources

About