Radiation Environments

The field of natural language processing has seen impressive progress in recent years, with neural network models replacing many of the traditional systems. A plethora of new models have been proposed, many of which are thought to be opaque compared to their feature-rich counterparts. This has led researchers to analyze, interpret, and evaluate neural networks in novel and more finegrained ways. In this survey paper, we review analysis methods in neural language processing, categorize them according to prominent research trends, highlight existing limitations, and point to potential directions for future work.

show abstract

What do Neural Machine Translation Models Learn about Morphology?

Belinkov¹,

Durrani²,

Dalvi³

et al. 2017

227

257

View full text Add to dashboard Cite

Neural machine translation (MT) models obtain state-of-the-art performance while maintaining a simple, end-to-end architecture. However, little is known about what these models learn about source and target languages during the training process.In this work, we analyze the representations learned by neural MT models at various levels of granularity and empirically evaluate the quality of the representations for learning morphology through extrinsic part-of-speech and morphological tagging tasks. We conduct a thorough investigation along several parameters: word-based vs. character-based representations, depth of the encoding layer, the identity of the target language, and encoder vs. decoder representations. Our data-driven, quantitative evaluation sheds light on important aspects in the neural MT system and its ability to capture word structure. 1

show abstract

Unsupervised Pattern Discovery in Speech

Park

Glass

2008

IEEE Trans. Audio Speech Lang. Process.

264

255

View full text Add to dashboard Cite

Unsupervised spoken keyword spotting via segmental DTW on Gaussian posteriorgrams

Zhang¹,

Glass²

2009

287

242

View full text Add to dashboard Cite

Abstract-In this paper, we present an unsupervised learning framework to address the problem of detecting spoken keywords. Without any transcription information, a Gaussian Mixture Model is trained to label speech frames with a Gaussian posteriorgram. Given one or more spoken examples of a keyword, we use segmental dynamic time warping to compare the Gaussian posteriorgrams between keyword samples and test utterances. The keyword detection result is then obtained by ranking the distortion scores of all the test utterances. We examine the TIMIT corpus as a development set to tune the parameters in our system, and the MIT Lecture corpus for more substantial evaluation. The results demonstrate the viability and effectiveness of our unsupervised learning framework on the keyword spotting task.

show abstract

Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input

et al. 2018

View full text Add to dashboard Cite

Speech database development at MIT: Timit and beyond

Zue

Seneff

Glass

1990

Speech Communication

500

236

View full text Add to dashboard Cite

An Unsupervised Autoregressive Model for Speech Representation Learning

et al. 2019

View full text Add to dashboard Cite

This paper proposes a novel unsupervised autoregressive neural model for learning generic speech representations. In contrast to other speech representation learning methods that aim to remove noise or speaker variabilities, ours is designed to preserve information for a wide range of downstream tasks. In addition, the proposed model does not require any phonetic or word boundary labels, allowing the model to benefit from large quantities of unlabeled data. Speech representations learned by our model significantly improve performance on both phone classification and speaker verification over the surface features and other supervised and unsupervised approaches. Further analysis shows that different levels of speech information are captured by our model at different layers. In particular, the lower layers tend to be more discriminative for speakers, while the upper layers provide more phonetic content.

show abstract

Highway long short-term memory RNNS for distant speech recognition

Zhang¹,

Chen²,

et al. 2016

260

205

View full text Add to dashboard Cite

In this paper, we extend the deep long short-term memory (DL-STM) recurrent neural networks by introducing gated direct connections between memory cells in adjacent layers. These direct links, called highway connections, enable unimpeded information flow across different layers and thus alleviate the gradient vanishing problem when building deeper LSTMs. We further introduce the latency-controlled bidirectional LSTMs (BLSTMs) which can exploit the whole history while keeping the latency under control. Efficient algorithms are proposed to train these novel networks using both frame and sequence discriminative criteria. Experiments on the AMI distant speech recognition (DSR) task indicate that we can train deeper LSTMs and achieve better improvement from sequence training with highway LSTMs (HLSTMs). Our novel model obtains 43.9/47.7% WER on AMI (SDM) dev and eval sets, outperforming all previous works. It beats the strong DNN and DLSTM baselines with 15.7% and 5.3% relative improvement respectively.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

James Glass

Analysis Methods in Neural Language Processing: A Survey

What do Neural Machine Translation Models Learn about Morphology?

Unsupervised Pattern Discovery in Speech

Unsupervised spoken keyword spotting via segmental DTW on Gaussian posteriorgrams

Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input

Speech database development at MIT: Timit and beyond

An Unsupervised Autoregressive Model for Speech Representation Learning

Highway long short-term memory RNNS for distant speech recognition

Contact Info

Product

Resources

About