High Order Recurrent Neural Networks for Acoustic Modelling

Zhang, C.; Woodland, Philip C.

doi:10.1109/icassp.2018.8461608

Cited by 19 publications

(17 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Figure 6 shows 1) all RNNs with memory or gated structure outperforms vanila-RNN and high-order RNN by a large margin, which indicates the advantages of memory and gated structure for controlling information flow; 2) high-order RNN performs better than vanila-RNN which implies the necessary of the non-local operations since high-order connections can be considered as a simple non-local operation in a local area. It is also consistent with the existing conclusions [37,50]; 3) our NRNM outperforms LSTM significantly which demonstrates the superiority of our model over standard LSTM.…”

Section: Investigation On Nrnmsupporting

confidence: 92%

Non-Local Recurrent Neural Memory for Supervised Sequence Modeling

Pei

Cao

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

View full text Add to dashboard Cite

Typical methods for supervised sequence modeling are built upon the recurrent neural networks to capture temporal dependencies. One potential limitation of these methods is that they only model explicitly information interactions between adjacent time steps in a sequence, hence the high-order interactions between nonadjacent time steps are not fully exploited. It greatly limits the capability of modeling the long-range temporal dependencies since oneorder interactions cannot be maintained for a long term due to information dilution and gradient vanishing. To tackle this limitation, we propose the Non-local Recurrent Neural Memory (NRNM) for supervised sequence modeling, which performs non-local operations to learn full-order interactions within a sliding temporal block and models global interactions between blocks in a gated recurrent manner. Consequently, our model is able to capture the long-range dependencies. Besides, the latent high-level features contained in high-order interactions can be distilled by our model. We demonstrate the merits of our NRNM on two different tasks: action recognition and sentiment analysis. * Both authors contributed equally. † Corresponding authors.

show abstract

Section: Investigation On Nrnmsupporting

confidence: 92%

Non-Local Recurrent Neural Memory for Supervised Sequence Modeling

Pei

Cao

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

View full text Add to dashboard Cite

show abstract

“…Notable examples are the clockwork RNN (Koutnik et al, 2014), gated feedback RNN (Chung et al, 2015), hierarchical multi-scale RNN (Chung et al, 2016), fast-slow RNN (Mujika et al, 2017), and higher order RNNs (HORNNs) (Soltani and Jiang, 2016). These modern RNN architectures have found various applications in motion classification (Neverova et al, 2016;Yan et al, 2018), speech synthesis (Wu and King, 2016;Achanta and Gangashetty, 2017;Zhang and Woodland, 2018), recognition (Chan et al, 2016), and other related areas (Liu et al, 2015;Krause et al, 2017;Kurata et al, 2017). These applications of hierarchical RNN architectures further confirm the relevance of hierarchically organized sequence generators for capturing complex dynamics in our everyday environments.…”

Section: A Hierarchy Of Time Scales: Machine Learningmentioning

confidence: 68%

Neuronal Sequence Models for Bayesian Online Inference

Frölich

Marković

Kiebel

2021

Front. Artif. Intell.

View full text Add to dashboard Cite

Various imaging and electrophysiological studies in a number of different species and brain regions have revealed that neuronal dynamics associated with diverse behavioral patterns and cognitive tasks take on a sequence-like structure, even when encoding stationary concepts. These neuronal sequences are characterized by robust and reproducible spatiotemporal activation patterns. This suggests that the role of neuronal sequences may be much more fundamental for brain function than is commonly believed. Furthermore, the idea that the brain is not simply a passive observer but an active predictor of its sensory input, is supported by an enormous amount of evidence in fields as diverse as human ethology and physiology, besides neuroscience. Hence, a central aspect of this review is to illustrate how neuronal sequences can be understood as critical for probabilistic predictive information processing, and what dynamical principles can be used as generators of neuronal sequences. Moreover, since different lines of evidence from neuroscience and computational modeling suggest that the brain is organized in a functional hierarchy of time scales, we will also review how models based on sequence-generating principles can be embedded in such a hierarchy, to form a generative model for recognition and prediction of sensory input. We shortly introduce the Bayesian brain hypothesis as a prominent mathematical description of how online, i.e., fast, recognition, and predictions may be computed by the brain. Finally, we briefly discuss some recent advances in machine learning, where spatiotemporally structured methods (akin to neuronal sequences) and hierarchical networks have independently been developed for a wide range of tasks. We conclude that the investigation of specific dynamical and structural principles of sequential brain activity not only helps us understand how the brain processes information and generates predictions, but also informs us about neuroscientific principles potentially useful for designing more efficient artificial neuronal networks for machine learning tasks.

show abstract

“…Weight decay factors were carefully tuned to maximise the performance of each system. More details about the LSTM implementation and training configuration can be found in [42,43].…”

Section: Methodsmentioning

confidence: 99%

Semi-tied Units for Efficient Gating in LSTM and Highway Networks

Zhang

Woodland

2018

Interspeech 2018

View full text Add to dashboard Cite

Gating is a key technique used for integrating information from multiple sources by long short-term memory (LSTM) models and has recently also been applied to other models such as the highway network. Although gating is powerful, it is rather expensive in terms of both computation and storage as each gating unit uses a separate full weight matrix. This issue can be severe since several gates can be used together in e.g. an LSTM cell. This paper proposes a semi-tied unit (STU) approach to solve this efficiency issue, which uses one shared weight matrix to replace those in all the units in the same layer. The approach is termed "semi-tied" since extra parameters are used to separately scale each of the shared output values. These extra scaling factors are associated with the network activation functions and result in the use of parameterised sigmoid, hyperbolic tangent, and rectified linear unit functions. Speech recognition experiments using British English multi-genre broadcast data showed that using STUs can reduce the calculation and storage cost by a factor of three for highway networks and four for LSTMs, while giving similar word error rates to the original models.

show abstract

High Order Recurrent Neural Networks for Acoustic Modelling

Cited by 19 publications

References 29 publications

Non-Local Recurrent Neural Memory for Supervised Sequence Modeling

Non-Local Recurrent Neural Memory for Supervised Sequence Modeling

Neuronal Sequence Models for Bayesian Online Inference

Semi-tied Units for Efficient Gating in LSTM and Highway Networks

Contact Info

Product

Resources

About