Sliced Recurrent Neural Networks

Yu, Zeping; Liu, Gongshen

doi:10.48550/arxiv.1807.02291

Cited by 6 publications

(7 citation statements)

References 2 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For recurrent neural networks, the more advanced architectures help people deal with the gradient vanishing or explosion issue and accelerate the execution of RNN. However, how to parallelize RNN is still a big problem under active investigation [187].…”

Section: State-of-the-art Deep Architecturesmentioning

confidence: 99%

Deep learning in bioinformatics: Introduction, application, and perspective in the big data era

Liu

Huang

Ding

et al. 2019

Methods

258

102

View full text Add to dashboard Cite

Deep learning, which is especially formidable in handling big data, has achieved great success in various fields, including bioinformatics. With the advances of the big data era in biology, it is foreseeable that deep learning will become increasingly important in the field and will be incorporated in vast majorities of analysis pipelines. In this review, we provide both the exoteric introduction of deep learning, and concrete examples and implementations of its representative applications in bioinformatics. We start from the recent achievements of deep learning in the bioinformatics field, pointing out the problems which are suitable to use deep learning. After that, we introduce deep learning in an easy-to-understand fashion, from shallow neural networks to legendary convolutional neural networks, legendary recurrent neural networks, graph neural networks, generative adversarial networks, variational autoencoder, and the most recent state-of-the-art architectures. After that, we provide eight examples, covering five bioinformatics research directions and all the four kinds of data type, with the implementation written in Tensorflow and Keras. Finally, we discuss the common issues, such as overfitting and interpretability, that users will encounter when adopting deep learning methods and provide corresponding suggestions. The implementations are freely available at https://github.com/lykaust15/Deep_learning_examples.

show abstract

Section: State-of-the-art Deep Architecturesmentioning

confidence: 99%

Deep learning in bioinformatics: Introduction, application, and perspective in the big data era

Liu

Huang

Ding

et al. 2019

Methods

258

102

View full text Add to dashboard Cite

show abstract

“…Reference [24] presented an effective approach to train RNN on multiple GPUs, where parallelized stochastic gradient descent (SGD) was applied and achieved 3.4 times speedup on 4 GPUs than the single one. More recently, [25] introduced sliced recurrent neural networks (SRNNs), which could be parallelized by slicing the sequences into many subsequences. SRNNs have the ability to obtain high-level information through multiple layers with few extra parameters.…”

Section: Related Workmentioning

confidence: 99%

The Asynchronous Training Algorithm Based on Sampling and Mean Fusion for Distributed RNN

et al. 2020

View full text Add to dashboard Cite

Training of large scale deep neural networks with distributed implementations is an effective way to improve the efficiency. However, high network communication cost for synchronizing gradients and parameters is a major bottleneck in distributed training. In this work, we propose an asynchronous training algorithm based on sampling and mean fusion for distributed recurrent neural network (RNN). In distributed RNN, multiple distributed neuron nodes and an interaction node work together to implement the training. The synchronization overhead is reduced by a unique asynchronous sampling strategy amongst the distributed neuron nodes. Then, in order to make up for the accuracy loss caused by the asynchronous parameter update, a mean fusion algorithm is proposed, where the interaction node averages all local parameters from the distributed neurons. We mathematically prove the convergence of the proposed algorithm. Experimental verification is performed on two language modeling benchmark datasets. The results demonstrate significant speed gains for distributed RNN, while the accuracy loss is less than 1% on average. INDEX TERMSAsynchronous training, distributed recurrent neural network, mean fusion.

show abstract

“…Fast Inference AQM+'s time complexity can be decreased further by changing the structure of aprxAgen. In specific, we can apply diverse methods such as skipping the update of hidden states in some steps , using convolution networks or self-attention networks Vaswani et al, 2017), substituting matrix multiplication operation for hidden state update to weighted addition (Yu & Liu, 2018), and direct information gain inference from the neural networks (Belghazi et al, 2018).…”

Section: Toward Practical Applicationsmentioning

confidence: 99%

Large-Scale Answerer in Questioner's Mind for Visual Dialog Question Generation

Lee¹,

Gao²,

Yang³

et al. 2019

Preprint

View full text Add to dashboard Cite

Answerer in Questioner's Mind (AQM) is an information-theoretic framework that has been recently proposed for task-oriented dialog systems. AQM benefits from asking a question that would maximize the information gain when it is asked. However, due to its intrinsic nature of explicitly calculating the information gain, AQM has a limitation when the solution space is very large. To address this, we propose AQM+ that can deal with a large-scale problem and ask a question that is more coherent to the current context of the dialog. We evaluate our method on GuessWhich, a challenging task-oriented visual dialog problem, where the number of candidate classes is near 10K. Our experimental results and ablation studies show that AQM+ outperforms the state-of-the-art models by a remarkable margin with a reasonable approximation. In particular, the proposed AQM+ reduces more than 60% of error as the dialog proceeds, while the comparative algorithms diminish the error by less than 6%. Based on our results, we argue that AQM+ is a general task-oriented dialog algorithm that can be applied for non-yes-or-no responses.

show abstract

Sliced Recurrent Neural Networks

Cited by 6 publications

References 2 publications

Deep learning in bioinformatics: Introduction, application, and perspective in the big data era

Deep learning in bioinformatics: Introduction, application, and perspective in the big data era

The Asynchronous Training Algorithm Based on Sampling and Mean Fusion for Distributed RNN

Large-Scale Answerer in Questioner's Mind for Visual Dialog Question Generation

Contact Info

Product

Resources

About