SLIM LSTMs

Salem, Fathi M.

doi:10.48550/arxiv.1812.11391

Cited by 4 publications

(9 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This variant form is close to the so-called basic Recurrent Neural Network (bRNN), see [19], [21] for analysis and details.…”

Section: Lstmmentioning

confidence: 91%

“…Different variants have been introduced earlier [3], [21]. For LSTM 6, the gating signals are set at constant values as follows:…”

Section: Lstmmentioning

confidence: 99%

“…We have introduced numerous, computationally simpler, LSTM variants by aggressively eleminating some of the adaptive parameters, see [1], [2], [3], [14], [16], [20], [21]. In this study we shall focus on one of the simplest variant forms, namely the slim LSTM 6 and LSTM C6 [2], [21].…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Slim LSTM NETWORKS: LSTM_6 and LSTM_C6

Akandeh

Salem

2019

2019 IEEE 62nd International Midwest Symposium on Circuits and Systems (MWSCAS)

Self Cite

View full text Add to dashboard Cite

We have shown previously that our parameterreduced variants of Long Short-Term Memory (LSTM) Recurrent Neural Networks (RNN) are comparable in performance to the standard LSTM RNN on the MNIST dataset. In this study, we show that this is also the case for two diverse benchmark datasets, namely, the review sentiment IMDB and the 20 Newsgroup datasets. Specifically, we focus on two of the simplest variants, namely LSTM 6 (i.e., standard LSTM with three constant fixed gates) and LSTM C6 (i.e., LSTM 6 with further reduced cell body input block). We demonstrate that these two aggressively reduced-parameter variants are competitive with the standard LSTM when hyper-parameters, e.g., learning parameter, number of hidden units and gate constants are set properly. These architectures enable speeding up training computations and hence, these networks would be more suitable for online training and inference onto portable devices with relatively limited computational resources.

show abstract

“…This variant form is close to the so-called basic Recurrent Neural Network (bRNN), see [19], [21] for analysis and details.…”

Section: Lstmmentioning

confidence: 91%

“…Different variants have been introduced earlier [3], [21]. For LSTM 6, the gating signals are set at constant values as follows:…”

Section: Lstmmentioning

confidence: 99%

See 1 more Smart Citation

Slim LSTM NETWORKS: LSTM_6 and LSTM_C6

Akandeh

Salem

2019

2019 IEEE 62nd International Midwest Symposium on Circuits and Systems (MWSCAS)

Self Cite

View full text Add to dashboard Cite

show abstract

“…More recently, a host of new variants with aggressive reduction of parameters of the LSTM layer have shown reasonable initial success, see [6]- [10]. These mosaic of variants are referred to as SLIM LSTMs [11].…”

Section: B Slim Lstm Variants Overviewmentioning

confidence: 99%

“…The overall equations of this standard LSTM layer are described in [2], and the references therein. Here, we follow the presentation in [10], [11], where one splits the 3 gating equations from the memory cell and the"input block" equations for suitability of the development in the next sections. The 3 gating equations are:…”

Section: Introduction a Lstm Architecture Overviewmentioning

confidence: 99%

Performance of Three Slim Variants of The Long Short-Term Memory (LSTM) Layer

Kent

Salem

2019

2019 IEEE 62nd International Midwest Symposium on Circuits and Systems (MWSCAS)

View full text Add to dashboard Cite

The Long Short-Term Memory (LSTM) layer is an important advancement in the field of neural networks and machine learning, allowing for effective training and impressive inference performance. LSTM-based neural networks have been successfully employed in various applications such as speech processing and language translation. The LSTM layer can be simplified by removing certain components, potentially speeding up training and runtime with limited change in performance. In particular, the recently introduced variants, called SLIM LSTMs, have shown success in initial experiments to support this view. Here, we perform computational analysis of the validation accuracy of a convolutional plus recurrent neural network architecture using comparatively the standard LSTM and three SLIM LSTM layers. We have found that some realizations of the SLIM LSTM layers can potentially perform as well as the standard LSTM layer for our considered architecture.

show abstract

ELSTM: An improved long short‐term memory network language model for sequence learning

Wang

et al. 2022

Expert Systems

View full text Add to dashboard Cite

The gated structure of the long short‐term memory (LSTM) alleviates the defects of gradient disappearance and explosion in the recurrent neural network (RNN). It has received widespread attention in sequence learning such as text analysis. Although LSTM has good performance in handling remote dependencies, information loss often occurs in long‐distance transmission. We propose a new model called ELSTM based on the computational complexity and gradient dispersion in the traditional LSTM model. This model simplifies the input gate of LSTM, reduces some time complexity by reducing some components, and improves the output gate. By introducing the exponential linear unit activation layer, the problem of gradient dispersion is alleviated. Comparing the new model with multiple existing models, when predicting language sequences, the time used by the model has been greatly reduced, and the language confusion has been reduced, showing good performance.

show abstract

SLIM LSTMs

Cited by 4 publications

References 12 publications

Slim LSTM NETWORKS: LSTM_6 and LSTM_C6

Slim LSTM NETWORKS: LSTM_6 and LSTM_C6

Performance of Three Slim Variants of The Long Short-Term Memory (LSTM) Layer

ELSTM: An improved long short‐term memory network language model for sequence learning

Contact Info

Product

Resources

About