2019 IEEE 62nd International Midwest Symposium on Circuits and Systems (MWSCAS) 2019
DOI: 10.1109/mwscas.2019.8884912
|View full text |Cite
|
Sign up to set email alerts
|

Slim LSTM NETWORKS: LSTM_6 and LSTM_C6

Abstract: We have shown previously that our parameterreduced variants of Long Short-Term Memory (LSTM) Recurrent Neural Networks (RNN) are comparable in performance to the standard LSTM RNN on the MNIST dataset. In this study, we show that this is also the case for two diverse benchmark datasets, namely, the review sentiment IMDB and the 20 Newsgroup datasets. Specifically, we focus on two of the simplest variants, namely LSTM 6 (i.e., standard LSTM with three constant fixed gates) and LSTM C6 (i.e., LSTM 6 with further… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 14 publications
(4 citation statements)
references
References 13 publications
0
4
0
Order By: Relevance
“…This would be useful for user authentication because it is likely that the same child would be using a mobile device over a given bout of time and training the model with behavior from earlier in the bout would enhance model performance. However, LSTM models are computationally intensive [68], which is why more parsimonious techniques were used in this preliminary study.…”
Section: Discussionmentioning
confidence: 99%
“…This would be useful for user authentication because it is likely that the same child would be using a mobile device over a given bout of time and training the model with behavior from earlier in the bout would enhance model performance. However, LSTM models are computationally intensive [68], which is why more parsimonious techniques were used in this preliminary study.…”
Section: Discussionmentioning
confidence: 99%
“…LSTM is a type of RNN layer that can effectively process sequential data, such as time series or text [ 57 ]. The bidirectional variant of LSTM processes the input sequence in both forward and backward directions, allowing the model to capture dependencies in both past and future contexts.…”
Section: Methodsmentioning
confidence: 99%
“…The main function of LSTM to solve the vanishing gradient is caused by the gradient getting smaller until the last layer of the weight value is unchanged, which causes it never to get a better result or converge. On the other hand, the increasing gradient causes the weighting values in several layers to increase too, so the optimization algorithm becomes divergent or is called an exploding gradient [2,[14][15][16][17].…”
Section: Fig 1 Recurrent Neural Network (Rnn) Conceptmentioning
confidence: 99%