2020
DOI: 10.1109/access.2020.2985418
|View full text |Cite
|
Sign up to set email alerts
|

NEWLSTM: An Optimized Long Short-Term Memory Language Model for Sequence Prediction

Abstract: The long short-term memory (LSTM) model trained on the universal language modeling task overcomes the bottleneck of vanishing gradients in the traditional recurrent neural network (RNN) and shows excellent performance in processing multiple tasks generated by natural language processing. Although LSTM effectively alleviates the vanishing gradient problem in the RNN, the information will be greatly lost in the long distance transmission, and there are still some limitations in its practical use. In this paper, … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
1
1

Relationship

1
8

Authors

Journals

citations
Cited by 19 publications
(9 citation statements)
references
References 17 publications
0
5
0
Order By: Relevance
“…Similar to all recurrent neural network-based methods, the MLP-LSTM [39] still suffers from long-term dependencies. [40][41][42] Therefore, by extracting low-quality temporal features, the performance of MLP-LSTM [39] was degraded. However, the HAMA achieved the highest performance by using the attention mechanism to extract high-quality temporal features.…”
Section: Performance Comparisonmentioning
confidence: 99%
“…Similar to all recurrent neural network-based methods, the MLP-LSTM [39] still suffers from long-term dependencies. [40][41][42] Therefore, by extracting low-quality temporal features, the performance of MLP-LSTM [39] was degraded. However, the HAMA achieved the highest performance by using the attention mechanism to extract high-quality temporal features.…”
Section: Performance Comparisonmentioning
confidence: 99%
“…A recursive neural network (RNN) was trained using the same training dataset and compared to ANN. The RNN can easily map sequences to sequences whenever the alignment between the inputs and outputs is known ahead of time [33]. Identically RNN with 12 input layers, 12 output layers, and 12 hidden layers was trained.…”
Section: B Machine Learning For Regression Problemmentioning
confidence: 99%
“…One of them is referred to as a keyframe, so one timestep will have several time series analysis items into the RNN at the same period. LSTM uses gate structures to selectively control information, thereby alleviating the defect of the disappearance gradient of traditional RNN (Wang et al, 2020). Among them, the input gate and output gate control input and output of the unit respectively, and forget the state of the gate control unit.…”
Section: Long Short-term Memorymentioning
confidence: 99%