Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing 2015
DOI: 10.18653/v1/d15-1280
|View full text |Cite
|
Sign up to set email alerts
|

Multi-Timescale Long Short-Term Memory Neural Network for Modelling Sentences and Documents

Abstract: Neural network based methods have obtained great progress on a variety of natural language processing tasks. However, it is still a challenge task to model long texts, such as sentences and documents. In this paper, we propose a multi-timescale long short-term memory (MT-LSTM) neural network to model long texts. MT-LSTM partitions the hidden states of the standard LSTM into several groups. Each group is activated at different time periods. Thus, MT-LSTM can model very long documents as well as short sentences.… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
80
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
4
3
2

Relationship

2
7

Authors

Journals

citations
Cited by 151 publications
(86 citation statements)
references
References 17 publications
0
80
0
Order By: Relevance
“…There also exist some studies on RNN based methods taking insight in modeling short-term and long-term contexts, e.g., Multi-timescale RNN [43] [19] and Clockwork RNN [16]. Based on a hierarchical RNN structure [9], these methods model short-term dependencies and long-term dependencies separately with multiple RNNs.…”
Section: Neural Network Based Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…There also exist some studies on RNN based methods taking insight in modeling short-term and long-term contexts, e.g., Multi-timescale RNN [43] [19] and Clockwork RNN [16]. Based on a hierarchical RNN structure [9], these methods model short-term dependencies and long-term dependencies separately with multiple RNNs.…”
Section: Neural Network Based Methodsmentioning
confidence: 99%
“…However, these RNN structures aim to better capture long-term dependencies in sequences via incorporating larger timescales in some of the many RNNs. Although they can indeed achieve better performance comparing with conventional structures in some applications [43] [16] [19], they still model input elements according to sequential orders in a RNN structure. Accordingly, they cannot overcome the drawback of RNN that temporal dependency changes monotonously.…”
Section: Neural Network Based Methodsmentioning
confidence: 99%
“…PV slightly surpasses MTLE on IMDB (91.7 against 91.3), as sentences from IMDB are much longer than SST and MDSD, which require stronger capabilities of long-term dependency learning. In this paper, we mainly focus the idea and effects of integrating label embedding with multi-task learning, so we just apply (Graves 2013) to realize Le I and Le L , which can be further implemented by other more effective sentence learning models (Liu et al 2015a;Chen et al 2015) and produce better performances.…”
Section: Comparisons With State-of-the-art Modelsmentioning
confidence: 99%
“…The other one tries to use multiple time-scales to distinguish different states (El Hihi and Bengio, 1995;Koutnik et al, 2014;Liu et al, 2015). They partition the hidden states into several groups and each group is activated and updated at different frequencies (e.g.…”
Section: Memory Augmented Recurrent Modelsmentioning
confidence: 99%