Multi-Timescale Long Short-Term Memory Neural Network for Modelling Sentences and Documents

Liu, Pengfei; Qiu, Xipeng; Chen, Xinchi; Wu, Shiyu; Huang, Xuanjing

doi:10.18653/v1/d15-1280

Cited by 151 publications

(86 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…There also exist some studies on RNN based methods taking insight in modeling short-term and long-term contexts, e.g., Multi-timescale RNN [43] [19] and Clockwork RNN [16]. Based on a hierarchical RNN structure [9], these methods model short-term dependencies and long-term dependencies separately with multiple RNNs.…”

Section: Neural Network Based Methodsmentioning

confidence: 99%

“…However, these RNN structures aim to better capture long-term dependencies in sequences via incorporating larger timescales in some of the many RNNs. Although they can indeed achieve better performance comparing with conventional structures in some applications [43] [16] [19], they still model input elements according to sequential orders in a RNN structure. Accordingly, they cannot overcome the drawback of RNN that temporal dependency changes monotonously.…”

Section: Neural Network Based Methodsmentioning

confidence: 99%

See 1 more Smart Citation

Multi-Behavioral Sequential Prediction with Recurrent Log-Bilinear Model

Liu

Wang

2017

IEEE Trans. Knowl. Data Eng.

View full text Add to dashboard Cite

Abstract-With the rapid growth of Internet applications, sequential prediction in collaborative filtering has become an emerging and crucial task. Given the behavioral history of a specific user, predicting his or her next choice plays a key role in improving various online services. Meanwhile, there are more and more scenarios with multiple types of behaviors, while existing works mainly study sequences with a single type of behavior. As a widely used approach, Markov chain based models are based on a strong independence assumption. As two classical neural network methods for modeling sequences, recurrent neural networks cannot well model short-term contexts, and the log-bilinear model is not suitable for long-term contexts. In this paper, we propose a Recurrent Log-BiLinear (RLBL) model. It can model multiple types of behaviors in historical sequences with behavior-specific transition matrices. RLBL applies a recurrent structure for modeling long-term contexts. It models several items in each hidden layer and employs position-specific transition matrices for modeling short-term contexts. Moreover, considering continuous time difference in behavioral history is a key factor for dynamic prediction, we further extend RLBL and replace position-specific transition matrices with time-specific transition matrices, and accordingly propose a Time-Aware Recurrent Log-BiLinear (TA-RLBL) model. Experimental results show that the proposed RLBL model and TA-RLBL model yield significant improvements over the competitive compared methods on three datasets, i.e., Movielens-1M dataset, Global Terrorism Database and Tmall dataset with different numbers of behavior types.

show abstract

Section: Neural Network Based Methodsmentioning

confidence: 99%

Section: Neural Network Based Methodsmentioning

confidence: 99%

Multi-Behavioral Sequential Prediction with Recurrent Log-Bilinear Model

Liu

Wang

2017

IEEE Trans. Knowl. Data Eng.

View full text Add to dashboard Cite

show abstract

“…PV slightly surpasses MTLE on IMDB (91.7 against 91.3), as sentences from IMDB are much longer than SST and MDSD, which require stronger capabilities of long-term dependency learning. In this paper, we mainly focus the idea and effects of integrating label embedding with multi-task learning, so we just apply (Graves 2013) to realize Le I and Le L , which can be further implemented by other more effective sentence learning models (Liu et al 2015a;Chen et al 2015) and produce better performances.…”

Section: Comparisons With State-of-the-art Modelsmentioning

confidence: 99%

Multi-Task Label Embedding for Text Classification

Zhang¹,

Xiao²,

Chen³

et al. 2018

Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

Multi-task learning in text classification leverages implicit correlations among related tasks to extract common features and yield performance gains. However, most previous works treat labels of each task as independent and meaningless onehot vectors, which cause a loss of potential information and makes it difficult for these models to jointly learn three or more tasks. In this paper, we propose Multi-Task Label Embedding to convert labels in text classification into semantic vectors, thereby turning the original tasks into vector matching tasks. We implement unsupervised, supervised and semisupervised models of Multi-Task Label Embedding, all utilizing semantic correlations among tasks and making it particularly convenient to scale and transfer as more tasks are involved. Extensive experiments on five benchmark datasets for text classification show that our models can effectively improve performances of related tasks with semantic representations of labels and additional information from each other.

show abstract

“…The other one tries to use multiple time-scales to distinguish different states (El Hihi and Bengio, 1995;Koutnik et al, 2014;Liu et al, 2015). They partition the hidden states into several groups and each group is activated and updated at different frequencies (e.g.…”

Section: Memory Augmented Recurrent Modelsmentioning

confidence: 99%

Cached Long Short-Term Memory Neural Networks for Document-Level Sentiment Classification

Xu¹,

Chen²,

Qiu³

et al. 2016

Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

Self Cite

167

View full text Add to dashboard Cite

Recently, neural networks have achieved great success on sentiment classification due to their ability to alleviate feature engineering. However, one of the remaining challenges is to model long texts in document-level sentiment classification under a recurrent architecture because of the deficiency of the memory unit. To address this problem, we present a Cached Long Short-Term Memory neural networks (CLSTM) to capture the overall semantic information in long texts. CLSTM introduces a cache mechanism, which divides memory into several groups with different forgetting rates and thus enables the network to keep sentiment information better within a recurrent unit. The proposed CLSTM outperforms the state-of-the-art models on three publicly available document-level sentiment analysis datasets.

show abstract

Multi-Timescale Long Short-Term Memory Neural Network for Modelling Sentences and Documents

Cited by 151 publications

References 17 publications

Multi-Behavioral Sequential Prediction with Recurrent Log-Bilinear Model

Multi-Behavioral Sequential Prediction with Recurrent Log-Bilinear Model

Multi-Task Label Embedding for Text Classification

Cached Long Short-Term Memory Neural Networks for Document-Level Sentiment Classification

Contact Info

Product

Resources

About