The platform will undergo maintenance on Sep 14 at about 7:45 AM EST and will be unavailable for approximately 2 hours.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2016
DOI: 10.18653/v1/p16-1125
|View full text |Cite
|
Sign up to set email alerts
|

Larger-Context Language Modelling with Recurrent Neural Network

Abstract: In this work, we propose a novel method to incorporate corpus-level discourse information into language modelling. We call this larger-context language model. We introduce a late fusion approach to a recurrent language model based on long shortterm memory units (LSTM), which helps the LSTM unit keep intra-sentence dependencies and inter-sentence dependencies separate from each other. Through the evaluation on four corpora (IMDB, BBC, Penn TreeBank, and Fil9), we demonstrate that the proposed model improves per… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
43
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 50 publications
(44 citation statements)
references
References 35 publications
1
43
0
Order By: Relevance
“…Comparison to Exploiting Auxiliary Contexts in Language Modeling: A thread of work in language modeling (LM) attempts to exploit auxiliary sentence-level or document-level context in an RNN LM (Mikolov and Zweig, 2012;Ji et al, 2015;Wang and Cho, 2016). Independent of our work, Wang and Cho (2016) propose "early fusion" models of RNNs where additional information from an intersentence context is "fused" with the input to the RNN.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Comparison to Exploiting Auxiliary Contexts in Language Modeling: A thread of work in language modeling (LM) attempts to exploit auxiliary sentence-level or document-level context in an RNN LM (Mikolov and Zweig, 2012;Ji et al, 2015;Wang and Cho, 2016). Independent of our work, Wang and Cho (2016) propose "early fusion" models of RNNs where additional information from an intersentence context is "fused" with the input to the RNN.…”
Section: Related Workmentioning
confidence: 99%
“…Comparison to Exploiting Auxiliary Contexts in Language Modeling: A thread of work in language modeling (LM) attempts to exploit auxiliary sentence-level or document-level context in an RNN LM (Mikolov and Zweig, 2012;Ji et al, 2015;Wang and Cho, 2016). Independent of our work, Wang and Cho (2016) propose "early fusion" models of RNNs where additional information from an intersentence context is "fused" with the input to the RNN. Closely related to Wang and Cho (2016), our approach aims to dynamically control the contributions of required source and target contexts for machine translation, while theirs focuses on integrating auxiliary corpus-level contexts for language modelling to better approximate the corpus-level probability.…”
Section: Related Workmentioning
confidence: 99%
“…A larger context language model that incorporates context from preceding sentences (Wang and Cho, 2016), by treating the preceding sentence as a bag of words, and using an attentional mechanism when predicting the next word. An additional hyper-parameter in lclm is the number of preceeding sentences to incorporate, which we tune based on a development set (to 4 sentences in each case).…”
Section: Lclmmentioning
confidence: 99%
“…Finally, we use late fusion (Wang and Cho, 2015) to combine the output of the attention mechanism with the current position in the B-RNN without interfering with its memory.…”
Section: The Neural Network Modelmentioning
confidence: 99%