Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing 2016
DOI: 10.18653/v1/d16-1123
|View full text |Cite
|
Sign up to set email alerts
|

Convolutional Neural Network Language Models

Abstract: Convolutional Neural Networks (CNNs) have shown to yield very strong results in several Computer Vision tasks. Their application to language has received much less attention, and it has mainly focused on static classification tasks, such as sentence classification for Sentiment Analysis or relation extraction. In this work, we study the application of CNNs to language modeling, a dynamic, sequential prediction task that needs models to capture local as well as long-range dependency information. Our contributio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
28
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
4
4
2

Relationship

0
10

Authors

Journals

citations
Cited by 35 publications
(28 citation statements)
references
References 20 publications
0
28
0
Order By: Relevance
“…Other methods for analyzing NLP models include (i) inspecting the mechanisms a model uses to encode information, e.g. attention weights (Voita et al, 2018;Raganato and Tiedemann, 2018;Voita et al, 2019b;Clark et al, 2019;Kovaleva et al, 2019) or individual neurons (Karpathy et al, 2015;Pham et al, 2016;Bau et al, 2019), (ii) looking at model predictions using manually defined templates, either evaluating sensitivity to specific grammatical errors (Linzen et al, 2016;Gulordava et al, 2018;Tran et al, 2018;Marvin and Linzen, 2018) or understanding what language models know when applying them as knowledge bases or in QA settings (Radford et al, 2019;Petroni et al, 2019;Poerner et al, 2019;Jiang et al, 2019). An information-theoretic view on analysis of NLP models has been previously attempted in Voita et al (2019a) when explaining how representations in the Transformer evolve between layers under different training objectives.…”
Section: Related Workmentioning
confidence: 99%
“…Other methods for analyzing NLP models include (i) inspecting the mechanisms a model uses to encode information, e.g. attention weights (Voita et al, 2018;Raganato and Tiedemann, 2018;Voita et al, 2019b;Clark et al, 2019;Kovaleva et al, 2019) or individual neurons (Karpathy et al, 2015;Pham et al, 2016;Bau et al, 2019), (ii) looking at model predictions using manually defined templates, either evaluating sensitivity to specific grammatical errors (Linzen et al, 2016;Gulordava et al, 2018;Tran et al, 2018;Marvin and Linzen, 2018) or understanding what language models know when applying them as knowledge bases or in QA settings (Radford et al, 2019;Petroni et al, 2019;Poerner et al, 2019;Jiang et al, 2019). An information-theoretic view on analysis of NLP models has been previously attempted in Voita et al (2019a) when explaining how representations in the Transformer evolve between layers under different training objectives.…”
Section: Related Workmentioning
confidence: 99%
“…Meng et al (2015) and (Tu et al, 2015) applied convolutional models to score phrase-pairs of traditional phrasebased and dependency-based translation models. Convolutional architectures have also been successful in language modeling but so far failed to outperform LSTMs (Pham et al, 2016).…”
Section: Related Workmentioning
confidence: 99%
“…Pham et al [20] used CNN as a language model based on Feed Forward Neural Network (FFNN). Experimental results demonstrated that the performance of the CNN language model is better than the normal FFNN.…”
Section: Background and Prior Researchmentioning
confidence: 99%