Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2018
DOI: 10.18653/v1/p18-1041
|View full text |Cite
|
Sign up to set email alerts
|

Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms

Abstract: Many deep learning architectures have been proposed to model the compositionality in text sequences, requiring a substantial number of parameters and expensive computations. However, there has not been a rigorous evaluation regarding the added value of sophisticated compositional functions. In this paper, we conduct a point-by-point comparative study between Simple Word-Embeddingbased Models (SWEMs), consisting of parameter-free pooling operations, relative to word-embedding-based RNN/CNN models. Surprisingly,… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
171
0
3

Year Published

2019
2019
2021
2021

Publication Types

Select...
5
2
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 275 publications
(182 citation statements)
references
References 30 publications
1
171
0
3
Order By: Relevance
“…• SWEM [12] is a simple word embedding model that has been shown to be effective in text classification approaches. We adapt the SWEM model for toxin classification.…”
Section: Methods For Comparisonmentioning
confidence: 99%
“…• SWEM [12] is a simple word embedding model that has been shown to be effective in text classification approaches. We adapt the SWEM model for toxin classification.…”
Section: Methods For Comparisonmentioning
confidence: 99%
“…Surprisingly, deep learning methods have been proposed and successfully applied to text classification. Mikolov et.al [17] come to focus on the model based on word embeddings and recently Shen et.al [21] conduct a study between Simple Word-Embedding-based Models, which show the effectiveness of word embeddings. At the same time, the principle of some deep learning models such as CNN [15] and RNN [10]are employed to text classification.…”
Section: Related Workmentioning
confidence: 99%
“…Additionally, we explore the effectiveness of our dualattention GCN by comparing the results with ourselves, and experiment on the hop K to determine what value is appropriate. Dataset train words test nodes classes 20ng 11314 42757 7532 61603 20 mr 7108 18764 3554 29426 2 ohsumed 3357 14157 4043 21557 23 R52 6532 8892 2568 17992 52 R8 5485 7688 2189 15362 8 We compare our proposed method dual-attention GCN with multiple stateof-the-art text classification and embedding methods by following , including TF-IDF+LR [26], CNN [12] , LSTM [16] , Bi-LSTM, PV-DBOW [14] , PV-DM [14] , PTE [22] , fastText [11] , SVEM [21] , LEAM [19] , Graph-CNN-C [6], Graph-CNN-S [4] , Graph-CNN-F [9] and TextGCN. TF-IDF+LR is the bagof-words model set term frequency-inverse document frequency as weights with Logistic Regression classifier.…”
Section: Datasetsmentioning
confidence: 99%
“…• SWEM-concat (Shen et al, 2018): This model is based on a neural network model with simple pooling operations (i.e., average and max pooling) over pretrained word embeddings. 6 Despite its simplicity, it outperformed many neural network-based models such as the word-based CNN model (Kim, 2014) and RNN model with LSTM units (Shen et al, 2018).…”
Section: Baselinesmentioning
confidence: 99%