Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms

Shen, Dinghan; Wang, Guoyin; Wang, Wenlin; Min, Martin Renqiang; Su, Qinliang; Zhang, Yizhe; Li, Chunyuan; Henao, Ricardo; Carin, Lawrence

doi:10.18653/v1/p18-1041

Cited by 275 publications

(182 citation statements)

References 30 publications

Supporting

Mentioning

171

Contrasting

Unclassified

Order By: Relevance

“…• SWEM [12] is a simple word embedding model that has been shown to be effective in text classification approaches. We adapt the SWEM model for toxin classification.…”

Section: Methods For Comparisonmentioning

confidence: 99%

ProtTox: Toxin identification from Protein Sequences

Datta

Muthiah

Butler

et al. 2020

Preprint

View full text Add to dashboard Cite

Toxin classification of protein sequences is a challenging task with real world applications in healthcare and synthetic biology. Due to an ever expanding database of proteins and the inordinate cost of manual annotation, automated machine learning based approaches are crucial. Approaches need to overcome challenges of homology, multi-functionality, and structural diversity among proteins in this task. We propose a novel deep learning based method ProtTox, that aims to address some of the shortcomings of previous approaches in classifying proteins as toxins or not. Our method achieves a performance of 0.812 F1-score which is about 5% higher than the closest performing baseline.

show abstract

“…• SWEM [12] is a simple word embedding model that has been shown to be effective in text classification approaches. We adapt the SWEM model for toxin classification.…”

Section: Methods For Comparisonmentioning

confidence: 99%

ProtTox: Toxin identification from Protein Sequences

Datta

Muthiah

Butler

et al. 2020

Preprint

View full text Add to dashboard Cite

show abstract

“…Surprisingly, deep learning methods have been proposed and successfully applied to text classification. Mikolov et.al [17] come to focus on the model based on word embeddings and recently Shen et.al [21] conduct a study between Simple Word-Embedding-based Models, which show the effectiveness of word embeddings. At the same time, the principle of some deep learning models such as CNN [15] and RNN [10]are employed to text classification.…”

Section: Related Workmentioning

confidence: 99%

“…Additionally, we explore the effectiveness of our dualattention GCN by comparing the results with ourselves, and experiment on the hop K to determine what value is appropriate. Dataset train words test nodes classes 20ng 11314 42757 7532 61603 20 mr 7108 18764 3554 29426 2 ohsumed 3357 14157 4043 21557 23 R52 6532 8892 2568 17992 52 R8 5485 7688 2189 15362 8 We compare our proposed method dual-attention GCN with multiple stateof-the-art text classification and embedding methods by following , including TF-IDF+LR [26], CNN [12] , LSTM [16] , Bi-LSTM, PV-DBOW [14] , PV-DM [14] , PTE [22] , fastText [11] , SVEM [21] , LEAM [19] , Graph-CNN-C [6], Graph-CNN-S [4] , Graph-CNN-F [9] and TextGCN. TF-IDF+LR is the bagof-words model set term frequency-inverse document frequency as weights with Logistic Regression classifier.…”

Section: Datasetsmentioning

confidence: 99%

Dual-Attention Graph Convolutional Network

Zhang

Zhao

et al. 2020

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Graph convolutional networks (GCNs) have shown the powerful ability in text structure representation and effectively facilitate the task of text classification. However, challenges still exist in adapting GCN on learning discriminative features from texts due to the main issue of graph variants incurred by the textual complexity and diversity. In this paper, we propose a dual-attention GCN to model the structural information of various texts as well as tackle the graph-invariant problem through embedding two types of attention mechanisms, i.e. the connection-attention and hop-attention, into the classic GCN. To encode various connection patterns between neighbour words, connectionattention adaptively imposes different weights specified to neighbourhoods of each word, which captures the short-term dependencies. On the other hand, the hop-attention applies scaled coefficients to different scopes during the graph diffusion process to make the model learn more about the distribution of context, which captures long-term semantics in an adaptive way. Extensive experiments are conducted on five widely used datasets to evaluate our dual-attention GCN, and the achieved state-of-the-art performance verifies the effectiveness of dual-attention mechanisms.

show abstract

“…• SWEM-concat (Shen et al, 2018): This model is based on a neural network model with simple pooling operations (i.e., average and max pooling) over pretrained word embeddings. 6 Despite its simplicity, it outperformed many neural network-based models such as the word-based CNN model (Kim, 2014) and RNN model with LSTM units (Shen et al, 2018).…”

Section: Baselinesmentioning

confidence: 99%

Neural Attentive Bag-of-Entities Model for Text Classification

Yamada¹,

Shindo²

2019

Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

View full text Add to dashboard Cite

This study proposes a Neural Attentive Bagof-Entities model, which is a neural network model that performs text classification using entities in a knowledge base. Entities provide unambiguous and relevant semantic signals that are beneficial for capturing semantics in texts. We combine simple high-recall entity detection based on a dictionary, to detect entities in a document, with a novel neural attention mechanism that enables the model to focus on a small number of unambiguous and relevant entities. We tested the effectiveness of our model using two standard text classification datasets (i.e., the 20 Newsgroups and R8 datasets) and a popular factoid question answering dataset based on a trivia quiz game. As a result, our model achieved state-of-the-art results on all datasets. The source code of the proposed model is available online at https://github.com/ wikipedia2vec/wikipedia2vec.

show abstract

Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms

Cited by 275 publications

References 30 publications

ProtTox: Toxin identification from Protein Sequences

ProtTox: Toxin identification from Protein Sequences

Dual-Attention Graph Convolutional Network

Neural Attentive Bag-of-Entities Model for Text Classification

Contact Info

Product

Resources

About