2019
DOI: 10.1371/journal.pone.0220976
|View full text |Cite
|
Sign up to set email alerts
|

Word2vec convolutional neural networks for classification of news articles and tweets

Abstract: Big web data from sources including online news and Twitter are good resources for investigating deep learning. However, collected news articles and tweets almost certainly contain data unnecessary for learning, and this disturbs accurate learning. This paper explores the performance of word2vec Convolutional Neural Networks (CNNs) to classify news articles and tweets into related and unrelated ones. Using two word embedding algorithms of word2vec, Continuous Bag-of-Word (CBOW) and Skip-gram, we constructed CN… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
54
0
5

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 139 publications
(70 citation statements)
references
References 48 publications
0
54
0
5
Order By: Relevance
“…The SG model was considered in this study based on its suitability for small- to medium-sized datasets. Jang et al [ 36 ] stated that the SG model is advantageous over CBOW when data size is not too large.…”
Section: Methodsmentioning
confidence: 99%
“…The SG model was considered in this study based on its suitability for small- to medium-sized datasets. Jang et al [ 36 ] stated that the SG model is advantageous over CBOW when data size is not too large.…”
Section: Methodsmentioning
confidence: 99%
“…On the other hand, many Natural Language Processing (NLP) studies with deep learning models have included learning word vector representation. The word vectors are represented in a dense form known as word embedding, in which the words that are semantically and syntactically related are close to each other in the embedding space [13,[38][39][40]. Word embedding has been used efficiently in many NLP tasks [41].…”
Section: Related Workmentioning
confidence: 99%
“…As previously mentioned, it was proven by Mikolov et al [63] that the performance of supervised methods, which rely on gold annotated treebanks, decays dramatically when applied to other domains or other languages. Mikolov et al [63] also mentioned that the generation of distributed representations of textual units would be adopted in NLP owing to the improvements they provide to different NLP tasks (e.g., the text classification task by Jang et al [64]).…”
Section: Plos Onementioning
confidence: 99%