2018
DOI: 10.1016/j.eswa.2018.06.022
|View full text |Cite
|
Sign up to set email alerts
|

A comparative evaluation of pre-processing techniques and their interactions for twitter sentiment analysis

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
107
0
8

Year Published

2019
2019
2019
2019

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 180 publications
(134 citation statements)
references
References 24 publications
0
107
0
8
Order By: Relevance
“…The various benchmark datasets used in the past decade were WePS-3, 27 SemEval, 30,52,54,55,73,75,76,85 tweets prepared by Stanford University, 34,45,46,75 SNAP, 40 Sanders Twitter Sentiment Corpus (denoted as Sanders), 44,55,75,79 2008 Presidential Debate Corpus, 44,75,79 Sentiment140, 51 RepLab 2012, 53 RepLab 2013, 53 STS-manual, 55 Gold Standard personality labeled Twitter dataset, 59 Cleveland Heart Disease data, 69 STS-Gold, 73 FIGURE 6 Distribution of papers in accordance to the digital libraries (expressed in percentages) Many reported researches were carried on the tweets fetched directly from Twitter using its API. The tweets were from a variety of domains, topics and time period (referred as topic specific/topic oriented tweets).…”
Section: • Widely Used Datasets and Domains In Which The Studies For mentioning
confidence: 99%
See 2 more Smart Citations
“…The various benchmark datasets used in the past decade were WePS-3, 27 SemEval, 30,52,54,55,73,75,76,85 tweets prepared by Stanford University, 34,45,46,75 SNAP, 40 Sanders Twitter Sentiment Corpus (denoted as Sanders), 44,55,75,79 2008 Presidential Debate Corpus, 44,75,79 Sentiment140, 51 RepLab 2012, 53 RepLab 2013, 53 STS-manual, 55 Gold Standard personality labeled Twitter dataset, 59 Cleveland Heart Disease data, 69 STS-Gold, 73 FIGURE 6 Distribution of papers in accordance to the digital libraries (expressed in percentages) Many reported researches were carried on the tweets fetched directly from Twitter using its API. The tweets were from a variety of domains, topics and time period (referred as topic specific/topic oriented tweets).…”
Section: • Widely Used Datasets and Domains In Which The Studies For mentioning
confidence: 99%
“…Accuracy (A) It is defined as proximity of a measurement to its true value. It 17,29,40,43-45,48-50,52,55,57,60,64, is calculated as a proportion of TP and true negatives (TN) 66,[68][69][70][72][73][74][75][77][78][79]85,[81][82][83] among total inspected cases.…”
mentioning
confidence: 99%
See 1 more Smart Citation
“…1. Pré-processamento: é o procedimento de limpar e preparar textos que serão classificados [16]. Ele também visa reduzir o volume de dados [12,17].…”
Section: Análise De Sentimentosunclassified
“…Algumas das técnicas de pré-processamento incluem remover símbolos e caracteres não textuais (característicos de textos não estruturados), expandir abreviações, substituir contrações, remover números, remover stopwords (preposições, artigos e conectivos que servem para ligar palavras a outra e não dão sentido na frase [17]) e reduzir a palavra ao radical (stemming) [18], diminuindo assim as variações da mesma palavra (plural, gerúndio, verbos, flexionados, aumentativo, diminutivo, substantivos, entre outros). [20,9], Linear discriminant analysis (LDA) [9,21], Naïve Bayes (NB) [16,22,23], Random Forest [24,25], Vizinhos mais próximos (KNN) [22], Multi-layer Perceptron (MLP) [16,26,13]. Sohrabi e Hemmatian [12] propuseram um sistema que utiliza SVM e RNA para reconhecimento de polaridade.…”
Section: Análise De Sentimentosunclassified