ALGA: Adaptive lexicon learning using genetic algorithm for sentiment analysis of microblogs

Keshavarz, Hamidreza; Abadeh, Mohammad Saniee

doi:10.1016/j.knosys.2017.01.028

Cited by 88 publications

(48 citation statements)

References 36 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The various benchmark datasets used in the past decade were WePS-3, 27 SemEval, 30,52,54,55,73,75,76,85 tweets prepared by Stanford University, 34,45,46,75 SNAP, 40 Sanders Twitter Sentiment Corpus (denoted as Sanders), 44,55,75,79 2008 Presidential Debate Corpus, 44,75,79 Sentiment140, 51 RepLab 2012, 53 RepLab 2013, 53 STS-manual, 55 Gold Standard personality labeled Twitter dataset, 59 Cleveland Heart Disease data, 69 STS-Gold, 73 FIGURE 6 Distribution of papers in accordance to the digital libraries (expressed in percentages) Many reported researches were carried on the tweets fetched directly from Twitter using its API. The tweets were from a variety of domains, topics and time period (referred as topic specific/topic oriented tweets).…”

Section: • Widely Used Datasets and Domains In Which The Studies For mentioning

confidence: 99%

“…The tweets were from a variety of domains, topics and time period (referred as topic specific/topic oriented tweets). These prominently included tweets from or about elite personalities like actors; singers; sportsperson; comedians; politicians, authors, idols; entertainers, 28,34,37,40,54,55,67,70,73,75,79 etc, news and commemoratives, 17,30,48,58,59,64,67,79 health and fitness, 31,56,57,69,74,75,78,79 stock market exchanges, 29,34,63,82 companies like AT&T; Amazon;…”

Section: • Widely Used Datasets and Domains In Which The Studies For mentioning

confidence: 99%

See 1 more Smart Citation

Systematic literature review of sentiment analysis on Twitter using soft computing techniques

Kumar

Jaiswal

2019

Concurrency and Computation

119

View full text Add to dashboard Cite

Sentiment detection and classification is the latest fad for social analytics on Web. With the array of practical applications in healthcare, finance, media, consumer markets, and government, distilling the voice of public to gain insight to target information and reviews is non-trivial. With a marked increase in the size, subjectivity, and diversity of social web-data, the vagueness, uncertainty and imprecision within the information has increased manifold. Soft computing techniques have been used to handle this fuzziness in practical applications. This work is a study to understand the feasibility, scope and relevance of this alliance of using Soft computing techniques for sentiment analysis on Twitter. We present a systematic literature review to collate, explore, understand and analyze the efforts and trends in a well-structured manner to identify research gaps defining the future prospects of this coupling. The contribution of this paper is significant because firstly the primary focus is to study and evaluate the use of soft computing techniques for sentiment analysis on Twitter and secondly as compared to the previous reviews we adopt a systematic approach to identify, gather empirical evidence, interpret results, critically analyze, and integrate the findings of all relevant high-quality studies to address specific research questions pertaining to the defined research domain. KEYWORDS machine learning, review, sentiment analysis, soft computing, Twitter INTRODUCTIONThe incessantly evolving dynamics of the Web in terms of the volume, velocity and variety of opinion-rich information accessible online, has made research in the domain of Sentiment Analysis (SA) a trend for many practical applications which facilitate decision support and deliver targeted information to domain analysts. Interestingly, the buzzing term ''big data'' which is estimated to be 90% unstructured 1 further makes it crucial to tap and analyze information using contemporary tools. Text mining models define the process to transform and substitute this unstructured data into a structured one for knowledge discovery. Use of classification algorithms to intelligently mine text has been studied extensively across literature. 2,3 SA, established as a typical text classification task, 4 is defined as the computational study of people's opinions, attitudes and emotions towards an entity. 5,6 It offers a technology-based solution to understand people's reactions, views and opinion polarities (positive, negative or neutral) in textual content available over social media sources.Research studies and practical applications in the field of SA have escalated in the past decade with the transformation and expansion of Web from passive provider of content to an active socially-aware distributor of collective intelligence. This new collaborative Web (called Web 2.0), 7 extended by Web-based technologies like comments, blogs and wikis, social media portals like Twitter or Facebook, that allow to build social networks based on professional relationship, i...

show abstract

Section: • Widely Used Datasets and Domains In Which The Studies For mentioning

confidence: 99%

Section: • Widely Used Datasets and Domains In Which The Studies For mentioning

confidence: 99%

Systematic literature review of sentiment analysis on Twitter using soft computing techniques

Kumar

Jaiswal

2019

Concurrency and Computation

119

View full text Add to dashboard Cite

show abstract

“…A methodology proposed for corpus and lexicon based approaches mainly is used to create the text documents. This approach is used for classifying sentiments and polarity [27]. A genetic algorithm is proposed for optimization problem and is used for finding lexicons in the opinionated text.…”

Section: Hybrid Based Approachesmentioning

confidence: 99%

“…An approach for sentiment classification in micro blogs is proposed. Genetic algorithm, sentiment lexicon, meta-level features, Bing Liu's lexicon and n-gram features are used in the framework [27]. It requires concentrating on creation of the lexicon to reduce the time-consuming and sentiment score in other domains.…”

Section: Open Issues and Research Gapsmentioning

confidence: 99%

Perspectives of the performance metrics in lexicon and hybrid based approaches: a review

Rani¹,

Sumathy²

2017

IJET

View full text Add to dashboard Cite

Online social media and social networking services experience a drastic development in the present scenario. Contents generated by hundreds of millions of users are used for communication in general. Users mark their opinion and review in various applications such as Twitter, Facebook, YouTube, Weibo, Flicker, LinkedIn, Online-e commerce sites, Microblogging sites, etc. User generated text is spread rapidly on the web, and it has become tedious to analyze the opinionated text in order to arrive at a decision. Sentiment analysis, a subcategory of text mining is the major active research domain in current era due to greater quantity of opinionated text present in the Internet. Semantic detection is the sub-class in the sentiment analysis which is used for measuring the sentiment orientation in any text. Opinionated text is used for analyzing and making the decision simple. This interdisciplinary field draws various techniques from data mining, machine learning, natural language processing, lexicon based and hybrid based approaches. This paper provides a broad perspective with the highlight of the current state-of art techniques emphasizing the various research challenges and gaps present. The performance metrics in terms of detection rate, precision, recall, f-measure/score, average mean, auto-Pearson correlation, cosine similarity and ratio of time on various algorithms is discussed in detail. An analysis of the text mining approaches in different domains is presented.

show abstract

“…Results of the research demonstrated that is improved efficiency of feature selection of classifiers in the classification of opinions. A novel GA was proposed by Keshavarz&Abadeh [13] in solving optimization issues and to find lexicon to classify text. Adaptive sentiment lexicons were generated through this algorithm and on its basis, which was used along with Bing Liu's lexicon and ngram features.…”

Section: Related Workmentioning

confidence: 99%

Hybrid optimization for feature selection in opinion mining

Moorthi¹,

Mathivanan²

2017

IJET

View full text Add to dashboard Cite

A sub-discipline of Information Retrieval (IR) is opinion mining and the lexicon of computers is not concerned of the subject of the document, but about the opinion expressed. It has caused a large impact in the arena of academics and industry as it has a wide area of research and the applications are widespread. Feature selection is a vital step in opinion mining, as its individual feature decides the opinions expressed by the customers. Feature selection reduces the dimensionality of data by avoiding non-relevant features; it can be considered as a necessary and excellent process for data mining applications. In this study, feature subset is optimized through Particle Swarm Optimization (PSO) algorithm, Cuckoo Search (CS) algorithm and hybridized PSO-CS algorithm. Classification is done through Naïve bayes and K-Nearest Neighbours (KNN) classifiers. Feature extraction has its basis on Term Frequency-Inverse Document Frequency (TF-IDF). The accuracy of classification precision is increased by the reduction in size of feature subset and computational complexity.

show abstract

ALGA: Adaptive lexicon learning using genetic algorithm for sentiment analysis of microblogs

Cited by 88 publications

References 36 publications

Systematic literature review of sentiment analysis on Twitter using soft computing techniques

Systematic literature review of sentiment analysis on Twitter using soft computing techniques

Perspectives of the performance metrics in lexicon and hybrid based approaches: a review

Hybrid optimization for feature selection in opinion mining

Contact Info

Product

Resources

About