Sentiment analysis from text consists of extracting information about opinions, sentiments, and even emotions conveyed by writers towards topics of interest. It is often equated to opinion mining, but it should also encompass emotion mining. Opinion mining involves the use of natural language processing and machine learning to determine the attitude of a writer towards a subject. Emotion mining is also using similar technologies but is concerned with detecting and classifying writers emotions toward events or topics. Textual emotion-mining methods have various applications, including gaining information about customer satisfaction, helping in selecting teaching materials in e-learning, recommending products based on users emotions, and even predicting mental-health disorders. In surveys on sentiment analysis, which are often old or incomplete, the strong link between opinion mining and emotion mining is understated. This motivates the need for a different and new perspective on the literature on sentiment analysis, with a focus on emotion mining. We present the state-of-the-art methods and propose the following contributions: (1) a taxonomy of sentiment analysis; (2) a survey on polarity classification methods and resources, especially those related to emotion mining; (3) a complete survey on emotion theories and emotion-mining research; and (4) some useful resources, including lexicons and datasets.
Abstract. Typical association rules consider only items enumerated in transactions. Such rules are referred to as positive association rules. Negative association rules also consider the same items, but in addition consider negated items (i.e. absent from transactions). Negative association rules are useful in market-basket analysis to identify products that conflict with each other or products that complement each other. They are also very convenient for associative classifiers, classifiers that build their classification model based on association rules. Many other applications would benefit from negative association rules if it was not for the expensive process to discover them. Indeed, mining for such rules necessitates the examination of an exponentially large search space. Despite their usefulness, and while they were referred to in many publications, very few algorithms to mine them have been proposed to date. In this paper we propose an algorithm that extends the support-confidence framework with a sliding correlation coefficient threshold. In addition to finding confident positive rules that have a strong correlation, the algorithm discovers negative association rules with strong negative correlation between the antecedents and consequents.
Abstract-Group work is widespread in education. The growing use of online tools supporting group work generates huge amounts of data. We aim to exploit this data to support mirroring: presenting useful high-level views of information about the group, together with desired patterns characterizing the behaviour of strong groups. The goal is to enable the groups and their facilitators to see relevant aspects of the group's operation and provide feedback if these are more likely to be associated with positive or negative outcomes and where the problems are. We explore how useful mirror information can be extracted via a theory-driven approach and a range of clustering and sequential pattern mining. The context is a senior software development project where students use the collaboration tool TRAC. We extract patterns distinguishing the better from the weaker groups and get insights in the success factors. The results point to the importance of leadership and group interaction, and give promising indications if they are occurring. Patterns indicating good individual practices were also identified. We found that some key measures can be mined from early data. The results are promising for advising groups at the start and early identification of effective and poor practices, in time for remediation.
BackgroundData measuring airborne pollutants, public health and environmental factors are increasingly being stored and merged. These big datasets offer great potential, but also challenge traditional epidemiological methods. This has motivated the exploration of alternative methods to make predictions, find patterns and extract information. To this end, data mining and machine learning algorithms are increasingly being applied to air pollution epidemiology.MethodsWe conducted a systematic literature review on the application of data mining and machine learning methods in air pollution epidemiology. We carried out our search process in PubMed, the MEDLINE database and Google Scholar. Research articles applying data mining and machine learning methods to air pollution epidemiology were queried and reviewed.ResultsOur search queries resulted in 400 research articles. Our fine-grained analysis employed our inclusion/exclusion criteria to reduce the results to 47 articles, which we separate into three primary areas of interest: 1) source apportionment; 2) forecasting/prediction of air pollution/quality or exposure; and 3) generating hypotheses. Early applications had a preference for artificial neural networks. In more recent work, decision trees, support vector machines, k-means clustering and the APRIORI algorithm have been widely applied. Our survey shows that the majority of the research has been conducted in Europe, China and the USA, and that data mining is becoming an increasingly common tool in environmental health. For potential new directions, we have identified that deep learning and geo-spacial pattern mining are two burgeoning areas of data mining that have good potential for future applications in air pollution epidemiology.ConclusionsWe carried out a systematic review identifying the current trends, challenges and new directions to explore in the application of data mining methods to air pollution epidemiology. This work shows that data mining is increasingly being applied in air pollution epidemiology.The potential to support air pollution epidemiology continues to grow with advancements in data mining related to temporal and geo-spacial mining, and deep learning. This is further supported by new sensors and storage mediums that enable larger, better quality data. This suggests that many more fruitful applications can be expected in the future.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.