In recent years, it is seen that the opinion-based postings in social media are helping to reshape business and public sentiments, and emotions have an impact on our social and political systems. Opinions are central to mostly all human activities as they are the key influencers of our behaviour. Whenever we need to make a decision, we generally want to know others opinion. Every organization and business always wants to find customer or public opinion about their products and services. Thus, it is necessary to grab and study the opinions on the Web. However, finding and monitoring sites on the web and distilling the reviews remains a big task because each site typically contains a huge volume of opinion text and the average human reader will have difficulty in identifying the polarity of each review and summarizing the opinions in them. Hence, it needs the automated sentiment analysis to find the polarity score and classify the reviews as positive or negative. This article uses NLTK, Text blob and VADER Sentiment analysis tool to classify the movie reviews which are downloaded from the website www.rottentomatoes.com that is provided by the Cornell University, and makes a comparison on these tools to find the efficient one for sentiment classification. The experimental results of this work confirm that VADER outperforms the Text blob.
Learners with reading difficulties normally face significant challenges in understanding the text-based learning materials. In this regard, there is a need for an assistive summary to help such learners to approach the learning documents with minimal difficulty. An important issue in extractive summarization is to extract cohesive summary from the text. Existing summarization approaches focus mostly on informative sentences rather than cohesive sentences. We considered several existing features, including sentence location, cardinality, title similarity, and keywords to extract important sentences. Moreover, learner-dependent readability-related features such as average sentence length, percentage of trigger words, percentage of polysyllabic words, and percentage of noun entity occurrences are considered for the summarization purpose. The objective of this work is to extract the optimal combination of sentences that increase readability through sentence cohesion using genetic algorithm. The results show that the summary extraction using our proposed approach performs better in-measure, readability, and cohesion than the baseline approach (lead) and the corpus-based approach. The task-based evaluation shows the effect of summary assistive reading in enhancing readability on reading difficulties.
Depression affects over 322 million people, and it is the most common source of disability worldwide. Literature in speech processing revealed that speech could be used for detecting depression. Depressed individuals exhibit varied acoustic characteristics compared to non-depressed. A four-staged machine learning classification system is developed to investigate the acoustic parameters to detect depression. Stage one uses speech recordings from a publicly available and clinically validated dataset DAIC-WOZ. The baseline acoustic feature vector, eGeMAPS, is extracted from the dataset in stage two. Adaptive synthetic (ADASYN) is performed along with data preprocessing to overcome the class imbalance. In stage three, we conducted feature selection (FS) using three techniques; Boruta FS, recursive feature elimination using support vector machine (SVM-RFE), and the fisher score-based FS. Experimentation with various machine learning base classifiers like gaussian naïve bayes (GNB), support vector machine (SVM), k-nearest neighbors (KNN), logistic regression (LR), and random forest classifier (RF) is performed in stage four. The hyperparameters of the classifiers are tuned using the GridSearchCV technique throughout the 10-fold stratified cross-validation (CV). Then we employed multiple dynamic ensemble selection of classifier algorithms (DES) with k=3 and k=5 utilizing the pool of aforementioned four base classifiers to improve the accuracy. We present a comparative study using eGeMAPS features against the base classifiers and the experimented DES classifiers. Our results on the DAIC-WOZ benchmark dataset suggested that K-Nearest Oracles Union (KNORA-U) DES with k=3 has superior accuracy using a subset of 15 features selected by fisher score-based FS than the individual base classifiers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.