Topic2features: a novel framework to classify noisy and sparse textual data using LDA topic distributions

Wahid, Junaid Abdul; Shi, Lei; Gao, Yufei; Yang, Beifang; Tao, Yongcai; Wei, Lin; Hussain, Shabir

doi:10.7717/peerj-cs.677

Cited by 10 publications

(6 citation statements)

References 38 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The three categories into which sentiment analysis can be divided are the machine learning technique, the lexicon-based approach, and the hybrid strategy that combines the previous two approaches [10]. Nowadays, computational technologies are being used in various domains of life, including healthcare [14], security [15] [21] [25] and also in safety purposes [16], disaster [17], and situational awareness [19] [26] [27] in the educational domain [18] as well. Sentiment analysis is a prominent research topic in demand under the category of NLP [20].…”

Section: Literature Reviewmentioning

confidence: 99%

See 1 more Smart Citation

Transfer Learning-Based Framework for Sentiment Classification of Cosmetics Products Reviews

Sahar

Ayoub

Hussain

et al. 2022

PakJET

View full text Add to dashboard Cite

The exponential growth in online reviews and recommendations availability drives sentiment classification, an interesting topic in industrial research. There is a vital requirement for organizations to explore client behaviour to assess the competitive business environment. This study aspires to examine and predict customer reviews using Transfer learning (TL) approaches. Reviews can span so many domains that it is challenging to gather annotated training data for all of them. Hence, this paper proposed an annotation algorithm to label a large unlabeled dataset. These reviews must be pulled and examined to predict the sentiment polarity, whether the review is positive, neutral, or negative. We propose a deep learning-based approach that learns to extract a meaningful representation for each review in an unsupervised fashion. Sentiment classifiers trained with this high-level feature representation outperform state-of-the-art methods on a benchmark of reviews of cosmetics brands on Amazon or other platforms. Using the BERT for sentiment analysis, we achieved the highest accuracy of 93.21% compared to previous studies.

show abstract

Section: Literature Reviewmentioning

confidence: 99%

“…No methodology to extract sentiments from customer feedback [15] Enhanced text mining to understand subjective and objective knowledge from text…”

Section: Literature Reviewmentioning

confidence: 99%

Transfer Learning-Based Framework for Sentiment Classification of Cosmetics Products Reviews

Sahar

Ayoub

Hussain

et al. 2022

PakJET

View full text Add to dashboard Cite

show abstract

“…Their model approached an accuracy of 95.97 percent for AM-FED+ Dataset, 94.89 percent for the AFEW dataset, and 91.14 percent for MELD. In the same contrast, deep learning is currently used in most common image recognition tools [22], natural language processing (NLP) [23] and speech recognition software. These tools are starting to appear in applications as diverse as self-driving cars and language translation services.…”

Section: A Local Binary Pattern Approachmentioning

confidence: 99%

Deep Learning based Framework for Emotion Recognition using Facial Expression

Bukhari

Hussain

Ayoub

et al. 2022

PakJET

View full text Add to dashboard Cite

Human convey their message in different forms. Expressing their emotions and moods through their facial expression is one of them. In this work, to avoid the traditional process of feature extraction (Geometry based method, Template based method, and Appearance based method), CNN model is used as a feature extractor for emotion detection using facial expression. In this study we also used three pre-trained models VGG-16, ResNet-50, Inception-V3. This Experiment is done on Fer-2013 facial expression dataset and Cohn Extended (CK+) dataset. By using FER-2013 dataset the accuracy rates for CNN, ResNet-50, VGG-16 and Inception-V3 are 76.74%, 85.71%, 85.78%s, 97.93% respectively. Similarly, the experimental results using CK+ dataset showed the accuracy rates for CNN, ResNet- 50, VGG-16 and Inception-V3 are 84.18%, 92.91%, 91.07%, and 73.16% respectively. The experimental results showed exceptional results for Inception-V3 with 97.93% using FER-2013 dataset and ResNet-50 with 91.92% using CK+ dataset.

show abstract

“…Few TM alternatives are also available such as latent semantic analysis (LSA), probabilistic LSA (pLSA), or LDA. Given the information and the length of text in these charts, we selected LDA for use as it has greater accuracy and is easy to interpret the results as it provides a more efficient representation of results (30,31).…”

Section: Latent Dirichlet Allocationmentioning

confidence: 99%

An Application of Machine Learning Techniques to Analyze Patient Information to Improve Oral Health Outcomes

Ameli¹,

Gibson²,

Khanna³

et al. 2022

Front. Dent. Med

View full text Add to dashboard Cite

ObjectiveVarious health-related fields have applied Machine learning (ML) techniques such as text mining, topic modeling (TM), and artificial neural networks (ANN) to automate tasks otherwise completed by humans to enhance patient care. However, research in dentistry on the integration of these techniques into the clinic arena has yet to exist. Thus, the purpose of this study was to: introduce a method of automating the reviewing patient chart information using ML, provide a step-by-step description of how it was conducted, and demonstrate this method's potential to identify predictive relationships between patient chart information and important oral health-related contributors.MethodsA secondary data analysis was conducted to demonstrate the approach on a set of anonymized patient charts collected from a dental clinic. Two ML applications for patient chart review were demonstrated: (1) text mining and Latent Dirichlet Allocation (LDA) were used to preprocess, model, and cluster data in a narrative format and extract common topics for further analysis, (2) Ordinal logistic regression (OLR) and ANN were used to determine predictive relationships between the extracted patient chart data topics and oral health-related contributors. All analysis was conducted in R and SPSS (IBM, SPSS, statistics 22).ResultsData from 785 patient charts were analyzed. Preprocessing of raw data (data cleaning and categorizing) identified 66 variables, of which 45 were included for analysis. Using LDA, 10 radiographic findings topics and 8 treatment planning topics were extracted from the data. OLR showed that caries risk, occlusal risk, biomechanical risk, gingival recession, periodontitis, gingivitis, assisted mouth opening, and muscle tenderness were highly predictable using the extracted radiographic and treatment planning topics and chart information. Using the statistically significant predictors obtained from OLR, ANN analysis showed that the model can correctly predict >72% of all variables except for bruxism and tooth crowding (63.1 and 68.9%, respectively).ConclusionOur study presents a novel approach to address the need for data-enabled innovations in the field of dentistry and creates new areas of research in dental analytics. Utilizing ML methods and its application in dental practice has the potential to improve clinicians' and patients' understanding of the major factors that contribute to oral health diseases/conditions.

show abstract

Topic2features: a novel framework to classify noisy and sparse textual data using LDA topic distributions

Cited by 10 publications

References 38 publications

Transfer Learning-Based Framework for Sentiment Classification of Cosmetics Products Reviews

Transfer Learning-Based Framework for Sentiment Classification of Cosmetics Products Reviews

Deep Learning based Framework for Emotion Recognition using Facial Expression

An Application of Machine Learning Techniques to Analyze Patient Information to Improve Oral Health Outcomes

Contact Info

Product

Resources

About