2020
DOI: 10.22266/ijies2020.0831.29
|View full text |Cite
|
Sign up to set email alerts
|

A Two-Stepped Feature Engineering Process for Topic Modeling using Batchwise LDA with Stochastic Variational Inference Model

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 0 publications
0
5
0
Order By: Relevance
“…1815 Each * provided by the medical professionals specified an extra weightage for the consolidated point mentioned, where * has least weightage and ***** has highest weightage, respectively. Feature Engineering through TF-IDF+forward scan trigrams [5] and removal of weak features through Feature Hashing has helped improve the model's performance by 12% in terms of coherence scores. The coherence score was used in the experimentation process for assessing the quality of the identified topics.…”
Section: And Discussionmentioning
confidence: 99%
See 4 more Smart Citations
“…1815 Each * provided by the medical professionals specified an extra weightage for the consolidated point mentioned, where * has least weightage and ***** has highest weightage, respectively. Feature Engineering through TF-IDF+forward scan trigrams [5] and removal of weak features through Feature Hashing has helped improve the model's performance by 12% in terms of coherence scores. The coherence score was used in the experimentation process for assessing the quality of the identified topics.…”
Section: And Discussionmentioning
confidence: 99%
“…After extracting the tweets from Twitter, natural language tool kit (NLTK) 3.1 version is used for initial data preprocessing. Then the first level of improvised feature engineering (weighted TF-IDF in combination with Forward Scan Trigrams [5]) is applied to create an efficient VSM. This VSM is input to an enhanced K-means clustering algorithm to yield clusters based on the similarity of the data elements.…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations