2020
DOI: 10.30534/ijatcse/2020/231942020
|View full text |Cite
|
Sign up to set email alerts
|

Topic Modeling Coherence: A Comparative Study between LDA and NMF Models using COVID’19 Corpus

Abstract: Topic modeling is a method for finding abstract topics in a large collection of documents. With it, it is possible to discover the mixture of hidden or "latent" topics that varies from document to document in a given corpus. As an unsupervised machine learning approach, topic models are not easy to evaluate since there is no labelled "ground truth" data to compare with. However, since topic modeling typically requires defining some parameters beforehand (first and foremost the number of topics k to be discover… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
17
0
1

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
4
1
1

Relationship

0
10

Authors

Journals

citations
Cited by 42 publications
(22 citation statements)
references
References 4 publications
0
17
0
1
Order By: Relevance
“…This model requires a fixed number of topics that is not specified accurately for a corpus. Accordingly, we chose an optimal number of topics for implementing the topic modelling technique following the study by Mifrah and Benlahmar (2020). In this respect, we calculated the topic coherence score for each number of topics to identify the most efficient one.…”
Section: − Step 2: Topic Constructionmentioning
confidence: 99%
“…This model requires a fixed number of topics that is not specified accurately for a corpus. Accordingly, we chose an optimal number of topics for implementing the topic modelling technique following the study by Mifrah and Benlahmar (2020). In this respect, we calculated the topic coherence score for each number of topics to identify the most efficient one.…”
Section: − Step 2: Topic Constructionmentioning
confidence: 99%
“…It is important to clarify that c_v is a measure that helps to determine the optimal number of topics, but this measurement requires human interpretability to determine if the T topics selected are appropriate. For more extensive information on Topic Coherence measures, we encourage the reader to see [29,30].…”
Section: Determining T Optimal Number Of Topicsmentioning
confidence: 99%
“…Os resultados do trabalho apontam que para uma melhor coerência entre os tópicos, LDA foi a melhor abordagem. O trabalho de [Mifrah and Benlahmar 2020] também tem como tema Covid-19 e os modelos LDA e NMF, e se apoiam apenas na métrica C v . Os autores concluem que para alguns tópicos o NMF teve melhores resultados, porém, avaliando os tópicos globalmente, o LDA gera tópicos mais coerentes e concisos.…”
Section: Trabalhos Relacionadosunclassified