Digital data are all around us and occurs in various forms as videos, pictures or texts. Digital documents represent the vast majority of such data. It can be e-news, social media contributions and so on. They can contain useful information, but due to their amount, it is time-consuming to find relevant information for the concrete company or persons. For that reason, there is a need for their automatic analysis. One of the areas which dealt with textual data analysis is topic modeling. It showed us a new way of how to automatically browse, search and summarize data in the organization. Topic modeling can be useful for time-based analysis of crises, elections, news feeds, launching of new products on the market, and other tasks which led to decision support tasks. In this paper, we aim to survey and compare topic modeling methods and propose web application to visualize extracted topics using topic modeling method called Latent Dirichlet Allocation (LDA). The comparison of selected standard topic modeling methods was experimentally tested on two selected textual datasets (20Newsgroup and Reuters) using standard evaluation metric. The proposed web application was implemented to use LDA and can extract topic models from textual documents datasets, visualize them and show their evolution over time.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.