Topic modeling is a state-of-the-art technique for analyzing text corpora. It uses a statistical model, most commonly Latent Dirichlet Allocation (LDA), to discover abstract topics that occur in the document collection. However, the LDA-based topic modeling procedure is based on a randomly selected initial configuration as well as a number of parameter values than need to be chosen. This induces uncertainties on the topic modeling results, and visualization methods should convey these uncertainties during the analysis process. We propose a visual uncertainty-aware topic modeling analysis. We capture the uncertainty by computing topic modeling ensembles and propose measures for estimating topic modeling uncertainty from the ensemble. Then, we propose to enhance state-of-the-art topic modeling visualization methods to convey the uncertainty in the topic modeling process. We visualize the entire ensemble of topic modeling results at different levels for topic and document analysis. We apply our visualization methods to a text corpus to document the impact of uncertainty on the analysis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.