Probabilistic topic models

Blei, David M.

doi:10.1145/2107736.2107741

Cited by 474 publications

(689 citation statements)

References 0 publications

Supporting

Mentioning

669

Contrasting

Unclassified

Order By: Relevance

“…As Blei (2012) notes, topics and topical decompositions are not in a sense 'definitive.' Fitting a model to any collection will yield patterns regardless of whether they exist in a true sense the corpus.…”

Section: Resultsmentioning

confidence: 99%

Modeling virtual organizations with Latent Dirichlet Allocation: A case for natural language processing

Groß

Murthy

2014

Neural Networks

View full text Add to dashboard Cite

This paper explores a variety of methods for applying the Latent Dirichlet Allocation (LDA) automated topic modeling algorithm to the modeling of the structure and behavior of virtual organizations found within modern social media and social networking environments. As the field of Big Data reveals, an increase in the scale of social data available presents new challenges which are not tackled by merely scaling up hardware and software. Rather, they necessitate new methods and, indeed, new areas of expertise. Natural language processing provides one such method. This paper applies LDA to the study of scientific virtual organizations whose members employ social technologies. Because of the vast data footprint in these virtual platforms, we found that natural language processing was needed to 'unlock' and render visible latent, previously unseen conversational connections across large textual corpora (spanning profiles, discussion threads, forums, and other social media incarnations). We introduce variants of LDA and ultimately make the argument that natural language processing is a critical interdisciplinary methodology to make better sense of social 'Big Data' and we were able to successfully model nested discussion topics from forums and blog posts using LDA. Importantly, we found that LDA can move us beyond the state-of-the-art in conventional Social Network Analysis techniques.

show abstract

Section: Resultsmentioning

confidence: 99%

Modeling virtual organizations with Latent Dirichlet Allocation: A case for natural language processing

Groß

Murthy

2014

Neural Networks

View full text Add to dashboard Cite

show abstract

“…If we can combine the results of this paper with expert opinions, we can expect a more accurate and valid result for sustainable technology analysis between competitors. Thus, in our future work, we will apply opinion mining [56], sentiment analysis [57], and topic model [58,59] to our methodology for sustainable technology analysis. This paper dealt with a more efficient way of finding sustainability in a specific technology field by introducing a new time concept that was not covered in the existing quantitative analysis methods for selecting sustainable technologies.…”

Section: Discussionmentioning

confidence: 99%

Statistical Technology Analysis for Competitive Sustainability of Three Dimensional Printing

Park

Jun

2017

Sustainability

View full text Add to dashboard Cite

Abstract:The technology of three-dimensional (3D) printing was commercialized in the late 1980s. Since then, the development of this technology has been dramatically increasing. Moreover, 3D printing technology has been used in many different fields, such as electronics and medical appliances, because 3D printing is a technological convergence based on precision instruments, chemical materials, and electrical equipment. The technological impact of 3D printing is so powerful that we need to analyze 3D printing technology to understand the 3D printing industry. In addition, we want more analytical results for understanding the sustainability of 3D printing technology. Thus, we compare the technologies between 3D printing competitors to find their technological innovations and evolution from a technological sustainability. To analyze the 3D printing technology, we propose a new methodology of statistical technology analysis combing social network analysis with time series clustering. In our case study, we make a comparison between "3D Systems" and "Stratasys", two major 3D printing companies, because they have been leading the sustainable technologies of 3D printing in the market. We illustrate how the proposed methodology can be applied to practical problems from the case study. This paper contributes to the sustainable technology management, and our research can expand to other competitors with diverse technological fields as well as 3D printing.

show abstract

“…The LSA space was built using the stochastic SVD decomposition from Apache Mahout [26] which was applied on the term-document matrix weighted with log-entropy, across 300 dimensions. LDA made use of parallel Gibbs sampling implemented in Mallet [27] and the model was created with 100 topics, as suggested by Blei [28]. A manual inspection of top 100 words from each LDA topic suggested that the space was adequately constructed due to the fact that the most representative words from each topic were semantically related one to another.…”

Section: The Nlp Processing Pipeline For Dutch Languagementioning

confidence: 99%

ReaderBench Learns Dutch: Building a Comprehensive Automated Essay Scoring System for Dutch Language

Dascălu

Westera

Ruşeţi

et al. 2017

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Abstract. Automated Essay Scoring has gained a wider applicability and usage with the integration of advanced Natural Language Processing techniques which enabled in-depth analyses of discourse in order capture the specificities of written texts. In this paper, we introduce a novel Automatic Essay Scoring method for Dutch language, built within the Readerbench framework, which encompasses a wide range of textual complexity indices, as well as an automated segmentation approach. Our method was evaluated on a corpus of 173 technical reports automatically split into sections and subsections, thus forming a hierarchical structure on which textual complexity indices were subsequently applied. The stepwise regression model explained 30.5% of the variance in students' scores, while a Discriminant Function Analysis predicted with substantial accuracy (75.1%) whether they are high or low performance students.

show abstract

Probabilistic topic models

Cited by 474 publications

References 0 publications

Modeling virtual organizations with Latent Dirichlet Allocation: A case for natural language processing

Modeling virtual organizations with Latent Dirichlet Allocation: A case for natural language processing

Statistical Technology Analysis for Competitive Sustainability of Three Dimensional Printing

ReaderBench Learns Dutch: Building a Comprehensive Automated Essay Scoring System for Dutch Language

Contact Info

Product

Resources

About