Sven Teresniak scite author profile

Sven Teresniak

5Publications

46Citation Statements Received

65Citation Statements Given

How they've been cited

How they cite others

Affiliations

Leipzig University

Publications

Order By: Most citations

Towards Automatic Detection and Tracking of Topic Change

Holz

Teresniak

2010

View full text Add to dashboard Cite

Abstract. We present an approach for automatic detection of topic change. Our approach is based on the analysis of statistical features of topics in time-sliced corpora and their dynamics over time. Processing large amounts of time-annotated news text, we identify new facets regarding a stream of topics consisting of latest news of public interest. Adaptable as an addition to the well known task of topic detection and tracking we aim to boil down a daily news stream to its novelty. For that we examine the contextual shift of the concepts over time slices. To quantify the amount of change, we adopt the volatility measure from econometrics and propose a new algorithm for frequency-independent detection of topic drift and change of meaning. The proposed measure does not rely on plain word frequency but the mixture of the co-occurrences of words. So, the analysis is highly independent of the absolute word frequencies and works over the whole frequency spectrum, especially also well for low-frequent words. Aggregating the computed time-related data of the terms allows to build overview illustrations of the most evolving terms for a whole time span.

show abstract

Two-stage framework for a topology-based projection and visualization of classified document collections

Oesterling

Scheuermann

Teresniak

et al. 2010

View full text Add to dashboard Cite

Figure 1: Island-like visualization of a document point cloud's topological structure. By sharing similar dimensions, documents accumulate in subspaces of the high dimensional information space. Considering dimensions as words, clusters are assumed to describe topics, i.e., islands, in the final visualization. ABSTRACTDuring the last decades, electronic textual information has become the world's largest and most important information source available. People have added a variety of daily newspapers, books, scientific and governmental publications, blogs and private messages to this wellspring of endless information and knowledge. Since neither the existing nor the new information can be read in its entirety, computers are used to extract and visualize meaningful or interesting topics and documents from this huge information clutter.In this paper, we extend, improve and combine existing individual approaches into an overall framework that supports topological analysis of high dimensional document point clouds given by the well-known tf-idf document-term weighting method. We show that traditional distance-based approaches fail in very high dimensional spaces, and we describe an improved two-stage method for topology-based projections from the original high dimensional information space to both two dimensional (2-D) and three dimensional (3-D) visualizations. To show the accuracy and usability of this framework, we compare it to methods introduced recently and apply it to complex document and patent collections.

show abstract

Disentangling from Babylonian Confusion – Unsupervised Language Identification

Biemann

Teresniak

2005

View full text Add to dashboard Cite

An Evaluation Measure for Distributed Information Retrieval Systems

Witschel¹,

Holz²,

Heinrich³

et al.

View full text Add to dashboard Cite

show abstract

22. P2P-based Communication

Heyer¹,

Holz²,

Teresniak³

2012

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Sven Teresniak

Towards Automatic Detection and Tracking of Topic Change

Two-stage framework for a topology-based projection and visualization of classified document collections

Disentangling from Babylonian Confusion – Unsupervised Language Identification

An Evaluation Measure for Distributed Information Retrieval Systems

22. P2P-based Communication

Contact Info

Product

Resources

About