Klemens Böhm scite author profile

Abstract-Outlier mining is a major task in data analysis. Outliers are objects that highly deviate from regular objects in their local neighborhood. Density-based outlier ranking methods score each object based on its degree of deviation. In many applications, these ranking methods degenerate to random listings due to low contrast between outliers and regular objects. Outliers do not show up in the scattered full space, they are hidden in multiple high contrast subspace projections of the data. Measuring the contrast of such subspaces for outlier rankings is an open research challenge.In this work, we propose a novel subspace search method that selects high contrast subspaces for density-based outlier ranking. It is designed as pre-processing step to outlier ranking algorithms. It searches for high contrast subspaces with a significant amount of conditional dependence among the subspace dimensions. With our approach, we propose a first measure for the contrast of subspaces. Thus, we enhance the quality of traditional outlier rankings by computing outlier scores in high contrast projections only. The evaluation on real and synthetic data shows that our approach outperforms traditional dimensionality reduction techniques, naive random projections as well as state-of-the-art subspace search techniques and provides enhanced quality for outlier ranking.

show abstract

Towards Concise Models of Grid Stability

Arzamasov

Böhm

Jochem

2018

View full text Add to dashboard Cite

Mining Edge-Weighted Call Graphs to Localise Software Bugs

Eichinger¹,

Böhm²,

Huber³

View full text Add to dashboard Cite

An important problem in software engineering is the automated discovery of noncrashing occasional bugs. In this work we address this problem and show that mining of weighted call graphs of program executions is a promising technique. We mine weighted graphs with a combination of structural and numerical techniques. More specifically, we propose a novel reduction technique for call graphs which introduces edge weights. Then we present an analysis technique for such weighted call graphs based on graph mining and on traditional feature selection schemes. The technique generalises previous graph mining approaches as it allows for an analysis of weights. Our evaluation shows that our approach finds bugs which previous approaches cannot detect so far. Our technique also doubles the precision of finding bugs which existing techniques can already localise in principle.

show abstract

FAS — a Freshness-Sensitive Coordination Middleware for a Cluster of OLAP Components

Röhm¹,

Böhm²,

Schek

et al. 2002

View full text Add to dashboard Cite

CMI: An Information-Theoretic Contrast Measure for Enhancing Subspace Cluster and Outlier Detection

Nguyen

Müller

Vreeken

et al. 2013

View full text Add to dashboard Cite

In many real world applications data is collected in multi-dimensional spaces, with the knowledge hidden in subspaces (i.e., subsets of the dimensions). It is an open research issue to select meaningful subspaces without any prior knowledge about such hidden patterns. Standard approaches, such as pairwise correlation measures, or statistical approaches based on entropy, do not solve this problem; due to their restrictive pairwise analysis and loss of information in discretization they are bound to miss subspaces with potential clusters and outliers.In this paper, we focus on finding subspaces with strong mutual dependency in the selected dimension set. Chosen subspaces should provide a high discrepancy between clusters and outliers and enhance detection of these patterns. To measure this, we propose a novel contrast score that quantifies mutual correlations in subspaces by considering their cumulative distributionswithout having to discretize the data. In our experiments, we show that these high contrast subspaces provide enhanced quality in cluster and outlier detection for both synthetic and real world data.

show abstract

Trading Quality for Time with Nearest-Neighbor Search

Weber

Böhm

2000

View full text Add to dashboard Cite

Towards Efficient Processing of General-Purpose Joins in Sensor Networks

Stern

Buchmann

Böhm

2009

View full text Add to dashboard Cite

Abstract-Join processing in wireless sensor networks is difficult: As the tuples can be arbitrarily distributed within the network, matching pairs of tuples is communication intensive and costly in terms of energy. Current solutions only work well with specific placements of the nodes and/or make restrictive assumptions. In this paper, we present SENS-Join, an efficient general-purpose join method for sensor networks. To obtain efficiency, SENS-Join does not ship tuples that do not join, based on a filtering step. Our main contribution is the design of this filtering step which is highly efficient in order not to exhaust the potential savings. We demonstrate the performance of SENS-Join experimentally: The overall energy consumption can be reduced by more than 80%, as compared to the state-of-the-art approach. The per node energy consumption of the most loaded nodes can be reduced by more than an order of magnitude.

show abstract

Ranking outlier nodes in subspaces of attributed graphs

Müller

Sánchez

Mülle

et al. 2013

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Klemens Böhm

HiCS: High Contrast Subspaces for Density-Based Outlier Ranking

Towards Concise Models of Grid Stability

Mining Edge-Weighted Call Graphs to Localise Software Bugs

FAS — a Freshness-Sensitive Coordination Middleware for a Cluster of OLAP Components

CMI: An Information-Theoretic Contrast Measure for Enhancing Subspace Cluster and Outlier Detection

Trading Quality for Time with Nearest-Neighbor Search

Towards Efficient Processing of General-Purpose Joins in Sensor Networks

Ranking outlier nodes in subspaces of attributed graphs

Contact Info

Product

Resources

About