SD-Map – A Fast Algorithm for Exhaustive Subgroup Discovery

Atzmueller, Martin; Puppe, Frank

doi:10.1007/11871637_6

Cited by 114 publications

(110 citation statements)

References 11 publications

(17 reference statements)

Supporting

Mentioning

110

Contrasting

Order By: Relevance

“…Such an extended quality function could be defined as q a (sd) = |ext(sd)| a · (t − t 0 ) · |u(sd)|, where |u(sd)| is the user count for images in the respective subgroup. Unfortunately, such interestingness measures are not supported by efficient exhaustive algorithms for subgroup discovery, e.g., SD-Map [10] or BSD [11]. On the other hand, more basic algorithms, for example exhaustive depth-first search without a specialized data structure scale not very well for the problem setting of this paper, with thousands of tags as descriptions and possibly millions of instances in an interactive setting.…”

Section: Avoiding User Bias: User-resource Weightingmentioning

confidence: 99%

Describing Locations Using Tags and Images: Explorative Pattern Mining in Social Media

Lemmerich

Atzmueller

2012

Modeling and Mining Ubiquitous Social Media

Self Cite

View full text Add to dashboard Cite

Abstract. This paper presents an approach for explorative pattern mining in social media for describing image media based on tagging information and collaborative geo-reference annotations. We utilize pattern mining techniques for obtaining sets of tags that are specific for the specified point, landmark, or region of interest. Next, we show how these candidate patterns can be presented and visualized for interactive exploration using a combination of general pattern mining visualizations and views specialized on geo-referenced tagging data. We present a case study using publicly available data from the Flickr photo sharing platform.

show abstract

Section: Avoiding User Bias: User-resource Weightingmentioning

confidence: 99%

Describing Locations Using Tags and Images: Explorative Pattern Mining in Social Media

Lemmerich

Atzmueller

2012

Modeling and Mining Ubiquitous Social Media

Self Cite

View full text Add to dashboard Cite

show abstract

“…The Dpsubgroup algorithm is only beaten by Algorithm 1 for sufficiently large differences in the search space. This behavior is due to the sophisticated data structures (fptrees [2,11]) Dpsubgroup uses in contrast to our algorithm. A further noteworthy fact is that unless Algorithm 1 ran out of memory (the oom entries) it always outperforms LCM/greedy.…”

Section: Empirical Evaluationmentioning

confidence: 99%

“…Subgroup discovery [2,12,17] is a local pattern discovery task: descriptions of subpopulations of a database are evaluated against some real-valued quality function, and those descriptions exceeding some given minimum quality are returned to the user. The quality functions commonly used in this course like PiatetskyShapiro, binomial test, or Gini-index (see [12] for a list) are functions of the extension of a subgroup description.…”

Section: Introductionmentioning

confidence: 99%

Non-redundant Subgroup Discovery Using a Closure System

Boley

Großkreutz

2009

Machine Learning and Knowledge Discovery in Databases

View full text Add to dashboard Cite

Abstract. Subgroup discovery is a local pattern discovery task, in which descriptions of subpopulations of a database are evaluated against some quality function. As standard quality functions are functions of the described subpopulation, we propose to search for equivalence classes of descriptions with respect to their extension in the database rather than individual descriptions. These equivalence classes have unique maximal representatives forming a closure system. We show that minimum cardinality representatives of each equivalence class can be found during the enumeration process of that closure system without additional cost, while finding a minimum representative of a single equivalence class is NP-hard. With several real-world datasets we demonstrate that search space and output are significantly reduced by considering equivalence classes instead of individual descriptions and that the minimum representatives constitute a family of subgroup descriptions that is of same or better expressive power than those generated by traditional methods.

show abstract

“…In 29 , a recent review describing the SD task, the quality measures used, the approaches and the applications can be found. The SD task is somehow between descriptive and predictive induction, and different algorithms adapting classical algorithms of both classification -as CN2-SD 38 -and association rule learning -as Apriori-SD 33 or SD-MAP 8 -have been proposed. Nowadays, one of the most important aspect in SD is the measures to be used to evaluate the quality of the subgroups extracted.…”

Section: Introductionmentioning

confidence: 99%

Genetic lateral tuning for subgroup discovery with fuzzy rules using the algorithm NMEEF-SD

Carmona¹,

González²,

Gacto³

et al. 2012

IJCIS

View full text Add to dashboard Cite

The main objective of subgroup discovery is to discover interesting and interpretable patterns with respect to a specific property. The use of evolutionary fuzzy systems provides good algorithms to approach this problem. In this sense, NMEEF-SD algorithm -one of the most representative evolutionary fuzzy systems for subgroup discovery-obtains precise and interpretable subgroups. However in the majority of the evolutionary fuzzy systems, the membership functions of the linguistic labels are usually fixed to static values and the partitions are not adapted to the context of each variable. In this paper, a post-processing tuning step to improve the results of the subgroup discovery algorithm NMEEF-SD is proposed, allowing the partitions to be adapted to the context the variables. The application of this tuning step is a novelty in subgroup discovery and consist of a genetic algorithm which allows the lateral displacement of the membership functions of a label considering a unique parameter, using the 2-tuples linguistic representation. The results obtained using different data sets of the KEEL repository show the improvement in the performance of the NMEEF-SD algorithm with lateral displacement. The study is supported by statistical tests to improve the analysis performed.

show abstract

SD-Map – A Fast Algorithm for Exhaustive Subgroup Discovery

Cited by 114 publications

References 11 publications

Describing Locations Using Tags and Images: Explorative Pattern Mining in Social Media

Describing Locations Using Tags and Images: Explorative Pattern Mining in Social Media

Non-redundant Subgroup Discovery Using a Closure System

Genetic lateral tuning for subgroup discovery with fuzzy rules using the algorithm NMEEF-SD

Contact Info

Product

Resources

About