2012
DOI: 10.1007/s10618-012-0273-y
|View full text |Cite
|
Sign up to set email alerts
|

Diverse subgroup set discovery

Abstract: Large data is challenging for most existing discovery algorithms, for several reasons. First of all, such data leads to enormous hypothesis spaces, making exhaustive search infeasible. Second, many variants of essentially the same pattern exist, due to (numeric) attributes of high cardinality, correlated attributes, and so on. This causes top-k mining algorithms to return highly redundant result sets, while ignoring many potentially interesting results. These problems are particularly apparent with subgroup di… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
96
0

Year Published

2014
2014
2019
2019

Publication Types

Select...
8
1

Relationship

1
8

Authors

Journals

citations
Cited by 92 publications
(96 citation statements)
references
References 40 publications
(53 reference statements)
0
96
0
Order By: Relevance
“…[14,16]. The aim is to find groups of objects, called subgroups, for which the distribution over the labels is statistically different from that of the entire set of objects.…”
Section: Subgroup Discoverymentioning
confidence: 99%
“…[14,16]. The aim is to find groups of objects, called subgroups, for which the distribution over the labels is statistically different from that of the entire set of objects.…”
Section: Subgroup Discoverymentioning
confidence: 99%
“…IDSD builds upon Diverse Subgroup Set Discovery (DSSD) [25]. DSSD was proposed in an attempt to eliminate redundancy by using a diverse beam search.…”
Section: Integrating Interaction Into Searchmentioning
confidence: 99%
“…For the setting without interaction, DSSD [25] was used with its default parameter settings (Table 1(a)). The results suffer from two severe problems.…”
Section: Case Study: Sports Analyticsmentioning
confidence: 99%
“…Recent works [12,9,17] propose approaches to solve the redundancy and trivial patterns issues in frequent pattern mining. For instance, the first two works focuse on finding a relevant or concise representation of sets of frequent patterns.…”
Section: Related Workmentioning
confidence: 99%