2012
DOI: 10.1145/2382577.2382580
|View full text |Cite
|
Sign up to set email alerts
|

Summarizing data succinctly with the most informative itemsets

Abstract: Knowledge discovery from data is an inherently iterative process. That is, what we know about the data greatly determines our expectations, and therefore, what results we would find interesting and/or surprising. Given new knowledge about the data, our expectations will change. Hence, in order to avoid redundant results, knowledge discovery algorithms ideally should follow such an iterative updating procedure. With this in mind, we introduce a well-founded approach for succinctly summarizing data wit… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
57
0

Year Published

2013
2013
2020
2020

Publication Types

Select...
5
2
1

Relationship

2
6

Authors

Journals

citations
Cited by 49 publications
(57 citation statements)
references
References 44 publications
0
57
0
Order By: Relevance
“…Cues may be taken from recent developments in pattern set mining, where algorithms have been proposed that can mine high-quality results directly from data [Smets and Vreeken 2012;Akoglu et al 2012;Mampaey et al 2012]. …”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Cues may be taken from recent developments in pattern set mining, where algorithms have been proposed that can mine high-quality results directly from data [Smets and Vreeken 2012;Akoglu et al 2012;Mampaey et al 2012]. …”
Section: Discussionmentioning
confidence: 99%
“…All typically return more, and in particular more specific patterns than BMF. Wang and Parthasarathy [2006] and Mampaey et al [2012] propose algorithms for summarizing data with sets of itemsets and frequencies. To this end, they construct a probabilistic model for the rows of the data by the maximum entropy principle, and iteratively mine itemsets that maximize the likelihood of the data under the model, while controlling complexity through BIC or MDL scores.…”
Section: Pattern-based Summarizationmentioning
confidence: 99%
“…Note that, following our generalised anomaly score for class 1 anomalies, any method that provides a probability for a transaction can be used. Examples based on pattern sets are those of Wang and Parthasarathy [28] and Mampaey et al [16].…”
Section: Related Workmentioning
confidence: 99%
“…KRIMP [27] and SLIM [24] are two deterministic algorithms that heuristically optimise this score. Other pattern set mining techniques, especially those that mine patterns characteristic for the data such as [16,9,28], are also meaningful choices to be used with UPC.…”
Section: Related Workmentioning
confidence: 99%
“…One has to realize, however, that enumerating the pattern space can actually in itself already be infeasible. To decrease redundancy in pattern collections even more, pattern set mining algorithms became increasingly important [1,20,17]. The goal of pattern set mining techniques is to find a small collection of patterns that are interesting together, rather than on their own.…”
Section: Introductionmentioning
confidence: 99%