Using background knowledge to rank itemsets

Tatti, Nikolaj; Mampaey, Michael

doi:10.1007/s10618-010-0188-4

Cited by 24 publications

(18 citation statements)

References 21 publications

(24 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Approaches that take account of background knowledge provide an important and closely related field of research [Jaroszewicz et al 2009;Tatti and Mampaey 2010;De Bie 2011].…”

Section: Other Related Approachesmentioning

confidence: 99%

Efficient Discovery of the Most Interesting Associations

Webb

Vreeken

2013

ACM Trans. Knowl. Discov. Data

View full text Add to dashboard Cite

Self-sufficient itemsets have been proposed as an effective approach to summarizing the key associations in data. However, their computation appears highly demanding, as assessing whether an itemset is selfsufficient requires consideration of all pairwise partitions of the itemset into pairs of subsets as well as consideration of all supersets. This article presents the first published algorithm for efficiently discovering self-sufficient itemsets. This branch-and-bound algorithm deploys two powerful pruning mechanisms based on upper bounds on itemset value and statistical significance level. It demonstrates that finding top-k productive and nonredundant itemsets, with postprocessing to identify those that are not independently productive, can efficiently identify small sets of key associations. We present extensive evaluation of the strengths and limitations of the technique, including comparisons with alternative approaches to finding the most interesting associations.

show abstract

“…Approaches that take account of background knowledge provide an important and closely related field of research [Jaroszewicz et al 2009;Tatti and Mampaey 2010;De Bie 2011].…”

Section: Other Related Approachesmentioning

confidence: 99%

Efficient Discovery of the Most Interesting Associations

Webb

Vreeken

2013

ACM Trans. Knowl. Discov. Data

View full text Add to dashboard Cite

show abstract

“…Indeed, most of pattern-based classification techniques focus on the sequential behavior and omit to take contextual and external knowledge into account. Though, recently, a new trend in the data mining field tries to incorporate expert knowledge in the process to improve the result quality [21,22,23]. Our approach clearly comes within this scope and we experimentally show that adding as much as available non-sequential information would lead to increase the classification performances.…”

Section: Genericity Of Mspcmentioning

confidence: 75%

Spatio-temporal data classification through multidimensional sequential patterns: Application to crop mapping in complex landscape

Pitarch

Ienco²,

Vintrou

et al. 2015

Engineering Applications of Artificial Intelligence

View full text Add to dashboard Cite

The main use of satellite imagery concerns the process of the spectral and spatial dimensions of the data. However, to extract useful information, the temporal dimension also has to be accounted for which increases the complexity of the problem. For this reason, there is a need for suitable data mining techniques for this source of data. In this work, we developed a data mining methodology to extract multidimensional sequential patterns to characterize temporal behaviors. We then used the extracted multidimensional sequences to build a classifier, and show how the patterns help to distinguish between the classes. We evaluated our technique using a real-world dataset containing information about land use in Mali (West Africa) to automatically recognize if an area is cultivated or not. (Résumé d'auteur

show abstract

“…However, the number of mined frequent itemsets is typically very large, because it contains a lot of redundant or potentially irrelevant patterns. To generate a more compact set of frequent itemsets representing most significant yet non-redundant knowledge hidden in the analyzed data many research efforts have been made (e.g., [28], [29], [20], [30]). Given a minimum support threshold minsup and a maximum itemset model size K, we extract the top-K most interesting and non-redundant itemsets according to the entropy-based heuristics proposed in [20].…”

Section: B Entropy-based Itemset Miningmentioning

confidence: 99%

Summarization of emergency news articles driven by relevance feedback

Cagliero

2017

2017 IEEE International Conference on Big Data (Big Data)

View full text Add to dashboard Cite

Abstract-Many articles on the same news are daily published by online newspapers and by various social media. To ease news article exploration sentence-based summarization algorithms aim at automatically generating for each news a summary consisting of the most salient sentences in the original articles. However, since sentence selection is error-prone, the automatically generated summaries are still subject to manual validation by domain experts. If the validation step not only focuses on pruning less relevant content but also on enriching summaries with missing yet relevant sentences this activity may become extremely time consuming.The paper focuses on summarizing news articles by means of an itemset-based technique. To tune summarizer performance a relevance feedback given on sentences is exploited to drive the generation of a new, more targeted summary. The feedback indicates the pertinence of the sentences that are already in the summary. Among the words or the word combinations selected by the summarization model, those occurring in sentences with high feedback score represent concepts that may be deemed as particularly relevant. Therefore, they are exploited to drive the new sentence selection process.The proposed approach was tested on collections of news articles reporting emergency situations. The results show the effectiveness of the proposed approach.

show abstract

Using background knowledge to rank itemsets

Cited by 24 publications

References 21 publications

Efficient Discovery of the Most Interesting Associations

Efficient Discovery of the Most Interesting Associations

Spatio-temporal data classification through multidimensional sequential patterns: Application to crop mapping in complex landscape

Summarization of emergency news articles driven by relevance feedback

Contact Info

Product

Resources

About