A survey of itemset mining

Fournier‐Viger, Philippe; Lin, Jerry Chun‐Wei; Vo, Bay; Truong, Tin; Zhang, Ji; Lê, Hoài Bắc

doi:10.1002/widm.1207

Cited by 193 publications

(156 citation statements)

References 111 publications

Supporting

Mentioning

137

Contrasting

Unclassified

Order By: Relevance

“…The aim of this article is to show the improvements addressed during the last 25 years, that is, since the FIM task was first described (Agrawal et al, ). While some reviews have been already proposed in literature (Chee, Jaafar, Aziz, Hasan, & Yeoh, ; Fournier‐Viger et al, ), they are mainly focused on sequential exhaustive search approaches and on describing the algorithms for nonexpert users. In this sense, it is our understanding that an analysis from an expert point of view that involves any existing methodology (exhaustive and nonexhaustive search) on any architecture (centralized and parallel) is necessary to comprehend which improvements have been proposed over time.…”

Section: Lesson Learnedmentioning

confidence: 99%

“…Since 1993, when the first FIM algorithm was released (Agrawal et al, 1993), a special attention has been given to the performance of novel algorithms in this field (Borgelt, 2012;Fournier-Viger, Lin, Vo, Truong et al, 2017). Nowadays, 25 years later, extremely large datasets can be analyzed in a few seconds and this is not only a matter of novel architectures and hardware progresses but also a consequence of the proposed algorithmic solutions.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Frequent itemset mining: A 25 years review

Luna

Fournier-Viger

Ventura

2019

WIREs Data Min & Knowl

Self Cite

170

View full text Add to dashboard Cite

Frequent itemset mining (FIM) is an essential task within data analysis since it is responsible for extracting frequently occurring events, patterns, or items in data. Insights from such pattern analysis offer important benefits in decision‐making processes. However, algorithmic solutions for mining such kind of patterns are not straightforward since the computational complexity exponentially increases with the number of items in data. This issue, together with the significant memory consumption that is present in the mining process, makes it necessary to propose extremely efficient solutions. Since the FIM problem was first described in the early 1990s, multiple solutions have been proposed by considering centralized systems as well as parallel (shared or nonshared memory) architectures. Solutions can also be divided into exhaustive search and nonexhaustive search models. Many of such approaches are extensions of other solutions and it is therefore necessary to analyze how this task has been considered during the last decades. This article is categorized under: Algorithmic Development > Association Rules Technologies > Association Rules

show abstract

Section: Lesson Learnedmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Frequent itemset mining: A 25 years review

Luna

Fournier-Viger

Ventura

2019

WIREs Data Min & Knowl

Self Cite

170

View full text Add to dashboard Cite

show abstract

“…3. It follows breadth-first search approach which is quite costly in terms of memory utilization [41].…”

Section: Utility Of Itemset (U) = Internal Utility (I) * External Utimentioning

confidence: 99%

A Survey on High Utility Itemsets Mining

Ninoria¹,

Thakur²

2017

IJCA

View full text Add to dashboard Cite

“…In many real‐world applications, data mining techniques are used to extract interesting patterns from databases, to support crucial decision‐making. Two fundamental tasks for revealing interesting relationships between items in transactional databases are frequent itemset mining (FIM) and association rule mining (ARM) (Agrawal, Imielinski, & Swami, ; Chen, Han, & Yu, ; Fournier‐Viger et al, ). The most well‐known ARM algorithms are Apriori (Agrawal & Srikant, ) and FP‐Growth (Han, Pei, Yin, & Mao, ).…”

Section: Introductionmentioning

confidence: 99%

“…To the best of our knowledge, this is the first survey on the mining task of incremental high‐utility itemset mining. The methods discussed in this article are not only important for iHUIM (Ahmed et al, ; Fournier‐Viger et al, ; Lin et al, ), but can also serve as inspiration for other data mining tasks (Fournier‐Viger et al, ), including incremental data mining (Hong et al, ) and dynamic data mining (Lin et al, ). The major contributions of this paper are threefold. A taxonomy of the most common approaches for mining HUIs in static databases, including Apriori‐based, tree‐based, projection‐based, hybrid, and other approaches, is presented.…”

Section: Introductionmentioning

confidence: 99%

A survey of incremental high‐utility itemset mining

Gan

Lin

Fournier-Viger

et al. 2018

WIREs Data Min & Knowl

Self Cite

130

View full text Add to dashboard Cite

Traditional association rule mining has been widely studied. But it is unsuitable for real‐world applications where factors such as unit profits of items and purchase quantities must be considered. High‐utility itemset mining (HUIM) is designed to find highly profitable patterns by considering both the purchase quantities and unit profits of items. However, most HUIM algorithms are designed to be applied to static databases. But in real‐world applications such as market basket analysis and business decision‐making, databases are often dynamically updated by inserting new data such as customer transactions. Several researchers have proposed algorithms to discover high‐utility itemsets (HUIs) in dynamically updated databases. Unlike batch algorithms, which always process a database from scratch, incremental high‐utility itemset mining (iHUIM) algorithms incrementally update and output HUIs, thus reducing the cost of discovering HUIs. This paper provides an up‐to‐date survey of the state‐of‐the‐art iHUIM algorithms, including Apriori‐based, tree‐based, and utility‐list‐based approaches. To the best of our knowledge, this is the first survey on the mining task of incremental high‐utility itemset mining. The paper also identifies several important issues and research challenges for iHUIM. WIREs Data Mining Knowl Discov 2018, 8:e1242. doi: 10.1002/widm.1242 This article is categorized under: Algorithmic Development > Association Rules Application Areas > Data Mining Software Tools Fundamental Concepts of Data and Knowledge > Knowledge Representation

show abstract

A survey of itemset mining

Cited by 193 publications

References 111 publications

Frequent itemset mining: A 25 years review

Frequent itemset mining: A 25 years review

A Survey on High Utility Itemsets Mining

A survey of incremental high‐utility itemset mining

Contact Info

Product

Resources

About