Mining top-rank-k frequent weighted itemsets using WN-list structures and an early pruning strategy

Vo, Bay; Bui, Huong Mai; Vo, Thanh; Le, Tuong

doi:10.1016/j.knosys.2020.106064

Cited by 15 publications

(6 citation statements)

References 46 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Therefore, Lee et al [6] used two new prefix tree structures FWI-tree W and FWI-tree T , to propose two algorithms FWI * WSD and FWI * TCD, respectively, for mining FWPs effectively. Later, using the N-list-based structure, Bui et al [7] proposed the algorithm NFWI for mining FWPs, Le et al [8] presented TFWIN + for mining toprank-k FWPs, and Bui et al [9] developed NFWCI for mining frequent weighted closed patterns (FWCPs).…”

Section: A Mining Frequent Weighted Patternsmentioning

confidence: 99%

See 1 more Smart Citation

A Sliding Window-Based Approach for Mining Frequent Weighted Patterns Over Data Streams

Bui¹,

Nguyen-Hoang²,

et al. 2021

IEEE Access

Self Cite

View full text Add to dashboard Cite

The mining of frequent weighted patterns (FWPs) that considers the different semantic significance (weight) of items is more suitable for practice than the mining of frequent patterns. Therefore, it plays a vital role in real-world scenarios. However, there exist several limitations when applying methods for mining FWPs designed for static data on growth datasets, especially data streams. Hence, this study proposes an algorithm for mining FWPs over data streams. First, we introduce the concept of mining FWPs over data streams via a sliding window model. Then, we introduce a modification of the weighted node tree (WN-tree) named SWN-tree that has the ability to maintain the information over data streams. Next, this study develops a method for mining FWPs over data streams employing a sliding window model based on SWN-tree. This method is called FWPODS (Frequent Weighted Patterns Over Data Stream) algorithm. Finally, we conduct empirical experiments to compare the performances of our approach and the state-of-the-art algorithm (NFWI) for mining FWPs over data streams. The results of experiment indicate that our approach outperforms the NFWI algorithm when running in batch mode in a sliding window.INDEX TERMS pattern mining, data streams, frequent weighted patterns, sliding window model.

show abstract

Section: A Mining Frequent Weighted Patternsmentioning

confidence: 99%

“…The results of experiments in this study confirmed that NFWI performs better than the existing approaches for mining FWPs. Later, using the WN-list structure combined with an early pruning strategy, Le et al [8] and Bui et al [9] proposed TFWIN + and NFWCI for mining top-rank-k FWPs and FWCPs, respectively.…”

Section: N-list-based Structuresmentioning

confidence: 99%

A Sliding Window-Based Approach for Mining Frequent Weighted Patterns Over Data Streams

Bui¹,

Nguyen-Hoang²,

et al. 2021

IEEE Access

Self Cite

View full text Add to dashboard Cite

show abstract

“…The state-of-the-art algorithm (NFWI) was presented for this purpose. This structure was also employed by Vo et al [10] along with tidset and diffsets to mine top rank-k frequent weighted itemsets. This paper also uses threshold raising and early pruning strategies to amplify the efficacy of extracting top rank-k frequent weighted items.…”

Section: Introductionmentioning

confidence: 99%

“…The proposed pruning approach relies on the construction and traversal of a set enumeration tree that adds to the memory consumption NL-ITP [8]  Uses N-list data structure to extract itemsets  Reduces the search space significantly to generate FITPs Shows limited improvement in runtime on sparse datasets TFWIN+ [10]  Combining mining and ranking phases into one  Uses Tidset, Diffset, and WN-list structures to extract the required itemsets.  Proposes threshold raising strategy and early pruning to effectively extract top rank-k-Frequent Weighted items…”

Section: Introductionmentioning

confidence: 99%

A multithreaded hybrid framework for mining frequent itemsets

Poovan

Acharya

Reddy³

2022

IJECE

View full text Add to dashboard Cite

<p><span>Mining frequent itemsets is an area of data mining that has beguiled several researchers in recent years. Varied data structures such as Nodesets, DiffNodesets, NegNodesets, N-lists, and Diffsets are among a few that were employed to extract frequent items. However, most of these approaches fell short either in respect of run time or memory. Hybrid frameworks were formulated to repress these issues that encompass the deployment of two or more data structures to facilitate effective mining of frequent itemsets. Such an approach aims to exploit the advantages of either of the data structures while mitigating the problems of relying on either of them alone. However, limited efforts have been made to reinforce the efficiency of such frameworks. To address these issues this paper proposes a novel multithreaded hybrid framework comprising of NegNodesets and N-list structure that uses the multicore feature of today’s processors. While NegNodesets offer a concise representation of itemsets, N-lists rely on List intersection thereby speeding up the mining process. To optimize the extraction of frequent items a hash-based algorithm has been designed here to extract the resultant set of frequent items which further enhances the novelty of the framework.</span></p>

show abstract

“…Tao et al (2003) determined transaction weights by calculating the average weight of the items present in a transaction. That is to say, all methods of calculating item weights and their variants can be combined with this concept to obtain the transaction weights (Cengiz et al, 2019;Bui et al, 2018;Vo et al, 2020;Datta et al, 2021). The second category is the calculation of the transaction weight by establishing a relationship between the transaction weight and several indicators that can reflect the importance of the transaction.…”

Section: Introductionmentioning

confidence: 99%

A novel consumer preference mining method based on improved weclat algorithm

Xin

et al. 2021

JEC

View full text Add to dashboard Cite

Purpose Conventional frequent itemsets mining ignores the fact that the relative benefits or significance of “transactions” belonging to different customers are different in most of the relevant applied studies, which leads to failure to obtain some association rules with lower support but from higher-value consumers. Because not all customers are financially attractive to firms, it is necessary that their values be determined and that transactions be weighted. The purpose of this study is to propose a novel consumer preference mining method based on conventional frequent itemsets mining, which can discover more rules from the high-value consumers. Design/methodology/approach In this study, the authors extend the conventional association rule problem by associating the “annual purchase amount” – “price preference” (AP) weight with a consumer to reflect the consumer’s contribution to a market. Furthermore, a novel consumer preference mining method, the AP-weclat algorithm, is proposed by introducing the AP weight into the weclat algorithm for discovering frequent itemsets with higher values. Findings The experimental results from the survey data revealed that compared with the weclat algorithm, the AP-weclat algorithm can make some association rules with low support but a large contribution to a market pass the screening by assigning different weights to consumers in the process of frequent itemsets generation. In addition, some valuable preference combinations can be provided for related practitioners to refer to. Originality/value This study is the first to introduce the AP-weclat algorithm for discovering frequent itemsets from transactions through considering AP weight. Moreover, the AP-weclat algorithm can be considered for application in other markets.

show abstract

Mining top-rank-k frequent weighted itemsets using WN-list structures and an early pruning strategy

Cited by 15 publications

References 46 publications

A Sliding Window-Based Approach for Mining Frequent Weighted Patterns Over Data Streams

A Sliding Window-Based Approach for Mining Frequent Weighted Patterns Over Data Streams

A multithreaded hybrid framework for mining frequent itemsets

A novel consumer preference mining method based on improved weclat algorithm

Contact Info

Product

Resources

About