2018
DOI: 10.22266/ijies2018.1031.21
|View full text |Cite
|
Sign up to set email alerts
|

Performance Improvement of PrePost Algorithm Based on Hadoop for Big Data

Abstract: With the blasting growth in data, uptake data mining techniques to mine association rules, and then find useful information hidden in large data has become ever more important. Several existing data mining techniques often through mining frequent itemsets draw association rules and get to relevant knowledge, but with the rapid arrival of the era of big data, traditional data mining algorithms have been impossible to meet large data's analysis needs. Lately, the PrePost algorithm has been suggested, a new algor… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 6 publications
(5 citation statements)
references
References 18 publications
0
5
0
Order By: Relevance
“…Because the named node has a single failure problem, when the named node fails, the backup named node can assume the job of the named node [17].…”
Section: Hadoop Big Data Processingmentioning
confidence: 99%
“…Because the named node has a single failure problem, when the named node fails, the backup named node can assume the job of the named node [17].…”
Section: Hadoop Big Data Processingmentioning
confidence: 99%
“…PrePost algorithm (9) based on the concept of N-lists is used for association rule mining which presents a data structure named N-list, for storing the information related to Association rule mining. PrePost scans the database twice to construct a tree which generates the N-list of frequent 1-itemsets.…”
Section: Background Studymentioning
confidence: 99%
“…In 2018, we presented, HPrePostPlus algorithm [26], a better version of PrePost, based on Hadoop, which uses a HashMap to traverse efficiently through the PPC tree and enhance the N-list creation process. The HPrePostPlus algorithm is very powerful and surpasses the state-of-the-art algorithms, such as PrePost [6], MRPrePost [23], PFP [18], and negFIN [27].…”
Section: Related Workmentioning
confidence: 99%
“…In this section, the DisPrePost algorithm has been compared to two advanced algorithms, HPrePostPlus [26] and the well-known HFIM [29]. DisPrePost is the first implementation of the PrePost algorithm in the Spark framework, HPrePostPlus is a recent implementation of the Hadoop-based PrePost parallel algorithm [26] with good results, and HFIM is a typical implementation of the Sparkbased Apriori parallel algorithm [29] with good performance. We evaluated speed performance by analyzing runtime and scalability.…”
Section: Performance Evaluationmentioning
confidence: 99%