2018
DOI: 10.1016/j.compeleceng.2017.10.008
|View full text |Cite
|
Sign up to set email alerts
|

Performance optimization of MapReduce-based Apriori algorithm on Hadoop cluster

Abstract: Many techniques have been proposed to implement the Apriori algorithm on MapReduce framework but only a few have focused on performance improvement. FPC (Fixed Passes Combined-counting) and DPC (Dynamic Passes Combined-counting) algorithms combine multiple passes of Apriori in a single MapReduce phase to reduce the execution time. In this paper, we propose improved MapReduce based Apriori algorithms VFPC (Variable Size based Fixed Passes Combined-counting) and ETDPC (Elapsed Time based Dynamic Passes Combined-… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
30
0
1

Year Published

2018
2018
2023
2023

Publication Types

Select...
7
2

Relationship

1
8

Authors

Journals

citations
Cited by 60 publications
(32 citation statements)
references
References 14 publications
(31 reference statements)
0
30
0
1
Order By: Relevance
“…As to the two mining algorithms, we can know that the most time-consuming part is to derive fuzzy large itemsets. To deal with this problem, MapReduce-based algorithms can be employed to improve the efficiency [25], [31]. For example, Martín et al presented a generic MapReduce framework for rule discovery [25], and Singh et al proposed a MapReduce-based Apriori algorithm for performance optimization on a Hadoop cluster [31].…”
Section: E Discussionmentioning
confidence: 99%
“…As to the two mining algorithms, we can know that the most time-consuming part is to derive fuzzy large itemsets. To deal with this problem, MapReduce-based algorithms can be employed to improve the efficiency [25], [31]. For example, Martín et al presented a generic MapReduce framework for rule discovery [25], and Singh et al proposed a MapReduce-based Apriori algorithm for performance optimization on a Hadoop cluster [31].…”
Section: E Discussionmentioning
confidence: 99%
“…In other words, if {A} is not frequent, then {AB} is not frequent; if {AB} is frequent, then {A} and {B} are frequent. So, the non-relevant sets are removed early in the search space [43]. In our database, we have about 100,000 real values measured over two years, and at every hour (about 17,280 transactions).…”
Section: Principle Of Anti-monotonymentioning
confidence: 99%
“…There is a small class of MapReduce-based Apriori algorithms [17,22,28,36] that are distinct from all of the above. Each aims to improve the performance over the traditional level-wise sequential or parallel Apriori but because they are focused in different aspects of the development (e.g., cloud storage, intelligent search), they have never been compared to realize their common property.…”
Section: Apriori Algorithms: Background and Remarksmentioning
confidence: 99%