2018
DOI: 10.1017/s0269888918000127
|View full text |Cite
|
Sign up to set email alerts
|

Review and comparison of Apriori algorithm implementations on Hadoop-MapReduce and Spark

Abstract: Several Apriori algorithm implementations for mining association rules have been proposed in the literature using the Hadoop-MapReduce framework and, more recently, Spark. However, none of the works have made a detailed assessment of its performance, for example, comparing it with other implementations in various characteristics of data sets. In this work, we present a review of the main algorithms proposed for Hadoop-MapReduce and compared their implementations in a single environment under several different … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
9
1

Relationship

0
10

Authors

Journals

citations
Cited by 12 publications
(6 citation statements)
references
References 18 publications
0
6
0
Order By: Relevance
“…Te proposed algorithm reduces the number of transactions and the time spent processing the dataset. Castro et al [33] compared alternative Apriori [35], and it succeeded in reducing the communication cost during the shufe process by allocating the tasks across cluster nodes in a fair and efective way using search space division strategy. It showed better performance in running time, memory usage, and scalability.…”
Section: Related Workmentioning
confidence: 99%
“…Te proposed algorithm reduces the number of transactions and the time spent processing the dataset. Castro et al [33] compared alternative Apriori [35], and it succeeded in reducing the communication cost during the shufe process by allocating the tasks across cluster nodes in a fair and efective way using search space division strategy. It showed better performance in running time, memory usage, and scalability.…”
Section: Related Workmentioning
confidence: 99%
“…In another work, the authors used the Apriori algorithm in three different execution approaches IMRAprior-iAcc (Improved MapReduce Apriori Accelerated), DPC (Dynamic Passes Combined-Counting) and CPA (Complete Parallel Apriori) along with their adaption on Spark with different size datasets and varying cluster configuration [21]. Four performance metrics runtime, speed-up, size-up and scale-up are used for the performance evaluation of the Hadoop MapReduce and Spark.…”
Section: 1mentioning
confidence: 99%
“…Table 1 summing up the platforms comparison of the big data. Each platform has its advantages over the other, therefore, selecting the best platform depends on the big data characteristics and requirements [18] [13] [61] [25].…”
Section: Big Data Clustering Platformsmentioning
confidence: 99%