2023
DOI: 10.1109/tpds.2022.3221210
|View full text |Cite
|
Sign up to set email alerts
|

A Utility-Based Distributed Pattern Mining Algorithm With Reduced Shuffle Overhead

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
0
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 54 publications
0
0
0
Order By: Relevance
“…Te proposed algorithm reduces the number of transactions and the time spent processing the dataset. Castro et al [33] compared alternative Apriori [35], and it succeeded in reducing the communication cost during the shufe process by allocating the tasks across cluster nodes in a fair and efective way using search space division strategy. It showed better performance in running time, memory usage, and scalability.…”
Section: Related Workmentioning
confidence: 99%
“…Te proposed algorithm reduces the number of transactions and the time spent processing the dataset. Castro et al [33] compared alternative Apriori [35], and it succeeded in reducing the communication cost during the shufe process by allocating the tasks across cluster nodes in a fair and efective way using search space division strategy. It showed better performance in running time, memory usage, and scalability.…”
Section: Related Workmentioning
confidence: 99%
“…Li et al [12] studied the data skew in the Shuffle stage and proposed a Shuffle phase dynamic balance partitioning method based on reservoir sampling to sample and preprocess the intermediate data, predict the overall data skew and provide the overall partitioning strategy for application implementation, thus reducing the impact of data skew on the Spark performance. Kumar et al [13] studied the search space partitioning strategy of data parallelism. Based on the communication cost-effectiveness pattern mining algorithm, tasks can be allocated fairly and effectively among cluster nodes to reduce the communication cost generated during Shuffle.…”
Section: Related Workmentioning
confidence: 99%