2018
DOI: 10.1587/transfun.e101.a.778
|View full text |Cite
|
Sign up to set email alerts
|

Naive Bayes Classifier Based Partitioner for MapReduce

Abstract: MapReduce is an effective framework for processing large datasets in parallel over a cluster. Data locality and data skew on the reduce side are two essential issues in MapReduce. Improving data locality can decrease network traffic by moving reduce tasks to the nodes where the reducer input data is located. Data skew will lead to load imbalance among reducer nodes. Partitioning is an important feature of MapReduce because it determines the reducer nodes to which map output results will be sent. Therefore, an … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(1 citation statement)
references
References 15 publications
0
1
0
Order By: Relevance
“…In Dekermanjian et al study, random forest was used as the model to train the classifier. And in this paper, four common models, Random Forest 26 – 28 , Plain Bayes 29 , 30 , XGBoost 31 – 33 and BP Neural Network 34 , 35 are compared for training. Random Forest (RF): Random Forest is a powerful machine learning method for classification by constructing multiple decision trees and integrating their predictions.…”
Section: Methodsmentioning
confidence: 99%
“…In Dekermanjian et al study, random forest was used as the model to train the classifier. And in this paper, four common models, Random Forest 26 – 28 , Plain Bayes 29 , 30 , XGBoost 31 – 33 and BP Neural Network 34 , 35 are compared for training. Random Forest (RF): Random Forest is a powerful machine learning method for classification by constructing multiple decision trees and integrating their predictions.…”
Section: Methodsmentioning
confidence: 99%