2020
DOI: 10.21203/rs.3.rs-54646/v2
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

CatBoost for Big Data: an Interdisciplinary Review

Abstract: Gradient Boosted Decision Trees (GBDT’s) are a powerful tool for classification and regression tasks in Big Data. Researchers should be familiar with the strengths and weaknesses of current implementations of GBDT’s in order to use them effectively and make successful contributions. CatBoost is a member of the family of GBDT machine learning ensemble techniques. Since its debut in late 2018, researchers have successfully used CatBoost for machine learning studies involving Big Data. We take this opportunity to r… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
27
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 25 publications
(27 citation statements)
references
References 54 publications
(131 reference statements)
0
27
0
Order By: Relevance
“…Trees (GBDT's) machine learning ensemble techniques [56]. All analysis was performed using R statistical language with Caret, XGBoost, SHAPforxgboost and CatBoost libraries.…”
Section: Discussionmentioning
confidence: 99%
“…Trees (GBDT's) machine learning ensemble techniques [56]. All analysis was performed using R statistical language with Caret, XGBoost, SHAPforxgboost and CatBoost libraries.…”
Section: Discussionmentioning
confidence: 99%
“…LightGBM is developed from the Gradient boosted decision trees (GBDT) model to create a better-performance model. The CatBoost model has emerged with the development of the Gradient Boosting model for high cardinality categorical variables [19].…”
Section: Methodsmentioning
confidence: 99%
“…These rankings are followed by a report of which features are in each of the 4 Agree, 5 Agree, 6 Agree, and 7 Agree datasets (Tables 47,48,49,50,51,52,53,54,55,56,57,58). 6 , for CatBoost default hyperparameter values we refer the reader to the CatBoost documentation 7 , and for Light GBM default hyperparameter values, please consult their documentation 8 .…”
Section: Appendix Bmentioning
confidence: 99%
“…In all experiments, we employ the following eight learners: CatBoost [7], Light GBM [8], XGBoost [9], RF [10], DT [11], LR [12], NB [13], and a MLP [14]. To gauge the performance of these classifiers, the AUC and AUPRC metrics are used.…”
mentioning
confidence: 99%