2021
DOI: 10.3390/ijerph18168530
|View full text |Cite
|
Sign up to set email alerts
|

Comparison of Random Forest and Gradient Boosting Machine Models for Predicting Demolition Waste Based on Small Datasets and Categorical Variables

Abstract: Construction and demolition waste (DW) generation information has been recognized as a tool for providing useful information for waste management. Recently, numerous researchers have actively utilized artificial intelligence technology to establish accurate waste generation information. This study investigated the development of machine learning predictive models that can achieve predictive performance on small datasets composed of categorical variables. To this end, the random forest (RF) and gradient boostin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
40
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
8
2

Relationship

1
9

Authors

Journals

citations
Cited by 57 publications
(40 citation statements)
references
References 40 publications
0
40
0
Order By: Relevance
“…The GBM is a forward learning ensemble methodology, under the rationale that good predictive results can be obtained through increasingly re ned approximations, and it builds regression trees on all factors assuming that each tree is built in parallel. Like ours, some studies have shown the excellent predictive performance of the GBM as compared with other models [33,34]. Besides the identi cation of best machine learning model, the key nding of this survey was the selection of a minimal number of factors according to their important contributions, including age of children, eating speed, number of relatives with obesity, sweet drinking, and paternal education.…”
Section: Discussionmentioning
confidence: 85%
“…The GBM is a forward learning ensemble methodology, under the rationale that good predictive results can be obtained through increasingly re ned approximations, and it builds regression trees on all factors assuming that each tree is built in parallel. Like ours, some studies have shown the excellent predictive performance of the GBM as compared with other models [33,34]. Besides the identi cation of best machine learning model, the key nding of this survey was the selection of a minimal number of factors according to their important contributions, including age of children, eating speed, number of relatives with obesity, sweet drinking, and paternal education.…”
Section: Discussionmentioning
confidence: 85%
“…To verify model performance, we applied leave-one-out cross-validation, a special case of k -fold cross-validation; this method can achieve more stable results than k -fold cross-validation for small datasets because it uses all samples for testing and training to ensure sufficient sample sizes [ 50 , 51 , 52 , 53 , 54 , 55 ].…”
Section: Methodsmentioning
confidence: 99%
“…Consequently, the risk of selecting misclassified samples for the training set rises, and a greater proportion of instances are correctly classified [40], [41]. However, boosting is a continual process of constructing classifiers enhanced by the weights of weak classifiers from previous rounds, which contributes to reducing dataset volatility and variability.…”
Section: Figure 2 Workflows Of Boosting and Bagging Techniquementioning
confidence: 99%