2021
DOI: 10.1016/j.autcon.2021.103896
|View full text |Cite
|
Sign up to set email alerts
|

Integrating feature engineering, genetic algorithm and tree-based machine learning methods to predict the post-accident disability status of construction workers

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
25
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 58 publications
(33 citation statements)
references
References 90 publications
0
25
0
Order By: Relevance
“…According to the analysis results, the optimum hyperparameters were identified as 350, 3, 0.1, 5, 0.6, and 42, while searching the parameters with a step size of 50, 1, 0.01, 1, 0.1, and 1, respectively. Figure 3 also illustrates the changes in the AUROC values based on the number of trees, which is one of the most determinant parameters of tree-based ML methods [31]. AUROC curve of the proposed SGB model shows that the model achieved an AUROC value of 0.741 (Fig.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…According to the analysis results, the optimum hyperparameters were identified as 350, 3, 0.1, 5, 0.6, and 42, while searching the parameters with a step size of 50, 1, 0.01, 1, 0.1, and 1, respectively. Figure 3 also illustrates the changes in the AUROC values based on the number of trees, which is one of the most determinant parameters of tree-based ML methods [31]. AUROC curve of the proposed SGB model shows that the model achieved an AUROC value of 0.741 (Fig.…”
Section: Resultsmentioning
confidence: 99%
“…The reason to apply RUS algorithm instead of other commonly considered data resampling methods such as random over-sampling (ROS) or synthetic minority over-sampling technique (SMOTE) is its high performance addressed in a comparative study [17]. Besides, artificial cases are not generated in the RUS method unlike ROS or SMOTE algorithms [31]. It is important to state that resampling algorithm was not performed for the testing set to examine the performance of the model, in case of an imbalanced class distribution commonly observed in real-life conditions [32].…”
Section: Data Preprocessingmentioning
confidence: 99%
“…The dataset after step (1) contains 34 categorical features, such as passenger gender, flight cabin, etc. Since the machine learning model has a better performance on numerical features, the categorical feature should be transformed into their numerical counterparts by encoding [15], [16]. Among them, the flight cabin is subjected to label encoding [17] due to the ordinal values.…”
Section: B Data Cleaning and Encodingmentioning
confidence: 99%
“…In the second stage, the dataset is resampled to balance. In the related literature, random undersampling, random oversampling, or synthetic minority oversampling techniques (SMOTE) [15] are often used for resampling. Considering that the former two approaches may lead to the risk of data loss or model overfitting, SMOTE [20] based on the K-nearest neighbor idea is applied to balance the dataset.…”
Section: The Incomplete Data Processing Layermentioning
confidence: 99%
“…It is possible to classify the techniques as either image processing or machine learning. Filters, morphological analysis, statistical approaches, and percolation techniques are used in the image processing methods for crack detection [7] [8], and no model training process is necessary. However, with machine learning, a dataset of images is gathered and fed into the machine learning model of choice during the training phase.…”
Section: Introductionmentioning
confidence: 99%