2023
DOI: 10.11591/ijece.v13i3.pp3359-3366
|View full text |Cite
|
Sign up to set email alerts
|

Multivariate sample similarity measure for feature selection with a resemblance model

Abstract: Feature selection improves the classification performance of machine learning models. It also identifies the important features and eliminates those with little significance. Furthermore, feature selection reduces the dimensionality of training and testing data points. This study proposes a feature selection method that uses a multivariate sample similarity measure. The method selects features with significant contributions using a machine-learning model. The multivariate sample similarity measure is evaluated… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
7
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(8 citation statements)
references
References 20 publications
1
7
0
Order By: Relevance
“…Furthermore, authors' findings aligned with previous research where XGBoost was effectively applied in socioeconomical aspects namely medicine [33], [34], economy [35], cybersecurity [36], language processing [37] and environmental applications [38]. Regarding medical applications, feature selection and XGBoost was considered the most effective solution for heart disease classification with 99.6% accuracy [39] improving the solution of [40] where the proposed decision trees provided 97.75% accuracy. Additionally, a similar framework was implemented in [41] for diabetes prediction with the presented approach resulting in an area under curve (AUC) of 82%.…”
Section: Resultssupporting
confidence: 77%
“…Furthermore, authors' findings aligned with previous research where XGBoost was effectively applied in socioeconomical aspects namely medicine [33], [34], economy [35], cybersecurity [36], language processing [37] and environmental applications [38]. Regarding medical applications, feature selection and XGBoost was considered the most effective solution for heart disease classification with 99.6% accuracy [39] improving the solution of [40] where the proposed decision trees provided 97.75% accuracy. Additionally, a similar framework was implemented in [41] for diabetes prediction with the presented approach resulting in an area under curve (AUC) of 82%.…”
Section: Resultssupporting
confidence: 77%
“…Even though the study has achieved higher accuracy compared to the previous work, result 68 has still scope for improvement for more accurate prediction of HD risk. The LRM proves to have an accuracy score of 86.11% for coronary heart disease risk prediction [15], [16]. The accuracy of the LRM model improves when trained on features that are highly correlated to HD risk.…”
Section: Introductionmentioning
confidence: 93%
“…The experimental result conducted in different studies [22]- [26] suggests that the performance of the machine learning method improves with feature selection, and preprocessing. During pre-processing, the missing values are replaced or removed, and the class distribution of the dataset is examined.…”
Section: Introductionmentioning
confidence: 99%
“…Correspondingly, another research by Assegie et al [6] investigated that feature selection improves the effectiveness of the extreme boosting (XGBoost) model for matching patterns between the predictor and predicted feature in the CHD dataset. The result shows that the XGBoost model achieves 99.6% accuracy in generalizing the presence or absence of CHD.…”
Section: Issn: 2252-8938 mentioning
confidence: 99%