2020
DOI: 10.3390/e22030335
|View full text |Cite
|
Sign up to set email alerts
|

Weighted Mean Squared Deviation Feature Screening for Binary Features

Abstract: In this study, we propose a novel model-free feature screening method for ultrahigh dimensional binary features of binary classification, called weighted mean squared deviation (WMSD). Compared to Chi-square statistic and mutual information, WMSD provides more opportunities to the binary features with probabilities near 0.5. In addition, the asymptotic properties of the proposed method are theoretically investigated under the assumption log p = o ( n ) . The number of features is practically selected by… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 18 publications
0
3
0
Order By: Relevance
“…However, the recent availability of more informative databases obtained from EHR opens up new research opportunities. These current databases containing records of hundreds of thousands of appointments allow the use of modern predictive techniques such as deep neural networks or novel binary classification algorithms for high-dimensional settings, such as [ 65 , 66 ]. A second research line consists of developing and incorporating strategies that reduce the negative effects of class imbalance.…”
Section: Discussionmentioning
confidence: 99%
“…However, the recent availability of more informative databases obtained from EHR opens up new research opportunities. These current databases containing records of hundreds of thousands of appointments allow the use of modern predictive techniques such as deep neural networks or novel binary classification algorithms for high-dimensional settings, such as [ 65 , 66 ]. A second research line consists of developing and incorporating strategies that reduce the negative effects of class imbalance.…”
Section: Discussionmentioning
confidence: 99%
“…The SIS and ISIS are routinely being applied in ultra-high dimensional applications and have also been extended to more complex models. 1120 However, one major drawback of the SIS or ISIS is their non-robust nature against data contamination as indicated already in the discussion of the original paper itself. This issue can be crucial when applying the method for screening of important genes from large-scale Omics data, which are often prone to at least a few outliers.…”
Section: Introductionmentioning
confidence: 99%
“…More specifically, by applying a certain condition, methods, or criteria to rank the features, then order them in a descending order based on the rank calculated while selecting the features highest in the order to represent the rest. The work in [7,8] and [9] are examples of filter methods to select features. On the other hand, wrapper methods relies on the continuous selection of various subsets of features from the feature space, and utilize them each to train a machine learning model and infer which subsets to choose and which to eliminate according to the resulting performance of the model.…”
Section: Introductionmentioning
confidence: 99%