2020
DOI: 10.1016/j.knosys.2020.105745
|View full text |Cite
|
Sign up to set email alerts
|

An information theoretic approach to quantify the stability of feature selection and ranking algorithms

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 13 publications
(6 citation statements)
references
References 49 publications
0
6
0
Order By: Relevance
“…As examples for the applicability of Theorem 10, one can e.g. combine each of the divergence generators ϕ of Table 1 (except for the 9th row) with any of the optimization problems ( 8), (10), (11), ( 14), ( 16), (17); the needed distributions Π…”
Section: Conditionmentioning
confidence: 99%
See 1 more Smart Citation
“…As examples for the applicability of Theorem 10, one can e.g. combine each of the divergence generators ϕ of Table 1 (except for the 9th row) with any of the optimization problems ( 8), (10), (11), ( 14), ( 16), (17); the needed distributions Π…”
Section: Conditionmentioning
confidence: 99%
“…There is a vast literature on recent applications of the Jensen-Shannon divergence, for instance it appears exemplarily in Kvitsiani et al [208] for finding connections between the circuit-level function of different interneuron types in regulating the flow of information and the behavioural functions served by the cortical circuits, in Xu et al ( 2014) for browsing and exploration of video sequences, in Jenkinson et al [168] for the fundamental understanding of the epigenome that leads to a powerful approach for studying its role in disease and aging, in Martin et al [250] for the implementation of an evolutionary-based global localization filter for mobile robots, in Suo et al [354] for the revelation of critical regulators of cell identity in mice, in Abante et al [2] for the detection of biologically significant differences in DNA methylation between alleles associated with local changes in genetic sequences -for a better understanding of the mechanism of complex human diseases, in Afek et al [5] for revealing mechanisms by which mismatches can recruit transcription factors for modulating replication and repair activities in cells, in Alaiz-Rodriguez & Parnell [10] for the quantification of stability in feature selection and ranking algorithms, in Biau et al [53] for generative adversarial networks (GANs) in artificial intelligence and machine learning, in Carre et al [74] for the standardization of brain magnetic resonance (MR) images, in Chakraborty et al [75] for hierarchical clustering in foreign exchange FOREX markets (e.g. in periods of major international crises), in Chong et al [87] as part of a web-based platform for comprehensive analysis of microbiome data outputs, in Cui et al [101] for modelling latent friend recommendation in online social media, in Gholami & Hodtani [134] for refinements of safety-and-security-targeted location verification systems in wireless communication networks (e.g in Intelligent Transportation Systems (ITSs) and vehicular technology), in Guo & Yuan [146] for accurate abnormality classification in semi-supervised Wireless Capsule Endoscopy (WCE) for digestive system cancer diagnosis, in Jiang et al [169] for the training of deep neural discriminative and generative networks used for designing and evaluating photonic devices, in Kartal et al [186] for uncovering the relationship between some genomic features and cell type-specific methylome diversity, in Laszlovszky et al [210] for investigating mechanisms of basal forebrain neurons which modulate synaptic plasticity,cortical processing, brain states and oscillations, in Lawson et al [211] for the improved understanding of some genetic circuit...…”
Section: We Obtainmentioning
confidence: 99%
“…Hence, some traditional FS methods have received considerable interest due to their ability to evaluate feature importance and select a certain number of top-ranked features. These methods include statistical test (e.g., analysis of variance (ANOVA) [ 8 , 9 ] and Chi-Squared (CHI2) [ 10 , 11 ]), correlation criteria (e.g., Pearson [ 12 ], Spearman [ 13 , 14 ], Kendall [ 15 , 16 ]), and information theory (e.g., symmetrical uncertainty (SU) [ 17 ], mutual information (MI) [ 18 , 19 ], and entropy [ 20 ]). However, the statistical test and correlation criteria techniques only consider the correlation between features and labels, and the feature subsets are not appropriate because some highly correlated but redundant features are selected.…”
Section: Introductionmentioning
confidence: 99%
“…Meanwhile, the wrapped method is a model-based feature selection, which optimizes the search subset by evaluating the objective function of the model to obtain the best subset [12]. The advantage of the filtered approach is that it is computationally efficient while not relying on a specific classification algorithm; the disadvantage is that the performance is slightly worse [13]. Wrapped methods have the advantage of good performance but require a search of the feature space and face the problem of a long search time [14].…”
Section: Introductionmentioning
confidence: 99%