2008
DOI: 10.2174/157340908784533238
|View full text |Cite
|
Sign up to set email alerts
|

Variable Selection in QSAR Models for Drug Design

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
7
0

Year Published

2009
2009
2018
2018

Publication Types

Select...
9

Relationship

0
9

Authors

Journals

citations
Cited by 17 publications
(7 citation statements)
references
References 0 publications
0
7
0
Order By: Relevance
“…Feature selection techniques have been successfully applied in many real-world applications, such as large-scale biological data analysis [ [24] , [25] , [26] ], text classification [ 27 ], information retrieval [ 28 ], near-infrared spectroscopy [ 29 ], mass spectroscopy data analysis [ 30 ], drug design [ 31 , 32 ], and especially the quantitative structure-activity relationship (QSAR) modeling [ 33 , 34 ]. In cancer research community, feature selection has also been widely applied in different omics data analyses: mRNA data [ 9 , 35 ], miRNA data [ 36 , 37 ], whole exome sequencing data [ 38 ], DNA-methylation data [ 39 , 40 ], and proteomics data [ 41 , 42 ].…”
Section: Feature Selection Techniquesmentioning
confidence: 99%
“…Feature selection techniques have been successfully applied in many real-world applications, such as large-scale biological data analysis [ [24] , [25] , [26] ], text classification [ 27 ], information retrieval [ 28 ], near-infrared spectroscopy [ 29 ], mass spectroscopy data analysis [ 30 ], drug design [ 31 , 32 ], and especially the quantitative structure-activity relationship (QSAR) modeling [ 33 , 34 ]. In cancer research community, feature selection has also been widely applied in different omics data analyses: mRNA data [ 9 , 35 ], miRNA data [ 36 , 37 ], whole exome sequencing data [ 38 ], DNA-methylation data [ 39 , 40 ], and proteomics data [ 41 , 42 ].…”
Section: Feature Selection Techniquesmentioning
confidence: 99%
“…A high value of the statistical feature (R 2 N 0.5) in the crossvalidations is considered proof of the high predictive ability of a model. Within the data analysis stage, the partial least squares (PLS), the multivariate linear regression (MLR), and the artificial neural network (ANN) are the techniques used for the selection of a subset of the most relevant molecular descriptors [24].…”
Section: Introductionmentioning
confidence: 99%
“…Assessment of descriptor importance with respect to the response variable is well established within the QSAR community as a method to provide mechanistic insight. 2 ML methods such as partial least-squares, 3 random forest, 4 and artificial neural networks, 5 as well as entropy (Gini index) and near neighbor based (ReliefF) methods, are used to rank the descriptor importance for the full data set. However, such global ranking of descriptors in a structurally diverse data set might not be accurate for individual compounds predicted by a nonlinear ML algorithm, and ultimately the structure activity relationship (SAR) should be understood locally for each scaffold.…”
Section: ■ Introductionmentioning
confidence: 99%