2012
DOI: 10.1002/minf.201100142
|View full text |Cite|
|
Sign up to set email alerts
|

Benchmarking Variable Selection in QSAR

Abstract: Some equations in the paper contain errors and some are unclear. This errata is provided for clarification. I. The equation in Section 2.3.2 on page 175 in the paper should be ^ b ¼ arg min b X n i¼1 ðy i À b 0 À bx i Þ 2 þ l b k k 1 ð1Þ II. The first equation in the left column on page 177 should be ^ q k ¼ arg max q k 1 5 X 5 h¼1 AUCðy h;Àk ; ^ y h;Àk ðq k ; X h;Àk ; D Àh;Àk ÞÞ: ð2Þ ^ y h;Àk ðq k ; X h;Àk ; D Àh;Àk Þ are here the predictions for the left out y of the hth partition with the kth subset previou… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
27
1

Year Published

2014
2014
2024
2024

Publication Types

Select...
7

Relationship

1
6

Authors

Journals

citations
Cited by 25 publications
(30 citation statements)
references
References 30 publications
0
27
1
Order By: Relevance
“…In Eklund et al, 2 we found MARS and lasso to be the feature selection methods that performed best among the methods included in the benchmarking experiments. Therefore, we use these feature selection methods here (or rather, we use a generalization of lasso: the elastic nets).…”
Section: ■ Methodsmentioning
confidence: 93%
“…In Eklund et al, 2 we found MARS and lasso to be the feature selection methods that performed best among the methods included in the benchmarking experiments. Therefore, we use these feature selection methods here (or rather, we use a generalization of lasso: the elastic nets).…”
Section: ■ Methodsmentioning
confidence: 93%
“…27 Our study showed that every combination model can be improved with a tremendous reduction of the number of descriptors. For example, the transparency of optimal RF models for the eight data sets ranges from 0.04 (MRP2) to 0.53 (BCPR), which means that as much as 96% of variables could a For one specific data set, the italic bold font style represents variable number and associated transparency with the best performance among the four modeling methods.…”
Section: ■ Resultsmentioning
confidence: 96%
“…27 Transparency represents the ability of a variable selection algorithm to extract the key ones from a pool of variables with noisy information. Usually, transparency was calculated for the variable set which maximizes the predictive performance of a model.…”
Section: ■ Materials and Methodsmentioning
confidence: 99%
“…Features are then removed one by one until a certain criterion is satisfied. [34] 1n-m Algorithm. The 1n-m algorithm combines FS with BE.…”
Section: Feature Selection Methodsmentioning
confidence: 99%
“…Forward selection (FS) begins with the presence of nonindependent features in the model, and the independent features are subsequently added one by one according to the predictive squared correlation coefficient of cross‐validation in the training set ( Qcv2) until the criteria are satisfied . In FS as well as the following two methods, the best feature subset is established when the value of Qcv2 has reached its maximum.…”
Section: Methodsmentioning
confidence: 99%