2008
DOI: 10.1021/ci700099u
|View full text |Cite
|
Sign up to set email alerts
|

Impact of Benchmark Data Set Topology on the Validation of Virtual Screening Methods: Exploration and Quantification by Spatial Statistics

Abstract: A common finding of many reports evaluating ligand-based virtual screening methods is that validation results vary considerably with changing benchmark data sets. It is widely assumed that these data set specific effects are caused by the redundancy, self-similarity, and cluster structure inherent to those data sets. These phenomena manifest themselves in the data sets' representation in descriptor space, which is termed the data set topology. A methodology for the characterization of data set topology based o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
50
0

Year Published

2009
2009
2020
2020

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 32 publications
(50 citation statements)
references
References 53 publications
0
50
0
Order By: Relevance
“…This choice is critical since the performances of a virtual screening method can vary considerably with the benchmarking dataset used for the study 74,75 . The first element that can guide the selection of a benchmarking dataset is the nature of the virtual screening method that will be evaluated.…”
Section: Selection Of the Optimal Benchmarking Dataset According To Tmentioning
confidence: 99%
“…This choice is critical since the performances of a virtual screening method can vary considerably with the benchmarking dataset used for the study 74,75 . The first element that can guide the selection of a benchmarking dataset is the nature of the virtual screening method that will be evaluated.…”
Section: Selection Of the Optimal Benchmarking Dataset According To Tmentioning
confidence: 99%
“…The method to build MUV was the refined nearest neighbor analysis in spatial statistics [105]. First, 17 physicochemical properties were used for calculating pairwise Euclidean distances.…”
Section: Currently Available Benchmarking Setsmentioning
confidence: 99%
“…Research Article [8,12,13]. The lack of robustness that results from such bias can seriously skew retrospective analyses and mislead researchers as to which method is likely to give the best prospective performance.…”
Section: Introductionmentioning
confidence: 99%