2002
DOI: 10.1021/ci010132r
|View full text |Cite
|
Sign up to set email alerts
|

Reoptimization of MDL Keys for Use in Drug Discovery

Abstract: For a number of years MDL products have exposed both 166 bit and 960 bit keysets based on 2D descriptors. These keysets were originally constructed and optimized for substructure searching. We report on improvements in the performance of MDL keysets which are reoptimized for use in molecular similarity. Classification performance for a test data set of 957 compounds was increased from 0.65 for the 166 bit keyset and 0.67 for the 960 bit keyset to 0.71 for a surprisal S/N pruned keyset containing 208 bits and 0… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

4
988
0

Year Published

2003
2003
2017
2017

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 1,224 publications
(996 citation statements)
references
References 27 publications
4
988
0
Order By: Relevance
“…Consequently, we selected the dice score on MACCS fingerprints ad hoc, although there are of course other valid choices too. We started by generating MACCS fingerprints (Durant et al, 2002) for all query and database molecules in Table 1. Each fingerprint encodes the presence or absence of 166 predetermined chemical groups in the molecule as a binary string of the same size.…”
Section: A Simple Tf Methods To Estimate a Lower-bound For Performancementioning
confidence: 99%
“…Consequently, we selected the dice score on MACCS fingerprints ad hoc, although there are of course other valid choices too. We started by generating MACCS fingerprints (Durant et al, 2002) for all query and database molecules in Table 1. Each fingerprint encodes the presence or absence of 166 predetermined chemical groups in the molecule as a binary string of the same size.…”
Section: A Simple Tf Methods To Estimate a Lower-bound For Performancementioning
confidence: 99%
“…In Step 3, compounds were clustered based on hierarchical clustering methods. Molecular fingerprint techniques have been developed, that is, PubChem (881 bits) [76], CDK (1024 bits) [77], Extended CDK (1024 bits) [78], MACCS (166 bits) [79], Klekota-Roth (4860 bits) [80], Substructure (307 bits) [81], Estate (79 bits) [82], and atom pairs (780 bits) [83]. Those molecular finger prints generally focuses on side-chain substructures of molecules.…”
Section: Data Setmentioning
confidence: 99%
“…Unauthenticated Download Date | 5/12/18 2:00 PM fragment keys Description of the structure of a molecule as the presence or absence of each of a pre-determined set of molecular substructures, usually associated with a particular location in a bit string. Note: An example is the publically described ISIS keys [35].…”
Section: False Positive (Fp)mentioning
confidence: 99%