2004
DOI: 10.1021/jm049902r
|View full text |Cite
|
Sign up to set email alerts
|

Deriving Knowledge through Data Mining High-Throughput Screening Data

Abstract: Deriving general knowledge from high-throughput screening data is made difficult by the significant amount of noise, arising primarily from false positives, in the data. The paradigm established for screening an encoded combinatorial library on polymeric support, an ECLiPS library, has a significant amount of built-in redundancy. Because of this redundancy, the resulting data can be interpreted through a rigorous statistical analysis procedure, thereby significantly reducing the number of false positives. Here… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
36
0

Year Published

2005
2005
2014
2014

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 47 publications
(40 citation statements)
references
References 30 publications
3
36
0
Order By: Relevance
“…For example, if only 2% of the compounds were active, classifying all compounds as inactive would yield an accuracy of 98%, but not be a useful model. The second issue has to do with the quality of data obtained from HTS being noisy primarily from false positives [51]. For these reasons we decided to train on the confirmatory assay (AID 846) because it is relatively balanced and false positives are not as likely because the IC 50 determination was performed in triplicate.…”
Section: Construction Of Training and Test Setmentioning
confidence: 99%
“…For example, if only 2% of the compounds were active, classifying all compounds as inactive would yield an accuracy of 98%, but not be a useful model. The second issue has to do with the quality of data obtained from HTS being noisy primarily from false positives [51]. For these reasons we decided to train on the confirmatory assay (AID 846) because it is relatively balanced and false positives are not as likely because the IC 50 determination was performed in triplicate.…”
Section: Construction Of Training and Test Setmentioning
confidence: 99%
“…In a less cited article, Diller and Hobbs provide further support for the greater frequency of compounds with lead-like properties turning up as actives in HT screens [16]. The data was mined from the screening results obtained from 2 Â 10 8 compounds assayed across 100 molecular targets.…”
Section: Design Considerationsmentioning
confidence: 99%
“…Pardridge: blood brain barrier permeability (CNS penetration) [9] HBD + HBA o 8; MW o 400; no acids Lipinski: Ro5 (oral drug-likeness) [10] cLogP W À2 to o5; MW o 500; HBD r 5; HDA r 10; rotatable bonds r10 Kelder: polar surface area (PSA) oral bioavailability [11] o120 Å (all drug types); o60-70 Å (CNS active) Clark: blood brain barrier permeability (CNS penetration) [12,13] [14,15] MW 100-350; cLogP 1-3 Veber: oral bioavailability (rat) [17] r10 rotatable bonds, r140 Å PSA, or o12 HBD + HBA independent of MW Diller: parameters for highest hit rates in large lead discovery libraries [16] cLogP 2-6; HBD 2; rotatable bonds 6; polar substituents able to make strong intermolecular contacts; r1 amide bond 9.2. DESIGN CONSIDERATIONS generally show larger relative hit rates ( Fig.…”
Section: Physiochemical Property Analyses Of Leads and Drugsmentioning
confidence: 99%
“…[6][7][8][9][10] Hopkins et al 6 reported that compound promiscuity decreases with higher molecular weight (MW) based on HTS data. Diller and Hobbs 11 published a statistical analysis of a company's historical HTS data, in which they examined the relationship between physicochemical properties, substructures, and likelihood of displaying biological activity, irrespective of target class.…”
Section: Introductionmentioning
confidence: 99%