2014
DOI: 10.1002/minf.201400024
|View full text |Cite
|
Sign up to set email alerts
|

The Calculation of Molecular Structural Similarity: Principles and Practice

Abstract: Measures of structural similarity play an important role in chemoinformatics for applications such as similarity searching, database clustering and molecular diversity analysis. A similarity measure comprises three components: a structure representation; a weighting scheme; and a similarity coefficient. The paper introduces these components and describes methods for comparing different measures. The use of similarity measures in chemoinformatics research is illustrated by recent projects in the author's labora… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
77
0
3

Year Published

2015
2015
2019
2019

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 104 publications
(88 citation statements)
references
References 89 publications
0
77
0
3
Order By: Relevance
“…[31][32][33] Extended connectivity fingerprints (ECFPs) can be used to analyze scaffold structures.I nE CFP_6, for instance,a ll structural features within the range of six bonds of each scaffold atom will be included in the fingerprint and can be interpreted as substructures. [34,35] Thus the fingerprint covers the complete molecular environment.…”
Section: Scaffolds and Scaffold Diversitymentioning
confidence: 99%
“…[31][32][33] Extended connectivity fingerprints (ECFPs) can be used to analyze scaffold structures.I nE CFP_6, for instance,a ll structural features within the range of six bonds of each scaffold atom will be included in the fingerprint and can be interpreted as substructures. [34,35] Thus the fingerprint covers the complete molecular environment.…”
Section: Scaffolds and Scaffold Diversitymentioning
confidence: 99%
“…Drugs that are also metabolites were removed from the list of drugs. Using the MACCS encoding [61], the similarity of each substance is compared via a KNIME workflow [62] using the Tanimoto similarity coefficient [63,64], whose values are encoded according to the colours indicated. Drugs and metabolites are clustered using an agglomerative clustering algorithm.…”
Section: Consequences Of the Fact That Individual Drugs Must And Do Umentioning
confidence: 99%
“…Although more complex comparisons are occasionally used (e.g. [249][250][251]), strings are typically compared on the basis of the number of bits they have in common relative to the total, a true metric (between 0 and 1) known as the Jaccard or Tanimoto similarity. Although this is a continuous function, Tanimoto similarities above 0.8 are commonly found (i) to be resistant to the precise encoding used, and (ii) indeed to have broadly similar effects in most cases (review at [213]).…”
Section: The Importance Of Qsars (Quantitative Structure-activity Relmentioning
confidence: 99%