This paper reports a comparison of several methods for measuring the degree of similarity between pairs of 3-D chemical structures that are represented by inter-atomic distance matrices. The methods that have been tested use the distance information in very different ways and have very different computational requirements. Experiments with 10 small datasets, for which both structural and biological activity data are available, suggest that the most cost-effective technique is based on a mapping procedure that tries to match pairs of atoms, one from each of the molecules that are being compared, that have neighbouring atoms at approximately the same distances.
This paper discusses algorithmic techniques for measuring the degree of similarity between pairs of three-dimensional (3-D) chemical molecules represented by interatomic distance matrices. A comparison of four methods for the calculation of 3-D structural similarity suggests that the most effective one is a procedure that identifies pairs of atoms, one from each of the molecules that are being compared, that lie at the center of geometrically-related volumes of 3-D space. This atom mapping method enables the calculation of a wide range of types of intermolecular similarity coefficient, including measures that are based on physicochemical data. Massively-parallel implementations of the method are discussed, using the AMT Distributed Array Processor, that achieve a substantial increase in performance when compared with a sequential implementation on a UNIX workstation. Current work involves the use of angular information and the extension of the method to field-based similarity searching. Similarity searching in 3-D macromolecules is effected by the use of a maximal common subgraph (MCS) isomorphism algorithm with a novel, graph-based representation of the tertiary structures of proteins. This algorithm is being used to identify similarities between the 3-D structures of proteins in the Brookhaven Protein Data Bank; its use is exemplified by searches involving the NAD-binding fold motif.
A BSTRA CT This paper discusses a procedure for culculating the degree of similarity between an input target molecule and each of the molecules in a database qf 3 0 chemical structures. A mapping procedure is used to identijy pairs ofutoms, one from each of the molecules that are being compared, that lie at the centre of topographically similar volumes of 30 space. E.yperiments ure reported to validate the signijicance of the similarities that result from the use of the procedure and to determine the eflect on the calculated similarities of the structural representation that is used to characterise the molecules in a database.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.