The discovery of various protein/receptor targets from genomic research is expanding rapidly. Along with the automation of organic synthesis and biochemical screening, this is bringing a major change in the whole field of drug discovery research. In the traditional drug discovery process, the industry tests compounds in the thousands. With automated synthesis, the number of compounds to be tested could be in the millions. This two-dimensional expansion will lead to a major demand for resources, unless the chemical libraries are made wisely. The objective of this work is to provide both quantitative and qualitative characterization of known drugs which will help to generate "drug-like" libraries. In this work we analyzed the Comprehensive Medicinal Chemistry (CMC) database and seven different subsets belonging to different classes of drug molecules. These include some central nervous system active drugs and cardiovascular, cancer, inflammation, and infection disease states. A quantitative characterization based on computed physicochemical property profiles such as log P, molar refractivity, molecular weight, and number of atoms as well as a qualitative characterization based on the occurrence of functional groups and important substructures are developed here. For the CMC database, the qualifying range (covering more than 80% of the compounds) of the calculated log P is between -0.4 and 5.6, with an average value of 2.52. For molecular weight, the qualifying range is between 160 and 480, with an average value of 357. For molar refractivity, the qualifying range is between 40 and 130, with an average value of 97. For the total number of atoms, the qualifying range is between 20 and 70, with an average value of 48. Benzene is by far the most abundant substructure in this drug database, slightly more abundant than all the heterocyclic rings combined. Nonaromatic heterocyclic rings are twice as abundant as the aromatic heterocycles. Tertiary aliphatic amines, alcoholic OH and carboxamides are the most abundant functional groups in the drug database. The effective range of physicochemical properties presented here can be used in the design of drug-like combinatorial libraries as well as in developing a more efficient corporate medicinal chemistry library.
Molecular hydrophobicity (lipophilicity), usually
quantified as log P (the logarithm of 1-octanol/water
partition
coefficient), is an important molecular characteristic in drug
discovery. ALOGP and CLOGP are two of the
most widely used methods for the estimation of log P.
This work describes an extensive reparametrization
of the atomic log P values and a detailed comparison of the
performance of ALOGP and CLOGP methods
on the Pomona Medchem database. Only the “star
list” compounds having precisely measured log P
values
were used in this analysis. While the overall results with both
methods are similar, analysis shows that the
CLOGP method is better for very small molecules in the range of 1−20
atoms. The two methods are almost
comparable in the range of 21−45 atoms, while the ALOGP method has
better accuracy for molecules with
more than 45 atoms. Although the rms deviation and the correlation
coefficient for CLOGP predictions
were marginally better than those for corresponding ALOGP predictions,
the latter showed a very stable
performance for all classes of organic compounds analyzed. The
ALOGP method can be used to compute
estimates of most neutral organic compounds having C, H, O, N, S, Se,
P, B, Si, and halogens. It also covers
most zwitterionic compounds having amine and carboxylic acids and
ammonium halide salts. The CLOGP
method has improved considerably over the years to cover most neutral
organic compounds, but it still has
some undefined fragments. Finally, unlike CLOGP and other methods
of predicting lipophilicity, the ALOGP
method has multiple uses, such as the estimation of local
hydrophobicity, the visualization of molecular
hydrophobicity maps, and the evaluation of hydrophobic interactions in
protein−ligand complexes.
In an earlier paper (Ghose A. K.; Crippen, G. M. J. Comput. Chem. 1986, 7, 565) the need of atomic physicochemical properties for three-dimensional-structure-directed quantitative structure-activity relationships was demonstrated, and it was shown how atomic parameters can be developed to successfully evaluate the molecular water-1-octanol partition coefficient, which is a measure of hydrophobicity. In the present work the atomic values of molar refractivity are reported. Carbon, hydrogen, oxygen, nitrogen, sulfur, and halogens are divided into 110 atom types of which 93 atomic values are evaluated from 504 molecules by using a constrained least-squares technique. These values gave a standard deviation of 1.269 and a correlation coefficient of 0.994. The parameters were used to predict the molar refractivities of 78 compounds. The predicted values have a standard deviation of 1.614 and a correlation coefficient of 0.994. The degree of closeness of the linear relationship between the atomic water-1-octanol partition coefficients and molar refractivities has been checked by the correlation coefficient of 89 atom types used for both the properties. The correlation coefficient has been found to be 0.322. The low value suggests that both parameters can be used to model the intermolecular interaction. The origin of these physicochemical properties and the types of interaction that can be modeled by these properties have been critically analyzed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.