A Novel Subshape Molecular Descriptor

Putta, Santosh; Eksterowicz, John; Lemmen, Christian; Stanton, Robert V.

doi:10.1021/ci0256384

Cited by 36 publications

(38 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Today, a wide variety of descriptors has been reported to use in QSAR analysis. [36][37][38][39][40] Recent progress in computational hardware and the development of efficient algorithms have assisted the routine development of molecular quantum chemical calculations. Quantum chemical calculations can, in principle, express all of the electronic and geometric properties of molecules and their interactions.…”

Section: Introductionmentioning

confidence: 99%

Accurate prediction of the blood–brain partitioning of a large set of solutes using ab initio calculations and genetic neural network modeling

et al. 2006

View full text Add to dashboard Cite

A genetic algorithm-based artificial neural network model has been developed for the accurate prediction of the blood-brain barrier partitioning (in logBB scale) of chemicals. A data set of 123 logBB (115 old molecules and 8 new molecules) of a diverse set of chemicals was chosen in this study. The optimum 3D geometry of the molecules was estimated by the ab initio calculations at the level of RHF/STO-3G, and consequently, different electronic descriptors were calculated for each molecule. Indeed, logP as a measure of hydrophobicity and different topological indices were also calculated. A three-layered artificial neural network with backpropagation of an error-learning algorithm was employed to process the nonlinear relationship between the calculated descriptors and logBB data. Genetic algorithm was used as a feature selection method to select the most relevant set of descriptors as the input of the network. Modeling of the logBB data by the only quantum descriptors produced a 5:4:1 ANN structure with RMS error of validation and crossvalidation equal to 0.224 and 0.227, respectively. Better nonlinear model (RMS(V) and RMS(CV) equals to 0.097 and 0.099, respectively) was obtained by the incorporation of the logP and the principal components of the topological indices to electronic descriptors. The ultimate performances of the models were obtained by the application of the models to predict the logBB of 23 molecules that did not have contribution in the steps of model development. The best model produced RMS error of prediction 0.140, and could predict about 98% of variances in the logBB data.

show abstract

Section: Introductionmentioning

confidence: 99%

Accurate prediction of the blood–brain partitioning of a large set of solutes using ab initio calculations and genetic neural network modeling

et al. 2006

View full text Add to dashboard Cite

show abstract

“…Fingerprints used in this study include those employed in activity coefficient and vapour pressure predictive techniques provided by the UManSysProp package Zuend et al, 2011;Nannoolal et al, 2008), alongside more general fingerprints, including the MACCS keys and FP4 keys (Putta et al, 2003). It is difficult to find information on the provenance behind these latter generic fingerprints (Putta et al, 2003), other than that they are designed to cover a set of molecular features that would be used across using SMARTS notation, and each molecule using the SMILES format.…”

Section: Methodsmentioning

confidence: 99%

“…It is difficult to find information on the provenance behind these latter generic fingerprints (Putta et al, 2003), other than that they are designed to cover a set of molecular features that would be used across using SMARTS notation, and each molecule using the SMILES format. The matrix of keys used to fit each method is constructed by systematically parsing each molecule.…”

Section: Methodsmentioning

confidence: 99%

“…Within atmospheric science, it is desirable to develop models for secondary organic aerosol (SOA) formation based on a given set of precursors and photochemical processing. Within most global and regional models, often-used techniques include modelling representative photochemical yields from specific precursors and tuning accordingly (Spracklen et al, 2011) or employing a parametric model such as the volatility basis set (Robinson et al, 2007). While both of these approaches can deliver realistic absolute concentrations, because they are not based on explicit physical processes, their predictive skill is always subject to question (Hallquist et al, 2009;Bergström et al, 2012).…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

STRAPS v1.0: evaluating a methodology for predicting electron impact ionisation mass spectra for the aerosol mass spectrometer

et al. 2017

View full text Add to dashboard Cite

Abstract. Our ability to model the chemical and thermodynamic processes that lead to secondary organic aerosol (SOA) formation is thought to be hampered by the complexity of the system. While there are fundamental models now available that can simulate the tens of thousands of reactions thought to take place, validation against experiments is highly challenging. Techniques capable of identifying individual molecules such as chromatography are generally only capable of quantifying a subset of the material present, making it unsuitable for a carbon budget analysis. Integrative analytical methods such as the Aerosol Mass Spectrometer (AMS) are capable of quantifying all mass, but because of their inability to isolate individual molecules, comparisons have been limited to simple data products such as total organic mass and the O : C ratio. More detailed comparisons could be made if more of the mass spectral information could be used, but because a discrete inversion of AMS data is not possible, this activity requires a system of predicting mass spectra based on molecular composition.In this proof-of-concept study, the ability to train supervised methods to predict electron impact ionisation (EI) mass spectra for the AMS is evaluated. Supervised Training Regression for the Arbitrary Prediction of Spectra (STRAPS) is not built from first principles. A methodology is constructed whereby the presence of specific mass-to-charge ratio (m/z) channels is fitted as a function of molecular structure before the relative peak height for each channel is similarly fitted using a range of regression methods. The widely used AMS mass spectral database is used as a basis for this, using unit mass resolution spectra of laboratory standards.Key to the fitting process is choice of structural information, or molecular fingerprint. Our approach relies on using supervised methods to automatically optimise the relationship between spectral characteristics and these molecular fingerprints. Therefore, any internal mechanisms or instrument features impacting on fragmentation are implicitly accounted for in the fitted model. Whilst one might expect a collection of keys specifically designed according to EI fragmentation principles to offer a robust basis, the suitability of a range of commonly available fingerprints is evaluated.Using available fingerprints in isolation, initial results suggest the generic public "MACCS" fingerprints provide the most accurate trained model when combined with both decision trees and random forests, with median cosine angles of 0.94-0.97 between modelled and measured spectra. There is some sensitivity to choice of fingerprint, but most sensitivity is in choice of regression technique. Support vector machines perform the worst, with median values of 0.78-0.85 and lower ranges approaching 0.4, depending on the fingerprint used. More detailed analysis of modelled versus mass spectra demonstrates important composition-dependent sensitivities on a compound-by-compound basis. This is further demonstrated when we apply...

show abstract

“…Various methods have been applied to construct QSAR models including linear and nonlinear regression methods. Multiple linear regression (MLR) and artificial neural networks (ANN) have been extensively employed in QSAR studies owing to their outstanding linear and nonlinear mapping capability, respectively [13,14]. Appropriate application of the structural and physicochemical features of molecules is an essential key to achieve successful QSAR models [12].…”

Section: Introductionmentioning

confidence: 99%

Computer-aided design of novel antibacterial 3-hydroxypyridine-4-ones: application of QSAR methods based on the MOLMAP approach

Sabet

Fassihi

Hemmateenejad

et al. 2012

J Comput Aided Mol Des

View full text Add to dashboard Cite

3-Hydroxypyridine-4-one derivatives have shown good inhibitory activity against bacterial strains. In this work we report the application of MOLMAP descriptors based on empirical physicochemical properties with genetic algorithm partial least squares (GA-PLS) and counter propagation artificial neural networks (CP-ANN) methods to propose some novel 3-hydroxypyridine-4-one derivatives with improved antibacterial activity against Staphylococcus aureus. A large collection of 302 novel derivatives of this chemical scaffold was selected for this purpose. The activity classes of these compounds were determined using the two quantitative structure activity relationships models. To evaluate the predictability and accuracy of the obtained models, nineteen compounds belonging to all three activity classes were prepared and the activity of them was determined against S. aureus. Comparing the experimental results and the predicted activity classes revealed the accuracy of the obtained models. Seventeen of the nineteen synthesized molecules were correctly predicted by GA-PLS model according to the antimicrobial evaluation method. Molecules 5f and 5h proved to be moderately active and active experimentally, but were predicted as inactive and moderately active compounds, respectively by this model. The CP-ANN based prediction was correct for sixteen out of the nineteen synthesized molecules. 5a, 5h and 5q were moderately active and active based on the antimicrobial assays, but they were introduced as members of inactive, moderately active and inactive classes of compounds, respectively according to CP-ANN model.

show abstract

A Novel Subshape Molecular Descriptor

Cited by 36 publications

References 16 publications

Accurate prediction of the blood–brain partitioning of a large set of solutes using ab initio calculations and genetic neural network modeling

Accurate prediction of the blood–brain partitioning of a large set of solutes using ab initio calculations and genetic neural network modeling

STRAPS v1.0: evaluating a methodology for predicting electron impact ionisation mass spectra for the aerosol mass spectrometer

Computer-aided design of novel antibacterial 3-hydroxypyridine-4-ones: application of QSAR methods based on the MOLMAP approach

Contact Info

Product

Resources

About