Prediction of hERG Potassium Channel Blockade Using kNN‐QSAR and Local Lazy Regression Methods

Gunturi, Sitarama B.; Archana, K. S.; Khandelwal, Akash; Ramamurthi, Narayanan

doi:10.1002/qsar.200810072

Cited by 16 publications

(15 citation statements)

References 81 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…k ‐Nearest neighbor (kNN) : kNN is a type of instance‐based learning and is thought to be one of the simplest machine learning algorithms. In this case, Manhattan distance was used to calculate the distance matrices, k was set to 28 to reduce the effect of noise, and the final result was an inverse distance weighted average of the k nearest neighbors.…”

Section: Methodsmentioning

confidence: 99%

In Silico Prediction of Compounds Binding to Human Plasma Proteins by QSAR Models

Sun

Yang

et al. 2017

ChemMedChem

View full text Add to dashboard Cite

Plasma protein binding (PPB) is a significant pharmacokinetic property of compounds in drug discovery and design. Due to the high cost and time-consuming nature of experimental assays, in silico approaches have been developed to assess the binding profiles of chemicals. However, because of unambiguity and the lack of uniform experimental data, most available predictive models are far from satisfactory. In this study, an elaborately curated training set containing 967 diverse pharmaceuticals with plasma-protein-bound fractions (f ) was used to construct quantitative structure-activity relationship (QSAR) models by six machine learning algorithms with 26 molecular descriptors. Furthermore, we combined all of the individual learners to yield consensus prediction, marginally improving the accuracy of the consensus model. The model performance was estimated by tenfold cross validation and three external validation sets comprising 242 pharmaceutical, 397 industrial, and 231 newly designed chemicals, respectively. The models showed excellent performance for the entire test set, with mean absolute error (MAE) ranging from 0.126 to 0.178, demonstrating that our models could be used by a chemist when drawing a molecular structure from scratch. Meanwhile, structural descriptors contributing significantly to the predictive power of the models were related to the binding mechanisms, and the trend in terms of their effects on PPB can serve as guidance for the structural modification of chemicals. The applicability domain was also defined to distinguish favorable predictions from unfavorable predictions.

show abstract

Section: Methodsmentioning

confidence: 99%

In Silico Prediction of Compounds Binding to Human Plasma Proteins by QSAR Models

Sun

Yang

et al. 2017

ChemMedChem

View full text Add to dashboard Cite

show abstract

“…This approach described in [12,13] is based on the method of k the nearest neighbors (kNN), which employs clusterization and generation of own regres sion model in each cluster.…”

Section: Introductionmentioning

confidence: 99%

Calculations of acute intravenous toxicity in mice based on local regression models in superoverlapping clusters (LRMSC)

Raevsky

Grigoriev

Liplavskaya

et al. 2011

Biochem. Moscow Suppl. Ser. B

View full text Add to dashboard Cite

Modeling of quantitative structure -activity relationships (QSAR) between physicochemical descriptors of organic chemicals and their acute intravenous toxicity in mice have been presented. This approach includes three steps: structure similarity chemicals selection for every compound of interest (clus terization); construction of quantitative structure -toxicity models for every cluster (without including of compounds of interest); application of the obtained QSAR equations for chemical of interest toxicity esti mation. This approach has been applied for calculations of acute intravenous toxicity for 10241 organic chemicals. For 7759 compounds possessing structural neighbors with the Tanimoto index (Tc) of 0.30 and above the standard deviation of the calculated vs. experimental log(1/LD 50 ) values was 0.51 at the estimation of the experimental determination error of ±0.50 (log(1/LD 50 ) value). Calculations performed for remaining compounds (~24%) were not as good as those made for the former group, possibly due to lack of reasonable number of structurally related analogues. It's suggested that this QSAR approach can be useful for prediction of biological activity and toxicity of large sets of chemical compounds.

show abstract

“…36,37 Deciding the optimal number k of nearest neighbors to be used for the identification of the best model is of great relevance and vital for the prediction ability of local modeling. 38 When a prediction is required for a query point, LL proceeds by identifying a set of local model candidates with different polynomial degrees and different number of neighbors, by letting k vary among k min and k max (bandwidth, where k min and k max control the minimal and maximal number of neighbors to be used for identifying and validating models, respectively).…”

Section: Introductionmentioning

confidence: 99%

“…38 When a prediction is required for a query point, LL proceeds by identifying a set of local model candidates with different polynomial degrees and different number of neighbors, by letting k vary among k min and k max (bandwidth, where k min and k max control the minimal and maximal number of neighbors to be used for identifying and validating models, respectively). The prediction ability of each model is commonly assessed through a local leave-one-out cross-validation (LOO-CV) procedure, [36][37][38][39] i.e. the model is built using the k min resultant neighborhood, which is validated using LOO-CV to generate a Q 2 LOO value or prediction error.…”

Section: Introductionmentioning

confidence: 99%

A new strategy to improve the predictive ability of the local lazy regression and its application to the QSAR study of melanin‐concentrating hormone receptor 1 antagonists

Lei

et al. 2009

J Comput Chem

View full text Add to dashboard Cite

In the quantitative structure-activity relationship (QSAR) study, local lazy regression (LLR) can predict the activity of a query molecule by using the information of its local neighborhood without need to produce QSAR models a priori. When a prediction is required for a query compound, a set of local models including different number of nearest neighbors are identified. The leave-one-out cross-validation (LOO-CV) procedure is usually used to assess the prediction ability of each model, and the model giving the lowest LOO-CV error or highest LOO-CV correlation coefficient is chosen as the best model. However, it has been proved that the good statistical value from LOO cross-validation appears to be the necessary, but not the sufficient condition for the model to have a high predictive power. In this work, a new strategy is proposed to improve the predictive ability of LLR models and to access the accuracy of a query prediction. The bandwidth of k neighbor value for LLR is optimized by considering the predictive ability of local models using an external validation set. This approach was applied to the QSAR study of a series of thienopyrimidinone antagonists of melanin-concentrating hormone receptor 1. The obtained results from the new strategy shows evident improvement compared with the commonly used LOO-CV LLR methods and the traditional global linear model.

show abstract

Prediction of hERG Potassium Channel Blockade Using kNN‐QSAR and Local Lazy Regression Methods

Cited by 16 publications

References 81 publications

In Silico Prediction of Compounds Binding to Human Plasma Proteins by QSAR Models

In Silico Prediction of Compounds Binding to Human Plasma Proteins by QSAR Models

Calculations of acute intravenous toxicity in mice based on local regression models in superoverlapping clusters (LRMSC)

A new strategy to improve the predictive ability of the local lazy regression and its application to the QSAR study of melanin‐concentrating hormone receptor 1 antagonists

Contact Info

Product

Resources

About