Hierarchical PLS Modeling for Predicting the Binding of a Comprehensive Set of Structurally Diverse Protein−Ligand Complexes

Lindström, Anton; Pettersson, Fredrik; Almqvist, Fredrik; Berglund, Anders; Kihlberg, Jan; Linusson, Anna

doi:10.1021/ci050323k

Cited by 30 publications

(33 citation statements)

References 59 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…38,41 To study the comprehensive information of PLCs, the whole refined set was used in this work without any additional filter of PLCs, like other works. 34 Within the 1300 PLC data in the refined set, there are 493 data with the binding affinity of dissociation constant (K d ) value and 807 with inhibition constant (K i ) value. We used the negative logarithm of K d and K i values in this study (pK d and pK i ).…”

Section: Data Setsmentioning

confidence: 99%

“…The external validation results are better than that in the references based on the same kind of dataset. 16,34,39 Besides, in order to further prove the generalization ability of our model, an overall 10-fold crossvalidation of this model on the whole datasets was also performed. Because the Q 2 result is not stable for an n-fold crossvalidation, we repeat this procedure for 10 times and get an average Q 2 for K d and K i datasets as 0.569 and 0.496.…”

Section: The External Validation Of Modelsmentioning

confidence: 99%

“…Recently, as an alternative to widely used docking and scoring approach, some other in silico methods based on the structures of ligands and the relevant proteins are also proposed for the fast prediction of the binding affinity with some success (e.g. Hi-PLS 34 and novel geometrical descriptors-based method 35,36 ). These methods often use the molecular descriptors calculated from the structure of the ligand and protein as the inputs, and then, use some modeling methods to develop predictive models for binding affinity.…”

Section: Introductionmentioning

confidence: 99%

“…Many of the methods mentioned earlier used the refined set [2003 release, containing 800 protein-ligand complexes (PLCs)] of PDBbind, 37,38 including the methods of PMF, ChemScore, PLP, LUDI, GOLD, and X-Score, Hi-PLS, etc. 16,34,39 Second, some prediction models gave very good accuracy, but they were based on only a relatively small data set. For instance, the work by Zhang et al 35 used a data set containing 264 PLCs with binding affinities (pK d ) and yielding the best R 2 ext of the models as 0.83.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

A novel method for protein‐ligand binding affinity prediction and the related descriptors exploration

Wang

et al. 2008

J Comput Chem

View full text Add to dashboard Cite

In this study, a novel method was developed to predict the binding affinity of protein-ligand based on a comprehensive set of structurally diverse protein-ligand complexes (PLCs). The 1300 PLCs with binding affinity (493 complexes with K(d) and 807 complexes with K(i)) from the refined dataset of PDBbind Database (release 2007) were studied in the predictive model development. In this method, each complex was described using calculated descriptors from three blocks: protein sequence, ligand structure, and binding pocket. Thereafter, the PLCs data were rationally split into representative training and test sets by full consideration of the validation of the models. The molecular descriptors relevant to the binding affinity were selected using the ReliefF method combined with least squares support vector machines (LS-SVMs) modeling method based on the training data set. Two final optimized LS-SVMs models were developed using the selected descriptors to predict the binding affinities of K(d) and K(i). The correlation coefficients (R) of training set and test set for K(d) model were 0.890 and 0.833. The corresponding correlation coefficients for the K(i) model were 0.922 and 0.742, respectively. The prediction method proposed in this work can give better generalization ability than other recently published methods and can be used as an alternative fast filter in the virtual screening of large chemical database.

show abstract

Section: Data Setsmentioning

confidence: 99%

Section: The External Validation Of Modelsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

A novel method for protein‐ligand binding affinity prediction and the related descriptors exploration

Wang

et al. 2008

J Comput Chem

View full text Add to dashboard Cite

show abstract

“…In this case, the Chemogenomics-based data representation requires three ingredients of efforts. The first is the features for the targets (e.g., protein structures [Lindströ m et al, 2006], amino acid sequence [Jacob and Vert, 2008], binding site descriptors [Strö mbergsson et al, 2008;Deng et al, 2004], etc.). This part is new from conventional single-target oriented SAR methods, but has already been well studied in structural biology independently, and thus in principle a good feature representation from their studies can be applied directly for chemogenomics.…”

Section: Data Representation For Chemogenomics-based Sarmentioning

confidence: 99%

In silico structure‐activity‐relationship (SAR) models from machine learning: a review

Ning

Karypis

2010

Drug Development Research

View full text Add to dashboard Cite

In this article, we review the recent development for in silico Structure-Activity-Relationship (SAR) models using machine-learning techniques. The review focuses on the following topics: machine-learning algorithms for computational SAR models, single-target-oriented SAR methodologies, Chemogenomics, and future trends. We try to provide the state-of-the-art SAR methods as well as the most up-to-date advancement, in order for the researchers to have a general overview at this area. Drug Dev Res 72:138-146, 2011.

show abstract

Chemoinformatics Taking Biology into Account: Proteochemometrics

Wikberg¹,

Spjuth²,

Eklund³

et al. 2011

Computational Approaches in Cheminformatics and Bioinformatics

View full text Add to dashboard Cite

Hierarchical PLS Modeling for Predicting the Binding of a Comprehensive Set of Structurally Diverse Protein−Ligand Complexes

Cited by 30 publications

References 59 publications

A novel method for protein‐ligand binding affinity prediction and the related descriptors exploration

A novel method for protein‐ligand binding affinity prediction and the related descriptors exploration

In silico structure‐activity‐relationship (SAR) models from machine learning: a review

Chemoinformatics Taking Biology into Account: Proteochemometrics

Contact Info

Product

Resources

About