Improving Docking-Based Virtual Screening Ability by Integrating Multiple Energy Auxiliary Terms from Molecular Docking Scoring

Ye, Wenling; Shen, Chao; Xiong, Guo-Li; Ding, Junjie; Lu, Aiping; Hou, Tingjun; Cao, Dong‐Sheng

doi:10.1021/acs.jcim.9b00977

Cited by 42 publications

(31 citation statements)

References 68 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…are surprising, considering the known fact that the performance of docking studies is highly dependent on the system under study, and the high false-positive rate associated with docking-based virtual screening [48][49][50][51][52]. Some of the reviewed papers arrive at assertive conclusions despite that no proper (either in silico or experimental) validation was performed [53][54][55]: 'the docking simulation results also indicate the synergistic interactions of 10 substances in the Melaleuca cajuputi essential oil exhibit the significant inhibition into the ACE2 and PDB6LU7 proteins.…”

Section: Balanced Expectations: An Example Of Covid-related Structure-based Virtual Screensmentioning

confidence: 99%

Can drug repurposing strategies be the solution to the COVID-19 crisis?

Bellera

Llanos

Gantner

et al. 2020

Expert Opinion on Drug Discovery

View full text Add to dashboard Cite

Introduction:The COVID-19 pandemic resulted in disastrous human and economic costs, mainly due to the initial lack of specific treatments. Complementary to immunotherapies, drug repurposing is possibly the best option to arrive at COVID-19 treatments in the short term. Areas covered: Repurposing prospects undergoing clinical trials or with some level of evidence emerging from clinical studies are overviewed. The authors discuss some possible intellectual property and commercial barriers to drug repurposing, and strategies to facilitate equitable access to incoming therapeutic solutions, highlighting the importance of collaborative drug discovery models. Based on a critical analysis of the available literature about in silico screens against SARS-CoV-2 main protease, the authors illustrate how frequently overconfident conclusions are being drawn in COVID-19-related literature. Expert opinion: Most of the current clinical trials on potential COVID-19 treatments are, in fact, drug repurposing examples. In October 2020, the FDA approved a repurposed antiviral, remdesivir, as the first treatment for COVID-19. Considering the high expectations invested in approaching therapeutic solutions, the scientific community must be careful not to raise unrealistic expectations. Today more than ever, the conclusions drawn in scientific reports have to be fully supported by the level of evidence, avoiding any sort of unfounded speculation.

show abstract

Section: Balanced Expectations: An Example Of Covid-related Structure-based Virtual Screensmentioning

confidence: 99%

Can drug repurposing strategies be the solution to the COVID-19 crisis?

Bellera

Llanos

Gantner

et al. 2020

Expert Opinion on Drug Discovery

View full text Add to dashboard Cite

show abstract

“…The features with the variance less than 0.01 were removed, followed by the standardization of the remaining features using the sklearn.preprocessing [ 66 ] module. Extreme gradient boosting (XGBoost) [ 67 ], a well-validated ML algorithm that has been widely used in the field of computer-aided drug design (CADD) [ 28 , 29 , 31 ], was utilized to construct the classification models. Some major hyper-parameters (Additional file 1 : Table S1) were tuned with the hyeropt [ 68 ] package and determined by the AUROC statistic based on the fivefold cross-validation.…”

Section: Methodsmentioning

confidence: 99%

The impact of cross-docked poses on performance of machine learning classifier for protein–ligand binding pose prediction

Shen

Gao

et al. 2021

J Cheminform

Self Cite

View full text Add to dashboard Cite

Structure-based drug design depends on the detailed knowledge of the three-dimensional (3D) structures of protein–ligand binding complexes, but accurate prediction of ligand-binding poses is still a major challenge for molecular docking due to deficiency of scoring functions (SFs) and ignorance of protein flexibility upon ligand binding. In this study, based on a cross-docking dataset dedicatedly constructed from the PDBbind database, we developed several XGBoost-trained classifiers to discriminate the near-native binding poses from decoys, and systematically assessed their performance with/without the involvement of the cross-docked poses in the training/test sets. The calculation results illustrate that using Extended Connectivity Interaction Features (ECIF), Vina energy terms and docking pose ranks as the features can achieve the best performance, according to the validation through the random splitting or refined-core splitting and the testing on the re-docked or cross-docked poses. Besides, it is found that, despite the significant decrease of the performance for the threefold clustered cross-validation, the inclusion of the Vina energy terms can effectively ensure the lower limit of the performance of the models and thus improve their generalization capability. Furthermore, our calculation results also highlight the importance of the incorporation of the cross-docked poses into the training of the SFs with wide application domain and high robustness for binding pose prediction. The source code and the newly-developed cross-docking datasets can be freely available at https://github.com/sc8668/ml_pose_prediction and https://zenodo.org/record/5525936, respectively, under an open-source license. We believe that our study may provide valuable guidance for the development and assessment of new machine learning-based SFs (MLSFs) for the predictions of protein–ligand binding poses.

show abstract

“…Unlike traditional SFs, machine learning (ML)-based scoring functions (MLSFs) do not have particular theory-motivated functional forms, and they are developed by learning from very large volumes of protein-ligand structural and interaction data through ML algorithms, such as random forest (RF), support vector machine (SVM), artificial neural network (ANN), gradient boosting decision tree (GBDT), etc [3,[5][6][7][8]. Consequently, MLSFs have the capability to capture the non-linear relationship between protein-ligand interaction features and binding mode that are difficult to be characterized by classical SFs, thus yielding better binding strength predictions [9,10]. However, in order to develop an MLSF, we need to generate a set of features to characterize protein-ligand interactions, and furthermore we need to be familiar with ML algorithms, which may be a difficult task for non-experts.…”

Section: Introductionmentioning

confidence: 99%

ASFP (Artificial Intelligence based Scoring Function Platform): a web server for the development of customized scoring functions

et al. 2021

Self Cite

View full text Add to dashboard Cite

Virtual screening (VS) based on molecular docking has emerged as one of the mainstream technologies of drug discovery due to its low cost and high efficiency. However, the scoring functions (SFs) implemented in most docking programs are not always accurate enough and how to improve their prediction accuracy is still a big challenge. Here, we propose an integrated platform called ASFP, a web server for the development of customized SFs for structure-based VS. There are three main modules in ASFP: (1) the descriptor generation module that can generate up to 3437 descriptors for the modelling of protein–ligand interactions; (2) the AI-based SF construction module that can establish target-specific SFs based on the pre-generated descriptors through three machine learning (ML) techniques; (3) the online prediction module that provides some well-constructed target-specific SFs for VS and an additional generic SF for binding affinity prediction. Our methodology has been validated on several benchmark datasets. The target-specific SFs can achieve an average ROC AUC of 0.973 towards 32 targets and the generic SF can achieve the Pearson correlation coefficient of 0.81 on the PDBbind version 2016 core set. To sum up, the ASFP server is a powerful tool for structure-based VS.

show abstract

Improving Docking-Based Virtual Screening Ability by Integrating Multiple Energy Auxiliary Terms from Molecular Docking Scoring

Cited by 42 publications

References 68 publications

Can drug repurposing strategies be the solution to the COVID-19 crisis?

Can drug repurposing strategies be the solution to the COVID-19 crisis?

The impact of cross-docked poses on performance of machine learning classifier for protein–ligand binding pose prediction

ASFP (Artificial Intelligence based Scoring Function Platform): a web server for the development of customized scoring functions

Contact Info

Product

Resources

About