Machine‐learning scoring functions for structure‐based virtual screening

Li, Hongjian; Sze, Kam-Heung; Lü, Gang; Ballester, Pedro J.

doi:10.1002/wcms.1478

“…For instance, a retrospective analysis based on random decoys found that a ML-based SF (MIEC-SVM) was much more predictive than a classical SF (Autodock4.2) on the ALK target, which was exactly what was later observed prospectively [8]. This is not the only ML-based SF reporting excellent prospective SBVS results without any use of PM decoys [14,34]. It is important to note too that PM decoys are not required either to train or test QSAR models [35], despite predicting exactly the same in vitro potency/affinity endpoints as SFs (e.g.…”

Section: Selecting a Scoring Function Based On Your Own Evaluationmentioning

confidence: 67%

“…Intelligence (AI), have demonstrated remarkable accuracy on various drug design applications [9][10][11][12][13][14]. In particular, when re-scoring crystal structures with ligand-bound proteins, or even their redocked poses, SFs are now able to predict the affinities of these binding molecules with high accuracy on many targets (we just wrote a review [13] focusing on this problem and discussing many examples from the literature).…”

Section: Sfs Built With Machine Learning (Ml) Arguably the Most Devementioning

confidence: 99%

“…By contrast, the application of SFs to SBVS is a harder problem, as SFs may struggle to simultaneously predict the affinities of binding and non-binding molecules. With that said, ML-based SFs have been found to provide accurate SBVS performance on many targets [14]. This is particularly true for those targets for which some of their known binders can be employed to train target-specific ML-based SFs [15,16].There is however uncertainty as to which SF would be most appropriate for a given target.…”

Section: Sfs Built With Machine Learning (Ml) Arguably the Most Devementioning

confidence: 99%

“…As further discussed elsewhere [14], current SBVS benchmarks do not mimic real test sets and hence their ability to anticipate prospective performance must at best be suboptimal. A prime example of this is MUV [21], whose test sets were built from High-Throughput Screening (HTS) datasets.…”

Section: Selecting a Scoring Function Based On Published Evaluationsmentioning

confidence: 99%

See 2 more Smart Citations

Selecting machine-learning scoring functions for structure-based virtual screening

Ballester

¹

2019

Drug Discovery Today: Technologies

Self Cite

View full text Add to dashboard Cite

Interest in docking technologies has grown parallel to the ever increasing number and diversity of 3D models for macromolecular therapeutic targets. Structure-Based Virtual Screening (SBVS) aims at leveraging these experimental structures to discover the necessary starting points for the drug discovery process. It is now established that Machine Learning (ML) can strongly enhance the predictive accuracy of scoring functions for SBVS by exploiting large datasets from targets, molecules and their associations. However, with greater choice, the question of which ML-based scoring function is the most suitable for prospective use on a given target has gained importance. Here we analyse two approaches to select an existing scoring function for the target along with a third approach consisting in generating a scoring function tailored to the target. These analyses required discussing the limitations of popular SBVS benchmarks, the alternatives to benchmark scoring functions for SBVS and how to generate them or use them using freely-available software.

show abstract

“…For instance, a retrospective analysis based on random decoys found that a ML-based SF (MIEC-SVM) was much more predictive than a classical SF (Autodock4.2) on the ALK target, which was exactly what was later observed prospectively [8]. This is not the only ML-based SF reporting excellent prospective SBVS results without any use of PM decoys [14,34]. It is important to note too that PM decoys are not required either to train or test QSAR models [35], despite predicting exactly the same in vitro potency/affinity endpoints as SFs (e.g.…”

Section: Selecting a Scoring Function Based On Your Own Evaluationmentioning

confidence: 67%

Selecting Machine-Learning Scoring Functions for Structure-Based Virtual Screening

Ballester¹

2020

Preprint

View full text Add to dashboard Cite

Interest in docking technologies has grown parallel to the ever increasing number and diversity of 3D models for macromolecular therapeutic targets. Structure-Based Virtual Screening (SBVS) aims at leveraging these experimental structures to discover the necessary starting points for the drug discovery process. It is now established that Machine Learning (ML) can strongly enhance the predictive accuracy of scoring functions for SBVS by exploiting large datasets from targets, molecules and their associations. However, with greater choice, the question of which ML-based scoring function is the most suitable for prospective use on a given target has gained importance. Here we analyse two approaches to select an existing scoring function for the target along with a third approach consisting in generating a scoring function tailored to the target. These analyses required discussing the limitations of popular SBVS benchmarks, the alternatives to benchmark scoring functions for SBVS and how to generate them or use them using freely-available software.

show abstract

ClassyPose: A Machine‐Learning Classification Model for Ligand Pose Selection Applied to Virtual Screening in Drug Discovery

Tran‐Nguyen,

Camproux,

Taboureau

2024

Advanced Intelligent Systems

0

View full text Add to dashboard Cite

Determining the target‐bound conformation of a drug‐like molecule is a crucial step in drug design, as it affects the outcome of virtual screening (VS), and paves the way for hit‐to‐lead and lead optimization. While most docking programs usually manage to produce at least a near‐native pose for a bioactive molecule inside its binding pocket, their integrated classical scoring functions (SFs) generally fail to prioritize this pose. Many studies have been carried out to tackle this SF problem, offering multiple pose refinement and/or classification methods, albeit with limitations. This study presents a new support vector machine model for pose classification, called “ClassyPose”, which predicts the probability that a receptor‐bound ligand conformation could be near‐native, without any additional pose optimization step. Trained on protein‐ligand extended connectivity features extracted from over 21 600 crystal and docking poses of diverse ligands, this model outperformed other machine‐learning algorithms and three existing SFs in terms of docking power, identifying the native ligand pose as top‐ranked solution for more than 90% of entries in two test sets. It also achieved high specificity (above 0.96), and improved VS performance when used for pose selection. This efficient, user‐friendly tool and all related data are available at https://github.com/vktrannguyen/Classy_Pose.

show abstract

Machine‐learning scoring functions for structure‐based virtual screening

Cited by 122 publications

References 134 publications

Selecting machine-learning scoring functions for structure-based virtual screening

Selecting machine-learning scoring functions for structure-based virtual screening

Selecting Machine-Learning Scoring Functions for Structure-Based Virtual Screening

ClassyPose: A Machine‐Learning Classification Model for Ligand Pose Selection Applied to Virtual Screening in Drug Discovery

Contact Info

Product

Resources

About