Abstract:Molecular docking predicts whether and how small molecules bind to a macromolecular target using a suitable 3D structure. Scoring functions for structure-based virtual screening primarily aim at discovering which molecules bind to the considered target when these form part of a library with a much higher proportion of non-binders. Classical scoring functions are essentially models building a linear mapping between the features describing a proteinligand complex and its binding label. Machine learning, a major … Show more
“…For instance, a retrospective analysis based on random decoys found that a ML-based SF (MIEC-SVM) was much more predictive than a classical SF (Autodock4.2) on the ALK target, which was exactly what was later observed prospectively [8]. This is not the only ML-based SF reporting excellent prospective SBVS results without any use of PM decoys [14,34]. It is important to note too that PM decoys are not required either to train or test QSAR models [35], despite predicting exactly the same in vitro potency/affinity endpoints as SFs (e.g.…”
Section: Selecting a Scoring Function Based On Your Own Evaluationmentioning
confidence: 67%
“…Intelligence (AI), have demonstrated remarkable accuracy on various drug design applications [9][10][11][12][13][14]. In particular, when re-scoring crystal structures with ligand-bound proteins, or even their redocked poses, SFs are now able to predict the affinities of these binding molecules with high accuracy on many targets (we just wrote a review [13] focusing on this problem and discussing many examples from the literature).…”
Section: Sfs Built With Machine Learning (Ml) Arguably the Most Devementioning
confidence: 99%
“…By contrast, the application of SFs to SBVS is a harder problem, as SFs may struggle to simultaneously predict the affinities of binding and non-binding molecules. With that said, ML-based SFs have been found to provide accurate SBVS performance on many targets [14]. This is particularly true for those targets for which some of their known binders can be employed to train target-specific ML-based SFs [15,16].There is however uncertainty as to which SF would be most appropriate for a given target.…”
Section: Sfs Built With Machine Learning (Ml) Arguably the Most Devementioning
confidence: 99%
“…As further discussed elsewhere [14], current SBVS benchmarks do not mimic real test sets and hence their ability to anticipate prospective performance must at best be suboptimal. A prime example of this is MUV [21], whose test sets were built from High-Throughput Screening (HTS) datasets.…”
Section: Selecting a Scoring Function Based On Published Evaluationsmentioning
Interest in docking technologies has grown parallel to the ever increasing number and diversity of 3D models for macromolecular therapeutic targets. Structure-Based Virtual Screening (SBVS) aims at leveraging these experimental structures to discover the necessary starting points for the drug discovery process. It is now established that Machine Learning (ML) can strongly enhance the predictive accuracy of scoring functions for SBVS by exploiting large datasets from targets, molecules and their associations. However, with greater choice, the question of which ML-based scoring function is the most suitable for prospective use on a given target has gained importance. Here we analyse two approaches to select an existing scoring function for the target along with a third approach consisting in generating a scoring function tailored to the target. These analyses required discussing the limitations of popular SBVS benchmarks, the alternatives to benchmark scoring functions for SBVS and how to generate them or use them using freely-available software.
“…For instance, a retrospective analysis based on random decoys found that a ML-based SF (MIEC-SVM) was much more predictive than a classical SF (Autodock4.2) on the ALK target, which was exactly what was later observed prospectively [8]. This is not the only ML-based SF reporting excellent prospective SBVS results without any use of PM decoys [14,34]. It is important to note too that PM decoys are not required either to train or test QSAR models [35], despite predicting exactly the same in vitro potency/affinity endpoints as SFs (e.g.…”
Section: Selecting a Scoring Function Based On Your Own Evaluationmentioning
confidence: 67%
“…Intelligence (AI), have demonstrated remarkable accuracy on various drug design applications [9][10][11][12][13][14]. In particular, when re-scoring crystal structures with ligand-bound proteins, or even their redocked poses, SFs are now able to predict the affinities of these binding molecules with high accuracy on many targets (we just wrote a review [13] focusing on this problem and discussing many examples from the literature).…”
Section: Sfs Built With Machine Learning (Ml) Arguably the Most Devementioning
confidence: 99%
“…By contrast, the application of SFs to SBVS is a harder problem, as SFs may struggle to simultaneously predict the affinities of binding and non-binding molecules. With that said, ML-based SFs have been found to provide accurate SBVS performance on many targets [14]. This is particularly true for those targets for which some of their known binders can be employed to train target-specific ML-based SFs [15,16].There is however uncertainty as to which SF would be most appropriate for a given target.…”
Section: Sfs Built With Machine Learning (Ml) Arguably the Most Devementioning
confidence: 99%
“…As further discussed elsewhere [14], current SBVS benchmarks do not mimic real test sets and hence their ability to anticipate prospective performance must at best be suboptimal. A prime example of this is MUV [21], whose test sets were built from High-Throughput Screening (HTS) datasets.…”
Section: Selecting a Scoring Function Based On Published Evaluationsmentioning
Interest in docking technologies has grown parallel to the ever increasing number and diversity of 3D models for macromolecular therapeutic targets. Structure-Based Virtual Screening (SBVS) aims at leveraging these experimental structures to discover the necessary starting points for the drug discovery process. It is now established that Machine Learning (ML) can strongly enhance the predictive accuracy of scoring functions for SBVS by exploiting large datasets from targets, molecules and their associations. However, with greater choice, the question of which ML-based scoring function is the most suitable for prospective use on a given target has gained importance. Here we analyse two approaches to select an existing scoring function for the target along with a third approach consisting in generating a scoring function tailored to the target. These analyses required discussing the limitations of popular SBVS benchmarks, the alternatives to benchmark scoring functions for SBVS and how to generate them or use them using freely-available software.
“…For instance, a retrospective analysis based on random decoys found that a ML-based SF (MIEC-SVM) was much more predictive than a classical SF (Autodock4.2) on the ALK target, which was exactly what was later observed prospectively [8]. This is not the only ML-based SF reporting excellent prospective SBVS results without any use of PM decoys [14,34]. It is important to note too that PM decoys are not required either to train or test QSAR models [35], despite predicting exactly the same in vitro potency/affinity endpoints as SFs (e.g.…”
Section: Selecting a Scoring Function Based On Your Own Evaluationmentioning
Interest in docking technologies has grown parallel to the ever increasing number and diversity of 3D models for macromolecular therapeutic targets. Structure-Based Virtual Screening (SBVS) aims at leveraging these experimental structures to discover the necessary starting points for the drug discovery process. It is now established that Machine Learning (ML) can strongly enhance the predictive accuracy of scoring functions for SBVS by exploiting large datasets from targets, molecules and their associations. However, with greater choice, the question of which ML-based scoring function is the most suitable for prospective use on a given target has gained importance. Here we analyse two approaches to select an existing scoring function for the target along with a third approach consisting in generating a scoring function tailored to the target. These analyses required discussing the limitations of popular SBVS benchmarks, the alternatives to benchmark scoring functions for SBVS and how to generate them or use them using freely-available software.
Determining the target‐bound conformation of a drug‐like molecule is a crucial step in drug design, as it affects the outcome of virtual screening (VS), and paves the way for hit‐to‐lead and lead optimization. While most docking programs usually manage to produce at least a near‐native pose for a bioactive molecule inside its binding pocket, their integrated classical scoring functions (SFs) generally fail to prioritize this pose. Many studies have been carried out to tackle this SF problem, offering multiple pose refinement and/or classification methods, albeit with limitations. This study presents a new support vector machine model for pose classification, called “ClassyPose”, which predicts the probability that a receptor‐bound ligand conformation could be near‐native, without any additional pose optimization step. Trained on protein‐ligand extended connectivity features extracted from over 21 600 crystal and docking poses of diverse ligands, this model outperformed other machine‐learning algorithms and three existing SFs in terms of docking power, identifying the native ligand pose as top‐ranked solution for more than 90% of entries in two test sets. It also achieved high specificity (above 0.96), and improved VS performance when used for pose selection. This efficient, user‐friendly tool and all related data are available at https://github.com/vktrannguyen/Classy_Pose.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.