Infinite Physical Monkey: Do Deep Learning Methods Really Perform Better in Conformation Generation?

Zhang, Haotian; Zhang, Jintu; Zhao, Haijian; Jiang, Dejun; Deng, Yafeng

doi:10.1101/2023.03.08.531607

Cited by 5 publications

(9 citation statements)

References 22 publications

(29 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This speculation is supported by reports that resampling RDKit ensembles using clustering can substantially improve its coverage metric on GEOM-QM9 and GEOM-Drugs. 34 Energy Minimization Is a Valuable Postprocessing Step. Energy minimizing-generated conformers generally improve their ability to recapitulate bioactive conformers (Figures 3 and 4), and selecting the lowest-energy conformer generally performs better than a random conformer (Figures 5, S6, 7, and 11).…”

Section: ■ Discussionmentioning

confidence: 99%

“…33 Deep generative models significantly outperform RDKit at this particular task, but an extended sampling and clustering approach using RDKit achieves a highly competitive performance. 34 It is not clear that it is a fair comparison to compare methods that utilize different amounts of sampling, 35 so here we evaluate RDKit and a deep generative model using identical sampling and ensemble formation criteria. As the direct molecular conformation generation (DMCG) 14 was found to perform best at the task of reconstituting the ensembles of the GEOM-Drugs subset of GEOM, we evaluate it here at the task of bioactive conformation recovery.…”

Section: ■ Introductionmentioning

confidence: 99%

“…DMCG is an end-to-end generative model with a variational encoder/ decoder architecture that learns all network parameters from the training data distribution, GEOM-Drugs. 31 However, our main goal is not to extend previous evaluations 26,27,34 but to explore the impact of various choices made in the conformer generation process, such as the size of the ensemble, the criteria for including conformers in the ensemble, and the use (or not) of energy minimization, on the ultimate end point of the common structure-based tasks of pharmacophore search and molecular docking.…”

Section: ■ Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Conformer Generation for Structure-Based Drug Design: How Many and How Good?

McNutt,

Bisiriyu,

Song

et al. 2023

J. Chem. Inf. Model.

View full text Add to dashboard Cite

Conformer generation, the assignment of realistic 3D coordinates to a small molecule, is fundamental to structure-based drug design. Conformational ensembles are required for rigid-body matching algorithms, such as shape-based or pharmacophore approaches, and even methods that treat the ligand flexibly, such as docking, are dependent on the quality of the provided conformations due to not sampling all degrees of freedom (e.g., only sampling torsions). Here, we empirically elucidate some general principles about the size, diversity, and quality of the conformational ensembles needed to get the best performance in common structure-based drug discovery tasks. In many cases, our findings may parallel “common knowledge” well-known to practitioners of the field. Nonetheless, we feel that it is valuable to quantify these conformational effects while reproducing and expanding upon previous studies. Specifically, we investigate the performance of a state-of-the-art generative deep learning approach versus a more classical geometry-based approach, the effect of energy minimization as a postprocessing step, the effect of ensemble size (maximum number of conformers), and construction (filtering by root-mean-square deviation for diversity) and how these choices influence the ability to recapitulate bioactive conformations and perform pharmacophore screening and molecular docking.

show abstract

Section: ■ Discussionmentioning

confidence: 99%

Section: ■ Introductionmentioning

confidence: 99%

Section: ■ Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Conformer Generation for Structure-Based Drug Design: How Many and How Good?

McNutt,

Bisiriyu,

Song

et al. 2023

J. Chem. Inf. Model.

View full text Add to dashboard Cite

show abstract

“…We leave the details of molecular conformation generation to Appendix C.4, as paper [57] pointed out that the current benchmark for molecular conformation generation could be wrong.…”

Section: Molecular Conformation Generationmentioning

confidence: 99%

Uni-Mol: A Universal 3D Molecular Representation Learning Framework

Zhou

Gao

Ding

et al. 2023

Preprint

111

View full text Add to dashboard Cite

Molecular representation learning (MRL) has gained tremendous attention due to its critical role in learning from limited supervised data for applications like drug design. In most MRL methods, molecules are treated as 1D sequential tokens or 2D topology graphs, limiting their ability to incorporate 3D information for downstream tasks and, in particular, making it almost impossible for 3D geometry prediction/generation. In this paper, we propose a universal 3D MRL framework, called Uni-Mol, that significantly enlarges the representation ability and application scope of MRL schemes. Uni-Mol contains two pretrained models with the same SE(3) Transformer architecture: a molecular model pretrained by 209M molecular conformations; a pocket model pretrained by 3M candidate protein pocket data. Besides, Uni-Mol contains several finetuning strategies to apply the pretrained models to various downstream tasks. By properly incorporating 3D information, Uni-Mol outperforms SOTA in 14/15 molecular property prediction tasks. Moreover, Uni-Mol achieves superior performance in 3D spatial tasks, including protein-ligand binding pose prediction, molecular conformation generation, etc. The code, model, and data are made publicly available at https://github.com/dptech-corp/Uni-Mol.

show abstract

“…[5,12,13] only compares their method with other GNN methods), or the task is not on QSAR modeling (such as quantum mechanics dataset QM9 used in [9]). Recently, some authors expressed doubts about deep learning performance over traditional methods in molecular tasks [14,15]. Ultimately, it remains a mystery whether GNNs are consistently better than methods that rely on traditional descriptor in CADD [16,17].…”

Section: Introductionmentioning

confidence: 99%

Integrating Expert Knowledge with Deep Learning Improves QSAR Models for CADD Modeling

Liu

Moretti

Wang

et al. 2023

Preprint

View full text Add to dashboard Cite

In recent years several applications of graph neural networks (GNNs) to molecular tasks have emerged. Whether GNNs outperform the traditional descriptor-based methods in the quantitative structure activity relationship (QSAR) modeling in early computer-aided drug discovery (CADD) remains an open question. This paper introduces a simple yet effective strategy to boost the predictive power of QSAR deep learning models. The strategy proposes to train GNNs together with traditional descriptors, combining the strengths of both methods. The enhanced model consistently outperforms vanilla descriptors or GNN methods on nine well-curated high throughput screening datasets over diverse therapeutic targets.

show abstract

Infinite Physical Monkey: Do Deep Learning Methods Really Perform Better in Conformation Generation?

Cited by 5 publications

References 22 publications

Conformer Generation for Structure-Based Drug Design: How Many and How Good?

Conformer Generation for Structure-Based Drug Design: How Many and How Good?

Uni-Mol: A Universal 3D Molecular Representation Learning Framework

Integrating Expert Knowledge with Deep Learning Improves QSAR Models for CADD Modeling

Contact Info

Product

Resources

About