“…Additionally, clashes with the protein were observed due to SILVR’s model not incorporating protein information. This is further compounded by the fact that our base model, MolDiff 15 , surpasses the base model that was used for SILVR, EDM 9 , primarily because it models and diffuses the bonds of the molecule, resulting in the generation of molecules with better validity and Synthetic Accessibility 40 .…”
Section: Resultsmentioning
confidence: 99%
“…As our base model we use MolDiff 15 as it is among the top-performing models for molecule generation 15 . MolDiff was pre-trained on the GEOM-Drug dataset 31 .…”
Section: Methodsmentioning
confidence: 99%
“…In the context of the diffusion model framework, following the framework introduced in the MolDiff paper 15 , during the reverse process, the Markov chain is reversed to reconstruct the true sample. This involves using E (3)-equivariant neural networks to parameterize the transition p θ ( M t− 1 | M t ) from prior distributions, where and .…”
Section: Methodsmentioning
confidence: 99%
“…To address the limitations of autoregressive models, recent studies 9,15–18 have turned to diffusion models 19 . These models iteratively denoise data points sampled from a prior distribution to generate samples.…”
Section: Introductionmentioning
confidence: 99%
“…The limited volume of experimentally determined structures of protein-ligand complexes often leads models to learn dataset biases rather than grasping the true biophysical principles underlying ligand-protein interactions 21 . Diffusion methods have also been developed and trained on the far larger dataset of just molecules which allows better coverage of drug-like space potentially leading to more viable and synthesizable drug candidates 15 .…”
Generative models have emerged as potentially powerful methods for molecular design, yet challenges persist in generating molecules that effectively bind to the intended target. The ability to control the design process and incorporate prior knowledge would be highly beneficial for better tailoring molecules to fit specific binding sites. In this paper, we introduce MolSnapper, a novel tool that is able to condition diffusion models for structure-based drug design by seamlessly integrating expert knowledge in the form of 3D pharmacophores. We demonstrate through comprehensive testing on both CrossDocked and Binding MOAD datasets, that our method generates molecules better tailored to fit a given binding site, achieving high structural and chemical similarity to the original molecules. It also, when compared to alternative methods, yields approximately twice as many valid molecules.
“…Additionally, clashes with the protein were observed due to SILVR’s model not incorporating protein information. This is further compounded by the fact that our base model, MolDiff 15 , surpasses the base model that was used for SILVR, EDM 9 , primarily because it models and diffuses the bonds of the molecule, resulting in the generation of molecules with better validity and Synthetic Accessibility 40 .…”
Section: Resultsmentioning
confidence: 99%
“…As our base model we use MolDiff 15 as it is among the top-performing models for molecule generation 15 . MolDiff was pre-trained on the GEOM-Drug dataset 31 .…”
Section: Methodsmentioning
confidence: 99%
“…In the context of the diffusion model framework, following the framework introduced in the MolDiff paper 15 , during the reverse process, the Markov chain is reversed to reconstruct the true sample. This involves using E (3)-equivariant neural networks to parameterize the transition p θ ( M t− 1 | M t ) from prior distributions, where and .…”
Section: Methodsmentioning
confidence: 99%
“…To address the limitations of autoregressive models, recent studies 9,15–18 have turned to diffusion models 19 . These models iteratively denoise data points sampled from a prior distribution to generate samples.…”
Section: Introductionmentioning
confidence: 99%
“…The limited volume of experimentally determined structures of protein-ligand complexes often leads models to learn dataset biases rather than grasping the true biophysical principles underlying ligand-protein interactions 21 . Diffusion methods have also been developed and trained on the far larger dataset of just molecules which allows better coverage of drug-like space potentially leading to more viable and synthesizable drug candidates 15 .…”
Generative models have emerged as potentially powerful methods for molecular design, yet challenges persist in generating molecules that effectively bind to the intended target. The ability to control the design process and incorporate prior knowledge would be highly beneficial for better tailoring molecules to fit specific binding sites. In this paper, we introduce MolSnapper, a novel tool that is able to condition diffusion models for structure-based drug design by seamlessly integrating expert knowledge in the form of 3D pharmacophores. We demonstrate through comprehensive testing on both CrossDocked and Binding MOAD datasets, that our method generates molecules better tailored to fit a given binding site, achieving high structural and chemical similarity to the original molecules. It also, when compared to alternative methods, yields approximately twice as many valid molecules.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.