Alexander Derry scite author profile

The task of protein sequence design is central to nearly all rational protein engineering problems, and enormous effort has gone into the development of energy functions to guide design. Here, we investigate the capability of a deep neural network model to automate design of sequences onto protein backbones, having learned directly from crystal structure data and without any human-specified priors. The model generalizes to native topologies not seen during training, producing experimentally stable designs. We evaluate the generalizability of our method to a de novo TIM-barrel scaffold. The model produces novel sequences, and high-resolution crystal structures of two designs show excellent agreement with in silico models. Our findings demonstrate the tractability of an entirely learned method for protein sequence design.

show abstract

ATOM3D: Tasks On Molecules in Three Dimensions

Townshend¹,

Vögele²,

Suriana³

et al. 2020

Preprint

View full text Add to dashboard Cite

Computational methods that operate directly on three-dimensional molecular structure hold large potential to solve important questions in biology and chemistry. In particular deep neural networks have recently gained significant attention. In this work we present ATOM3D, a collection of both novel and existing datasets spanning several key classes of biomolecules, to systematically assess such learning methods. We develop three-dimensional molecular learning networks for each of these tasks, finding that they consistently improve performance relative to oneand two-dimensional methods. The specific choice of architecture proves to be critical for performance, with three-dimensional convolutional networks excelling at tasks involving complex geometries, while graph networks perform well on systems requiring detailed positional information. Furthermore, equivariant networks show significant promise. Our results indicate many molecular problems stand to gain from three-dimensional molecular learning. All code and datasets can be accessed via https://www.atom3d.ai.

show abstract

Protein Sequence Design with a Learned Potential

Anand

Eguchi

Mathews

et al. 2020

Preprint

View full text Add to dashboard Cite

The primary challenge of fixed-backbone protein sequence design is to find a distribution of sequences that fold to the backbone of interest. In practice, state-of-the-art protocols often find viable but highly convergent solutions. In this study, we propose a novel method for fixed-backbone protein sequence design using a learned deep neural network potential. We train a convolutional neural network (CNN) to predict a distribution over amino acids at each residue position conditioned on the local structural environment around the residues. Our method for sequence design involves iteratively sampling from this conditional distribution. We demonstrate that this approach is able to produce feasible, novel designs with quality on par with the state-of-the-art, while achieving greater design diversity. In terms of generalizability, our method produces plausible and variable designs for a de novo TIM-barrel structure, showcasing its practical utility in design applications for which there are no known native structures.

show abstract

A Literature-Based Knowledge Graph Embedding Method for Identifying Drug Repurposing Opportunities in Rare Diseases

Sosa

Derry

Guo

et al. 2019

Preprint

View full text Add to dashboard Cite

One in ten people are affected by rare diseases, and three out of ten children with rare diseases will not live past age five. However, the small market size of individual rare diseases, combined with the time and capital requirements of pharmaceutical R&D, have hindered the development of new drugs for these cases. A promising alternative is drug repurposing, whereby existing FDA-approved drugs might be used to treat diseases different from their original indications. In order to generate drug repurposing hypotheses in a systematic and comprehensive fashion, it is essential to integrate information from across the literature of pharmacology, genetics, and pathology. To this end, we leverage a newly developed knowledge graph, the Global Network of Biomedical Relationships (GNBR). GNBR is a large, heterogeneous knowledge graph comprising drug, disease, and gene (or protein) entities linked by a small set of semantic themes derived from the abstracts of biomedical literature. We apply a knowledge graph embedding method that explicitly models the uncertainty associated with literature-derived relationships and uses link prediction to generate drug repurposing hypotheses. This approach achieves high performance on a gold-standard test set of known drug indications (AUROC = 0.89) and is capable of generating novel repurposing hypotheses, which we independently validate using external literature sources and protein interaction networks. Finally, we demonstrate the ability of our model to produce explanations of its predictions.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Alexander Derry

Flexible and stretchable nanowire-coated fibers for optoelectronic probing of spinal cord circuits

Protein sequence design with a learned potential

ATOM3D: Tasks On Molecules in Three Dimensions

Protein Sequence Design with a Learned Potential

A Literature-Based Knowledge Graph Embedding Method for Identifying Drug Repurposing Opportunities in Rare Diseases

Contact Info

Product

Resources

About