Highlights d Graph neural network generates new proteins with predetermined topologies d Probabilities assigned to individual amino acids correlate with stability of mutants d Probabilities assigned to amino acid sequences correlate with stability of designs d Orders of magnitude faster than traditional approaches
Blocking the association between the severe acute respiratory syndrome coronavirus 2
(SARS-CoV-2) spike protein receptor-binding domain (RBD) and the human
angiotensin-converting enzyme 2 (ACE2) is an attractive therapeutic approach to prevent
the virus from entering human cells. While antibodies and other modalities have been
developed to this end,
d
-amino acid peptides offer unique advantages, including
serum stability, low immunogenicity, and low cost of production. Here, we designed
potent novel D-peptide inhibitors that mimic the ACE2 α1-binding helix by
searching a mirror-image version of the PDB. The two best designs bound the RBD with
affinities of 29 and 31 nM and blocked the infection of Vero cells by SARS-CoV-2 with
IC
50
values of 5.76 and 6.56 μM, respectively. Notably, both
D-peptides neutralized with a similar potency the infection of two variants of concern:
B.1.1.7 and B.1.351
in vitro
. These potent D-peptide inhibitors are
promising lead candidates for developing SARS-CoV-2 prophylactic or therapeutic
treatments.
Consensus-designed tetratricopeptide repeat proteins are highly stable, modular proteins that are strikingly amenable to rational engineering. They therefore have tremendous potential as building blocks for biomaterials and biomedicine. Here, we explore the possibility of extending the loops between repeats to enable further diversification, and we investigate how this modification affects stability and folding cooperativity. We find that extending a single loop by up to 25 residues does not disrupt the overall protein structure, but, strikingly, the effect on stability is highly context-dependent: in a two-repeat array, destabilization is relatively small and can be accounted for purely in entropic terms, whereas extending a loop in the middle of a large array is much more costly because of weakening of the interaction between the repeats. Our findings provide important and, to our knowledge, new insights that increase our understanding of the structure, folding, and function of natural repeat proteins and the design of artificial repeat proteins in biotechnology.
Protein structure and function is determined by the arrangement of the linear sequence of amino acids in 3D space. Despite substantial advances, precisely designing sequences that fold into a predetermined shape (the "protein design" problem) remains difficult. We show that a deep graph neural network, ProteinSolver, can solve protein design by phrasing it as a constraint satisfaction problem (CSP). To sidestep the considerable issue of optimizing the network architecture, we first develop a network that is accurately able to solve the related and straightforward problem of Sudoku puzzles. Recognizing that each protein design CSP has many solutions, we train this network on millions of real protein sequences corresponding to thousands of protein structures. We show that our method rapidly designs novel protein sequences and perform a variety of in silico and in vitro validations suggesting that our designed proteins adopt the predetermined structures.One Sentence Summary: A neural network optimized using Sudoku puzzles designs protein sequences that adopt predetermined structures.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.