While deep learning has revolutionized protein structure prediction, almost all experimentally characterized de novo protein designs have been generated using physically based approaches such as Rosetta. Here we describe a deep learning–based protein sequence design method, ProteinMPNN, with outstanding performance in both in silico and experimental tests. On native protein backbones, ProteinMPNN has a sequence recovery of 52.4%, compared to 32.9% for Rosetta. The amino acid sequence at different positions can be coupled between single or multiple chains, enabling application to a wide range of current protein design challenges. We demonstrate the broad utility and high accuracy of ProteinMPNN using X-ray crystallography, cryoEM and functional studies by rescuing previously failed designs, made using Rosetta or AlphaFold, of protein monomers, cyclic homo-oligomers, tetrahedral nanoparticles, and target binding proteins.
While deep learning has revolutionized protein structure prediction, almost all experimentally characterized de novo protein designs have been generated using physically based approaches such as Rosetta. Here we describe a deep learning based protein sequence design method, ProteinMPNN, with outstanding performance in both in silico and experimental tests. The amino acid sequence at different positions can be coupled between single or multiple chains, enabling application to a wide range of current protein design challenges. On native protein backbones, ProteinMPNN has a sequence recovery of 52.4%, compared to 32.9% for Rosetta. Incorporation of noise during training improves sequence recovery on protein structure models, and produces sequences which more robustly encode their structures as assessed using structure prediction algorithms. We demonstrate the broad utility and high accuracy of ProteinMPNN using X-ray crystallography, cryoEM and functional studies by rescuing previously failed designs, made using Rosetta or AlphaFold, of protein monomers, cyclic homo-oligomers, tetrahedral nanoparticles, and target binding proteins.
RNA has enormous potential as a therapeutic, yet, the successful application depends on efficient delivery strategies. In this study, we demonstrate that a designed artificial viral coat protein, which self-assembles with DNA to form rod-shaped virus-like particles (VLPs), also encapsulates and protects mRNA encoding enhanced green fluorescent protein (EGFP) and luciferase, and yields cellular expression of these proteins. The artificial viral coat protein consists of an oligolysine (K) for binding to the oligonucleotide, a silk protein-like midblock S = (GAGAGAGQ) that self-assembles into stiff rods, and a long hydrophilic random coil block C that shields the nucleic acid cargo from its environment. With mRNA, the C-S-K protein coassembles to form rod-shaped VLPs each encapsulating about one to five mRNA molecules. Inside the rod-shaped VLPs, the mRNAs are protected against degradation by RNAses, and VLPs also maintain their shape following incubation with serum. Despite the lack of cationic surface charge, the mRNA VLPs transfect cells with both EGFP and luciferase, although with a much lower efficiency than obtained by a lipoplex transfection reagent. The VLPs have a negligible toxicity and minimal hemolytic activity. Our results demonstrate that VLPs yield efficient packaging and shielding of mRNA and create the basis for implementation of additional virus-like functionalities to improve transfection and cell specificity, such as targeting functionalities.
We propose to exploit multivalent binding of solid-binding peptides (SBPs) for the physical attachment of antifouling polypeptide brushes on solid surfaces. Using a silica-binding peptide as a model SBP, we find that both tandem-repeated SBPs and SBPs repeated in branched architectures implemented via a multimerization domain work very well to improve the binding strength of polypeptide brushes, as compared to earlier designs with a single SBP. At the same time, for many of the designed sequences, either the solubility or the yield of recombinant production is low. For a single design, with the domain structure B - M - E , both solubility and yield of recombinant production were high. In this design, B is a silica-binding peptide, M is a highly thermostable, de novo-designed trimerization domain, and E is a hydrophilic elastin-like polypeptide. We show that the B - M - E triblock polypeptide rapidly assembles into highly stable polypeptide brushes on silica surfaces, with excellent antifouling properties against high concentrations of serum albumin. Given that SBPs attaching to a wide range of materials have been identified, the B - M - E triblock design provides a template for the development of polypeptides for coating many other materials such as metals or plastics.
The design of novel protein-protein interfaces using physics-based design methods such as Rosetta requires substantial computational resources and manual refinement by expert structural biologists. A new generation of deep learning methods promises to simplify protein-protein interface design and enable its application to a wide variety of problems by researchers from various scientific disciplines. Here we test the ability of a deep learning method for protein sequence design, ProteinMPNN, to design two-component tetrahedral protein nanomaterials and benchmark its performance against Rosetta. ProteinMPNN had a similar success rate to Rosetta, yielding 13 new experimentally confirmed assemblies, but required orders of magnitude less computation and no manual refinement. The interfaces designed by ProteinMPNN were substantially more polar than those designed by Rosetta, which facilitated in vitro assembly of the designed nanomaterials from independently purified components. Crystal structures of several of the assemblies confirmed the accuracy of the design method at high resolution. Our results showcase the potential of deep learning-based methods to unlock the widespread application of designed protein-protein interfaces and self-assembling protein nanomaterials in biotechnology.
We analyze modularity for a B-M-E triblock protein designed to selfassemble into antifouling coatings. Previously, we have shown that the design performs well on silica surfaces when B is taken to be a silica-binding peptide, M is a thermostable trimer domain, and E is the uncharged elastin-like polypeptide (ELP), E = (GSGVP) 40 .Here, we demonstrate that we can modulate the nature of the substrate on which the coatings form by choosing different solid-binding peptides as binding domain B and that we can modulate antifouling properties by choosing a different hydrophilic block E. Specifically, to arrive at antifouling coatings for gold surfaces, as binding block B we use the gold-binding peptide GBP1 (with the sequence MHGKTQATSGTIQS), while we replace the antifouling blocks E by zwitterionic ELPs of different lengths, E Z n = (GDGVP-GKGVP) n/2 , with n = 20, 40, or 80. We find that even the B-M-E proteins with the shortest E blocks make coatings on gold surfaces with excellent antifouling against 1% human serum (HS) and reasonable antifouling against 10% HS. This suggests that the B-M-E triblock protein can be easily adapted to form antifouling coatings on any substrate for which solid-binding peptide sequences are available.
Attaining molecular-level control over solidification processes is a crucial aspect of materials science. To control ice formation, organisms have evolved bewildering arrays of ice-binding proteins (IBPs) but these have poorly understood structure-activity relationships. We propose that reverse engineering using de novo computational protein design can shed light on structure-activity relationships of IBPs. We hypothesized that the model alpha-helical winter flounder antifreeze protein (wfAFP) uses an unusual under-twisting of its alpha-helix to align its putative ice-binding threonine residues in exactly the same direction. We test this hypothesis by designing a series of straight three-helix bundles with an ice-binding helix projecting threonines and two supporting helices constraining the twist of the ice-binding helix. We find that ice recrystallization inhibition by the designed proteins increases with the degree of designed under-twisting, thus validating our hypothesis and opening up new avenues for the computational design of ice-binding proteins.
Solid interfacing biomaterials is a crucial aspect of bionanotechnology and important for applications such as biosensing. Because of their potentially large contact area, flat solenoidal proteins are ideal scaffolds for designing proteins binding to surfaces of man‐made solids such as minerals, metals, and plastics. To explore this opportunity, a naturally occurring flat solenoidal protein: the Rhagium inquistor Antifreeze Protein from the insect Rhagium inquisitor is re‐designed. By mutating 4, 6, and 10 out of its 4 × 5 arrays of threonines into arginines, it have arrived at the silica‐binding proteins RiSiBP‐4, RiSiBP‐6, and RiSiBP‐10. Variants with increasing numbers of arginines bind stronger to silica, but are also less stable and increasingly difficult to produce. It is found that the RiSiBP‐6 variant binds strongly to silica yet still has good stability and easy production. It is shown that sfGFP‐RiSiBP‐6 fusions allow for the functional display of a monolayer of sfGFP cargo on silica surfaces, suggesting the general usefulness of flat solenoidal proteins as scaffolds for designing solid‐binding proteins.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.