There are 20(200) possible amino-acid sequences for a 200-residue protein, of which the natural evolutionary process has sampled only an infinitesimal subset. De novo protein design explores the full sequence space, guided by the physical principles that underlie protein folding. Computational methodology has advanced to the point that a wide range of structures can be designed from scratch with atomic-level accuracy. Almost all protein engineering so far has involved the modification of naturally occurring proteins; it should now be possible to design new functional proteins from the ground up to tackle current challenges in biomedicine and nanotechnology.
Vaccine development to induce broadly neutralizing antibodies (bNAbs) against HIV-1 is a global health priority. Potent VRC01-class bNAbs against the CD4 binding site of HIV gp120 have been isolated from HIV-1-infected individuals; however, such bNAbs have not been induced by vaccination. Wild-type gp120 proteins lack detectable affinity for predicted germline precursors of VRC01-class bNAbs, making them poor immunogens to prime a VRC01-class response. We employed computation-guided, in vitro screening to engineer a germline-targeting gp120 outer domain immunogen that binds to multiple VRC01-class bNAbs and their germline precursors. When multimerized on nanoparticles, this immunogen (eOD-GT6) activates both germline and mature VRC01-class B cells. Thus, eOD-GT6 nanoparticles have promise as a vaccine prime candidate. In principle, similar germline-targeting strategies can be applied to other epitopes and pathogens.
The HIV envelope (Env) protein gp120 is protected from antibody recognition by a dense glycan shield. However, several of the recently identified PGT broadly neutralizing antibodies appear to interact directly with the HIV glycan coat. Crystal structures of Fabs PGT 127 and 128 with Man9 at 1.65 and 1.29 Å resolution, respectively, and glycan binding data delineate a specific high mannose binding site. Fab PGT 128 complexed with a fully-glycosylated gp120 outer domain at 3.25 Å reveals that the antibody penetrates the glycan shield and recognizes two conserved glycans as well as a short β-strand segment of the gp120 V3 loop, accounting for its high binding affinity and broad specificify. Furthermore, our data suggest that the high neutralization potency of PGT 127 and 128 IgGs may be mediated by cross-linking Env trimers on the viral surface.
Despite decades of work by structural biologists, there are still ~5200 protein families with unknown structure outside the range of comparative modeling. We show that Rosetta structure prediction guided by residue-residue contacts inferred from evolutionary information can accurately model proteins that belong to large families, and that metagenome sequence data more than triples the number of protein families with sufficient sequences for accurate modeling. We then integrate metagenome data, contact based structure matching and Rosetta structure calculations to generate models for 614 protein families with currently unknown structures; 206 are membrane proteins and 137 have folds not represented in the PDB. This approach provides the representative models for large protein families originally envisioned as the goal of the protein structure initiative at a fraction of the cost.
The Rosetta software suite for macromolecular modeling, docking, and design is widely used in pharmaceutical, industrial, academic, non-profit, and government laboratories. Despite its broad modeling capabilities, Rosetta remains consistently among leading software suites when compared to other methods created for highly specialized protein modeling and design tasks. Developed for over two decades by a global community of over 60 laboratories, Rosetta has undergone multiple refactorings, and now comprises over three million lines of code. Here we discuss methods developed in the last five years in Rosetta, involving the latest protocols for structure prediction; protein-protein and protein-small molecule docking; protein structure and interface design; loop modeling; the incorporation of various types of experimental data; modeling of peptides, antibodies and proteins in the immune system, nucleic acids, non-standard chemistries, carbohydrates, and membrane proteins. We briefly discuss improvements to the energy function, user interfaces, and usability of the software. Rosetta is available at www.rosettacommons.org.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.