Determination of side‐chain conformations is an important step in protein structure prediction and protein design. Many such methods have been presented, although only a small number are in widespread use. SCWRL is one such method, and the SCWRL3 program (2003) has remained popular because of its speed, accuracy, and ease‐of‐use for the purpose of homology modeling. However, higher accuracy at comparable speed is desirable. This has been achieved in a new program SCWRL4 through: (1) a new backbone‐dependent rotamer library based on kernel density estimates; (2) averaging over samples of conformations about the positions in the rotamer library; (3) a fast anisotropic hydrogen bonding function; (4) a short‐range, soft van der Waals atom–atom interaction potential; (5) fast collision detection using k‐discrete oriented polytopes; (6) a tree decomposition algorithm to solve the combinatorial problem; and (7) optimization of all parameters by determining the interaction graph within the crystal environment using symmetry operators of the crystallographic space group. Accuracies as a function of electron density of the side chains demonstrate that side chains with higher electron density are easier to predict than those with low‐electron density and presumed conformational disorder. For a testing set of 379 proteins, 86% of χ1 angles and 75% of χ1+2 angles are predicted correctly within 40° of the X‐ray positions. Among side chains with higher electron density (25–100th percentile), these numbers rise to 89 and 80%. The new program maintains its simple command‐line interface, designed for homology modeling, and is now available as a dynamic‐linked library for incorporation into other software programs. Proteins 2009. © 2009 Wiley‐Liss, Inc.
Rotamer libraries are used in protein structure determination, structure prediction, and design. The backbone-dependent rotamer library consists of rotamer frequencies and their mean dihedral angles and variances as a function of the backbone dihedral angles ϕ and ψ. Previous versions of this rotamer library were not developed with smoothness in mind, although some structure prediction and protein design methods would strongly benefit from smoothing. A new version of the backbone-dependent rotamer library has been developed using adaptive kernel density estimates for the rotamer frequencies and adaptive kernel regression for the mean dihedral angles and variances. The formulation presented allows for evaluation of the rotamer probabilities, mean angles and variances at any ϕ, ψ point, i.e. as a continuous function of ϕ and ψ. Continuous probability density estimates for the non-rotameric degrees of freedom of amides, carboxylates, and aromatic side chains have been modeled as a function of the backbone dihedral angles and rotamers of the remaining degrees of freedom. New backbone-dependent rotamer libraries at varying levels of smoothing are available from http://dunbrack.fccc.edu.
Over the past decade, the Rosetta biomolecular modeling suite has informed diverse biological questions and engineering challenges ranging from interpretation of low-resolution structural data to design of nanomaterials, protein therapeutics, and vaccines. Central to Rosetta’s success is the energy function: a model parameterized from small molecule and X-ray crystal structure data used to approximate the energy associated with each biomolecule conformation. This paper describes the mathematical models and physical concepts that underlie the latest Rosetta Energy Function, REF15. Applying these concepts, we explain how to use Rosetta energies to identify and analyze the features of biomolecular models. Finally, we discuss the latest advances in the energy function that extend capabilities from soluble proteins to also include membrane proteins, peptides containing non-canonical amino acids, small molecules, carbohydrates, nucleic acids, and other macromolecules.
Over the past decade, the Rosetta biomolecular modeling suite has informed diverse biological questions and engineering challenges ranging from interpretation of low-resolution structural data to design of nanomaterials, protein therapeutics, and vaccines. Central to Rosetta's success is the energy function: a model parameterized from small molecule and X-ray crystal structure data used to approximate the energy associated with each biomolecule conformation. This paper describes the mathematical models and physical concepts that underlie the latest Rosetta energy function, beta_nov15. Applying these concepts, we explain how to use Rosetta energies to identify and analyze the features of biomolecular models. Finally, we discuss the latest advances in the energy function that extend capabilities from soluble proteins to also include membrane proteins, peptides containing non-canonical amino acids, carbohydrates, nucleic acids, and other macromolecules.
Distributions of the backbone dihedral angles of proteins have been studied for over 40 years. While many statistical analyses have been presented, only a handful of probability densities are publicly available for use in structure validation and structure prediction methods. The available distributions differ in a number of important ways, which determine their usefulness for various purposes. These include: 1) input data size and criteria for structure inclusion (resolution, R-factor, etc.); 2) filtering of suspect conformations and outliers using B-factors or other features; 3) secondary structure of input data (e.g., whether helix and sheet are included; whether beta turns are included); 4) the method used for determining probability densities ranging from simple histograms to modern nonparametric density estimation; and 5) whether they include nearest neighbor effects on the distribution of conformations in different regions of the Ramachandran map. In this work, Ramachandran probability distributions are presented for residues in protein loops from a high-resolution data set with filtering based on calculated electron densities. Distributions for all 20 amino acids (with cis and trans proline treated separately) have been determined, as well as 420 left-neighbor and 420 right-neighbor dependent distributions. The neighbor-independent and neighbor-dependent probability densities have been accurately estimated using Bayesian nonparametric statistical analysis based on the Dirichlet process. In particular, we used hierarchical Dirichlet process priors, which allow sharing of information between densities for a particular residue type and different neighbor residue types. The resulting distributions are tested in a loop modeling benchmark with the program Rosetta, and are shown to improve protein loop conformation prediction significantly. The distributions are available at http://dunbrack.fccc.edu/hdp.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.