malgorzata.kotulska@pwr.edu.pl.
Supplementary data are available at Bioinformatics online.
Supplementary data are available at Bioinformatics online.
Supplementary data are available at Bioinformatics online.
Contact sites between amino acids characterize important structural features of a protein. We investigated characteristics of contact sites in a representative set of proteins and their relations between protein class or topology. For this purpose, we used a non-redundant set of 5872 protein domains, identically categorized by CATH and SCOP databases. The proteins represented alpha, beta, and alpha+beta classes. Contact maps of protein structures were obtained for a selected set of physical distances in the main backbone and separations in protein sequences. For each set a dependency between contact degree and distance parameters was quantified. We indicated residues forming contact sites most frequently and unique amino acid pairs which created contact sites most often within each structural class. Contact characteristics of specific topologies were compared to the characteristics of their protein classes showing protein groups with a distinguished contact characteristic. We showed that our results could be used to improve the performance of recent top contact predictor — direct coupling analysis. Our work provides values of contact site propensities that can be involved in bioinformatic databases.Electronic supplementary materialThe online version of this article (doi:10.1007/s00894-014-2497-9) contains supplementary material, which is available to authorized users.
Knowledge about the three dimensional structure of proteins is crucial in order to learn about their behavior, stability, or role as a target in drug design. Unfortunately, traditional experimental methods used in structure determination such as X-ray crystallography and NMR are costly and time-consuming. Therefore, computational methods that allow for protein structure reconstruction from sequence only are greatly desired. One of these is the recently developed direct coupling analysis (DCA) method [1, 2] which achieves the best results in residue-residue contact prediction from multiple sequence alignments only. Predicted contacts are used as restraints in the reconstruction of the three-dimensional structure of a protein. Unfortunately, the accuracy of DCA methods is on the order of 40% among the 100 strongest predicted contacts, which is insufficient for ab initio protein structure reconstruction. However, the results of DCA can support protein structure reconstruction in a different way.Our results show that DCA can indicate the best protein structure among its structural variants by the prediction of residue-residue contacts [3]. We counted the number of correctly predicted contacts within the strongest 100 DCA predictions for a set of obsolete PDB entries and their successors and for 22 proteins for which the Decoys 'R' Us database [4] provided properly folded and misfolded structures. These numbers were related to structure similarity scores, such as RMSD or TM-score [5]. DCA correctly predicts significantly more contacts for properly folded structures than for misfolded ones. Our method works much better for structures determined with X-ray crystallography than with the NMR spectroscopy [3]. The method will not detect misfolded proteins per se, but when a protein structure experimentalist needs to choose between alternative folds for the same protein, DCA can help.
Knowledge about the three dimensional structure of proteins is crucial in order to learn about their behavior, stability, or role as a target in drug design. Unfortunately, traditional experimental methods used in structure determination such as X-ray crystallography and NMR are costly and time-consuming. Therefore, computational methods that allow for protein structure reconstruction from sequence only are greatly desired. One of these is the recently developed direct coupling analysis (DCA) method [1, 2] which achieves the best results in residue-residue contact prediction from multiple sequence alignments only. Predicted contacts are used as restraints in the reconstruction of the three-dimensional structure of a protein. Unfortunately, the accuracy of DCA methods is on the order of 40% among the 100 strongest predicted contacts, which is insufficient for ab initio protein structure reconstruction. However, the results of DCA can support protein structure reconstruction in a different way.Our results show that DCA can indicate the best protein structure among its structural variants by the prediction of residue-residue contacts [3]. We counted the number of correctly predicted contacts within the strongest 100 DCA predictions for a set of obsolete PDB entries and their successors and for 22 proteins for which the Decoys 'R' Us database [4] provided properly folded and misfolded structures. These numbers were related to structure similarity scores, such as RMSD or TM-score [5]. DCA correctly predicts significantly more contacts for properly folded structures than for misfolded ones. Our method works much better for structures determined with X-ray crystallography than with the NMR spectroscopy [3]. The method will not detect misfolded proteins per se, but when a protein structure experimentalist needs to choose between alternative folds for the same protein, DCA can help.
We introduce a database containing peptides related to diseases arising from protein aggregation. The general database AmyLoad includes all experimentally studied protein fragments that could be involved in erroneous protein folding, leading to amyloid formation. The database has been extended since its first release with regard to new instances of peptides or their fragments. Moreover, information of related diseases have been added to all entries, whenever available. Currently the database includes all available peptides tested for their potential amyloid properties, obtained from diverse resources, creating the largest dataset available at one place. This enables comparison between properties of amyloid and non-amyloid peptides. We could also select candidates for the most pathogenic peptides, involved in several diseases related to protein aggregation. We also discuss a need for sub-databases of different structures, such as related to βγ-crystallins-a protein family occurring in the eye lens. Misfolding of these proteins may lead to various forms of cataract. Those freely available internet services can facilitate finding the link between a protein sequence, its propensity to aggregation and the resulting disease, as well as support research on their pharmacological treatment and prevention.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.