Motivation Local protein structure is usually described via classifying each peptide to a unique class from a set of pre-defined structures. These classifications may differ in the number of structural classes, the length of peptides, or class attribution criteria. Most methods that predict the local structure of a protein from its sequence first rely on some classification and only then proceed to the 3D conformation assessment. However, most classification methods rely on homologous proteins’ existence, unavoidably lose information by attributing a peptide to a single class or suffer from a suboptimal choice of the representative classes. Results To alleviate the above challenges, we propose a method that constructs a peptide’s structural representation from the sequence, reflecting its similarity to several basic representative structures. For 5-mer peptides and 16 representative structures, we achieved the Q16 classification accuracy of 67.9%, which is higher than what is currently reported in the literature. Our prediction method does not utilize information about protein homologues but relies only on the amino acids’ physicochemical properties and the resolved structures’ statistics. We also show that the 3D coordinates of a peptide can be uniquely recovered from its structural coordinates, and show the required conditions under various geometric constraints.
MotivationLocal protein structure is usually described via classifying each peptide to a unique element from a set of pre-defined structures. These so-called structural alphabets may differ in the number of structures or the length of peptides. Most methods that predict the local structure of a protein from its sequence rely on this kind of classification. However, since all peptides assigned to the same class are indistinguishable, such an approach may not be sufficient to model protein folding with high accuracy.ResultsWe developed a method that predicts the structural representation of a peptide from its sequence. For 5-mer peptides, we achieved the Q16 classification accuracy of 67.9%, which is higher than what is currently reported in the literature. Importantly, our prediction method does not utilize information about protein homologues but only physicochemical properties of the amino acids and the statistics of the structures, but relies on a comprehensive feature-generation procedure based only on the protein sequence and the statistics of resolved structures. We also show that the 3D coordinates of a peptide can be uniquely recovered from its structural coordinates, and show the required conditions for that under various geometric constraints.AvailabilityThe online implementation of the method is provided freely at http://pbpred.eimb.ruContactmilch@eimb.ru or vmilchev@uni-koeln.deSupplementary informationSupplementary data are available online at http://pbpred.eimb.ru/S/index.html
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.