Structural coordinates: A novel approach to predict protein backbone conformation

Milchevskaya, Vladislava; Nikitin, Alexei; Lukshin, Sergey A.; Filatov, I. V.; Kravatsky, Yuri V.; Tumanyan, V. G.; Ng, Esipova; Milchevskiy, Yury V.

doi:10.1371/journal.pone.0239793

Cited by 3 publications

(9 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In our recent work [16], we proposed using the same protein blocks as cluster centers, but we assigned cluster labels based on the root mean square deviation (RMSD) [17] distance instead of the RMSDA. The RMSD for protein blocks can be written as follows:…”

Section: Introductionmentioning

confidence: 99%

“…., 3M denote the Cartesian coordinates of the backbone atoms N, Cα, and C, respectively, of the M-residue protein block PB k (M = 5 for the PBs as in (1)), whereas the minimum is taken over by all of the spatial colocalization of the two protein blocks PB 1 and PB 2 . By definition, an RMSD is close to zero only when two structures are identical in three dimensions, and using the RMSD may therefore be preferable to using the RMSDA, which does not always satisfy this criterion [16]. However, RMSD calculations require assessing all possible alignments between two query fragments, and thus, they have higher computational costs than RMSDA calculations.…”

Section: Introductionmentioning

confidence: 99%

“…In the context of 3D coordinate reconstruction, cluster centers are treated as basic structures, and distances from the query fragment to these basic structures are referred to as "structural coordinates". We previously demonstrated that, given a large enough set of basic structures, one can reconstruct the coordinates of the backbone atoms of a protein fragment [16]. The number of basic structures required for such a reconstruction depends on the number of degrees of freedom (i.e., bond angles, bond lengths, and dihedral angles).…”

Section: Introductionmentioning

confidence: 99%

“…For instance, for a five-residue fragment with all bond lengths and bond angles considered to be fixed, one would need at least twelve basic structures to reconstruct the conformation of the backbone. We previously developed and implemented an algorithm [16] that could unambiguously reconstruct a protein's backbone coordinates from its "structural coordinates" representation.…”

Section: Introductionmentioning

confidence: 99%

“…In our previous work [16], we did not utilize evolutionary information. Instead, our feature encoding relied solely on amino acids' physicochemical properties and resolved structures' statistics.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Effective Local and Secondary Protein Structure Prediction by Combining a Neural Network-Based Approach with Extensive Feature Design and Selection without Reliance on Evolutionary Information

Milchevskiy,

Milchevskaya,

Nikitin

et al. 2023

IJMS

Self Cite

View full text Add to dashboard Cite

Protein structure prediction continues to pose multiple challenges despite outstanding progress that is largely attributable to the use of novel machine learning techniques. One of the widely used representations of local 3D structure—protein blocks (PBs)—can be treated in a similar way to secondary structure classes. Here, we present a new approach for predicting local conformation in terms of PB classes solely from amino acid sequences. We apply the RMSD metric to ensure unambiguous future 3D protein structure recovery. The selection of statistically assessed features is a key component of the proposed method. We suggest that ML input features should be created from the statistically significant predictors that are derived from the amino acids’ physicochemical properties and the resolved structures’ statistics. The statistical significance of the suggested features was assessed using a stepwise regression analysis that permitted the evaluation of the contribution and statistical significance of each predictor. We used the set of 380 statistically significant predictors as a learning model for the regression neural network that was trained using the PISCES30 dataset. When using the same dataset and metrics for benchmarking, our method outperformed all other methods reported in the literature for the CB513 nonredundant dataset (for the PBs, Q16 = 81.01%, and for the DSSP, Q3 = 85.99% and Q8 = 79.35%).

show abstract

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Effective Local and Secondary Protein Structure Prediction by Combining a Neural Network-Based Approach with Extensive Feature Design and Selection without Reliance on Evolutionary Information

Milchevskiy,

Milchevskaya,

Nikitin

et al. 2023

IJMS

Self Cite

View full text Add to dashboard Cite

show abstract

Method to Generate Complex Predictive Features for Machine Learning-Based Prediction of the Local Structure and Functions of Proteins

Milchevskiy

Milchevskaya²,

Kravatsky³

2023

Mol Biol

View full text Add to dashboard Cite

A Method to Generate Complex Predictive Features for ML-Based Prediction of the Local Protein Structure

Milchevskiy

Milchevskaya

Kravatsky

2023

Molekulârnaâ biologiâ

View full text Add to dashboard Cite

Recently, the prediction of protein structure and function from its sequence underwent a rapid increase in performance. It is primarily due to the application of machine learning methods, many of which rely on the predictive features supplied to them. It is thus crucial to retrieve the information encoded in the amino acid sequence of a protein. Here, we propose a method to generate a set of complex yet interpretable predictors, which aids in revealing factors that influence protein conformation. The proposed method allows us to generate predictive features and test them for significance in two scenarios: for a general description of the protein structures and functions, as well as for highly specific predictive tasks. Having generated an exhaustive set of predictors, we narrow it down to a smaller curated set of informative features using feature selection methods, which increases the performance of subsequent predictive modelling. We illustrate the effectiveness of the proposed methodology by applying it in the context of local protein structure prediction, where the rate of correct prediction for DSSP Q3 (three-class classification) is 81.3%. The method is implemented in C++ for command line use and can be run on any operating system. The source code is released on GitHub: https://github.com/Milchevskiy/protein-encoding-projects.

show abstract

Structural coordinates: A novel approach to predict protein backbone conformation

Cited by 3 publications

References 28 publications

Effective Local and Secondary Protein Structure Prediction by Combining a Neural Network-Based Approach with Extensive Feature Design and Selection without Reliance on Evolutionary Information

Effective Local and Secondary Protein Structure Prediction by Combining a Neural Network-Based Approach with Extensive Feature Design and Selection without Reliance on Evolutionary Information

Method to Generate Complex Predictive Features for Machine Learning-Based Prediction of the Local Structure and Functions of Proteins

A Method to Generate Complex Predictive Features for ML-Based Prediction of the Local Protein Structure

Contact Info

Product

Resources

About