2015
DOI: 10.1371/journal.pone.0143166
|View full text |Cite
|
Sign up to set email alerts
|

Machine Learning: How Much Does It Tell about Protein Folding Rates?

Abstract: The prediction of protein folding rates is a necessary step towards understanding the principles of protein folding. Due to the increasing amount of experimental data, numerous protein folding models and predictors of protein folding rates have been developed in the last decade. The problem has also attracted the attention of scientists from computational fields, which led to the publication of several machine learning-based models to predict the rate of protein folding. Some of them claim to predict the logar… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
16
0
2

Year Published

2016
2016
2022
2022

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 18 publications
(18 citation statements)
references
References 36 publications
(66 reference statements)
0
16
0
2
Order By: Relevance
“…To summarize, the theoretical and semi-empirical methods (that use such meaningful parameters as the chain length [51,69,70], protein globule cross-section [71], α-helical content [72], locality of contacts [73], contact order [74,75], etc., but do not use or use a very small number of adjustable parameters) show better predictive power and correlation with experiment than the current machine learning techniques that use too many adjustable parameters (provided that correlations are obtained on testing and not training sets) [99]. Given the still relatively low number of experimental points, the purely statistical and machine learning techniques can be currently useful only for fine-tuning small second-order corrections to the existing rough but physically or biologically meaningful estimates, or for finding relatively small corrections for parameters already known to play a physically or biologically meaningful role in folding [83].…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations
“…To summarize, the theoretical and semi-empirical methods (that use such meaningful parameters as the chain length [51,69,70], protein globule cross-section [71], α-helical content [72], locality of contacts [73], contact order [74,75], etc., but do not use or use a very small number of adjustable parameters) show better predictive power and correlation with experiment than the current machine learning techniques that use too many adjustable parameters (provided that correlations are obtained on testing and not training sets) [99]. Given the still relatively low number of experimental points, the purely statistical and machine learning techniques can be currently useful only for fine-tuning small second-order corrections to the existing rough but physically or biologically meaningful estimates, or for finding relatively small corrections for parameters already known to play a physically or biologically meaningful role in folding [83].…”
Section: Discussionmentioning
confidence: 99%
“…To study this phenomenon, which, in principle, can lead to drastically worse results obtained for the testing sets than those reported for the training sets, Corrales et al checked three machine learning methods by applying them to new data, data not used when building the models [99]. It turned out that for all three considered machine learning methods the obtained correlations were significantly worse than those declared in the original publications.…”
Section: Refinement Of Existing Estimates Of Protein Folding Timesmentioning
confidence: 98%
See 2 more Smart Citations
“…2ABD and 1ST7 are respectively, the bovine and yeast structures of the extensively studied four helix bundle ACBP (acyl-coenzyme A-binding protein), whose folding pathway could challenge our infeasible SSU restriction (section 2.5). ACBP is also interesting because, depending on experimental conditions, it can be classified as a multi-state folder [15,33]. 1QYS is a de novo protein with < 100 residues but exhibits non-cooperative behavior, and has a very stable intermediate structure [38].…”
Section: Protein Datasetsmentioning
confidence: 99%