2020
DOI: 10.1093/nargab/lqaa090
|View full text |Cite
|
Sign up to set email alerts
|

Machine learning a model for RNA structure prediction

Abstract: RNA function crucially depends on its structure. Thermodynamic models currently used for secondary structure prediction rely on computing the partition function of folding ensembles, and can thus estimate minimum free-energy structures and ensemble populations. These models sometimes fail in identifying native structures unless complemented by auxiliary experimental data. Here, we build a set of models that combine thermodynamic parameters, chemical probing data (DMS and SHAPE) and co-evolutionary data (direct… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
26
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 26 publications
(27 citation statements)
references
References 46 publications
0
26
0
Order By: Relevance
“…Similarly, another paper [ 137 ] reported that the F score of ContextFold also lowered by 24% when testing on a set of structurally dissimilar RNAs to the training set. Although most DL-based methods take many precautions to alleviate overfitting by many techniques (such as using regularization [ 100 ], enlarging dataset [ 97 ], adding constraints [ 99 ], or combining Turner’s nearest neighbor free energy parameters), the concerns about overfitting remain.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Similarly, another paper [ 137 ] reported that the F score of ContextFold also lowered by 24% when testing on a set of structurally dissimilar RNAs to the training set. Although most DL-based methods take many precautions to alleviate overfitting by many techniques (such as using regularization [ 100 ], enlarging dataset [ 97 ], adding constraints [ 99 ], or combining Turner’s nearest neighbor free energy parameters), the concerns about overfitting remain.…”
Section: Discussionmentioning
confidence: 99%
“…In addition to the encoded RNA sequences being used as the input, other information can also be incorporated into the DL model. Calonaci and colleagues [ 100 ] trained an ensemble model based on a combination of SHAPE data, co-evolutionary data (DCA), and RNA sequence data. Their model consists of a convolutional neural network (CNN) subnetwork and an MLP subnetwork to predict penalties based on SHAPE and DCA data, respectively, with an RNAfold [ 17 ] module to generate structures using RNA sequences and penalties.…”
Section: Ml-based Methodsmentioning
confidence: 99%
“…Given the impact of structure on RNA functionality, the accurate computational prediction of the secondary and tertiary structure of RNA is an ongoing area of great interest in the computational biology community (Calonaci et al, 2020;Cruz et al, 2012;Wayment-Steele et al, 2020).…”
Section: Introductionmentioning
confidence: 99%
“…Data driven approaches have also started to become popular that use machine learning to evaluate structures rather than FE or kinetic models. These include ContraFold, DMfold, and structure prediction with neural networks (Do et al, 2006;Wang et al, 2019;Calonaci et al, 2020). Furthermore, in the attempt to achieve advantages of model or data driven approaches, methods have been proposed that attempt to aggregate multiple information sources to get more accurate secondary structure prediction.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation