2022
DOI: 10.1093/bioinformatics/btac415
|View full text |Cite
|
Sign up to set email alerts
|

Deep learning models for RNA secondary structure prediction (probably) do not generalize across families

Abstract: Motivation The secondary structure of RNA is of importance to its function. Over the last few years, several papers attempted to use machine learning to improve de novo RNA secondary structure prediction. Many of these papers report impressive results for intra-family predictions, but seldom address the much more difficult (and practical) inter-family problem. Results We demonstrate that it is nearly trivial with convolutiona… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
55
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 45 publications
(60 citation statements)
references
References 66 publications
1
55
0
Order By: Relevance
“…SPOT-RNA [36], MXfold2 [37], UFold [38]). However, there is a risk of overtraining for some of these deep learning techniques, which would make some methods to perform poorly for unseen RNA families [39]. Thus, caution must be exercised when using these deep learning techniques.…”
Section: Discussionmentioning
confidence: 99%
“…SPOT-RNA [36], MXfold2 [37], UFold [38]). However, there is a risk of overtraining for some of these deep learning techniques, which would make some methods to perform poorly for unseen RNA families [39]. Thus, caution must be exercised when using these deep learning techniques.…”
Section: Discussionmentioning
confidence: 99%
“…Recently, more and more ML-based methods of RNA secondary structure prediction have been proposed . This is because the number of RNAs with known secondary structures is much larger than that of known 3D structures.…”
Section: Recent Advances In Rna 3d Structure Predictionmentioning
confidence: 99%
“…It even has had a profound impact on the field of structural biology. Recently, different ML models have also been applied to solve 73 This is because the number of RNAs with known secondary structures is much larger than that of known 3D structures. For example, the RNA secondary structure database bpRNA-1m has 102,348 RNA sequences.…”
Section: Recent Advances In Rna 3d Structure Predictionmentioning
confidence: 99%
“…A number of highly successful de novo DL models have been reported, such as 2dRNA [21], ATTfold [22], DMfold [23], E2Efold [24], MXfold2 [25], SPOT-RNA [26], and Ufold [27], among others [28][29][30][31]. These DL models markedly outperform traditional algorithms, with even close-to-perfect predictions in some cases, though questions on the training vs. test similarity have been raised [32,33] and discussed below. It is worth noting that DL models have also been developed for base-level prediction tasks such as the pairing probability of each base [34].…”
Section: Introductionmentioning
confidence: 99%