2024
DOI: 10.1101/2024.01.30.578025
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

RNA3DB: A structurally-dissimilar dataset split for training and benchmarking deep learning models for RNA structure prediction

Marcell Szikszai,
Marcin Magnus,
Siddhant Sanghi
et al.

Abstract: With advances in protein structure prediction thanks to deep learning models like AlphaFold, RNA structure prediction has recently received increased attention from deep learning researchers. RNAs introduce substantial challenges due to the sparser availability and lower structural diversity of the experimentally resolved RNA structures in comparison to protein structures. These challenges are often poorly addressed by the existing literature, many of which report inflated performance due to using training and… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 50 publications
0
1
0
Order By: Relevance
“…For instance, physics-based simulations utilize principles of molecular mechanics and dynamics to simulate folding pathways, while homology modeling (e.g., algorithms such as PSI-BLAST, HHblits, and HMMER) leverages evolutionary relationships between proteins to infer structures [13][14][15][16][17][18][19][20]. Of recent further interest, machine learning techniques, particularly deep learning, have emerged as powerful tools for predicting protein structures by learning patterns from large datasets [4,[21][22][23][24][25][26][27][28][29][30]. Recent advancements in deep learning, exemplified by AlphaFold, have revolutionized protein structure prediction.…”
Section: Introductionmentioning
confidence: 99%
“…For instance, physics-based simulations utilize principles of molecular mechanics and dynamics to simulate folding pathways, while homology modeling (e.g., algorithms such as PSI-BLAST, HHblits, and HMMER) leverages evolutionary relationships between proteins to infer structures [13][14][15][16][17][18][19][20]. Of recent further interest, machine learning techniques, particularly deep learning, have emerged as powerful tools for predicting protein structures by learning patterns from large datasets [4,[21][22][23][24][25][26][27][28][29][30]. Recent advancements in deep learning, exemplified by AlphaFold, have revolutionized protein structure prediction.…”
Section: Introductionmentioning
confidence: 99%
“…However, based on assessment of the CASP15 RNA modeling challenge and the metrics used therein, RNA structure prediction using deep learning approaches has not reached human-tailored model performance, and human modeling of RNA structure is still not at the level of protein structure prediction [10][11][12][13] . A fundamental weakness in RNA modeling is the state of RNA sequence, structural, and phenotypic databases available for training deep learning models 12,14 . Rfam, the closest analogue to Pfam, provides curated seed sequences, alignments and homology models for thousands of RNA families 15 .…”
Section: Introductionmentioning
confidence: 99%