Review of machine learning methods for RNA secondary structure prediction

Zhao, Qi; Zhao, Zheng; Fan, Xiaoya; Yuan, Zhengwei; Mao, Qian; Yao, Yu-Dong

doi:10.1371/journal.pcbi.1009291

Cited by 54 publications

(57 citation statements)

References 135 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Similarly, a powerful LNP degradation model, embracing thermodynamic stability, aggregation, and physical degradation, could predict both in-process and in vivo stability. Following the same logic, the available RNA sequence design strategy, underpinned by various data-driven, hybrid, and molecular dynamic modeling tools, could consider the sequence manufacturability in addition to RNA stability and translation [41,[115][116][117]. The secondary structure is indeed known to impact in-process degradation and capping accessibility and could play a potential role in LNP formation [37].…”

Section: Trends In Biotechnologymentioning

confidence: 99%

Quality by Design for enabling RNA platform production processes

Daniel¹,

Kis²,

Kontoravdi³

et al. 2022

Trends in Biotechnology

View full text Add to dashboard Cite

Section: Trends In Biotechnologymentioning

confidence: 99%

Quality by Design for enabling RNA platform production processes

Daniel¹,

Kis²,

Kontoravdi³

et al. 2022

Trends in Biotechnology

View full text Add to dashboard Cite

“…Examples are mfold [ 18 ], the first MFE-based RNA secondary structure prediction tool, the Vienna RNA package [ 19 ], UNAfold [ 20 ] and RNAstructure [ 21 ]. Recent advances in deep learning allow end-to-end prediction of RNA base-pairing structures with improved performance not only in canonical base pairs but also in pseudoknots and non-canonical base pairs associated with tertiary interactions [ 22–25 ].…”

Section: Introductionmentioning

confidence: 99%

“…Here, we will provide an overview of recent progresses in experimental and computational approaches to RNA SASA as it has been an overlooked area of research. Other recent reviews on studies of RNA secondary or tertiary structures can be found elsewhere [ 14 , 22 , 30 , 31 ].…”

Section: Introductionmentioning

confidence: 99%

Probing RNA structures and functions by solvent accessibility: an overview from experimental and computational perspectives

Solayman

Litfin

Singh

et al. 2022

Briefings in Bioinformatics

View full text Add to dashboard Cite

Characterizing RNA structures and functions have mostly been focused on 2D, secondary and 3D, tertiary structures. Recent advances in experimental and computational techniques for probing or predicting RNA solvent accessibility make this 1D representation of tertiary structures an increasingly attractive feature to explore. Here, we provide a survey of these recent developments, which indicate the emergence of solvent accessibility as a simple 1D property, adding to secondary and tertiary structures for investigating complex structure–function relations of RNAs.

show abstract

“…As such, there have been major interests in determining and understanding RNA secondary structures, via both experiment and computation (10)(11)(12). In recent years, with the emergence of sizeable RNA structure databases and the accessibility of powerful machine learning algorithms, data-centric deep-learning-based models, the subject of this study, have been successfully developed for RNA secondary structure prediction (13).…”

Section: Introductionmentioning

confidence: 99%

“…Varieties of efficient algorithms, especially based on dynamic programming and related techniques (17), have also been introduced along with improved scoring parameters (15,16). However, traditional algorithms have struggled to make significant gains in performance in the recent decade (13).…”

Section: Introductionmentioning

confidence: 99%

Decisive Roles of Sequence Distributions in the Generalizability ofde novoDeep Learning Models for RNA Secondary Structure Prediction

Qiu

2022

Preprint

View full text Add to dashboard Cite

The availability of sizeable RNA structure databases and powerful deep learning (DL) frameworks has prompted recent developments of DL models for RNA secondary structure prediction. Taking RNA sequences as the only inputs, the class of de novo DL models has demonstrated far superior performances than traditional algorithms. However, key questions remain over the statistical underpinning of such DL models which make no use of co-evolutionary information or physical laws of RNA folding. Here we present a quantitative study of the capacity and generalizability of a series of de novo DL models, with a minimal two-module architecture and no post-processing, under varied sequence distributions of seen and unseen datasets. Excellent performances are observed from our models with as few as 16K parameters, affirming their remarkable learning capacity. Our DL models prove to generalize well over non-identical unseen sequences, but the model generalizability degrades rapidly as the sequence distributions between the seen and unseen become dissimilar. Examinations of RNA family-specific behaviors manifest not only disparate family-dependent model performances but substantial generalization gaps within the same RNA family. We further determine how model generalization decreases with the decrease of sequence similarity via pairwise sequence alignment, providing quantitative insights into the limitations of statistical learning. Model generalizability thus poses major hurdles for practical uses of current single-sequence-based DL models and we discuss avenues for future advances of such de novo DL models for RNA secondary structure prediction.

show abstract

Review of machine learning methods for RNA secondary structure prediction

Cited by 54 publications

References 135 publications

Quality by Design for enabling RNA platform production processes

Quality by Design for enabling RNA platform production processes

Probing RNA structures and functions by solvent accessibility: an overview from experimental and computational perspectives

Decisive Roles of Sequence Distributions in the Generalizability ofde novoDeep Learning Models for RNA Secondary Structure Prediction

Contact Info

Product

Resources

About