2020
DOI: 10.48550/arxiv.2002.05810
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

RNA Secondary Structure Prediction By Learning Unrolled Algorithms

Abstract: In this paper, we propose an end-to-end deep learning model, called E2Efold, for RNA secondary structure prediction which can effectively take into account the inherent constraints in the problem. The key idea of E2Efold is to directly predict the RNA base-pairing matrix, and use an unrolled algorithm for constrained programming as the template for deep architectures to enforce constraints. With comprehensive experiments on benchmark datasets, we demonstrate the superior performance of E2Efold: it predicts sig… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
26
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
2

Relationship

3
5

Authors

Journals

citations
Cited by 25 publications
(27 citation statements)
references
References 28 publications
0
26
0
Order By: Relevance
“…Therefore, we compared SPOT-RNA2 with both single-sequence and alignment-based secondary structure predictors. Single-sequence based predictors includes recent deep learning based predictor SPOT-RNA (Singh et al, 2019) (available at https://github.com/ jaswindersingh2/SPOT-RNA), mxfold2 (Sato et al, 2021) version-0.1.0 (available at https://github.com/keio-bioinformatics/ mxfold2/releases/), Ufold (Fu et al, 2021) (available at https://ufold.ics.uci.edu/), 2dRNA (Mao et al, 2020) (available at http://biophy.hust.edu.cn/new/2dRNA/), and E2Efold (Chen et al, 2020) (available at https://github.com/ ml4bio/e2efold), heuristic approach based LinearPartition (Zhang et al, 2020a) (available at https://github.com/LinearFold/ LinearPartition) and LinearFold (Huang et al, 2019) (available at https://github.com/LinearFold/LinearFold), machine learning based CONTRAfold (Do et al, 2006) version 2.02 (available at http://contra.stanford.edu/contrafold/download. html) and mxfold (Akiyama et al, 2018) version-0.0.2 (available at https://github.com/keio-bioinformatics/mxfold/ releases/), integer programming based IPknot (Sato et al, 2011) version 0.0.4 (available at https://github.com/satoken/ ipknot/releases), maximum expected accuracy (MEA) prediction from partition function based Probknot (Bellaousov and Mathews, 2010) (from RNAstructure package version 6.2, available at http://rna.…”
Section: Methods Comparisonmentioning
confidence: 99%
“…Therefore, we compared SPOT-RNA2 with both single-sequence and alignment-based secondary structure predictors. Single-sequence based predictors includes recent deep learning based predictor SPOT-RNA (Singh et al, 2019) (available at https://github.com/ jaswindersingh2/SPOT-RNA), mxfold2 (Sato et al, 2021) version-0.1.0 (available at https://github.com/keio-bioinformatics/ mxfold2/releases/), Ufold (Fu et al, 2021) (available at https://ufold.ics.uci.edu/), 2dRNA (Mao et al, 2020) (available at http://biophy.hust.edu.cn/new/2dRNA/), and E2Efold (Chen et al, 2020) (available at https://github.com/ ml4bio/e2efold), heuristic approach based LinearPartition (Zhang et al, 2020a) (available at https://github.com/LinearFold/ LinearPartition) and LinearFold (Huang et al, 2019) (available at https://github.com/LinearFold/LinearFold), machine learning based CONTRAfold (Do et al, 2006) version 2.02 (available at http://contra.stanford.edu/contrafold/download. html) and mxfold (Akiyama et al, 2018) version-0.0.2 (available at https://github.com/keio-bioinformatics/mxfold/ releases/), integer programming based IPknot (Sato et al, 2011) version 0.0.4 (available at https://github.com/satoken/ ipknot/releases), maximum expected accuracy (MEA) prediction from partition function based Probknot (Bellaousov and Mathews, 2010) (from RNAstructure package version 6.2, available at http://rna.…”
Section: Methods Comparisonmentioning
confidence: 99%
“…For instance, a visual SUDOKU puzzle can be solved using a neural module to perceive the digits followed by a quadratic optimization module to maximize a logic satisfiability objective [1]. The RNA folding problem can be tackled by a neural energy model to capture pairwise relations between RNA bases and a constrained optimization module to minimize the energy, with additional pairing constraints, to obtain a folding [2]. In a broader context, MAML [28,29] also has a neural module for joint initialization and a reasoning module that performs optimization steps for task-specific adaptation.…”
Section: Summary Of Resultsmentioning
confidence: 99%
“…Very often these reasoning modules can be implemented as unrolled iterative algorithms, which can solve more sophisticated tasks with carefully designed and interpretable operations. For instance, SATNet [1] integrated a satisfiability solver into its deep model as a reasoning module; E2Efold [2] used a constrained optimization algorithm on top of a neural energy network to predict and reason about RNA structures, while [3] used optimal transport algorithm as a reasoning module for learning to sort. Other algorithms such as ADMM [4,5], Langevin dynamics [6], inductive logic programming [7], DP [8], k-means clustering [9], message passing [10,11], power iterations [12] are also used as differentiable reasoning modules in deep models for various learning tasks.…”
Section: Inverse Designmentioning
confidence: 99%
“…However, only a tiny fraction (<0.001%) of the structure-known ncRNAs has been determined by experiments [43] due to the high cost of wet-lab experiments and RNA structural instability. To tackle this problem, more and more computational approaches [28,[30][31][32] have been proposed for RNA structure prediction. We investigate RNA-FM's performance on several structure prediction tasks, including secondary structure prediction, 3D closeness prediction, and RNA map distance prediction.…”
Section: Methodsmentioning
confidence: 99%
“…On the other hand, with more RNA data available, several deep learning approaches are recently developed in the community to improve the accuracy of RNA secondary structure prediction. For example, SPOT-RNA [28], E2Efold [30], MXfold2 [31], and UFold [32] are shown to be able to improve the prediction accuracy significantly on different datasets. Nevertheless, the generalization capability of such DL-based methods still remains a problem, as the model architecture is explicitly designed for corresponding tasks and cannot generalize well to unknown RNA types [30].…”
Section: Introductionmentioning
confidence: 99%