2016
DOI: 10.1002/wrna.1334
|View full text |Cite
|
Sign up to set email alerts
|

New insights from cluster analysis methods for RNA secondary structure prediction

Abstract: A widening gap exists between the best practices for RNA secondary structure prediction developed by computational researchers and the methods used in practice by experimentalists. Minimum free energy (MFE) predictions, although broadly used, are outperformed by methods which sample from the Boltzmann distribution and data mine the results. In particular, moving beyond the single structure prediction paradigm yields substantial gains in accuracy. Furthermore, the largest improvements in accuracy and precision … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
15
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 15 publications
(16 citation statements)
references
References 71 publications
(121 reference statements)
0
15
0
Order By: Relevance
“…We statistically sampled 1000 secondary structures for each allele sequence, removed duplicates, and combined samples into one set of >100 structures. To account for inter-sample variation at the base-pair level, which is characteristic of statistical samples 24 , 26 , we considered 10 independent samples in all analyses. Although DMS-MaPseq can report multiple modifications per read by way of mutational profiling, it is also possible to generate an equivalent truncation data set in silico, where each read reports up to one modification (see Methods for details and Supplementary Fig.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…We statistically sampled 1000 secondary structures for each allele sequence, removed duplicates, and combined samples into one set of >100 structures. To account for inter-sample variation at the base-pair level, which is characteristic of statistical samples 24 , 26 , we considered 10 independent samples in all analyses. Although DMS-MaPseq can report multiple modifications per read by way of mutational profiling, it is also possible to generate an equivalent truncation data set in silico, where each read reports up to one modification (see Methods for details and Supplementary Fig.…”
Section: Resultsmentioning
confidence: 99%
“…Otherwise, a two-step approach may be more appropriate, where LASSO is applied to limit the number of selected structures, followed by NNLS for accurate abundance estimation 54 . Also, one may cluster either the input structures or the structures selected by NNLS using methods specialized to RNA structure 4 , 24 , 26 . However, clustering might eliminate features that may be informative to downstream analysis.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Overall, these results are not surprising as an ensemble-based approach considers many competing alternative structures while an MFE approach encapsulates structural dynamics into a single structure [ 69 ]. Nevertheless, our results also indicate that PATTERNA performs surprisingly well, given that no sequence information is used, and can even outperform thermodynamics modeling on occasions.…”
Section: Resultsmentioning
confidence: 99%
“…In addition, long RNAs (>1000 nt) generally require deeper sampling of the ensemble, compared to short RNAs, to ensure that all biologically relevant structures are represented in the sampled pool and that sample frequencies reliably approximate their theoretical counterparts. Therefore, this approach is typically reserved for small-scale studies of small RNAs [ 69 ], or alternatively, for small targeted regions within transcripts (e.g., 100 nt long), as these are easily folded computationally. MFE prediction suffers a similar drawback given its complexity is also (Additional file 2 : Figure S3) and, while faster compared to ensemble sampling, it would still require about a month for a typical transcriptome-wide data set (Additional file 2 : Table S7).…”
Section: Resultsmentioning
confidence: 99%