2019
DOI: 10.1186/s12864-019-5965-x
|View full text |Cite
|
Sign up to set email alerts
|

Impact of sequencing depth and technology on de novo RNA-Seq assembly

Abstract: Background RNA-Seq data is inherently nonuniform for different transcripts because of differences in gene expression. This makes it challenging to decide how much data should be generated from each sample. How much should one spend to recover the less expressed transcripts? The sequencing technology used is another consideration, as there are inevitably always biases against certain sequences. To investigate these effects, we first looked at high-depth libraries from a set of well-annotated organi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

2
36
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 56 publications
(38 citation statements)
references
References 34 publications
2
36
0
Order By: Relevance
“…De novo assembling for new "raw" transcriptomes. It is well established that more input reads do not produce better assemblies 31,32 . According to this observation, 30 Illumina and 1 GS-FLX libraries were adopted as the most informative ones from the original dataset of 111 libraries 7 following criteria detailed in "Methods" section.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…De novo assembling for new "raw" transcriptomes. It is well established that more input reads do not produce better assemblies 31,32 . According to this observation, 30 Illumina and 1 GS-FLX libraries were adopted as the most informative ones from the original dataset of 111 libraries 7 following criteria detailed in "Methods" section.…”
Section: Resultsmentioning
confidence: 99%
“…These new pipelines have to deal with errors due to the high number of reads, weakly expressed or 'uncommon' transcripts, circular RNAs, gene fusions, and isoform discrimination. It was well established that assembly quality improves with longer paired-end reads 28 but not increasing the input sequencing dataset 31 due to, for example, the increase in low abundance transcripts and a plethora of unannotated transcripts, most of them containing intronic regions 32 . Hence, small sequencing datasets are desirable to decrease computational requirements and produce robust transcriptomes, since manual curation is excluded, even though it has been proved to render the best results 37 .…”
Section: Discussionmentioning
confidence: 99%
“…This is a similar distribution of BUSCO scores to those in the recently published 1KP project One Thousand Plant Transcriptomes Initiative, 2019) and other studies (Blande et al, 2017;Evkaikina et al, 2017;Pokorn et al, 2017;Weisberg et al, 2017) . Like many transcriptome assemblies (Johnson et al, 2012;Carpenter et al, 2019;Patterson et al, 2019) , these assemblies also contain a large number of small scaffolds (<300 bp). Small scaffolds are likely artifacts of library amplification and sequencing, considering that most did not translate to a known plant protein sequence.…”
Section: Discussionmentioning
confidence: 99%
“…For example, the MGISEQ-2000 currently generates 1.44 TB of sequence data in a single run with a running cost of 10 USD/GB. Several recent studies have compared the performance of BGI sequencers with Illumina’s sequencers and showed that the BGI sequencers produced high-quality sequence data at lower or similar prices in studies of whole-exome [2,3], whole-genome [4][1], transcriptome [5,6], single-cell transcriptome [2,78], metagenome [9], and small RNA sequencing [10].…”
Section: Introductionmentioning
confidence: 99%