2019
DOI: 10.1186/s12859-019-3182-x
|View full text |Cite
|
Sign up to set email alerts
|

VARUS: sampling complementary RNA reads from the sequence read archive

Abstract: BackgroundVast amounts of next generation sequencing RNA data has been deposited in archives, accompanying very diverse original studies. The data is readily available also for other purposes such as genome annotation or transcriptome assembly. However, selecting a subset of available experiments, sequencing runs and reads for this purpose is a nontrivial task and complicated by the inhomogeneity of the data.ResultsThis article presents the software VARUS that selects, downloads and aligns reads from NCBI’s Se… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
2

Relationship

2
3

Authors

Journals

citations
Cited by 12 publications
(11 citation statements)
references
References 17 publications
0
10
0
Order By: Relevance
“… The test sets were (All) all annotated multi-exon genes and (Reliable) all annotated complete multi-exon genes having all introns supported by mapped RNA-seq reads, the ones sampled by VARUS ( 36 ). …”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“… The test sets were (All) all annotated multi-exon genes and (Reliable) all annotated complete multi-exon genes having all introns supported by mapped RNA-seq reads, the ones sampled by VARUS ( 36 ). …”
Section: Resultsmentioning
confidence: 99%
“…We used the OrthoDB database ( 34 ) as a source of protein data. RNA-seq data used in runs of BRAKER1 was sampled from the Sequence Read Archive ( 35 ) by VARUS ( 36 ). To determine to which degree both predicted and annotated genes covered the sets of universal single copy genes identified by the BUSCO protein families, we used the BUSCO database v4 ( 37 ).…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…We observed that if annotated genes of S. lycopersicum genome were supported by RNA-Seq, they were significantly better predicted by GeneMark-EP+ (Table S4). To generate intron hints from RNA-Seq we used VARUS (29). We divided annotated tomato genes into two groups: a/ genes with all introns predicted by VARUS and b/ all other genes.…”
Section: Assessment Of Accuracy Of Genemark-ep -Ep+mentioning
confidence: 99%
“…BRAKER1 has been run on the genomes of A. thaliana, C. elegans, and D. melanogaster with hints originating from RNA-Seq reads sampled by VARUS [34] from the NCBI Sequence Read Archive [33]. VARUS used HISAT2 [41] for mapping RNA-Seq reads to genomic sequences (Supplementary Materials, section 1.10).…”
Section: Selection Of Protein Data Sets and Test Sets Of Annotated Gementioning
confidence: 99%