2020
DOI: 10.1038/s41592-020-0905-x
|View full text |Cite
|
Sign up to set email alerts
|

Cumulus provides cloud-based data analysis for large-scale single-cell and single-nucleus RNA-seq

Abstract: Massively parallel single-cell and single-nucleus RNA-seq (sc/snRNA-seq) have opened the way to systematic tissue atlases in health and disease, but as the scale of data generation is growing, so does the need for computational pipelines for scaled analysis. Here, we developed Cumulus, a cloud-based framework for analyzing large scale sc/snRNA-seq datasets. Cumulus combines the power of cloud computing with improvements in algorithm implementations to achieve high scalability, low cost, user-friendliness, and … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
120
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3

Relationship

4
4

Authors

Journals

citations
Cited by 148 publications
(121 citation statements)
references
References 58 publications
1
120
0
Order By: Relevance
“…The total run times per dataset of the Salmon-Alevin-fry programs were consistently much higher than that of kallisto-bustools, and the memory requirements were much higher as well (Figure 3). Salmon-Alevin-fry was on average 3 times slower than kallisto-bustools, a result consistent with the benchmarks in (Li et al 2020). Importantly, while kallisto-bustools ran in under 4Gb of RAM for all datasets except the human-mouse mixed samples, Salmon-Alevin-fry required up to 18.8 Gb of RAM for some of those datasets.…”
Section: Resultssupporting
confidence: 82%
“…The total run times per dataset of the Salmon-Alevin-fry programs were consistently much higher than that of kallisto-bustools, and the memory requirements were much higher as well (Figure 3). Salmon-Alevin-fry was on average 3 times slower than kallisto-bustools, a result consistent with the benchmarks in (Li et al 2020). Importantly, while kallisto-bustools ran in under 4Gb of RAM for all datasets except the human-mouse mixed samples, Salmon-Alevin-fry required up to 18.8 Gb of RAM for some of those datasets.…”
Section: Resultssupporting
confidence: 82%
“…Cell Ranger mkfastq (10x Genomics) was used to demultiplex the raw sequencing reads, and Cell Ranger count on Terra using the cellranger_workflow in Cumulus 33 was used to align sequencing reads and generate a counts matrix. Reads were aligned to a custom-built Human GRCh38 and SARS-CoV-2 RNA reference.…”
Section: Methodsmentioning
confidence: 99%
“…Resolution 1.3 was sufficient for coarse annotations of clusters. Clustering was performed using Pegasus 33 with default parameters (except the resolution as mentioned above).…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…We generated tSNE plots per compartment from NMF loading matrices, with a perplexity value of 30 and the Barnes-Hut approximation method 81 . A global tSNE of all cells was generated using Pegasus with the default parameters and using SVD for the preliminary embedding (v0.17.0, 82 ).…”
Section: Experimental Methodsmentioning
confidence: 99%