2022
DOI: 10.1101/2022.04.11.487796
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Multimodal single cell data integration challenge: results and lessons learned

Abstract: Biology has become a data-intensive science. Recent technological advances in single-cell genomics have enabled the measurement of multiple facets of cellular state, producing datasets with millions of single-cell observations. While these data hold great promise for understanding molecular mechanisms in health and disease, analysis challenges arising from sparsity, technical and biological variability, and high dimensionality of the data hinder the derivation of such mechanistic insights. To promote the innov… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
72
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 49 publications
(90 citation statements)
references
References 22 publications
2
72
1
Order By: Relevance
“…For the Multiome dataset, MultiVI's graph connectivity score is considerably lower for small sample sizes, while all models improve performance with an increasing number of cells. With the best performing models reaching a score of almost 1 for higher cell numbers in the case of the Multiome dataset, this is in line with the scores achieved by the models of the NeurIPS competition [30]. For the CITE-seq dataset, the performance of TotalVI increased with increasing cell numbers, achieving the highest graph connectivity score for 5,000 and 10,000 cells.…”
Section: Removing Technical Effectssupporting
confidence: 79%
See 4 more Smart Citations
“…For the Multiome dataset, MultiVI's graph connectivity score is considerably lower for small sample sizes, while all models improve performance with an increasing number of cells. With the best performing models reaching a score of almost 1 for higher cell numbers in the case of the Multiome dataset, this is in line with the scores achieved by the models of the NeurIPS competition [30]. For the CITE-seq dataset, the performance of TotalVI increased with increasing cell numbers, achieving the highest graph connectivity score for 5,000 and 10,000 cells.…”
Section: Removing Technical Effectssupporting
confidence: 79%
“…We compare our results with the metric values achieved by the models of the NeurIPS 2021 competition for the integration of the Multiome dataset (data points were extracted via WebPlotDigitizer-4.5 [44] from Supplementary Figure 6 of [30]). However, as we merely used a subset of at most 10,000 cells of the original benchmark dataset, we expect our investigated algorithms to score higher for most metrics if they were to be subjected to the complete benchmark dataset.…”
Section: Resultsmentioning
confidence: 99%
See 3 more Smart Citations