2022
DOI: 10.1038/s42003-022-03808-9
|View full text |Cite
|
Sign up to set email alerts
|

Precise identification of cancer cells from allelic imbalances in single cell transcriptomes

Abstract: A fundamental step of tumour single cell mRNA analysis is separating cancer and non-cancer cells. We show that the common approach to separation, using shifts in average expression, can lead to erroneous biological conclusions. By contrast, allelic imbalances representing copy number changes directly detect the cancer genotype and accurately separate cancer from non-cancer cells. Our findings provide a definitive approach to identifying cancer cells from single cell mRNA sequencing data.

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
11
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3
1

Relationship

2
6

Authors

Journals

citations
Cited by 14 publications
(11 citation statements)
references
References 18 publications
0
11
0
Order By: Relevance
“…To further analyze the effects of dataset imbalance in realistic scenarios, we considered the pancreatic ductal adenocarcinoma (PDAC) dataset of 8 batches comprising tumor samples across 8 different biopsies [20]. One major challenge in the analysis of PDAC data is accurate annotation of tumor cells, and being able to separate these from normal non-cancerous epithelial cells [38, 39]. As both acinar and ductal epithelial cells have been proposed as cell of origin candidates in PDAC across numerous studies [40, 41], reliably classifying tumor cells from these normal epithelial cell-types in scRNA-seq data remains a major computational challenge.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…To further analyze the effects of dataset imbalance in realistic scenarios, we considered the pancreatic ductal adenocarcinoma (PDAC) dataset of 8 batches comprising tumor samples across 8 different biopsies [20]. One major challenge in the analysis of PDAC data is accurate annotation of tumor cells, and being able to separate these from normal non-cancerous epithelial cells [38, 39]. As both acinar and ductal epithelial cells have been proposed as cell of origin candidates in PDAC across numerous studies [40, 41], reliably classifying tumor cells from these normal epithelial cell-types in scRNA-seq data remains a major computational challenge.…”
Section: Resultsmentioning
confidence: 99%
“…To further analyze the effects of dataset imbalance in realistic scenarios, we considered the pancreatic ductal adenocarcinoma (PDAC) dataset of 8 batches comprising tumor samples across 8 different biopsies [20]. One major challenge in the analysis of PDAC data is accurate annotation of tumor cells, and being able to separate these from normal non-cancerous epithelial cells [38,39].…”
Section: Perturbation Analysis In Pdac Samples Reveals Tumor Compartm...mentioning
confidence: 99%
“…Next, we consider single cell transcriptomes from 5 cancers, where the cancer cell transcriptomes have been previously identified [9][10][11][12] . As cancers are derived from a single cell, cancer cells must have the same X-inactivation status.…”
Section: Evaluation Of the Methodsmentioning
confidence: 99%
“…To validate the inactiveXX method, we used three samples with single cell transcriptomics, bulk DNA, and parental DNA to define a gold standard. To do this, we identified heterozygous SNPs in the bulk DNA, then phased them using the DNA from the parents, using the alleleIntegrator package 12 . This defined phased heterozygous SNPs on the X chromosome.…”
Section: Validation Of Inactivexx Methodsmentioning
confidence: 99%
“…Trinh et al (2022) andGao et al (2022) have demonstrated the feasibility of distinguishing normal and cancer cells using BAF signal. Here, we propose a convenient special case of our model that effectively detects the cluster of normal cells.…”
mentioning
confidence: 99%