Recent advancements in single cell analysis technologies are now able to provide insights into genomic DNA content, RNA expression and protein surface markers. In bulk assays, the effect of genetic variation on gene expression would be masked by the heterogeneity inherent in tumor cells. Additionally, for cancer immunotherapy studies that rely on gene editing, single cell resolution is necessary to minimize possible off target effects. However, the ability to simultaneously interrogate multiple intracellular analytes, such as genomic DNA and RNA, have proved difficult to implement in a high throughput single cell workflow. We report the development of a complete solution that enables both targeted genomic DNA and RNA sequencing from individual cells. This workflow relies on the Tapestri microfluidic droplet platform, where up to 20,000 cells can be sequenced in each run. Leveraging proprietary cell barcoding, novel primer design strategies and enzymatic manipulation of cellular contents, DNA and RNA multiplex targeted sequencing panels provide for independent barcoded sequence information from both overlapping mRNA and corresponding genomic DNA regions. In addition, amplification primers can be designed to target separate gDNA and non-overlapping RNA transcript. Sequencing is followed by an integrated analysis solution that assigns the reads from both the gDNA and RNA to each cell. Feasibility of the targeted nucleic acid workflow has been shown with inputs from mixed cancer cell lines. A targeted sequencing panel with overlapping mRNA and gDNA regions was designed covering oncogenes, tumor suppressor genes, and known fusions. Expected SNVs and indels were detected and gene expression measured for thousands of cells per run with high cell recovery. This complete solution for single cell multiomics on the Tapestri platform has the power to quantitatively and unambiguously link genotypic and phenotypic data, giving insight into cancer progression. Citation Format: Dalia Dhingra, Kaustubh Gokhale, Nianzhen Li, Pedro Mendez, Shu Wang, Manimozhi Manivannan, Adam Sciambi, Keith Jones, Charlie Silver, Dennis Eastburn, David Ruff. A complete solution for high throughput single cell targeted multiomic DNA and RNA sequencing for cancer research [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2019; 2019 Mar 29-Apr 3; Atlanta, GA. Philadelphia (PA): AACR; Cancer Res 2019;79(13 Suppl):Abstract nr 3540.
Background: To realize the promise of precision medicine for cancer, assessing genetic variation present in rare cells and understanding the role that these cells play in the evolution of tumor progression is essential. High throughput single cell DNA targeted sequencing enables detection of rare mutations in cells and identification of subclones defined by co-occurrence of mutations. The big challenge with multiplex sequencing at single cell level is the non-uniform amplification of the targeted regions during PCR. This results in inadequate coverage of mutations of interest in the panel and hence makes genotyping challenging. To address this challenge, we developed a machine learning engine to optimize amplicon design for uniform amplification by making reliable performance prediction. Methods: Multiple panels with various sizes were designed with amplicons spanning a wide range of design properties such as amplicon GC, length, secondary structure prediction, primer specificity. These panels were synthesized and processed through Tapestri single cell DNA platform. The tested amplicons are classified into low-performer, OK-performer and high flyer based on their normalized reads-per-cell value. Design properties and property distribution of the amplicons and the panel are the features. We used random forest classifier to calculate feature importance and analyzed the range of the top features for each class and their significance of variance between classes. These ranges were then used as parameters in the assay design pipeline. Next, we train machine learning models with performance data to develop a performance prediction engine. Results: To test the performance of the design pipeline with new parameters, we designed a small (31), medium (128) and large (287) amplicon panel. Multiple runs were conducted for each panel with different cell types. We were able to achieve high panel performance of 97%, 92% and 88% across the three panels. The new parameters resulted in ~10-20% improvement in panel uniformity. We are working on further optimizing the performance prediction engine by using different ML classification models with K-fold cross validation, training using larger group of amplicons and optimizing features using combinations of properties. Citation Format: Shu Wang, Saurabh Gulati, Dong Kim, Sombeet Sahu, Saurabh Parikh, Nianzhen Li, Manimozhi Manivannan, Nigel Beard. Amplicon design algorithm for single cell targeted DNA sequencing using machine learning [abstract]. In: Proceedings of the Annual Meeting of the American Association for Cancer Research 2020; 2020 Apr 27-28 and Jun 22-24. Philadelphia (PA): AACR; Cancer Res 2020;80(16 Suppl):Abstract nr 2109.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.