Highlights d Tumor organoid cultures from >1,000 patients reveal genomic/transcriptomic fidelity d Establishment of chemically defined minimal medias for each solid tumor type d Pan-cancer neural network predicts drug response from label-free light microscopy
Background With the introduction of DNA-damaging therapies into standard of care cancer treatment, there is a growing need for predictive diagnostics assessing homologous recombination deficiency (HRD) status across tumor types. Following the strong clinical evidence for the utility of DNA-sequencing-based HRD testing in ovarian cancer, and growing evidence in breast cancer, we present analytical validation of the Tempus HRD-DNA test. We further developed, validated, and explored the Tempus HRD-RNA model, which uses gene expression data from 16,750 RNA-seq samples to predict HRD status from formalin-fixed paraffin-embedded tumor samples across numerous cancer types. Methods Genomic and transcriptomic profiling was performed using next-generation sequencing from Tempus xT, Tempus xO, Tempus xE, Tempus RS, and Tempus RS.v2 assays on 48,843 samples. Samples were labeled based on their BRCA1, BRCA2 and selected Homologous Recombination Repair pathway gene (CDK12, PALB2, RAD51B, RAD51C, RAD51D) mutational status to train and validate HRD-DNA, a genome-wide loss-of-heterozygosity biomarker, and HRD-RNA, a logistic regression model trained on gene expression. Results In a sample of 2058 breast and 1216 ovarian tumors, BRCA status was predicted by HRD-DNA with F1-scores of 0.98 and 0.96, respectively. Across an independent set of 1363 samples across solid tumor types, the HRD-RNA model was predictive of BRCA status in prostate, pancreatic, and non-small cell lung cancer, with F1-scores of 0.88, 0.69, and 0.62, respectively. Conclusions We predict HRD-positive patients across many cancer types and believe both HRD models may generalize to other mechanisms of HRD outside of BRCA loss. HRD-RNA complements DNA-based HRD detection methods, especially for indications with low prevalence of BRCA alterations.
INTRODUCTION We performed a retrospective analysis of longitudinal real-world data (RWD) from breast cancer patients to replicate results from clinical studies and demonstrate the feasibility of generating real-world evidence. We also assessed the value of transcriptome profiling as a complementary tool for determining molecular subtypes. PATIENTS AND METHODSDe-identified, longitudinal data were analyzed after abstraction from U.S. breast cancer patient records structured and stored in the Tempus database. Demographics, clinical characteristics, molecular subtype, treatment history, and survival outcomes were assessed according to strict qualitative criteria. RNA sequencing and clinical data were used to predict molecular subtypes and signaling pathway enrichment. RESULTSThe clinical abstraction cohort (n=4,000) mirrored U.S. breast cancer demographics and clinical characteristics indicating feasibility for RWE generation. Among HER2+ patients, 74.2% received anti-HER2 therapy, with ~70% starting within 3 months of a positive test result. Most non-treated patients were early stage. In this RWD set, 31.7% of patients with HER2+ IHC had discordant FISH results recorded. Among patients with multiple HER2 IHC results at diagnosis, 18.6% exhibited intra-test discordance. Through development of a wholetranscriptome model to predict IHC receptor status in the molecular sequenced cohort (n=400), molecular subtypes were resolved for all patients (n=36) with equivocal HER2 statuses from abstracted test results. Receptor-related signaling pathways were differentially enriched between clinical molecular subtypes.CONCLUSION RWD in the Tempus database mirrors the overall U.S. breast cancer population. These results suggest real-time, RWD analyses are feasible in a large, highly heterogeneous database. Furthermore, molecular data may aid deficiencies and discrepancies observed from breast cancer RWD.
Laboratories conducting high volumes of RNA sequencing must be extremely wary of technical batch effects if samples are to be compared across extended time periods, which is imperative for the most well-powered analyses of cancer transcriptomes. Changes in reagents, protocols, or technologies used in nucleic acid extraction, library preparation, and sequencing can alter transcriptomes in ways that invalidate or complicate comparisons of samples from different batches, necessitating continuous monitoring. This monitoring can be particularly difficult when analyzing samples from distinct tissue sites as tumor type is the major biological determinant of transcriptome variance in cancer. Brain and liver cancer transcriptomes, for example, are expected to differ so drastically that their comparison is not informative for batch effect detection. Detection methods must also be robust to disparate batch effects that can manifest as minor changes in expression among many genes or major changes in a subset of genes making ad hoc detection unfeasible. To overcome these challenges, we developed MaCoBED (matched cohort batch effect detection), a novel method that evaluates technical batch effects in a set of transcriptome samples (e.g., a flow cell) by pooling them with a set of validated reference samples matched by cancer type and tissue site. This pooled set of transcriptomes is then subjected to low-dimensional embedding using Uniform Manifold Approximation and Projection (UMAP), and each component is tested for deviation from the reference set using a Wilcox test. Matching new and legacy samples by cancer type and tissue site ensures that any differences in UMAP clustering are not driven by known biological contributions. We found that UMAP was preferable to Principal Components Analysis (PCA). UMAP can capture variability in just two dimensions, accentuating modest but consistent transcriptome differences among batches that would otherwise be manifested among multiple minor principal components, making batch effects more obvious and readily detectable. This approach was able to detect a number of simulated batch effects with high specificity and sensitivity relative to randomly sampled validated legacy samples. Thus, we propose MaCoBED as a simple and rapid approach for batch effect monitoring of high-throughput RNA sequencing datasets that is versatile in detecting distinct kinds of batch effects, easily automatable, readily interpretable upon visualization, and extensible to small or large batch sizes. Citation Format: Joshua Drews, Joshua Bell, Wesley Munson, Saksham Saini, Benjamin Leibowitz, Jackson Michuda, Calvin McCarter, Lee Langer, Catherine Igartua, Kevin White. Robust detection of sequencing batch effects in RNA through low dimensional embedding with subtype-matched reference samples [abstract]. In: Proceedings of the Annual Meeting of the American Association for Cancer Research 2020; 2020 Apr 27-28 and Jun 22-24. Philadelphia (PA): AACR; Cancer Res 2020;80(16 Suppl):Abstract nr 5466.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.