The causes of pediatric cancers’ distinctiveness compared to adult-onset tumors of the same type are not completely clear and not fully explained by their genomes. In this study, we used an optimized multilevel RNA clustering approach to derive molecular definitions for most childhood cancers. Applying this method to 13,313 transcriptomes, we constructed a pediatric cancer atlas to explore age-associated changes. Tumor entities were sometimes unexpectedly grouped due to common lineages, drivers or stemness profiles. Some established entities were divided into subgroups that predicted outcome better than current diagnostic approaches. These definitions account for inter-tumoral and intra-tumoral heterogeneity and have the potential of enabling reproducible, quantifiable diagnostics. As a whole, childhood tumors had more transcriptional diversity than adult tumors, maintaining greater expression flexibility. To apply these insights, we designed an ensemble convolutional neural network classifier. We show that this tool was able to match or clarify the diagnosis for 85% of childhood tumors in a prospective cohort. If further validated, this framework could be extended to derive molecular definitions for all cancer types.
Objectives: Sarcomas are mesodermal cancers of bone and soft tissue of which there are >60 malignant varieties, many of which can be difficult to diagnose or subtype using traditional histopathology. A universal molecular definition of sarcoma types would therefore be an invaluable tool to the diagnostic pathologist. RNA has the potential to offer a complimentary perspective to cytogenetic- and methylation-based diagnostics, as it represents the active state of the disease at sampling and better reveals its phenotype. Recognizing the potential for RNA-based classification, we set out to create a first-generation transcriptional atlas of sarcoma. Methods: To develop transcriptional definitions of cancers with the potential to further subclassify tumor types, we designed a self-optimizing and scale-adaptive unsupervised method (RACCOON), which groups samples into hierarchically organized clusters. We used this approach on the UCSC Treehouse Childhood Cancer Compendium, a set of 2,178 pediatric and 9,400 adult tumors, 1,130 of which are sarcomas, as well as 1,735 non-neoplastic samples. We then trained an ensemble of convolutional neural networks to classify tumors to these transcriptional clusters. We have now added 624 more sarcoma samples from Toronto centers and international collaborators to better represent the breadth of sarcoma. We are actively sequencing 500 additional samples in partnership with the Gabriella Miller Kids First Research Program to yield an expanded cohort of >2,200 uniformly processed and analyzed sarcomas. Results: Sarcomas organize into two clusters at the highest hierarchical level: one characterized by entities which occur primarily in adults and resemble mature tissue, the other by primarily pediatric entities which exhibit high stemness and resemble embryonic tissue. Several included entities are not bona fide sarcomas but originate from the mesoderm (e.g., Wilms Tumor) signifying a common transcriptional identity for mesodermal neoplasms. Additionally, we demonstrate the first transcriptional subtypes of central osteosarcoma reflecting its major histotypes and representing divergent clinical courses. We also determine Ewing Sarcoma (ES) to be a distinct entity which clusters separately from all other cancers, raising questions of its origin and affinity to sarcoma. When classifying ongoing patients to the atlas, we correctly classified >85% of tumors and corrected the diagnosis of 7%. We find 14% of ES in our dataset were likely misdiagnosed CIC- or BCOR-driven sarcomas. Critically, assigned subtypes are consistent between primary and relapse pairs. Conclusion: RNA-seq is a promising tool for both subtype discovery and classifying sarcoma in ongoing patients. We have already included this tool in tumor boards to help inform patient care. Our method reveals the overarching organization of sarcoma for the first time and specifies its underlying biology. This atlas is ever-growing and is open to the community to contribute. Citation Format: Joshua O. Nash, Federico Comitani, Rose Chami, Sarah Cohen-Gogo, Astra Chang-Schwertschkow, Yael Babichev, Jodi Lees, Noa Alon, Nalan Gokgoz, Stephen Man Yu, Kyoko Yuki, Miranda Lorenti, Zhanqin Liu, Alaina McGoey, Famida Spatare, Bernarld Castro, Kim Tsoi, Hagit Peretz Soroka, Jack Brzezinski, Anita Villani, Albiruni Razak, Abha Gupta, Elizabeth Demicco, Gino Somers, Brendan C. Dickson, Jay S. Wunder, Irene L. Andrulis, David Malkin, Rebecca A. Gladdy, Adam Shlien. The development of a multiscale transcriptional atlas of sarcoma [abstract]. In: Proceedings of the AACR Special Conference: Sarcomas; 2022 May 9-12; Montreal, QC, Canada. Philadelphia (PA): AACR; Clin Cancer Res 2022;28(18_Suppl):Abstract nr B027.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.