Primary liver tissue cancer types are renowned to display a consistent increase in global disease burden and mortality, thus needing more effective diagnostics and treatments. Yet, integrative research efforts to identify cellof-origin for these cancers by utilizing human specimen data were poorly established. To this end, we analyzed previously published whole-genome sequencing data for 384 tumor and progenitor tissues along with 423 publicly available normal tissue epigenomic features and single cell RNA-seq data from human livers to assess correlation patterns and extended this information to conduct in-silico prediction of the cell-of-origin for primary liver cancer subtypes. Despite mixed histological features, the cell-of-origin for mixed hepatocellular carcinoma/ intrahepatic cholangiocarcinoma subtype was predominantly predicted to be hepatocytic origin. Individual sample-level predictions also revealed hepatocytes as one of the major predicted cell-of-origin for intrahepatic cholangiocarcinoma, thus implying trans-differentiation process during cancer progression. Additional analyses on the whole genome sequencing data of hepatic progenitor cells suggest these cells may not be a direct cell-oforigin for liver cancers. These results provide novel insights on the nature and potential contributors of cell-oforigins for primary liver cancers.
We present here COOBoostR, a computational method designed for the putative prediction of the tissue- or cell-of-origin of various cancer types. COOBoostR leverages regional somatic mutation density information and chromatin mark features to be applied to an extreme gradient boosting-based machine-learning algorithm. COOBoostR ranks chromatin marks from various tissue and cell types, which best explain the somatic mutation density landscape of any sample of interest. A specific tissue or cell type matching the chromatin mark feature with highest explanatory power is designated as a potential tissue- or cell-of-origin. Through integrating either ChIP-seq based chromatin data, along with regional somatic mutation density data derived from normal cells/tissue, precancerous lesions, and cancer types, we show that COOBoostR outperforms existing random forest-based methods in prediction speed, with comparable or better tissue or cell-of-origin prediction performance (prediction accuracy—normal cells/tissue: 76.99%, precancerous lesions: 95.65%, cancer cells: 89.39%). In addition, our results suggest a dynamic somatic mutation accumulation at the normal tissue or cell stage which could be intertwined with the changes in open chromatin marks and enhancer sites. These results further represent chromatin marks shaping the somatic mutation landscape at the early stage of mutation accumulation, possibly even before the initiation of precancerous lesions or neoplasia.
We here present COOBoostR (https://github.com/SWJ9385/COOBoostR), a computational method designed for the putative prediction of tissue- or cell-of-origin of various cancer types. COOBoostR leverages regional somatic mutation density information and chromatin mark features to be applied to an extreme gradient boosting-based machine-learning algorithm. COOBoostR ranks chromatin marks from various tissue and cell types which best explain the somatic mutation density landscape of any sample of interest. Through integrating either ChIP-seq based chromatin data or bulk/single cell chromatin accessibility data along with regional somatic mutation density data derived from normal cells/tissue, precancerous lesions, and cancer types, we show that COOBoostR outperforms existing random forest-based methods in prediction speed with comparable or better tissue or cell-of-origin prediction performance. In addition, our results suggest a dynamic somatic mutation accumulation at the normal tissue or cell stage which could be intertwined with the changes in open chromatin marks and enhancer sites. These results further represent chromatin marks shaping the somatic mutation landscape at the early stage of mutation accumulation, possibly even before the initiation of precancerous lesions or neoplasia.
21Background: Primary liver tissue cancers display consistent increase in global disease burden 22 and mortality. Identification of cell-of-origins for primary liver cancers would be a necessity 23 to expand options for designing relevant therapeutics and preventive medicine for these cancer 24 types. Previous reports on cell-of-origin for primary liver cancers was mainly from animal 25 studies, and integrative research utilizing human specimen data was poorly established. 26Methods: We analyzed a whole-genome sequencing data set for a total of 363 tumor and 27 progenitor tissues along with 423 normal tissue epigenomic marks to predict the cell-of-origin 28 for primary liver cancer subtypes. 29Results: Despite the mixed histological features, the predicted cell-of-origin for mixed 30 hepatocellular carcinoma / intrahepatic cholangiocarcinoma were uniformly predicted as a 31 hepatocytic origin. Individual sample-level prediction revealed differential level of cell-of-32 origin heterogeneity depending on the primary liver cancer types, with more heterogeneity 33 observed in intrahepatic cholangiocarcinomas. Additional analyses on the whole genome 34 sequencing data of hepatic progenitor cells suggest these progenitor cells might not a direct 35 cell-of-origin for liver cancers. 36 Conclusions:These results provide novel insights on the heterogeneous nature and potential 37 contributors of cell-of-origin predictions for primary liver cancers. 38 39 Background 43Primary liver cancers (PLCs) is one of the major cancer types with increasing global disease 44 burden over the years, reaching incidence rates and mortality over 900,000 per year (1, 2). This 45 high morbidity and mortality of PLCs is due to the complex nature of the disease and lacking 46 effective diagnostics and treatment besides multi-kinase inhibitors, thus strongly emphasizing 47 the importance of relevant researches on early diagnosis and extensive drug development. In 48 line with this, several endeavored researches were performed on identifying suitable diagnostic 49 markers and targeted therapy-based treatments for PLCs, including the whole genome and 50 exome-level profiling (3). So far, recent comprehensive efforts on investigating the genomics 51 of PLCs revealed novel insights about the major mutation signatures, sub-classifications, and 52 recurrent somatic mutations in coding regions (TERT, TP53, CTNNB1, KRAS, IDH1/2, etc.) 53 and noncoding regions (NEAT1 and MALAT1), which some of them are driver mutations and 54 may associate with the clinical outcomes (4, 5). More investigations are underway to fully 55 unveil the mechanisms and processes behind the progression of PLCs. 56 One of the complex, unanswered questions associated with the progression of PLCs is the 57 possible cell-of-origins (COOs) corresponding to the various subtypes. PLC not only represents 58 classical hepatocellular carcinoma (HCC) subtype, comprising of ~90% of PLCs, but also 59 includes mixed hepatocellular and cholangiocarcinoma (Mixed) and intrahepatic 60 cholang...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.