The backbone of all colorectal cancer classifications including the consensus molecular subtypes (CMS) highlights microsatellite instability (MSI) as a key molecular pathway. Although mucinous histology (generally defined as >50% extracellular mucin-to-tumor area) is a “typical” feature of MSI, it is not limited to this subgroup. Here, we investigate the association of CMS classification and mucin-to-tumor area quantified using a deep learning algorithm, and the expression of specific mucins in predicting CMS groups and clinical outcome. A weakly supervised segmentation method was developed to quantify extracellular mucin-to-tumor area in H&E images. Performance was compared to two pathologists’ scores, then applied to two cohorts: (1) TCGA (n = 871 slides/412 patients) used for mucin-CMS group correlation and (2) Bern (n = 775 slides/517 patients) for histopathological correlations and next-generation Tissue Microarray construction. TCGA and CPTAC (n = 85 patients) were used to further validate mucin detection and CMS classification by gene and protein expression analysis for MUC2, MUC4, MUC5AC and MUC5B. An excellent inter-observer agreement between pathologists’ scores and the algorithm was obtained (ICC = 0.92). In TCGA, mucinous tumors were predominantly CMS1 (25.7%), CMS3 (24.6%) and CMS4 (16.2%). Average mucin in CMS2 was 1.8%, indicating negligible amounts. RNA and protein expression of MUC2, MUC4, MUC5AC and MUC5B were low-to-absent in CMS2. MUC5AC protein expression correlated with aggressive tumor features (e.g., distant metastases (p = 0.0334), BRAF mutation (p < 0.0001), mismatch repair-deficiency (p < 0.0001), and unfavorable 5-year overall survival (44% versus 65% for positive/negative staining). MUC2 expression showed the opposite trend, correlating with less lymphatic (p = 0.0096) and venous vessel invasion (p = 0.0023), no impact on survival.The absence of mucin-expressing tumors in CMS2 provides an important phenotype-genotype correlation. Together with MSI, mucinous histology may help predict CMS classification using only histopathology and should be considered in future image classifiers of molecular subtypes.
Today computational molecular evolution and bioinformatics are vibrant research areas that flourish on large amounts of complex datasets generated by new generation technologies – from full genomes and proteomes to microbiomes, metabolomes and epigenomes. Yet the foundations for successful mining and the analyses of such data were established long before the structure of the DNA was discovered. Darwin’s theory of evolution by means of natural selection not only remains relevant today, but also provides solid ground for computational research with a variety of applications. The data size and its complexity require empirical scientists to work in close collaboration with experts in computational science, modeling and statistics, as Sir R. Fisher has beautifully demonstrated in early 20th century. Particularly, modern computational methods for evaluating selection in molecular sequences are very useful for generating biological hypotheses and candidate gene sets for follow-up experiments. Evolutionary analyses of selective pressures in genomic data have high potential for applications, since natural selection is a leading force in function conservation, in adaptation to emerging pathogens, new environments, and plays key role in immune and resistance systems. At this stage, pharma and biotech industries can successfully use this potential, taking the initiative to enhance their research and development with the state-of the art bioinformatics approaches. This mini-review provides a quick “why-and-how” guide to the current approaches that apply the evolutionary principles of natural selection to real life problems – from drug target validation, vaccine design and protein engineering to applications in agriculture, ecology and conservation.
Motivation: Automatically extracting relationships from biomedical texts among multiple sorts of entities is an essential task in biomedical natural language processing with numerous applications, such as drug development or repurposing, precision medicine, and other biomedical tasks requiring knowledge discovery. Current Relation Extraction (RE) systems mostly use one set of features, either as text, or more recently, as graph structures. The state-of-the-art systems often use resource-intensive hence slow algorithms and largely work for a particular type of relationship. However, a simple yet agile system that learns from different sets of features has the advantage of adaptability over different relationship types without an extra burden required for system re-design. Results: We model RE as a classification task and propose a new multi-channel deep neural network designed to process textual and graph structures in separate input channels. We extend a Recurrent Neural Network (RNN) with a Convolutional Neural Network (CNN) to process three sets of features, namely, tokens, types, and graphs. We demonstrate that entity type and ontology graph structure provide better representations than simple token-based representations for RE. We also experiment with various sources of knowledge, including data resources in the Unified Medical Language System (UMLS) to test our hypothesis. Extensive experiments on four well-studied biomedical benchmarks with different relationship types show that our system outperforms earlier ones. Thus, our system has state-of-the-art performance and allows processing millions of full-text scientific articles in a few days on one typical machine.
Today computational molecular evolution and bioinformatics are vibrant research areas that flourish on large amounts of complex datasets generated by new generation technologies – from full genomes and proteomes to microbiomes, metabolomes and epigenomes. Yet the foundations for successful mining and the analyses of such data were established long before the structure of the DNA was discovered. Darwin’s theory of evolution by means of natural selection not only remains relevant today, but also provides solid ground for computational research with a variety of applications. The data size and its complexity require empirical scientists to work in close collaboration with experts in computational science, modeling and statistics, as Sir R. Fisher has beautifully demonstrated in early 20th century. Particularly, modern computational methods for evaluating selection in molecular sequences are very useful for generating biological hypotheses and candidate gene sets for follow-up experiments. Evolutionary analyses of selective pressures in genomic data have high potential for applications, since natural selection is a leading force in function conservation, in adaptation to emerging pathogens, new environments, and plays key role in immune and resistance systems. At this stage, pharma and biotech industries can successfully use this potential, taking the initiative to enhance their research and development with the state-of the art bioinformatics approaches. This mini-review provides a quick “why-and-how” guide to the current approaches that apply the evolutionary principles of natural selection to real life problems – from drug target validation, vaccine design and protein engineering to applications in agriculture, ecology and conservation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.