Background One of the most challenging tasks for bladder cancer diagnosis is to histologically differentiate two early stages, non-invasive Ta and superficially invasive T1, the latter of which is associated with a significantly higher risk of disease progression. Indeed, in a considerable number of cases, Ta and T1 tumors look very similar under microscope, making the distinction very difficult even for experienced pathologists. Thus, there is an urgent need for a favoring system based on machine learning (ML) to distinguish between the two stages of bladder cancer. Methods A total of 1177 images of bladder tumor tissues stained by hematoxylin and eosin were collected by pathologists at University of Rochester Medical Center, which included 460 non-invasive (stage Ta) and 717 invasive (stage T1) tumors. Automatic pipelines were developed to extract features for three invasive patterns characteristic to the T1 stage bladder cancer (i.e., desmoplastic reaction, retraction artifact, and abundant pinker cytoplasm), using imaging processing software ImageJ and CellProfiler. Features extracted from the images were analyzed by a suite of machine learning approaches. Results We extracted nearly 700 features from the Ta and T1 tumor images. Unsupervised clustering analysis failed to distinguish hematoxylin and eosin images of Ta vs. T1 tumors. With a reduced set of features, we successfully distinguished 1177 Ta or T1 images with an accuracy of 91–96% by six supervised learning methods. By contrast, convolutional neural network (CNN) models that automatically extract features from images produced an accuracy of 84%, indicating that feature extraction driven by domain knowledge outperforms CNN-based automatic feature extraction. Further analysis revealed that desmoplastic reaction was more important than the other two patterns, and the number and size of nuclei of tumor cells were the most predictive features. Conclusions We provide a ML-empowered, feature-centered, and interpretable diagnostic system to facilitate the accurate staging of Ta and T1 diseases, which has a potential to apply to other types of cancer.
Background The topological landscape of gene interaction networks provides a rich source of information for inferring functional patterns of genes or proteins. However, it is still a challenging task to aggregate heterogeneous biological information such as gene expression and gene interactions to achieve more accurate inference for prediction and discovery of new gene interactions. In particular, how to generate a unified vector representation to integrate diverse input data is a key challenge addressed here. Results We propose a scalable and robust deep learning framework to learn embedded representations to unify known gene interactions and gene expression for gene interaction predictions. These low- dimensional embeddings derive deeper insights into the structure of rapidly accumulating and diverse gene interaction networks and greatly simplify downstream modeling. We compare the predictive power of our deep embeddings to the strong baselines. The results suggest that our deep embeddings achieve significantly more accurate predictions. Moreover, a set of novel gene interaction predictions are validated by up-to-date literature-based database entries. Conclusion The proposed model demonstrates the importance of integrating heterogeneous information about genes for gene network inference. GNE is freely available under the GNU General Public License and can be downloaded from GitHub ( https://github.com/kckishan/GNE ). Electronic supplementary material The online version of this article (10.1186/s12918-019-0694-y) contains supplementary material, which is available to authorized users.
The topological landscape of gene interaction networks provides a rich source of information for inferring functional patterns of genes or proteins. However, it is still a challenging task to aggregate heterogeneous biological information such as gene expression and gene interactions to achieve more accurate inference for prediction and discovery of new gene interactions. In particular, how to generate a unified vector representation to integrate diverse input data is a key challenge addressed here. We propose a scalable and robust deep learning framework to learn embedded representations to unify known gene interactions and gene expression for gene interaction predictions. These low-dimensional embeddings derive deeper insights into the structure of rapidly accumulating and diverse gene interaction networks and greatly simplify downstream modeling. We compare the predictive power of our deep embeddings to the strong baselines. The results suggest that our deep embeddings achieve significantly more accurate predictions. Moreover, a set of novel gene interaction predictions are validated by up-to-date
Biomedical interaction networks have incredible potential to be useful in the prediction of biologically meaningful interactions, identification of network biomarkers of disease, and the discovery of putative drug targets. Recently, graph neural networks have been proposed to effectively learn representations for biomedical entities and achieved state-of-the-art results in biomedical interaction prediction. These methods only consider information from immediate neighbors but cannot learn a general mixing of features from neighbors at various distances. In this paper, we present a higher-order graph convolutional network (HOGCN) to aggregate information from the higher-order neighborhood for biomedical interaction prediction. Specifically, HOGCN collects feature representations of neighbors at various distances and learns their linear mixing to obtain informative representations of biomedical entities. Experiments on four interaction networks, including protein-protein, drug-drug, drug-target, and gene-disease interactions, show that HOGCN achieves more accurate and calibrated predictions. HOGCN performs well on noisy, sparse interaction networks when feature representations of neighbors at various distances are considered. Moreover, a set of novel interaction predictions are validated by literature-based case studies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.