The prediction of microsatellite instability (MSI) using deep learning (DL) techniques could have significant benefits, including reducing cost and increasing MSI testing of colorectal cancer (CRC) patients. Nonetheless, batch effects or systematic biases are not well characterized in digital histology models and lead to overoptimistic estimates of model performance. Methods to not only palliate but to directly abrogate biases are needed. We present a multiple bias rejecting DL system based on adversarial networks for the prediction of MSI in CRC from tissue microarrays (TMAs), trained and validated in 1788 patients from EPICOLON and HGUA. The system consists of an end-to-end image preprocessing module that tile samples at multiple magnifications and a tissue classification module linked to the bias-rejecting MSI predictor. We detected three biases associated with the learned representations of a baseline model: the project of origin of samples, the patient’s spot and the TMA glass where each spot was placed. The system was trained to directly avoid learning the batch effects of those variables. The learned features from the bias-ablated model achieved maximum discriminative power with respect to the task and minimal statistical mean dependence with the biases. The impact of different magnifications, types of tissues and the model performance at tile vs patient level is analyzed. The AUC at tile level, and including all three selected tissues (tumor epithelium, mucin and lymphocytic regions) and 4 magnifications, was 0.87 ± 0.03 and increased to 0.9 ± 0.03 at patient level. To the best of our knowledge, this is the first work that incorporates a multiple bias ablation technique at the DL architecture in digital pathology, and the first using TMAs for the MSI prediction task.
The prediction of microsatellite instability (MSI) in colorectal cancer (CRC) using deep learning (DL) techniques directly from hematoxylin and eosin stained slides (H&E) has been shown feasible by independent works. Nonetheless, when available, relevant information from clinical, oncological and family history could be used to further inform DL predictions. The present work analyzes the effects from leveraging multimodal inputs and multitask supervision in a previously published DL system for the prediction of MSI in CRC (xDEEP-MSI). xDEEP-MSI was a multiple bias rejecting DL system based on adversarial networks trained and validated in 1788 patients from a total of 25 participating centers from EPICOLON and HGUA projects. In the present work, xDEEP-MSI is further enriched with weakly supervised learning in multiple molecular alterations (MSI status, K-RAS and BRAF mutations and Lynch Syndrome confirmed by germline mutations), adapted to multimodal inputs with variable degree of completeness (image, age, gender, localization of CRC, revised Bethesda criteria, Amsterdam II criteria and additional oncological history) and a self-supervised multiple instance learning that integrates multiple image-tiles, to obtain patient-level predictions. The AUC, including all three selected tissues (tumor epithelium, mucin and lymphocytic regions) and 5 magnifications, increases from 0.9 +/- 0.03, to 0.94 +/- 0.02. The sensibility and specificity reaches 92.5% 95%CI(79.6-98.4%) and 93.4% 95%CI(90.0-95.8%) respectively. To the best of our knowledge this is the first work that jointly uses multimodal inputs, multiple instance learning and multiple molecular supervision for the prediction of MSI in CRC from H&E, demonstrating their gains in performance. Prospective validation in an external independent dataset is still required.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.