Background: Identifying which individuals will develop tuberculosis (TB) remains an unresolved problem due to few animal models and computational approaches that effectively address its heterogeneity. To meet these shortcomings, we show that Diversity Outbred (DO) mice reflect human-like genetic diversity and develop human-like lung granulomas when infected with Mycobacterium tuberculosis (M.tb). Methods: Following M.tb infection, a "supersusceptible" phenotype develops in approximately one-third of DO mice characterized by rapid morbidity and mortality within 8 weeks. These supersusceptible DO mice develop lung granulomas patterns akin to humans. This led us to utilize deep learning to identify supersusceptibility from hematoxylin & eosin (H&E) lung tissue sections utilizing only clinical outcomes (supersusceptible or not-supersusceptible) as labels. Findings: The proposed machine learning model diagnosed supersusceptibility with high accuracy (91.50 § 4.68%) compared to two expert pathologists using H&E stained lung sections (94.95% and 94.58%). Two nonexperts used the imaging biomarker to diagnose supersusceptibility with high accuracy (88.25% and 87.95%) and agreement (96.00%). A board-certified veterinary pathologist (GB) examined the imaging biomarker and determined the model was making diagnostic decisions using a form of granuloma necrosis (karyorrhectic and pyknotic nuclear debris). This was corroborated by one other board-certified veterinary pathologist. Finally, the imaging biomarker was quantified, providing a novel means to convert visual patterns within granulomas to data suitable for statistical analyses. Implications: Overall, our results have translatable implication to improve our understanding of TB and also to the broader field of computational pathology in which clinical outcomes alone can drive automatic identification of interpretable imaging biomarkers, knowledge discovery, and validation of existing clinical biomarkers.
More humans have died of tuberculosis (TB) than any other infectious disease and millions still die each year. Experts advocate for blood-based, serum protein biomarkers to help diagnose TB, which afflicts millions of people in high-burden countries. However, the protein biomarker pipeline is small. Here, we used the Diversity Outbred (DO) mouse population to address this gap, identifying five protein biomarker candidates. One protein biomarker, serum CXCL1, met the World Health Organization’s Targeted Product Profile for a triage test to diagnose active TB from latent M.tb infection (LTBI), non-TB lung disease, and normal sera in HIV-negative, adults from South Africa and Vietnam. To find the biomarker candidates, we quantified seven immune cytokines and four inflammatory proteins corresponding to highly expressed genes unique to progressor DO mice. Next, we applied statistical and machine learning methods to the data, i.e., 11 proteins in lungs from 453 infected and 29 non-infected mice. After searching all combinations of five algorithms and 239 protein subsets, validating, and testing the findings on independent data, two combinations accurately diagnosed progressor DO mice: Logistic Regression using MMP8; and Gradient Tree Boosting using a panel of 4: CXCL1, CXCL2, TNF, IL-10. Of those five protein biomarker candidates, two (MMP8 and CXCL1) were crucial for classifying DO mice; were above the limit of detection in most human serum samples; and had not been widely assessed for diagnostic performance in humans before. In patient sera, CXCL1 exceeded the triage diagnostic test criteria (>90% sensitivity; >70% specificity), while MMP8 did not. Using Area Under the Curve analyses, CXCL1 averaged 94.5% sensitivity and 88.8% specificity for active pulmonary TB (ATB) vs LTBI; 90.9% sensitivity and 71.4% specificity for ATB vs non-TB; and 100.0% sensitivity and 98.4% specificity for ATB vs normal sera. Our findings overall show that the DO mouse population can discover diagnostic-quality, serum protein biomarkers of human TB.
Background Machine learning sustains successful application to many diagnostic and prognostic problems in computational histopathology. Yet, few efforts have been made to model gene expression from histopathology. This study proposes a methodology which predicts selected gene expression values (microarray) from haematoxylin and eosin whole-slide images as an intermediate data modality to identify fulminant-like pulmonary tuberculosis ('supersusceptible') in an experimentally infected cohort of Diversity Outbred mice (n=77). Methods Gradient-boosted trees were utilized as a novel feature selector to identify gene transcripts predictive of fulminant-like pulmonary tuberculosis. A novel attention-based multiple instance learning model for regression was used to predict selected genes' expression from whole-slide images. Gene expression predictions were shown to be sufficiently replicated to identify supersusceptible mice using gradient-boosted trees trained on ground truth gene expression data. Findings The model was accurate, showing high positive correlations with ground truth gene expression on both cross-validation ( n = 77, 0.63 ≤ ρ ≤ 0.84) and external testing sets ( n = 33, 0.65 ≤ ρ ≤ 0.84). The sensitivity and specificity for gene expression predictions to identify supersusceptible mice ( n =77) were 0.88 and 0.95, respectively, and for an external set of mice (n=33) 0.88 and 0.93, respectively. Implications Our methodology maps histopathology to gene expression with sufficient accuracy to predict a clinical outcome. The proposed methodology exemplifies a computational template for gene expression panels, in which relatively inexpensive and widely available tissue histopathology may be mapped to specific genes' expression to serve as a diagnostic or prognostic tool. Funding National Institutes of Health and American Lung Association.
The World Health Organization (WHO) has clear guidelines regarding the use of Ki67 index in defining the proliferative rate and assigning grade for pancreatic neuroendocrine tumor (NET). WHO mandates the quantification of Ki67 index by counting at least 500 positive tumor cells in a hotspot. Unfortunately, Ki67 antibody may stain both tumor and non-tumor cells as positive depending on the phase of the cell cycle. Likewise, the counter stain labels both tumor and non-tumor as negative. This non-specific nature of Ki67 stain and counter stain therefore hinders the exact quantification of Ki67 index. To address this problem, we present a deep learning method to automatically differentiate between NET and non-tumor regions based on images of Ki67 stained biopsies. Transfer learning was employed to recognize and apply relevant knowledge from previous learning experiences to differentiate between tumor and non-tumor regions. Transfer learning exploits a rich set of features previously used to successfully categorize non-pathology data into 1,000 classes. The method was trained and validated on a set of whole-slide images including 33 NETs subject to Ki67 immunohistochemical staining using a leave-one-out cross-validation. When applied to 30 high power fields (HPF) and assessed against a gold standard (evaluation by two expert pathologists), the method resulted in a high sensitivity of 97.8% and specificity of 88.8%. The deep learning method developed has the potential to reduce pathologists’ workload by directly identifying tumor boundaries on images of Ki67 stained slides. Moreover, it has the potential to replace sophisticated and expensive imaging methods which are recently developed for identification of tumor boundaries in images of Ki67-stained NETs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.