Objective
Despite 90% Glioblastoma (GBM) recurrences occurring in the peritumoral brain zone (PBZ), its contribution in patient survival is poorly understood. The current study leverages computerized texture (i.e. radiomic) analysis to evaluate the efficacy of PBZ features from preoperative MRI in predicting long (>18-months) versus short-term (<7-months) survival in GBM.
Methods
65 patient exams (29 short-term, 36 long-term) with Gadolinium-contrast T1w, FLAIR, T2w sequences from the Cancer Imaging Archive were employed. An expert manually segmented each study as: enhancing lesion, PBZ, and tumour necrosis. 402 radiomic features (capturing co-occurrence, gray-level dependence, directional gradients) was obtained for each region. Evaluation was performed using 3-fold cross validation, such that a subset of studies was used to select the most predictive features, and the remaining subset was used to evaluate their efficacy in predicting survival.
Results
A subset of 10 radiomic “peritumoral” MRI features, suggestive of intensity heterogeneity and textural patterns, was found to be predictive of survival (p = 1.47 × 10−5), as compared to features from enhancing tumour, necrotic regions, and known clinical factors.
Conclusion
Our preliminary analysis suggests that radiomic features from the PBZ on routine pre-operative MRI may be predictive of long-, versus short-term survival in GBM.
Hypoxia, a characteristic trait of Glioblastoma (GBM), is known to cause resistance to chemo-radiation treatment and is linked with poor survival. There is hence an urgent need to non-invasively characterize tumor hypoxia to improve GBM management. We hypothesized that (a) radiomic texture descriptors can capture tumor heterogeneity manifested as a result of molecular variations in tumor hypoxia, on routine treatment naïve MRI, and (b) these imaging based texture surrogate markers of hypoxia can discriminate GBM patients as short-term (STS), mid-term (MTS), and long-term survivors (LTS). 115 studies (33 STS, 41 MTS, 41 LTS) with gadolinium-enhanced T1-weighted MRI (Gd-T1w) and T2-weighted (T2w) and FLAIR MRI protocols and the corresponding RNA sequences were obtained. After expert segmentation of necrotic, enhancing, and edematous/nonenhancing tumor regions for every study, 30 radiomic texture descriptors were extracted from every region across every MRI protocol. Using the expression profile of 21 hypoxia-associated genes, a hypoxia enrichment score (HES) was obtained for the training cohort of 85 cases. Mutual information score was used to identify a subset of radiomic features that were most informative of HES within 3-fold cross-validation to categorize studies as STS, MTS, and LTS. When validated on an additional cohort of 30 studies (11 STS, 9 MTS, 10 LTS), our results revealed that the most discriminative features of HES were also able to distinguish STS from LTS (p = 0.003).
To evaluate the trustworthiness of saliency maps for abnormality localization in medical imaging.
Materials and Methods:Using two large publicly available radiology datasets (SIIM-ACR Pneumothorax Segmentation and RSNA Pneumonia Detection), we quantified the performance of eight commonly used saliency map techniques in regards to their 1) localization utility (segmentation and detection), 2) sensitivity to model weight randomization, 3) repeatability, and 4) reproducibility. We compared their performances versus baseline methods and localization network architectures, using area under the precision-recall curve (AUPRC) and structural similarity index (SSIM) as metrics.Results: All eight saliency map techniques fail at least one of the criteria and were inferior in performance compared to localization networks. For pneumothorax segmentation, the AUPRC ranged from 0.024-0.224, while a U-Net achieved a significantly superior AUPRC of 0.404 (p<0.005). For pneumonia detection, the AUPRC ranged from 0.160-0.519, while a RetinaNet achieved a significantly superior AUPRC of 0.596 (p<0.005). Five and two saliency methods (out of eight) failed the model randomization test on the segmentation and detection datasets, respectively, suggesting that these methods are not sensitive to changes in model parameters. The repeatability and reproducibility of the majority of the saliency methods were worse than localization networks for both the segmentation and detection datasets.
Conclusion:We suggest that the use of saliency maps in the high-risk domain of medical imaging warrants additional scrutiny and recommend that detection or segmentation models be used if localization is the desired output of the network.
The growing teratoma syndrome should be defined not only as a growing mediastinal mass but also with secondary cardiopulmonary deterioration precluding safe completion of planned chemotherapy in the presence of declining serum tumor markers. Prompt recognition of this syndrome, discontinuation of chemotherapy, and surgical intervention can result in cure.
Saliency maps have become a widely used method to make deep learning models more interpretable by providing post-hoc explanations of classifiers through identification of the most pertinent areas of the input medical image. They are increasingly being used in medical imaging to provide clinically plausible explanations for the decisions the neural network makes. However, the utility and robustness of these visualization maps has not yet been rigorously examined in the context of medical imaging. We posit that trustworthiness in this context requires 1) localization utility, 2) sensitivity to model weight randomization, 3) repeatability, and 4) reproducibility. Using the localization information available in two large public radiology datasets, we quantify the performance of eight commonly used saliency map approaches for the above criteria using area under the precision-recall curves (AUPRC) and structural similarity index (SSIM), comparing their performance to various baseline measures. Using our framework to quantify the trustworthiness of saliency maps, we show that all eight saliency map techniques fail at least one of the criteria and are, in most cases, less trustworthy when compared to the baselines. We suggest that their usage in the high-risk domain of medical imaging warrants additional scrutiny and recommend that detection or segmentation models be used if localization is the desired output of the network.
Objective: We developed deep learning algorithms to automatically assess BI-RADS breast density.Methods: Using a large multi-institution patient cohort of 108,230 digital screening mammograms from the Digital Mammographic Imaging Screening Trial, we investigated the effect of data, model, and training parameters on overall model performance and provided crowdsourcing evaluation from the attendees of the ACR 2019 Annual Meeting.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.