Machin e lear nin g (ML) is incr easingly being use d in image retrieval systems for medical decision making. On e app lication of ML is to retrieve visually similar medical images from pas t patients (e.g. tissue from biops ies) to reference whe n making a medical decision with a new pat ient. Howeve r, no algorithm can perfectly captu re an expert ' s ideal notion of similarity for every case: an image th at is algorithmi cally determin ed to be similar may not be medically relevant to a doctor' s specific diagnostic needs. In this pape r, we identified the needs of patho logists whe n searchin g for similar images retrieved usin g a deep lear nin g algorithm , and develope d tools that empower use rs to cope with the search algorithm on-the -fly, communi cating what types of similarity are most import ant at different moment s in time. In two evaluations with path ologists, we found th at th ese refinement tools increased the diagnos tic utility of images found and increased user trus t in the algorithm. Th e tools we re preferred over a traditi onal interface, without a loss in diagnostic accuracy. We also observe d that users adopted new str ategies whe n using refinement tools, re-purpos ing th em to test and understand the underlying algorithm and to disambiguate ML errors from their own errors. Taken togethe r, these findings inform futur e hum an-ML collabo rative systems for expe rt decision-m aking. CCS CONCEPTS• Human-centered computing --> Human computer interaction (HCI); KEYWORDS Human -AI int eraction ; machin e learnin g; clinical healthPermission to mak e digital or har d copies of part or all of this work for personal or classroom use is grant ed without fee provi ded that copies are not made or distributed for profit or commercial advanta ge and that copies bear this notice an d the full citation on the first page. Figure 1: Medical images contain a wide range of clinical features , such as cellular (1) and glandular morphology (2), interaction between components (3), processing artifacts (4), and many more. It can be difficult for a similar -image search algorithm to perfectly capture an expert's notion of similarity ,
Deriving interpretable prognostic features from deep-learning-based prognostic histopathology models remains a challenge. In this study, we developed a deep learning system (DLS) for predicting disease-specific survival for stage II and III colorectal cancer using 3652 cases (27,300 slides). When evaluated on two validation datasets containing 1239 cases (9340 slides) and 738 cases (7140 slides), respectively, the DLS achieved a 5-year disease-specific survival AUC of 0.70 (95% CI: 0.66–0.73) and 0.69 (95% CI: 0.64–0.72), and added significant predictive value to a set of nine clinicopathologic features. To interpret the DLS, we explored the ability of different human-interpretable features to explain the variance in DLS scores. We observed that clinicopathologic features such as T-category, N-category, and grade explained a small fraction of the variance in DLS scores (R2 = 18% in both validation sets). Next, we generated human-interpretable histologic features by clustering embeddings from a deep-learning-based image-similarity model and showed that they explained the majority of the variance (R2 of 73–80%). Furthermore, the clustering-derived feature most strongly associated with high DLS scores was also highly prognostic in isolation. With a distinct visual appearance (poorly differentiated tumor cell clusters adjacent to adipose tissue), this feature was identified by annotators with 87.0–95.5% accuracy. Our approach can be used to explain predictions from a prognostic deep learning model and uncover potentially-novel prognostic features that can be reliably identified by people for future validation studies.
The increasing availability of large institutional and public histopathology image datasets is enabling the searching of these datasets for diagnosis, research, and education. Although these datasets typically have associated metadata such as diagnosis or clinical notes, even carefully curated datasets rarely contain annotations of the location of regions of interest on each image. As pathology images are extremely large (up to 100,000 pixels in each dimension), further laborious visual search of each image may be needed to find the feature of interest. In this paper, we introduce a deep-learning-based reverse image search tool for histopathology images: Similar Medical Images Like Yours (SMILY). We assessed SMILY’s ability to retrieve search results in two ways: using pathologist-provided annotations, and via prospective studies where pathologists evaluated the quality of SMILY search results. As a negative control in the second evaluation, pathologists were blinded to whether search results were retrieved by SMILY or randomly. In both types of assessments, SMILY was able to retrieve search results with similar histologic features, organ site, and prostate cancer Gleason grade compared with the original query. SMILY may be a useful general-purpose tool in the pathologist’s arsenal, to improve the efficiency of searching large archives of histopathology images, without the need to develop and implement specific tools for each application.
Histologic grading of breast cancer involves review and scoring of three well-established morphologic features: mitotic count, nuclear pleomorphism, and tubule formation. Taken together, these features form the basis of the Nottingham Grading System which is used to inform breast cancer characterization and prognosis. In this study, we develop deep learning models to perform histologic scoring of all three components using digitized hematoxylin and eosin-stained slides containing invasive breast carcinoma. We first evaluate model performance using pathologist-based reference standards for each component. To complement this typical approach to evaluation, we further evaluate the deep learning models via prognostic analyses. The individual component models perform at or above published benchmarks for algorithm-based grading approaches, achieving high concordance rates with pathologist grading. Further, prognostic performance using deep learning-based grading is on par with that of pathologists performing review of matched slides. By providing scores for each component feature, the deep-learning based approach also provides the potential to identify the grading components contributing most to prognostic value. This may enable optimized prognostic models, opportunities to improve access to consistent grading, and approaches to better understand the links between histologic features and clinical outcomes in breast cancer.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.