Provenance is information describing the lineage of an object, such as a dataset or biological material. Since these objects can be passed between organizations, each organization can document only parts of the objects life cycle. As a result, interconnection of distributed provenance parts forms distributed provenance chains. Dependant on the actual provenance content, complete provenance chains can provide traceability and contribute to reproducibility and FAIRness of research objects. In this paper, we define a lightweight provenance model based on W3C PROV that enables generation of distributed provenance chains in complex, multi-organizational environments. The application of the model is demonstrated with a use case spanning several steps of a real-world research pipeline — starting with the acquisition of a specimen, its processing and storage, histological examination, and the generation/collection of associated data (images, annotations, clinical data), ending with training an AI model for the detection of tumor in the images. The proposed model has become an open conceptual foundation of the currently developed ISO 23494 standard on provenance for biotechnology domain.
The diagnosis of solid tumors of epithelial origin (carcinomas) represents a major part of the workload in clinical histopathology. Carcinomas consist of malignant epithelial cells arranged in more or less cohesive clusters of variable size and shape, together with stromal cells, extracellular matrix, and blood vessels. Distinguishing stroma from epithelium is a critical component of artificial intelligence (AI) methods developed to detect and analyze carcinomas. In this paper, we propose a novel automated workflow that enables large-scale guidance of AI methods to identify the epithelial component. The workflow is based on re-staining existing hematoxylin and eosin (H&E) formalin-fixed paraffin-embedded sections by immunohistochemistry for cytokeratins, cytoskeletal components specific to epithelial cells. Compared to existing methods, clinically available H&E sections are reused and no additional material, such as consecutive slides, is needed. We developed a simple and reliable method for automatic alignment to generate masks denoting cytokeratin-rich regions, using cell nuclei positions that are visible in both the original and the re-stained slide. The registration method has been compared to state-of-the-art methods for alignment of consecutive slides and shows that, despite being simpler, it provides similar accuracy and is more robust. We also demonstrate how the automatically generated masks can be used to train modern AI image segmentation based on U-Net, resulting in reliable detection of epithelial regions in previously unseen H&E slides. Through training on real-world material available in clinical laboratories, this approach therefore has widespread applications toward achieving AI-assisted tumor assessment directly from scanned H&E sections. In addition, the re-staining method will facilitate additional automated quantitative studies of tumor cell and stromal cell phenotypes.
Diagnostic histopathology is facing increasing demands due to aging populations and expanding healthcare programs. Semi-automated diagnostic systems employing deep learning methods are one approach to alleviate this pressure, with promising results for many routine diagnostic procedures. However, one major issue with deep learning approaches is their lack of interpretability - after adequate training they perform their assigned tasks admirably, but do not explain how they reach their conclusions. Knowledge of how a given method performs its task with high sensitivity and specificity would be advantageous to understand the key features responsible for diagnosis, and should in turn allow fine-tuning of deep learning approaches. This paper presents a deep learning-based system for carcinoma detection in whole slide images of prostate core biopsies, achieving state-of-the-art performance; 100% area under curve and sensitivity of 0.978 for 8 detected false positives on average per slide. Furthermore, we investigated various methods to extract the key features used by the neural network for classification. Of these, the technique called occlusion, adapted to whole slide images, analyzes the sensitivity of the detection system to changes in the input images. This technique produces heatmaps indicating which parts of the image have the strongest impact on the system's output that a histopathologist can examine to identify the network's reasoning behind a given classification. Reassuringly, the heatmaps identified several prevailing histomorphological features characterizing carcinoma, e.g. single-layered epithelium, presence of small lumina, and hyperchromatic nuclei with halos. A convincing finding was the recognition of their mimickers in non-neoplastic tissue. The results show that the neural network approach to recognize prostatic cancer is similar to that taken by a human pathologist at medium optical resolution. The use of explainability heatmaps provides added value for automated digital pathology to analyze and fine-tune deep learning systems, and improves trust in computer-based decisions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.