Variations in the shape and appearance of anatomical structures in medical images are often relevant radiological signs of disease. Automatic tools can help automate parts of this manual process. A cloud-based evaluation framework is presented in this paper including results of benchmarking current state-of-the-art medical imaging algorithms for anatomical structure segmentation and landmark detection: the VISCERAL Anatomy benchmarks. The algorithms are implemented in virtual machines in the cloud where participants can only access the training data and can be run privately by the benchmark administrators to objectively compare their performance in an unseen common test set. Overall, 120 computed tomography and magnetic resonance patient volumes were manually annotated to create a standard Gold Corpus containing a total of 1295 structures and 1760 landmarks. Ten participants contributed with automatic algorithms for the organ segmentation task, and three for the landmark localization task. Different algorithms obtained the best scores in the four available imaging modalities and for subsets of anatomical structures. The annotation framework, resulting data set, evaluation setup, results and performance analysis from the three VISCERAL Anatomy benchmarks are presented in this article. Both the VISCERAL data set and Silver Corpus generated with the fusion of the participant algorithms on a larger set of non-manually-annotated medical images are available to the research community.
Currently, increasingly large medical imaging data sets become available for research and are analysed by a range of algorithms segmenting anatomical structures automatically and interactively. While they provide segmentations on a much larger scale than possible to achieve with expert annotators, they are typically less accurate than experts. We present and compare approaches to estimate segmentations on large imaging data sets based on a small number of expert annotated examples, and algorithmic segmentations on a much larger data set. Results demonstrate that combining algorithmic segmentations is reliably outperforming the average individual algorithm. Furthermore, injecting organ specific reliability assessments of algorithms based on expert annotations improves accuracy compared to standard label fusion algorithms. The proposed methods are particularly relevant in putting the results of large image analysis algorithm benchmarks to long-term use.
This chapter describes the annotation of the medical image data that were used in the VISCERAL project. Annotation of regions in the 3D images is nontrivial, and tools need to be chosen to limit the manual work and have semi-automated annotation available. For this, several tools that were available free of charge or with limited costs were tested and compared. The GeoS tool was finally chosen for the annotation based on the detailed analysis, allowing for efficient and effective annotations. 3D slice was chosen for smaller structures with low contrast to complement the annotations. A detailed quality control was also installed, including an automatic tool that attributes organs to annotate and volumes to specific annotators, and then compares results. This allowed to judge the confidence in specific annotators and also to iteratively refine the annotation instructions to limit the subjectivity of the task as much as possible. For several structures, some subjectivity remains and this was measured via double annotations of the structure. This allows the judgement of the quality of automatic segmentations.
In the VISCERAL project, several Gold Corpus datasets containing medical imaging data and corresponding manual expert annotations have been created. These datasets were used for training and evaluation of participant algorithms in the VISCERAL Benchmarks. In addition to Gold Corpus datasets, the architecture of VISCERAL enables the creation of Silver Corpus annotations of far larger datasets, which are generated by the collective ensemble of submitted algorithms. In this chapter, three Gold Corpus datasets created for the VISCERAL Anatomy, Detection and Retrieval Benchmarks are described. Additionally, we present two datasets that have been created as a result of the anatomy and retrieval challenge.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.