Motivation
While deep-learning algorithms have demonstrated outstanding performance in semantic image segmentation tasks, large annotation datasets are needed to create accurate models. Annotation of histology images is challenging due to the effort and experience required to carefully delineate tissue structures, and difficulties related to sharing and markup of whole-slide images.
Results
We recruited 25 participants, ranging in experience from senior pathologists to medical students, to delineate tissue regions in 151 breast cancer slides using the Digital Slide Archive. Inter-participant discordance was systematically evaluated, revealing low discordance for tumor and stroma, and higher discordance for more subjectively defined or rare tissue classes. Feedback provided by senior participants enabled the generation and curation of 20 000+ annotated tissue regions. Fully convolutional networks trained using these annotations were highly accurate (mean AUC=0.945), and the scale of annotation data provided notable improvements in image classification accuracy.
Availability and Implementation
Dataset is freely available at: https://goo.gl/cNM4EL.
Supplementary information
Supplementary data are available at Bioinformatics online.
Background: Renal cell carcinoma (RCC) is one of the common malignancies in the United States. RCC incidence and mortality have been changing due to many reasons. We provide a thorough investigation of incidence and mortality trends of RCC in the US using the surveillance, epidemiology and end results (SEER) database. Methods: The SEER 13 registries were accessed for RCC cases diagnosed between 1992 and 2015. Incidence and mortality were calculated by demographic and tumor characteristics. We calculated annual percent changes (APC) of these rates. Rates were expressed by 100,000 personyears. Results: A total of 104,584 RCC cases were reviewed with 47,561 deaths. The overall incidence was 11.281 per 100,000 person-years. Incidence increased by 2.421% per year (95% CI, 2.096-2.747, p<.001) but later became stable since 2008. However, the incidence of clear-cell subtype continued to increase (1.449%; 95% CI, 0.216-2.697, P=.024). RCC overall mortality rates have been declining since 2001. However, mortality associated with distant RCC only started to decrease in 2012 with APC of −18.270% (−28.775-6.215, P = .006)
Background
Deep learning enables accurate high-resolution mapping of cells and tissue structures that can serve as the foundation of interpretable machine-learning models for computational pathology. However, generating adequate labels for these structures is a critical barrier, given the time and effort required from pathologists.
Results
This article describes a novel collaborative framework for engaging crowds of medical students and pathologists to produce quality labels for cell nuclei. We used this approach to produce the NuCLS dataset, containing >220,000 annotations of cell nuclei in breast cancers. This builds on prior work labeling tissue regions to produce an integrated tissue region- and cell-level annotation dataset for training that is the largest such resource for multi-scale analysis of breast cancer histology. This article presents data and analysis results for single and multi-rater annotations from both non-experts and pathologists. We present a novel workflow that uses algorithmic suggestions to collect accurate segmentation data without the need for laborious manual tracing of nuclei. Our results indicate that even noisy algorithmic suggestions do not adversely affect pathologist accuracy and can help non-experts improve annotation quality. We also present a new approach for inferring truth from multiple raters and show that non-experts can produce accurate annotations for visually distinctive classes.
Conclusions
This study is the most extensive systematic exploration of the large-scale use of wisdom-of-the-crowd approaches to generate data for computational pathology applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.