Nuclear segmentation in digital microscopic tissue images can enable extraction of high-quality features for nuclear morphometrics and other analysis in computational pathology. Conventional image processing techniques, such as Otsu thresholding and watershed segmentation, do not work effectively on challenging cases, such as chromatin-sparse and crowded nuclei. In contrast, machine learning-based segmentation can generalize across various nuclear appearances. However, training machine learning algorithms requires data sets of images, in which a vast number of nuclei have been annotated. Publicly accessible and annotated data sets, along with widely agreed upon metrics to compare techniques, have catalyzed tremendous innovation and progress on other image classification problems, particularly in object recognition. Inspired by their success, we introduce a large publicly accessible data set of hematoxylin and eosin (H&E)-stained tissue images with more than 21000 painstakingly annotated nuclear boundaries, whose quality was validated by a medical doctor. Because our data set is taken from multiple hospitals and includes a diversity of nuclear appearances from several patients, disease states, and organs, techniques trained on it are likely to generalize well and work right out-of-the-box on other H&E-stained images. We also propose a new metric to evaluate nuclear segmentation results that penalizes object- and pixel-level errors in a unified manner, unlike previous metrics that penalize only one type of error. We also propose a segmentation technique based on deep learning that lays a special emphasis on identifying the nuclear boundaries, including those between the touching or overlapping nuclei, and works well on a diverse set of test images.
Staining and scanning of tissue samples for microscopic examination is fraught with undesirable color variations arising from differences in raw materials and manufacturing techniques of stain vendors, staining protocols of labs, and color responses of digital scanners. When comparing tissue samples, color normalization and stain separation of the tissue images can be helpful for both pathologists and software. Techniques that are used for natural images fail to utilize structural properties of stained tissue samples and produce undesirable color distortions. The stain concentration cannot be negative. Tissue samples are stained with only a few stains and most tissue regions are characterized by at most one effective stain. We model these physical phenomena that define the tissue structure by first decomposing images in an unsupervised manner into stain density maps that are sparse and non-negative. For a given image, we combine its stain density maps with stain color basis of a pathologist-preferred target image, thus altering only its color while preserving its structure described by the maps. Stain density correlation with ground truth and preference by pathologists were higher for images normalized using our method when compared to other alternatives. We also propose a computationally faster extension of this technique for large whole-slide images that selects an appropriate patch sample instead of using the entire image to compute the stain color basis.
Context:Color normalization techniques for histology have not been empirically tested for their utility for computational pathology pipelines.Aims:We compared two contemporary techniques for achieving a common intermediate goal – epithelial-stromal classification.Settings and Design:Expert-annotated regions of epithelium and stroma were treated as ground truth for comparing classifiers on original and color-normalized images.Materials and Methods:Epithelial and stromal regions were annotated on thirty diverse-appearing H and E stained prostate cancer tissue microarray cores. Corresponding sets of thirty images each were generated using the two color normalization techniques. Color metrics were compared for original and color-normalized images. Separate epithelial-stromal classifiers were trained and compared on test images. Main analyses were conducted using a multiresolution segmentation (MRS) approach; comparative analyses using two other classification approaches (convolutional neural network [CNN], Wndchrm) were also performed.Statistical Analysis:For the main MRS method, which relied on classification of super-pixels, the number of variables used was reduced using backward elimination without compromising accuracy, and test - area under the curves (AUCs) were compared for original and normalized images. For CNN and Wndchrm, pixel classification test-AUCs were compared.Results:Khan method reduced color saturation while Vahadane reduced hue variance. Super-pixel-level test-AUC for MRS was 0.010–0.025 (95% confidence interval limits ± 0.004) higher for the two normalized image sets compared to the original in the 10–80 variable range. Improvement in pixel classification accuracy was also observed for CNN and Wndchrm for color-normalized images.Conclusions:Color normalization can give a small incremental benefit when a super-pixel-based classification method is used with features that perform implicit color normalization while the gain is higher for patch-based classification methods for classifying epithelium versus stroma.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.