This paper presents a deep learning approach for automatic detection and visual analysis of invasive ductal carcinoma (IDC) tissue regions in whole slide images (WSI) of breast cancer (BCa). Deep learning approaches are learn-from-data methods involving computational modeling of the learning process. This approach is similar to how human brain works using different interpretation levels or layers of most representative and useful features resulting into a hierarchical learned representation. These methods have been shown to outpace traditional approaches of most challenging problems in several areas such as speech recognition and object detection. Invasive breast cancer detection is a time consuming and challenging task primarily because it involves a pathologist scanning large swathes of benign regions to ultimately identify the areas of malignancy. Precise delineation of IDC in WSI is crucial to the subsequent estimation of grading tumor aggressiveness and predicting patient outcome. DL approaches are particularly adept at handling these types of problems, especially if a large number of samples are available for training, which would also ensure the generalizability of the learned features and classifier. The DL framework in this paper extends a number of convolutional neural networks (CNN) for visual semantic analysis of tumor regions for diagnosis support. The CNN is trained over a large amount of image patches (tissue regions) from WSI to learn a hierarchical part-based representation. The method was evaluated over a WSI dataset from 162 patients diagnosed with IDC. 113 slides were selected for training and 49 slides were held out for independent testing. Ground truth for quantitative evaluation was provided via expert delineation of the region of cancer by an expert pathologist on the digitized slides. The experimental evaluation was designed to measure classifier accuracy in detecting IDC tissue regions in WSI. Our method yielded the best quantitative results for automatic detection of IDC regions in WSI in terms of F-measure and balanced accuracy (71.80%, 84.23%), in comparison with an approach using handcrafted image features (color, texture and edges, nuclear textural and architecture), and a machine learning classifier for invasive tumor classification using a Random Forest. The best performing handcrafted features were fuzzy color histogram (67.53%, 78.74%) and RGB histogram (66.64%, 77.24%). Our results also suggest that at least some of the tissue classification mistakes (false positives and false negatives) were less due to any fundamental problems associated with the approach, than the inherent limitations in obtaining a very highly granular annotation of the diseased area of interest by an expert pathologist.
With the increasing ability to routinely and rapidly digitize whole slide images with slide scanners, there has been interest in developing computerized image analysis algorithms for automated detection of disease extent from digital pathology images. The manual identification of presence and extent of breast cancer by a pathologist is critical for patient management for tumor staging and assessing treatment response. However, this process is tedious and subject to inter- and intra-reader variability. For computerized methods to be useful as decision support tools, they need to be resilient to data acquired from different sources, different staining and cutting protocols and different scanners. The objective of this study was to evaluate the accuracy and robustness of a deep learning-based method to automatically identify the extent of invasive tumor on digitized images. Here, we present a new method that employs a convolutional neural network for detecting presence of invasive tumor on whole slide images. Our approach involves training the classifier on nearly 400 exemplars from multiple different sites, and scanners, and then independently validating on almost 200 cases from The Cancer Genome Atlas. Our approach yielded a Dice coefficient of 75.86%, a positive predictive value of 71.62% and a negative predictive value of 96.77% in terms of pixel-by-pixel evaluation compared to manually annotated regions of invasive ductal carcinoma.
The identification of phenotypic changes in breast cancer (BC) histopathology on account of corresponding molecular changes is of significant clinical importance in predicting disease outcome. One such example is the presence of lymphocytic infiltration (LI) in histopathology, which has been correlated with nodal metastasis and distant recurrence in HER2+ BC patients. In this paper, we present a computer-aided diagnosis (CADx) scheme to automatically detect and grade the extent of LI in digitized HER2+ BC histopathology. Lymphocytes are first automatically detected by a combination of region growing and Markov random field algorithms. Using the centers of individual detected lymphocytes as vertices, three graphs (Voronoi diagram, Delaunay triangulation, and minimum spanning tree) are constructed and a total of 50 image-derived features describing the arrangement of the lymphocytes are extracted from each sample. A nonlinear dimensionality reduction scheme, graph embedding (GE), is then used to project the high-dimensional feature vector into a reduced 3-D embedding space. A support vector machine classifier is used to discriminate samples with high and low LI in the reduced dimensional embedding space. A total of 41 HER2+ hematoxylin-and-eosin-stained images obtained from 12 patients were considered in this study. For more than 100 three-fold cross-validation trials, the architectural feature set successfully distinguished samples of high and low LI levels with a classification accuracy greater than 90%. The popular unsupervised Varma-Zisserman texton-based classification scheme was used for comparison and yielded a classification accuracy of only 60%. Additionally, the projection of the 50 image-derived features for all 41 tissue samples into a reduced dimensional space via GE allowed for the visualization of a smooth manifold that revealed a continuum between low, intermediate, and high levels of LI. Since it is known that extent of LI in BC biopsy specimens is a prognostic indicator, our CADx scheme will potentially help clinicians determine disease outcome and allow them to make better therapy recommendations for patients with HER2+ BC.
Breast cancer (BCa) grading plays an important role in predicting disease aggressiveness and patient outcome. A key component of BCa grade is the mitotic count, which involves quantifying the number of cells in the process of dividing (i.e., undergoing mitosis) at a specific point in time. Currently, mitosis counting is done manually by a pathologist looking at multiple high power fields (HPFs) on a glass slide under a microscope, an extremely laborious and time consuming process. The development of computerized systems for automated detection of mitotic nuclei, while highly desirable, is confounded by the highly variable shape and appearance of mitoses. Existing methods use either handcrafted features that capture certain morphological, statistical, or textural attributes of mitoses or features learned with convolutional neural networks (CNN). Although handcrafted features are inspired by the domain and the particular application, the data-driven CNN models tend to be domain agnostic and attempt to learn additional feature bases that cannot be represented through any of the handcrafted features. On the other hand, CNN is computationally more complex and needs a large number of labeled training instances. Since handcrafted features attempt to model domain pertinent attributes and CNN approaches are largely supervised feature generation methods, there is an appeal in attempting to combine these two distinct classes of feature generation strategies to create an integrated set of attributes that can potentially outperform either class of feature extraction strategies individually. We present a cascaded approach for mitosis detection that intelligently combines a CNN model and handcrafted features (morphology, color, and texture features). By employing a light CNN model, the proposed approach is far less demanding computationally, and the cascaded strategy of combining handcrafted features and CNN-derived features enables the possibility of maximizing the performance by leveraging the disconnected feature sets. Evaluation on the public ICPR12 mitosis dataset that has 226 mitoses annotated on 35 HPFs ([Formula: see text] magnification) by several pathologists and 15 testing HPFs yielded an [Formula: see text]-measure of 0.7345. Our approach is accurate, fast, and requires fewer computing resources compared to existent methods, making this feasible for clinical use.
Modified Bloom–Richardson (mBR) grading is known to have prognostic value in breast cancer (BCa), yet its use in clinical practice has been limited by intra- and interobserver variability. The development of a computerized system to distinguish mBR grade from entire estrogen receptor-positive (ER+) BCa histopathology slides will help clinicians identify grading discrepancies and improve overall confidence in the diagnostic result. In this paper, we isolate salient image features characterizing tumor morphology and texture to differentiate entire hematoxylin and eosin (H and E) stained histopathology slides based on mBR grade. The features are used in conjunction with a novel multifield-of-view (multi-FOV) classifier—a whole-slide classifier that extracts features from a multitude of FOVs of varying sizes—to identify important image features at different FOV sizes. Image features utilized include those related to the spatial arrangement of cancer nuclei (i.e., nuclear architecture) and the textural patterns within nuclei (i.e., nuclear texture). Using slides from 126 ER+ patients (46 low, 60 intermediate, and 20 high mBR grade), our grading system was able to distinguish low versus high, low versus intermediate, and intermediate versus high grade patients with area under curve values of 0.93, 0.72, and 0.74, respectively. Our results suggest that the multi-FOV classifier is able to 1) successfully discriminate low, medium, and high mBR grade and 2) identify specific image features at different FOV sizes that are important for distinguishing mBR grade in H and E stained ER+ BCa histology slides.
Digital histopathology slides have many sources of variance, and while pathologists typically do not struggle with them, computer aided diagnostic algorithms can perform erratically. This manuscript presents Stain Normalization using Sparse AutoEncoders (StaNoSA) for use in standardizing the color distributions of a test image to that of a single template image. We show how sparse autoencoders can be leveraged to partition images into tissue sub-types, so that color standardization for each can be performed independently. StaNoSA was validated on three experiments and compared against five other color standardization approaches and shown to have either comparable or superior results.
The presence of lymphocytic infiltration (LI) has been correlated with nodal metastasis and tumor recurrence in HER2+ breast cancer (BC). The ability to automatically detect and quantify extent of LI on histopathology imagery could potentially result in the development of an image based prognostic tool for human epidermal growth factor receptor-2 (HER2+) BC patients. Lymphocyte segmentation in hematoxylin and eosin (H&E) stained BC histopathology images is complicated by the similarity in appearance between lymphocyte nuclei and other structures (e.g., cancer nuclei) in the image. Additional challenges include biological variability, histological artifacts, and high prevalence of overlapping objects. Although active contours are widely employed in image segmentation, they are limited in their ability to segment overlapping objects and are sensitive to initialization. In this paper, we present a new segmentation scheme, expectation-maximization (EM) driven geodesic active contour with overlap resolution (EMaGACOR), which we apply to automatically detecting and segmenting lymphocytes on HER2+ BC histopathology images. EMaGACOR utilizes the expectation-maximization algorithm for automatically initializing a geodesic active contour (GAC) and includes a novel scheme based on heuristic splitting of contours via identification of high concavity points for resolving overlapping structures. EMaGACOR was evaluated on a total of 100 HER2+ breast biopsy histology images and was found to have a detection sensitivity of over 86% and a positive predictive value of over 64%. By comparison, the EMaGAC model (without overlap resolution) and GAC model yielded corresponding detection sensitivities of 42% and 19%, respectively. Furthermore, EMaGACOR was able to correctly resolve over 90% of overlaps between intersecting lymphocytes. Hausdorff distance (HD) and mean absolute distance (MAD) for EMaGACOR were found to be 2.1 and 0.9 pixels, respectively, and significantly better compared to the corresponding performance of the EMaGAC and GAC models. EMaGACOR is an efficient, robust, reproducible, and accurate segmentation technique that could potentially be applied to other biomedical image analysis problems.
Precise detection of invasive cancer on whole-slide images (WSI) is a critical first step in digital pathology tasks of diagnosis and grading. Convolutional neural network (CNN) is the most popular representation learning method for computer vision tasks, which have been successfully applied in digital pathology, including tumor and mitosis detection. However, CNNs are typically only tenable with relatively small image sizes (200 × 200 pixels). Only recently, Fully convolutional networks (FCN) are able to deal with larger image sizes (500 × 500 pixels) for semantic segmentation. Hence, the direct application of CNNs to WSI is not computationally feasible because for a WSI, a CNN would require billions or trillions of parameters. To alleviate this issue, this paper presents a novel method, High-throughput Adaptive Sampling for whole-slide Histopathology Image analysis (HASHI), which involves: i) a new efficient adaptive sampling method based on probability gradient and quasi-Monte Carlo sampling, and, ii) a powerful representation learning classifier based on CNNs. We applied HASHI to automated detection of invasive breast cancer on WSI. HASHI was trained and validated using three different data cohorts involving near 500 cases and then independently tested on 195 studies from The Cancer Genome Atlas. The results show that (1) the adaptive sampling method is an effective strategy to deal with WSI without compromising prediction accuracy by obtaining comparative results of a dense sampling (∼6 million of samples in 24 hours) with far fewer samples (∼2,000 samples in 1 minute), and (2) on an independent test dataset, HASHI is effective and robust to data from multiple sites, scanners, and platforms, achieving an average Dice coefficient of 76%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.