Stain normalization is an important processing task for computer-aided diagnosis (CAD) systems in modern digital pathology. This task reduces the color and intensity variations present in stained images from different laboratories. Consequently, stain normalization typically increases the prediction accuracy of CAD systems. However, there are computational challenges that this normalization step must overcome, especially for real-time applications: the memory and run-time bottlenecks associated with the processing of images in high resolution, e.g., 40X. Moreover, stain normalization can be sensitive to the quality of the input images, e.g., when they contain stain spots or dirt. In this case, the algorithm may fail to accurately estimate the stain vectors. We present a high-performance system for stain normalization using a state-of-the-art unsupervised method based on stain-vector estimation. Using a highly-optimized normalization engine, our architecture enables high-speed and large-scale processing of high-resolution whole-slide images. This optimized engine integrates an automated thresholding technique to determine the useful pixels and uses a novel pixel-sampling method that significantly reduces the processing time of the normalization algorithm. We demonstrate the performance of our architecture using measurements from images of different sizes and scanner formats that belong to four different datasets. The results show that our optimizations achieve up to 58x speedup compared to a baseline implementation. We also prove the scalability of our system by showing that the processing time scales almost linearly with the amount of tissue pixels present in the image. Furthermore, we show that the output of the normalization algorithm can be adversely affected when the input images include artifacts. To address this issue, we enhance the stain normalization pipeline by introducing a parameter cross-checking technique that automatically detects the distortion of the algorithm's critical parameters. To assess the robustness of the proposed method we employ a machine learning (ML) pipeline that classifies images for detection of prostate cancer. The results show that the enhanced normalization algorithm increases the classification accuracy of the ML pipeline in the presence of poor-quality input images. For an exemplary ML pipeline, our new method increases the accuracy on an unseen dataset from 0.79 to 0.87.
Simulators and empirical profiling data are often used to understand how suitable a specific hardware architecture is for an application. However, simulators can be slow, and empirical profiling-based methods can only provide insights about the existing hardware on which the applications are executed. While the insights obtained in this way are valuable, they cannot be used to evaluate a large number of system designs efficiently.In this paper, we introduce an alternative to these approaches that is hardware-agnostic and enables fast designspace exploration. We propose a framework based on the LLVM compiler infrastructure that is capable of analyzing the inherent instruction-level parallelism and memory access patterns in sequential and parallel applications. The analysis is performed per thread or process during application execution. To illustrate the potential of the framework, we provide a detailed characterization of a representative benchmark for graph-based analytics, Graph 500. In addition to showcasing our tool's capabilities, our study complements previous Graph 500 characterization work by providing an analysis of the intrinsic properties of various implementations of the benchmark. Finally, we show how analytical models can leverage the software properties measured by our framework to enable fast system design-space exploration.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.