George Teodoro scite author profile

We present the Real-time Accurate Cell-shape Extractor (RACE), a high-throughput image analysis framework for automated three-dimensional cell segmentation in large-scale images. RACE is 55-330 times faster and 2-5 times more accurate than state-of-the-art methods. We demonstrate the generality of RACE by extracting cell-shape information from entire Drosophila, zebrafish, and mouse embryos imaged with confocal and light-sheet microscopes. Using RACE, we automatically reconstructed cellular-resolution tissue anisotropy maps across developing Drosophila embryos and quantified differences in cell-shape dynamics in wild-type and mutant embryos. We furthermore integrated RACE with our framework for automated cell lineaging and performed joint segmentation and cell tracking in entire Drosophila embryos. RACE processed these terabyte-sized datasets on a single computer within 1.4 days. RACE is easy to use, as it requires adjustment of only three parameters, takes full advantage of state-of-the-art multi-core processors and graphics cards, and is available as open-source software for Windows, Linux, and Mac OS.

show abstract

Machine-Based Morphologic Analysis of Glioblastoma Using Whole-Slide Pathology Images Uncovers Clinically Relevant Molecular Correlates

Kong

et al. 2013

View full text Add to dashboard Cite

Pathologic review of tumor morphology in histologic sections is the traditional method for cancer classification and grading, yet human review has limitations that can result in low reproducibility and inter-observer agreement. Computerized image analysis can partially overcome these shortcomings due to its capacity to quantitatively and reproducibly measure histologic structures on a large-scale. In this paper, we present an end-to-end image analysis and data integration pipeline for large-scale morphologic analysis of pathology images and demonstrate the ability to correlate phenotypic groups with molecular data and clinical outcomes. We demonstrate our method in the context of glioblastoma (GBM), with specific focus on the degree of the oligodendroglioma component. Over 200 million nuclei in digitized pathology slides from 117 GBMs in the Cancer Genome Atlas were quantitatively analyzed, followed by multiplatform correlation of nuclear features with molecular and clinical data. For each nucleus, a Nuclear Score (NS) was calculated based on the degree of oligodendroglioma appearance, using a regression model trained from the optimal feature set. Using the frequencies of neoplastic nuclei in low and high NS intervals, we were able to cluster patients into three well-separated disease groups that contained low, medium, or high Oligodendroglioma Component (OC). We showed that machine-based classification of GBMs with high oligodendroglioma component uncovered a set of tumors with strong associations with PDGFRA amplification, proneural transcriptional class, and expression of the oligodendrocyte signature genes MBP, HOXD1, PLP1, MOBP and PDGFRA. Quantitative morphologic features within the GBMs that correlated most strongly with oligodendrocyte gene expression were high nuclear circularity and low eccentricity. These findings highlight the potential of high throughput morphologic analysis to complement and inform human-based pathologic review.

show abstract

Comparative Performance Analysis of Intel (R) Xeon Phi (TM), GPU, and CPU: A Case Study from Microscopy Image Analysis

Teodoro

Kurç

Kong

et al. 2014

View full text Add to dashboard Cite

We study and characterize the performance of operations in an important class of applications on GPUs and Many Integrated Core (MIC) architectures. Our work is motivated by applications that analyze low-dimensional spatial datasets captured by high resolution sensors, such as image datasets obtained from whole slide tissue specimens using microscopy scanners. Common operations in these applications involve the detection and extraction of objects (object segmentation), the computation of features of each extracted object (feature computation), and characterization of objects based on these features (object classification). In this work, we have identify the data access and computation patterns of operations in the object segmentation and feature computation categories. We systematically implement and evaluate the performance of these operations on modern CPUs, GPUs, and MIC systems for a microscopy image analysis application. Our results show that the performance on a MIC of operations that perform regular data access is comparable or sometimes better than that on a GPU. On the other hand, GPUs are significantly more efficient than MICs for operations that access data irregularly. This is a result of the low performance of MICs when it comes to random data access. We also have examined the coordinated use of MICs and CPUs. Our experiments show that using a performance aware task strategy for scheduling application operations improves performance about 1.29× over a first-come-first-served strategy. This allows applications to obtain high performance efficiency on CPU-MIC systems - the example application attained an efficiency of 84% on 192 nodes (3072 CPU cores and 192 MICs).

show abstract

Efficient irregular wavefront propagation algorithms on hybrid CPU–GPU machines

et al. 2013

View full text Add to dashboard Cite

We address the problem of efficient execution of a computation pattern, referred to here as the irregular wavefront propagation pattern (IWPP), on hybrid systems with multiple CPUs and GPUs. The IWPP is common in several image processing operations. In the IWPP, data elements in the wavefront propagate waves to their neighboring elements on a grid if a propagation condition is satisfied. Elements receiving the propagated waves become part of the wavefront. This pattern results in irregular data accesses and computations. We develop and evaluate strategies for efficient computation and propagation of wavefronts using a multi-level queue structure. This queue structure improves the utilization of fast memories in a GPU and reduces synchronization overheads. We also develop a tile-based parallelization strategy to support execution on multiple CPUs and GPUs. We evaluate our approaches on a state-of-the-art GPU accelerated machine (equipped with 3 GPUs and 2 multicore CPUs) using the IWPP implementations of two widely used image processing operations: morphological reconstruction and euclidean distance transform. Our results show significant performance improvements on GPUs. The use of multiple CPUs and GPUs cooperatively attains speedups of 50× and 85× with respect to single core CPU executions for morphological reconstruction and euclidean distance transform, respectively.

show abstract

Coordinating the use of GPU and CPU for improving performance of compute intensive applications

Teodoro

Sachetto

Sertel

et al. 2009

View full text Add to dashboard Cite

Accelerating Large Scale Image Analyses on Parallel, CPU-GPU Equipped Systems

Teodoro

Kurç

Pan

et al. 2012

View full text Add to dashboard Cite

The past decade has witnessed a major paradigm shift in high performance computing with the introduction of accelerators as general purpose processors. These computing devices make available very high parallel computing power at low cost and power consumption, transforming current high performance platforms into heterogeneous CPU-GPU equipped systems. Although the theoretical performance achieved by these hybrid systems is impressive, taking practical advantage of this computing power remains a very challenging problem. Most applications are still deployed to either GPU or CPU, leaving the other resource under- or un-utilized. In this paper, we propose, implement, and evaluate a performance aware scheduling technique along with optimizations to make efficient collaborative use of CPUs and GPUs on a parallel system. In the context of feature computations in large scale image analysis applications, our evaluations show that intelligently co-scheduling CPUs and GPUs can significantly improve performance over GPU-only or multi-core CPU-only approaches.

show abstract

CUDAlign 4.0: Incremental Speculative Traceback for Exact Chromosome-Wide Alignment in GPU Clusters

Sandes

Miranda

Martorell

et al. 2016

IEEE Trans. Parallel Distrib. Syst.

View full text Add to dashboard Cite

Abstract-This paper proposes and evaluates CUDAlign 4.0, a parallel strategy to obtain the optimal alignment of huge DNA sequences in multi-GPU platforms, using the exact Smith-Waterman (SW) algorithm. In the first phase of CUDAlign 4.0, a huge Dynamic Programming (DP) matrix is computed by multiple GPUs, which asynchronously communicate border elements to the right neighbor in order to find the optimal score. After that, the traceback phase of SW is executed. The efficient parallelization of the traceback phase is very challenging because of the high amount of data dependency, which particularly impacts the performance and limits the application scalability. In order to obtain a multi-GPU highly parallel traceback phase, we propose and evaluate a new parallel traceback algorithm called Incremental Speculative Traceback (IST), which pipelines the traceback phase, speculating incrementally over the values calculated so far, producing results in advance. With CUDAlign 4.0, we were able to calculate SW matrices with up to 60 Peta cells, obtaining the optimal local alignments of all Human and Chimpanzee homologous chromosomes, whose sizes range from 26 Millions of Base Pairs (MBP) up to 249 MBP. As far as we know, this is the first time such comparison was made with the SW exact method. We also show that the IST algorithm is able to reduce the traceback time from 2.15⇥ up to 21.03⇥, when compared with the baseline traceback algorithm. The human⇥chimpanzee chromosome 5 comparison (180 MBP⇥183 MBP) attained 10,370.00 GCUPS (Billions of Cells Updated per Second) using 384 GPUs, with a speculation hit ratio of 98.2%.

show abstract

Anthill: A Scalable Run-Time Environment for Data Mining Applications

Ferreira

Meira

Guedes

et al.

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

George Teodoro

Real-Time Three-Dimensional Cell Segmentation in Large-Scale Microscopy Data of Developing Embryos

Machine-Based Morphologic Analysis of Glioblastoma Using Whole-Slide Pathology Images Uncovers Clinically Relevant Molecular Correlates

Comparative Performance Analysis of Intel (R) Xeon Phi (TM), GPU, and CPU: A Case Study from Microscopy Image Analysis

Efficient irregular wavefront propagation algorithms on hybrid CPU–GPU machines

Coordinating the use of GPU and CPU for improving performance of compute intensive applications

Accelerating Large Scale Image Analyses on Parallel, CPU-GPU Equipped Systems

CUDAlign 4.0: Incremental Speculative Traceback for Exact Chromosome-Wide Alignment in GPU Clusters

Anthill: A Scalable Run-Time Environment for Data Mining Applications

Contact Info

Product

Resources

About