Perhaad Mistry scite author profile

Optical Quadrature Microscopy (OQM) is a process which uses phase data to capture information about the sample being studied. OQM is part of an imaging framework developed by the Optical Science Laboratory at Northeastern University. In one particular application of interest, the framework is used to extract phase information from the image of an embryo to determine embryo viability.Phase Unwrapping is the process of reconstructing the real phase shift (propagation delay) of a sample from the measured "wrapped" representation which is between −π and +π. Unwrapping can be done using the Minimum L P Norm Phase Unwrap algorithm. Images are first preprocessed using an Affine Transform before they are unwrapped. Both of these steps are time consuming and would benefit greatly from parallelization and acceleration. Faster processing would lower many research barriers (in terms of throughput and performance) present when using OQM.In this paper we report on accelerating Phase Unwrapping and Affine Transformations using NVIDIA's CUDA programming model. We also run elementary noise removal on the GPU using NVIDIA's CUBLAS (CUDA Basic Linear Algebra Subprograms) library. We integrate GPU execution into a Matlab environment to seamlessly interface to the pre-existing image acquisition system. By mapping the unwrap and noise removal to a GPU, and by also reducing the amount of I/O overhead, we are able to accelerate the end-to-end

show abstract

OpenCL Extensions

Gaster¹,

Howes²,

Kaeli³

et al. 2012

View full text Add to dashboard Cite

Data transformations enabling loop vectorization on multithreaded data parallel architectures

Jang

Mistry

Schaa

et al. 2010

View full text Add to dashboard Cite

Loop vectorization, a key feature exploited to obtain high performance on Single Instruction Multiple Data (SIMD) vector architectures, is significantly hindered by irregular memory access patterns in the data stream. This paper describes data transformations that allow us to vectorize loops targeting massively multithreaded data parallel architectures. We present a mathematical model that captures loop-based memory access patterns and computes the most appropriate data transformations in order to enable vectorization. Our experimental results show that the proposed data transformations can significantly increase the number of loops that can be vectorized and enhance the data-level parallelism of applications. Our results also show that the overhead associated with our data transformations can be easily amortized as the size of the input data set increases. For the set of high performance benchmark kernels studied, we achieve consistent and significant performance improvements (up to 11.4X) by applying vectorization using our data transformation approach.

show abstract

OpenCL Device Architectures

Gaster¹,

Howes²,

Kaeli³

et al. 2013

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Perhaad Mistry

Exploiting Memory Access Patterns to Improve Memory Performance in Data-Parallel Architectures

Introduction to OpenCL

Multi2Sim

Analyzing program flow within a many-kernel OpenCL application

Accelerating phase unwrapping and affine transformations for optical quadrature microscopy using CUDA

OpenCL Extensions

Data transformations enabling loop vectorization on multithreaded data parallel architectures

OpenCL Device Architectures

Contact Info

Product

Resources

About