A major goal of a Genome Wide Association Study (GWAS) is to find associations between genetic variations, such as Single-Nucleotide Polymorphisms (SNPs) and the risk for developing a complex disease, such as cancer or schizophrenia. Logic Feature Selection (logicFS) is a technique to search for interactions between SNPs possibly enhancing the risk to develop a particular disease. Composed of several hundreds of processors, the Graphics Processing Unit (GPU) has become a very interesting platform for computationally demanding tasks on massive data. A special hierarchy of processors and fast memory units allow very powerful and efficient parallelization but also demands novel parallel algorithms. In this paper, we formulate LogicFS-GPU algorithm particularly suited for the data parallel architectures, such as GPUs. For this purpose, we employ low (or device) level and high level data parallel primitives, e.g. map, compaction, parallel-prefix-sum (scan) and parallel reduction. The primary idea of our algorithm is to allow the parallel threads developing cooperatively their own private high quality binary interaction models to predict the affection status of subjects. We demonstrate (1) how to formulate the parallel LogicFS-GPU algorithm to be able to exploit most of the potential parallelism hidden in the base logicFS algorithm and (2) how to utilize the special memory and processor architecture of a modern GPU in order to share this information among threads in an optimal way. As a perspective, LogicFS-GPU is not limited examining SNP interactions, but can also be applied to any problem in which multi-variate binary predictor interactions are tried to be associated with observations. Furthermore, the target architecture of LogicFS-GPU is not only constrained by GPU and it may be possible to port our formulation to any other target data-parallel architecture.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.