Executing large number of self-regulating tasks or tasks that execute minimal inter-task communication in analogous is a common requirement in many domains. In this paper, we present our knowledge in applying two new Microsoft technologies Dryad and Azure to three bioinformatics applications. We also contrast with traditional MPI and Apache Hadoop MapReduce completion in one example. The applications are an EST (Expressed Sequence Tag) series assembly program, PhyloD statistical package to recognize HLA-associated viral evolution, and a pairwise Alu gene alignment application. We give detailed presentation discussion on a 768 core Windows HPC Server cluster and an Azure cloud. All the applications start with a "doubly data parallel step" connecting independent data chosen from two parallel (EST, Alu) or two different databases (PhyloD). There are different structures for final stages in each application.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.