2017
DOI: 10.1186/s12859-017-1902-7
|View full text |Cite
|
Sign up to set email alerts
|

Scaling bioinformatics applications on HPC

Abstract: BackgroundRecent breakthroughs in molecular biology and next generation sequencing technologies have led to the expenential growh of the sequence databases. Researchrs use BLAST for processing these sequences. However traditional software parallelization techniques (threads, message passing interface) applied in newer versios of BLAST are not adequate for processing these sequences in timely manner.MethodsA new method for array job parallelization has been developed which offers O(T) theoretical speed-up in co… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
12
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 9 publications
(12 citation statements)
references
References 9 publications
0
12
0
Order By: Relevance
“…All MC simulations were performed using a supercomputing cluster comprising 94 computing nodes with 8 CPUs and 24 GB of RAM per node. 55 A two-dimensional (2D) x − z plane in the computed 3D energy deposition map at ∼9 mm offset from the beam center in y direction was extracted to calculate initial pressure within the image plane. The offset plane was used to mimic the real-world experimental setups, in which the ultrasound transducer is offset from the optical source.…”
Section: Monte Carlo Simulationmentioning
confidence: 99%
“…All MC simulations were performed using a supercomputing cluster comprising 94 computing nodes with 8 CPUs and 24 GB of RAM per node. 55 A two-dimensional (2D) x − z plane in the computed 3D energy deposition map at ∼9 mm offset from the beam center in y direction was extracted to calculate initial pressure within the image plane. The offset plane was used to mimic the real-world experimental setups, in which the ultrasound transducer is offset from the optical source.…”
Section: Monte Carlo Simulationmentioning
confidence: 99%
“…The computation segmentation technique described in this paper generated partial results every half hour. 34 This technique can be adapted to scale a wide variety of bioinformatics applications, as described by Mikailov et al 35…”
Section: Speedup Using the Computation Segmentation Techniquementioning
confidence: 99%
“…In contrast, parallelization using data partitioning provides a simple way to immediately utilize the available resources on an HPC cluster to speed up the latest version of the original BLAST+ without modifying the algorithm. For example, the “dual segmentation” method [ 12 ] divides the database and query into m and n subsets, respectively, on a cluster with nodes. The pairs of database-query subsets are then processed in parallel using nodes.…”
Section: Introductionmentioning
confidence: 99%
“…This value is important for calculating the expect value , or E -value, which is a measure of the statistical significance of the matches found. Experiment results showed reduction in runtime from 27 days to less than one day on a homogeneous HPC cluster with 500+ nodes [ 12 ]. However, selection of the optimal m and n values was not explored.…”
Section: Introductionmentioning
confidence: 99%