2015
DOI: 10.1007/978-3-319-24462-4_22
|View full text |Cite
|
Sign up to set email alerts
|

Data-Intensive Computing Infrastructure Systems for Unmodified Biological Data Analysis Pipelines

Abstract: Abstract. Biological data analysis is typically implemented using a deep pipeline that combines a wide array of tools and databases. These pipelines must scale to very large datasets, and consequently require parallel and distributed computing. It is therefore important to choose a hardware platform and underlying data management and processing systems well suited for processing large datasets. There are many infrastructure systems for such data-intensive computing. However, in our experience, most biological … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2015
2015
2017
2017

Publication Types

Select...
1
1
1

Relationship

2
1

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 30 publications
0
4
0
Order By: Relevance
“…The main advantage is improved performance and scalability for I/O bound jobs. The main disadvantage is that applications may need to be modified to utilize such a platform fully [30,25,26].…”
Section: Hardware Platformsmentioning
confidence: 99%
See 2 more Smart Citations
“…The main advantage is improved performance and scalability for I/O bound jobs. The main disadvantage is that applications may need to be modified to utilize such a platform fully [30,25,26].…”
Section: Hardware Platformsmentioning
confidence: 99%
“…Systems such as Troilkatt [26,30] are designed to execute workflows on clusters built for data-intensive computing [31]. Compared to HPC clusters these have storage distributed on the compute nodes, and data processing systems that utilize such distributed storage.…”
Section: Hardware Platformsmentioning
confidence: 99%
See 1 more Smart Citation
“…We discuss the limitations and lessons learned utilizing data-intensive systems for biological data processing in [18].…”
Section: Related Workmentioning
confidence: 99%