Martin Ernstsen scite author profile

Martin Ernstsen

5Publications

13Citation Statements Received

63Citation Statements Given

How they've been cited

How they cite others

Affiliations

UiT The Arctic University of Norway, Kongsberg Satellite Services (Norway)

Publications

Order By: Most citations

Integrating Data-Intensive Computing Systems with Biological Data Analysis Frameworks

Pedersen

Raknes

Ernstsen

et al. 2015

View full text Add to dashboard Cite

Biological data analysis is typically implemented using a pipeline that combines many data analysis tools and meta-databases. These pipelines must scale to very large datasets, and therefore often require parallel and distributed computing. There are many infrastructure systems for data-intensive computing. However, most biological data analysis pipelines do not leverage these systems. An important challenge is therefore to integrate biological data analysis frameworks with data-intensive computing infrastructure systems. In this paper, we describe how we have extended data-intensive computing systems to support unmodified biological data analysis tools. We also describe four approaches for integrating the extended systems with biological data analysis frameworks, and discuss challenges for such integration on production platforms. Our results demonstrate how biological data analysis pipelines can benefit from infrastructure systems for data-intensive computing.

show abstract

Data-Intensive Computing Infrastructure Systems for Unmodified Biological Data Analysis Pipelines

Bongo

Pedersen

Ernstsen

2015

View full text Add to dashboard Cite

Abstract. Biological data analysis is typically implemented using a deep pipeline that combines a wide array of tools and databases. These pipelines must scale to very large datasets, and consequently require parallel and distributed computing. It is therefore important to choose a hardware platform and underlying data management and processing systems well suited for processing large datasets. There are many infrastructure systems for such data-intensive computing. However, in our experience, most biological data analysis pipelines do not leverage these systems.We give an overview of data-intensive computing infrastructure systems, and describe how we have leveraged these for: (i) scalable fault-tolerant computing for large-scale biological data; (ii) incremental updates to reduce the resource usage required to update large-scale compendium; and (iii) interactive data analysis and exploration. We provide lessons learned and describe problems we have encountered during development and deployment. We also provide a literature survey on the use of data-intensive computing systems for biological data processing. Our results show how unmodified biological data analysis tools can benefit from infrastructure systems for data-intensive computing.

show abstract

Mario: Interactive Tuning of Biological Analysis Pipelines Using Iterative Processing

Ernstsen

Kjærner-Semb

Willassen

et al. 2014

View full text Add to dashboard Cite

META-pipe - Pipeline Annotation, Analysis and Visualization of Marine Metagenomic Sequence Data

Robertsen¹,

Kahlke²,

Raknes³

et al. 2016

Preprint

View full text Add to dashboard Cite

Norway

Ernstsen¹

2016

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Martin Ernstsen

Integrating Data-Intensive Computing Systems with Biological Data Analysis Frameworks

Data-Intensive Computing Infrastructure Systems for Unmodified Biological Data Analysis Pipelines

Mario: Interactive Tuning of Biological Analysis Pipelines Using Iterative Processing

META-pipe - Pipeline Annotation, Analysis and Visualization of Marine Metagenomic Sequence Data

Norway

Contact Info

Product

Resources

About