Sebastian Dorok scite author profile

Sebastian Dorok

4Publications

10Citation Statements Received

64Citation Statements Given

How they've been cited

How they cite others

Affiliations

University Hospital Magdeburg, Bayer (Germany)

Publications

Order By: Most citations

Toward Efficient Variant Calling Inside Main-Memory Database Systems

Dorok

Breß

Saake

2014

View full text Add to dashboard Cite

Abstract-Mutations in genomes indicate predisposition for diseases or effects on efficacy of drugs. A variant calling algorithm determines possible mutations in sample genomes. Afterwards, scientists have to decide about the impact of these mutations. Certainly, many different variant calling algorithms exist that generate different outputs due to different sequence alignments as input and parameterizations of variant calling algorithms. Thus, a combination of variant calling results is necessary to provide a more complete set of mutations than single algorithm runs can provide. Therefore, a system is required that facilitates the integration and parameterization of different variant calling algorithms and processing of different sequence alignments. Moreover, against the backdrop of ever increasing amounts of available genome sequencing data, such a system must provide matured database management capabilities to enable flexible and efficient analyses while keeping data consistent. In this paper, we present a first approach to integrate variant calling into a main-memory database management system that allows for calling variants via SQL.

show abstract

Efficiently Storing and Analyzing Genome Data in Database Systems

et al. 2017

View full text Add to dashboard Cite

Toward efficient and reliable genome analysis using main-memory database systems

Dorok

Breß

Läpple

et al. 2014

View full text Add to dashboard Cite

Improvements in DNA sequencing technologies allow to sequence complete human genomes in a short time and at acceptable cost. Hence, the vision of genome analysis as standard procedure to support and improve medical treatment becomes reachable. In this vision paper, we describe important data-management challenges that have to be met to make this vision come true. Besides genome-analysis performance, data-management capabilities such as data provenance and data integrity become increasingly important to enable comprehensible and reliable genome analysis. We argue to meet these challenges by using main-memory database technologies, which combine fast processing capabilities with extensive data-management capabilities. Finally, we discuss possibilities of integrating genome-analysis tasks into DBMSs and derive new research questions.

show abstract

The Relational Way To Dam The Flood Of Genome Data

Dorok

2015

View full text Add to dashboard Cite

Mutations in genomes can indicate a predisposition for diseases such as cancer or cardiovascular disorder. Genome analysis is an established procedure to determine mutations and deduce their impact on living organisms. The first step in genome analysis is DNA sequencing that makes the biochemically stored hereditary information in DNA digitally readable. The cost and time to sequence a whole genome decreases rapidly and leads to an increase of available raw genome data that must be stored and integrated to be analyzed. Damming this flood of genome data requires efficient and effective analysis as well as data management solutions. State-of-the-art in genome analysis are flat-file-based storage and analysis solutions. Consequently, every analysis application is responsible to manage data on its own, which leads to implementation and process overhead.Database systems have already shown their ability to reduce data management overhead for analysis applications in various domains. However, current approaches using relational database systems for genome-data management lack scalable performance on increasing amounts of genome data. In this thesis, we investigate the capabilities of relational main-memory database systems to store and query genome data efficiently, while enabling flexible data access.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Sebastian Dorok

Toward Efficient Variant Calling Inside Main-Memory Database Systems

Efficiently Storing and Analyzing Genome Data in Database Systems

Toward efficient and reliable genome analysis using main-memory database systems

The Relational Way To Dam The Flood Of Genome Data

Contact Info

Product

Resources

About