Felix Droop scite author profile

Felix Droop

3Publications

4Citation Statements Received

74Citation Statements Given

How they've been cited

How they cite others

Affiliations

Freie Universität Berlin

Publications

Order By: Most citations

Hierarchical Interleaved Bloom Filter: Enabling ultrafast, approximate sequence queries

Mehringer

Seiler

Droop

et al. 2022

Preprint

View full text Add to dashboard Cite

Searching sequences in large, distributed databases is the most widely used bioinformatics analysis done. This basic task is in dire need for solutions that deal with the exponential growth of sequence repositories and perform approximate queries very fast. In this paper, we present a novel data structure: the Hierarchical Interleaved Bloom Filter (HIBF). It is extremely fast and space efficient, yet so general that it has the potential to serve as the underlying engine for many applications. We show that the HIBF is superior in build time, index size and search time while achieving a comparable or better accuracy compared to other state-of-the art tools (Mantis and Bifrost). The HIBF builds an index up to 211 times faster, using up to 14 times less space and can answer approximate membership queries faster by a factor of up to 129. This can be considered a quantum leap that opens the door to indexing complete sequence archives like the European Nucleotide Archive or even larger metagenomics data sets.

show abstract

Hierarchical Interleaved Bloom Filter: enabling ultrafast, approximate sequence queries

et al. 2023

View full text Add to dashboard Cite

We present a novel data structure for searching sequences in large databases: the Hierarchical Interleaved Bloom Filter (HIBF). It is extremely fast and space efficient, yet so general that it could serve as the underlying engine for many applications. We show that the HIBF is superior in build time, index size, and search time while achieving a comparable or better accuracy compared to other state-of-the-art tools. The HIBF builds an index up to 211 times faster, using up to 14 times less space, and can answer approximate membership queries faster by a factor of up to 129.

show abstract

A mathematical programming approach for resource allocation of data analysis workflows on heterogeneous clusters

et al. 2023

View full text Add to dashboard Cite

Scientific communities are motivated to schedule their large-scale data analysis workflows in heterogeneous cluster environments because of privacy and financial issues. In such environments containing considerably diverse resources, efficient resource allocation approaches are essential for reaching high performance. Accordingly, this research addresses the scheduling problem of workflows with bag-of-task form to minimize total runtime (makespan). To this aim, we develop a mixed-integer linear programming model (MILP). The proposed model contains binary decision variables determining which tasks should be assigned to which nodes. Also, it contains linear constraints to fulfill the tasks requirements such as memory and scheduling policy. Comparative results show that our approach outperforms related approaches in most cases. As part of the post-optimality analysis, some secondary preferences are imposed on the proposed model to obtain the most preferred optimal solution. We analyze the relaxation of the makespan in the hope of significantly reducing the number of consumed nodes.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Felix Droop

Hierarchical Interleaved Bloom Filter: Enabling ultrafast, approximate sequence queries

Hierarchical Interleaved Bloom Filter: enabling ultrafast, approximate sequence queries

A mathematical programming approach for resource allocation of data analysis workflows on heterogeneous clusters

Contact Info

Product

Resources

About