Sebastian Schmidt scite author profile

Sebastian Schmidt

4Publications

76Citation Statements Received

282Citation Statements Given

How they've been cited

How they cite others

214

275

Affiliations

University of Helsinki, University Hospital Magdeburg, University of Calgary

Publications

Order By: Most citations

A set of multi-touch graph interaction techniques

Schmidt

Nacenta

Dachselt

et al. 2010

View full text Add to dashboard Cite

Interactive node-link diagrams are useful for describing and exploring data relationships in many domains such as network analysis and transportation planning. We describe a multi-touch interaction technique set (IT set) that focuses on edge interactions for node-link diagrams. The set includes five techniques (TouchPlucking, TouchPinning, TouchStrumming, TouchBundling and PushLens) and provides the flexibility to combine them in either sequential or simultaneous actions in order to address edge congestion.

show abstract

Matchtigs: minimum plain text representation of kmer sets

Schmidt

Khan

Alanko

et al. 2021

Preprint

View full text Add to dashboard Cite

Kmer-based methods are widely used in bioinformatics, which raises the question of what is the smallest practically usable representation (i.e. plain text) of a set of kmers. We propose a polynomial algorithm computing a minimum such representation (which was previously posed as a potentially NP-hard open problem), as well as an efficient near-minimum greedy heuristic. When compressing genomes of large model organisms, read sets thereof or bacterial pangenomes, with only a minor runtime increase, we decrease the size of the representation by up to 60% over unitigs and 27% over previous work. Additionally, the number of strings is decreased by up to 97% over unitigs and 91% over previous work. Finally we show that a small representation has advantages in downstream applications, as it speeds up queries on the popular kmer indexing tool Bifrost by 1.66× over unitigs and 1.29× over previous work.Availabilityhttps://github.com/algbio/matchtigs

show abstract

Matchtigs: minimum plain text representation of k-mer sets

Schmidt¹,

Khan²,

Alanko³

et al. 2023

Genome Biol

View full text Add to dashboard Cite

We propose a polynomial algorithm computing a minimum plain-text representation of k-mer sets, as well as an efficient near-minimum greedy heuristic. When compressing read sets of large model organisms or bacterial pangenomes, with only a minor runtime increase, we shrink the representation by up to 59% over unitigs and 26% over previous work. Additionally, the number of strings is decreased by up to 97% over unitigs and 90% over previous work. Finally, a small representation has advantages in downstream applications, as it speeds up SSHash-Lite queries by up to 4.26× over unitigs and 2.10× over previous work.

show abstract

Eulertigs: minimum plain text representation of k-mer sets without repetitions in linear time

Schmidt

Alanko

2022

Preprint

View full text Add to dashboard Cite

A fundamental operation in computational genomics is to reduce the input sequences to their constituent k-mers. For maximum performance of downstream applications it is important to store the k-mers in small space, while keeping the representation easy and efficient to use (i.e. without k-mer repetitions and in plain text). Recently, heuristics were presented to compute a near-minimum such representation. We present an algorithm to compute a minimum representation in optimal (linear) time and use it to evaluate the existing heuristics. For that, we present a formalisation of arc-centric bidirected de Bruijn graphs and carefully prove that it accurately models the k-mer spectrum of the input. Our algorithm first constructs the de Bruijn graph in linear time in the length of the input strings (for a fixed-size alphabet). Then it uses a Eulerian-cycle-based algorithm to compute the minimum representation, in time linear in the size of the de Bruijn graph.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.