Pelin Icer Baykal scite author profile

Aligning sequencing reads onto a reference is an essential step of the majority of genomic analysis pipelines. Computational algorithms for read alignment have evolved in accordance with technological advances, leading to today’s diverse array of alignment methods. We provide a systematic survey of algorithmic foundations and methodologies across 107 alignment methods, for both short and long reads. We provide a rigorous experimental evaluation of 11 read aligners to demonstrate the effect of these underlying algorithms on speed and efficiency of read alignment. We discuss how general alignment algorithms have been tailored to the specific needs of various domains in biology.

show abstract

Unlocking capacities of genomics for the COVID-19 response and future pandemics

Knyazev

Chhugani

Sarwal

et al. 2022

Nat Methods

View full text Add to dashboard Cite

Global transmission network of SARS-CoV-2: from outbreak to pandemic

Skums

Kirpich

Baykal

et al. 2020

Preprint

View full text Add to dashboard Cite

Background. The COVID-19 pandemic caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is straining health systems around the world. Although the Chinese government implemented a number of severe restrictions on people's movement in an attempt to contain its local and international spread, the virus had already reached many areas of the world in part due to its potent transmissibility and the fact that a substantial fraction of infected individuals develop little or no symptoms at all. Following its emergence, the virus started to generate sustained transmission in neighboring countries in Asia, Western Europe, Australia, Canada and the United States, and finally in South America and Africa. As the virus continues its global spread, a clear and evidence-based understanding of properties and dynamics of the global transmission network of SARS-CoV-2 is essential to design and put in place efficient and globally coordinated interventions. Methods. We employ molecular surveillance data of SARS-CoV-2 epidemics for inference and comprehensive analysis of its global transmission network before the pandemic declaration. Our goal was to characterize the spatial-temporal transmission pathways that led to the establishment of the pandemic. We exploited a network-based approach specifically tailored to emerging outbreak settings. Specifically, it traces the accumulation of mutations in viral genomic variants via mutation trees, which are then used to infer transmission networks, revealing an up-to-date picture of the spread of SARS-CoV-2 between and within countries and geographic regions. Results and Conclusions. The analysis suggest multiple introductions of SARS-CoV-2 into the majority of world regions by means of heterogeneous transmission pathways. The transmission network is scale-free, with a few genomic variants responsible for the majority of possible transmissions. The network structure is in line with the available temporal information represented by sample collection times and suggest the expected sampling time difference of few days between potential transmission pairs. The inferred network structural properties, transmission clusters and pathways and virus introduction routes emphasize the extent of the global epidemiological linkage and demonstrate the importance of internationally coordinated public health measures.

show abstract

Tracking SARS-CoV-2 genomic variants in wastewater sequencing data withLolliPop

Dreifuss

Topolsky

Baykal

et al. 2022

Preprint

View full text Add to dashboard Cite

During the COVID-19 pandemic, wastewater-based epidemiology has progressively taken a central role as a pathogen surveillance tool. Tracking viral loads and variant outbreaks in sewage offers advantages over clinical surveillance methods by providing unbiased estimates and enabling early detection. However, wastewater-based epidemiology poses new computational research questions that need to be solved in order for this approach to be implemented broadly and successfully. Here, we address the variant deconvolution problem, where we aim to estimate the relative abundances of genomic variants from next-generation sequencing data of a mixed wastewater sample. We introduce LolliPop, a computational method to solve the variant deconvolution problem by simultaneously solving least squares problems and kernel-based smoothing of relative variant abundances from wastewater time series sequencing data. We derive multiple approaches to compute confidence bands, and demonstrate the application of our method to data from the Swiss wastewater surveillance efforts.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.