2021
DOI: 10.1101/2021.03.23.436571
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Comparative genome analysis using sample-specific string detection in accurate long reads

Abstract: Motivation: Comparative genome analysis of two or more whole-genome sequenced (WGS) samples is at the core of most applications in genomics. These include discovery of genomic differences segregating in population, case-control analysis in common diseases, and rare disorders. With the current progress of accurate long-read sequencing technologies (e.g., circular consensus sequencing from PacBio sequencers) we can dive into studying repeat regions of genome (e.g., segmental duplications) and hard-to-detect vari… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
8
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(8 citation statements)
references
References 56 publications
0
8
0
Order By: Relevance
“…The second result consists in the design of an algorithm to compute all the occurrences in a single sequence T of its target-specific factors against a reference R. The algorithm runs in real-time on the target sequence, independently of the number of occurrences of target-specific factors, after a standard processing of the reference. This improves on the result in [15], where the running time of the main algorithm depends on the number of occurrences of sought factors.…”
Section: Introductionmentioning
confidence: 73%
See 3 more Smart Citations
“…The second result consists in the design of an algorithm to compute all the occurrences in a single sequence T of its target-specific factors against a reference R. The algorithm runs in real-time on the target sequence, independently of the number of occurrences of target-specific factors, after a standard processing of the reference. This improves on the result in [15], where the running time of the main algorithm depends on the number of occurrences of sought factors.…”
Section: Introductionmentioning
confidence: 73%
“…The motivation comes from the analysis of genomic sequences as done for example by Khorsand et al in [15] in which authors introduce the notion of sample-specific strings. To avoid alignments but however to extract interesting elements that differentiate the target from the reference, the chosen specific fragments are minimal forbidden factors, also called minimal absent factors.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…Since being able to detect dissimilarities is also important in sequence comparison, here we also present a novel notion aiming at discovering differences among similar sequences. This is the notion of sample specific string (SFS) [27]. We show applications of both notions in facing problems motivated by computational pangenomics [13,20].…”
Section: Introductionmentioning
confidence: 99%