Biocomputing 2023 2022
DOI: 10.1142/9789811270611_0012
|View full text |Cite
|
Sign up to set email alerts
|

Selecting Clustering Algorithms for Identity-By-Descent Mapping

Abstract: Groups of distantly related individuals who share a short segment of their genome identical-by-descent (IBD) can provide insights about rare traits and diseases in massive biobanks using IBD mapping. Clustering algorithms play an important role in finding these groups accurately and at scale. We set out to analyze the fitness of commonly used, fast and scalable clustering algorithms for IBD mapping applications. We designed a realistic benchmark for local IBD graphs and utilized it to compare the statistical p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 27 publications
(33 reference statements)
0
3
0
Order By: Relevance
“…Let haplotypes be nodes and IBD segments be edges in the context of a network graph. Because IBD segments longer than 1.0 cM are infrequent in a population sample, this graph should contain many disconnected haplotype clusters 48 . We also impose the rule that a node in a cluster may only be three edges away from its highest degree node (the haplotype sharing the greatest number of IBD segments).…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Let haplotypes be nodes and IBD segments be edges in the context of a network graph. Because IBD segments longer than 1.0 cM are infrequent in a population sample, this graph should contain many disconnected haplotype clusters 48 . We also impose the rule that a node in a cluster may only be three edges away from its highest degree node (the haplotype sharing the greatest number of IBD segments).…”
Section: Methodsmentioning
confidence: 99%
“…Let haplotypes be nodes and IBD segments be edges in the context of a network graph. Because extended IBD is infrequent in a population sample, this graph should contain many disconnected haplotype clusters 80 . Sizes of these haplotype clusters C 1 , C 2 , … amount to an empirical distribution for IBD cluster size.…”
Section: Methodsmentioning
confidence: 99%
“…Previous approaches to determining multi-individual IBD have generally looked for highly-connected clusters of pairwise IBD, and then added or removed some pairwise IBD to obtain IBD transitivity. 6; 32; 33 This approach scales quadratically with cluster size. In this work, we solve the problem by trimming a fixed genetic distance (e.g.…”
Section: Introductionmentioning
confidence: 99%