2020
DOI: 10.1038/s41467-020-19588-x
|View full text |Cite
|
Sign up to set email alerts
|

Identity-by-descent detection across 487,409 British samples reveals fine scale population structure and ultra-rare variant associations

Abstract: Detection of Identical-By-Descent (IBD) segments provides a fundamental measure of genetic relatedness and plays a key role in a wide range of analyses. We develop FastSMC, an IBD detection algorithm that combines a fast heuristic search with accurate coalescent-based likelihood calculations. FastSMC enables biobank-scale detection and dating of IBD segments within several thousands of years in the past. We apply FastSMC to 487,409 UK Biobank samples and detect ~214 billion IBD segments transmitted by shared a… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
82
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 85 publications
(93 citation statements)
references
References 95 publications
0
82
0
Order By: Relevance
“…In addition, we used a minimum length threshold of 3 cM for IBD estimation across our analyses in this manuscript because using hash-based algorithms such as iLASH, GERMLINE, and RaPID, to estimate short IBD segments (<1 cM) can result in an excess of false-positive edges with short segments, e.g., 36 . However, Saada et al have recently proposed FastSMC 37 , an IBD detection algorithm that uses coalescent based likelihood estimates to assess the validity of shorter segments. Development of a scalable pipeline where candidate matches found by iLASH’s LSH algorithm are evaluated and scored via the FastSMC method would yield a fast and more accurate approach for short segments.…”
Section: Discussionmentioning
confidence: 99%
“…In addition, we used a minimum length threshold of 3 cM for IBD estimation across our analyses in this manuscript because using hash-based algorithms such as iLASH, GERMLINE, and RaPID, to estimate short IBD segments (<1 cM) can result in an excess of false-positive edges with short segments, e.g., 36 . However, Saada et al have recently proposed FastSMC 37 , an IBD detection algorithm that uses coalescent based likelihood estimates to assess the validity of shorter segments. Development of a scalable pipeline where candidate matches found by iLASH’s LSH algorithm are evaluated and scored via the FastSMC method would yield a fast and more accurate approach for short segments.…”
Section: Discussionmentioning
confidence: 99%
“…GERMLINE creates a hash table between short, exact matches of haplotypes and extending into longer, fuzzy (i.e., allowing for small SNP mismatches or genotype errors) IBD segments. This "seed and extend" paradigm, leveraging the inherent efficiency of short hashing functions for speedup beyond standard pairwise comparisons has been adopted by subsequent detection algorithms (Shemirani et al, 2019;Nait Saada et al, 2020), and improved efficiency over hidden Markov model (HMM)-based algorithms or simpler string matching approaches. The computational efficiency garnered by GERMLINE allows computational time to scale approximately linearly with the number of samples and genotyped variants.…”
Section: Overview Of Methodsmentioning
confidence: 99%
“…Another novel algorithmic extension that builds on IBD detection and that shows high performance in accuracy as well as speed is FastSMC (Nait Saada et al, 2020). FastSMC builds upon the hash table GERMLINE method as a first identification step by also including a validation step that uses a approximate coalescent HMM (Palamara et al, 2018).…”
Section: Overview Of Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Later, the program provided the platform for other was uses including the assessment of runs of homozygosity as a measure of autozygosity [152]. Peripolli et al [152] viewed Plink as superior to other programs such as GERMLINE [153] and BEAGLE [154], especially with regard to the analysis of runs of homozygosity. In fact, Plink is suitable for calculating most of the parameters described and it is replacing the use of a wide variety of other programs [155].…”
Section: Diversity Parameters and Software Programsmentioning
confidence: 99%