2021
DOI: 10.1038/s41467-021-22910-w
|View full text |Cite
|
Sign up to set email alerts
|

Rapid detection of identity-by-descent tracts for mega-scale datasets

Abstract: The ability to identify segments of genomes identical-by-descent (IBD) is a part of standard workflows in both statistical and population genetics. However, traditional methods for finding local IBD across all pairs of individuals scale poorly leading to a lack of adoption in very large-scale datasets. Here, we present iLASH, an algorithm based on similarity detection techniques that shows equal or improved accuracy in simulations compared to current leading methods and speeds up analysis by several orders of … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
26
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 24 publications
(26 citation statements)
references
References 41 publications
0
26
0
Order By: Relevance
“…The merged dataset was then statistically phased using Shapeit4 [81]. IBD was called using iLASH using default parameters [82]. For downstream analysis, IBD segments were summed between individuals to create an adjacency matrix, where each row represented a pair of individuals, and each column represented the total genome-wide IBD between those two individuals.…”
Section: Methodsmentioning
confidence: 99%
“…The merged dataset was then statistically phased using Shapeit4 [81]. IBD was called using iLASH using default parameters [82]. For downstream analysis, IBD segments were summed between individuals to create an adjacency matrix, where each row represented a pair of individuals, and each column represented the total genome-wide IBD between those two individuals.…”
Section: Methodsmentioning
confidence: 99%
“…They can be divided into two distinct groups: those that need phased data and those that do not. In the first group there is GERMLINE [ 51 ] that deals with errors in the genotypes and iLASH [ 52 ], RaPID [ 53 ], hap-IBD [ 54 ], FastSMC [ 55 ] and fastIBD [ 56 ], all of them reporting improved speed. Also in this group, there is RefinedIBD [ 57 ], which does not allow for genotype errors but uses the GERMLINE algorithm for the identification of shared haplotypes exceeding a threshold length.…”
Section: Enhance Your Gwas: the Different Ways To Exploit Snp Array Datamentioning
confidence: 99%
“…To define our IBD clusters, we called pairwise IBD between all ATLAS participants and reference individuals sourced from the 1000 Genomes Project [26], the Simons Genome Diversity Project [27], and the Human Genome Diversity Project [28]. IBD segments were estimated using iLASH [29], identifying in total, over 95 million shared IBD segments. All participants in the biobank had at least one IBD segment detected, with the mean amount of IBD sharing being 14.80cM (IQR: 3.84-21.57cM).…”
Section: Identity By Descent Clusteringmentioning
confidence: 99%
“…IBD was called using iLASH [29] with the following parameters: slice_size 350, step_size 350, perm_count 20, shingle_size 15, shingle_overlap 0, bucket_count 5, max_thread 20, match_threshold 0.99, interest_threshold 0.70, min_length 2.9, auto_slice 1, slice_length 2.9, cm_overlap 1, minhash_threshold 55. IBD was called for one chromosome at a time.…”
Section: Ibd Calling and Processingmentioning
confidence: 99%