2023
DOI: 10.1101/2023.03.09.531600
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A deep population reference panel of tandem repeat variation

Abstract: Tandem repeats (TRs) represent one of the largest sources of genetic variation in humans and are implicated in a range of phenotypes. Here we present a deep characterization of TR variation based on high coverage whole genome sequencing from 3,550 diverse individuals from the 1000 Genomes Project and H3Africa cohorts. We develop a method, EnsembleTR, to integrate genotypes from four separate methods resulting in high-quality genotypes at more than 1.7 million TR loci. Our catalog reveals novel sequence feature… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
8
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 12 publications
(9 citation statements)
references
References 83 publications
0
8
0
Order By: Relevance
“…In fact, a CAG repeat in ZFHX3 was already investigated within the German SCA4 family 20 years ago 8 . As the genome harbors approximately 1.7 million polymorphic STR loci, 28 using even more recent bioinformatic tools such as ExpansionHunter 17 requires filtering of variants by type before outlier detection analysis to simplify and manage the number of STR loci reviewed. Thus, adaptive sampling allows for a flexible, computationally directed, variant agnostic approach to target candidate regions of high linkage and represents an exciting and important method in the discovery of unsolved causes of inherited neurological disorders.…”
Section: Discussionmentioning
confidence: 99%
“…In fact, a CAG repeat in ZFHX3 was already investigated within the German SCA4 family 20 years ago 8 . As the genome harbors approximately 1.7 million polymorphic STR loci, 28 using even more recent bioinformatic tools such as ExpansionHunter 17 requires filtering of variants by type before outlier detection analysis to simplify and manage the number of STR loci reviewed. Thus, adaptive sampling allows for a flexible, computationally directed, variant agnostic approach to target candidate regions of high linkage and represents an exciting and important method in the discovery of unsolved causes of inherited neurological disorders.…”
Section: Discussionmentioning
confidence: 99%
“…For example, many non-coding variants located in and around the oculocutaneous albinism II (OCA2) gene contribute to hair pigmentation variation, and cannot be identified in the sole study of coding regions (41). Finally, TR variant calling is challenging, with the dozens of available methods yielding unique benefits and drawbacks (42)(43)(44). For example, the GangSTR method used herein has an elevated error rate in the context of AC/TG dinucleotide motifs that is not present in other methods that have different shortcomings.…”
Section: Discussionmentioning
confidence: 99%
“…Several studies have demonstrated that while overall patterns of repeat variations are highly similar across populations, there are notable exceptions with population-specific patterns (Ziaei Jam et al 2023). For instance, common "CAG" expansions in an intronic repeat within the gene CA10 were found predominantly in African individuals, and the motif usage of an intronic repeat in gene PCBP3 showed substantial differences across modern superpopulations (Ziaei Jam et al 2023;Course et al 2021). Another example is the BEAN1 repeat, which showed population-specific pathogenic expansions (Figure 5) (Ishikawa and Nagai 2019).…”
Section: Discussionmentioning
confidence: 99%