2014
DOI: 10.1101/gr.177774.114
|View full text |Cite
|
Sign up to set email alerts
|

The landscape of human STR variation

Abstract: Short tandem repeats are among the most polymorphic loci in the human genome. These loci play a role in the etiology of a range of genetic diseases and have been frequently utilized in forensics, population genetics, and genetic genealogy. Despite this plethora of applications, little is known about the variation of most STRs in the human population. Here, we report the largest-scale analysis of human STR variation to date. We collected information for nearly 700,000 STR loci across more than 1000 individuals … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

25
275
1

Year Published

2014
2014
2023
2023

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 239 publications
(301 citation statements)
references
References 72 publications
(83 reference statements)
25
275
1
Order By: Relevance
“…We focused on 311 European individuals whose LCL expression profiles were measured using RNA-sequencing by the gEUVADIS 9 project and whose whole genomes were sequenced by the 1000 Genomes Project 41 . The STR genotypes were obtained in our previous study 42 in which we created a catalog of STR variation as part of the 1000 Genomes Project using lobSTR, a specialized algorithm for profiling STR variations from high throughput sequencing data 43 . Briefly, lobSTR identifies reads with repetitive sequences that are flanked by non-repetitive segments.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…We focused on 311 European individuals whose LCL expression profiles were measured using RNA-sequencing by the gEUVADIS 9 project and whose whole genomes were sequenced by the 1000 Genomes Project 41 . The STR genotypes were obtained in our previous study 42 in which we created a catalog of STR variation as part of the 1000 Genomes Project using lobSTR, a specialized algorithm for profiling STR variations from high throughput sequencing data 43 . Briefly, lobSTR identifies reads with repetitive sequences that are flanked by non-repetitive segments.…”
Section: Resultsmentioning
confidence: 99%
“…Finally, lobSTR aggregates aligned reads and employs a model of STR-specific sequencing errors to report the maximum likelihood genotype at each locus. lobSTR recovered most (r 2 =0.71) of the variation in STR locus lengths in the 1000 Genomes datasets based on large-scale validation using 5,000 STR genotype calls obtained by capillary electrophoresis, the gold standard for STR genotyping 42 . The majority of genotype errors were from dropout of one allele at heterozygote sites due to low sequencing coverage.…”
Section: Resultsmentioning
confidence: 99%
“…Calls based on so few reads may not be accurate even for homozygous germline alleles. Calling heterozygous STR genotypes remains difficult with the modest coverage of most available whole-genome-sequencing data, such as found in the 1000 Genomes Project [12], which becomes even more challenging when potential somatic mutations contribute to a heterogeneous sample population. To illustrate this challenge, consider a heterozygous ~30 bp-STR locus and whole-genome sequencing with 101 bp-reads at 5× coverage – this scenario is likely to yield just three STR-spanning reads (Figure 2).…”
Section: Analytical Tools and Genotyping Methods Continue To Strugglementioning
confidence: 99%
“…Three percent of the human genome consists of STRs [9] and 6% of human coding regions are estimated to contain STR variation [10,11]. Recently, the first catalog of genome-wide population-scale human STR variation has appeared [12], opening up new possibilities for understanding the contribution of STRs to human genetic diseases. This catalog, and similar data sources [13], have appeared decades after initial calls for the assessment of the role of STRs in phenotypic variation [14], lagging behind surveys of other genomic elements.…”
Section: The ‘Missing Heritability’ Of Complex Diseases and Str Variamentioning
confidence: 99%
See 1 more Smart Citation