2021
DOI: 10.1371/journal.pcbi.1009254
|View full text |Cite
|
Sign up to set email alerts
|

Using de novo assembly to identify structural variation of eight complex immune system gene regions

Abstract: Driven by the necessity to survive environmental pathogens, the human immune system has evolved exceptional diversity and plasticity, to which several factors contribute including inheritable structural polymorphism of the underlying genes. Characterizing this variation is challenging due to the complexity of these loci, which contain extensive regions of paralogy, segmental duplication and high copy-number repeats, but recent progress in long-read sequencing and optical mapping techniques suggests this proble… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
22
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 24 publications
(23 citation statements)
references
References 68 publications
1
22
0
Order By: Relevance
“…However, it exhibits a highly skewed allele balance (centered on 0.2, in contrast to the expectation of 0.5) – a symptom of reference bias in read mapping ( Chen et al, 2019 ) – as well as low genotyping rates (31% for CDX in 1KGP, compared with 85% in our study). These biases result in lower estimates of the insertion’s allele frequency in East Asian populations (AF = 0.73 in CDX) and underscore the technical challenges of genotyping at this locus with traditional short-read approaches ( Chin et al, 2020 ; Zhang et al, 2021 ).…”
Section: Resultsmentioning
confidence: 99%
See 3 more Smart Citations
“…However, it exhibits a highly skewed allele balance (centered on 0.2, in contrast to the expectation of 0.5) – a symptom of reference bias in read mapping ( Chen et al, 2019 ) – as well as low genotyping rates (31% for CDX in 1KGP, compared with 85% in our study). These biases result in lower estimates of the insertion’s allele frequency in East Asian populations (AF = 0.73 in CDX) and underscore the technical challenges of genotyping at this locus with traditional short-read approaches ( Chin et al, 2020 ; Zhang et al, 2021 ).…”
Section: Resultsmentioning
confidence: 99%
“…The variants identified by Browning et al, 2018 are located in a subregion of the broader IGH locus, downstream of the introgressed SVs, and segregate at high allele frequency in East Asian, European, and American populations ( Figure 4—figure supplement 3 ). The southeast Asian-specific haplotype we identify, which includes the IGHG4 insertion and nearby deletion, may have been challenging to discover due to the difficulties of short-read alignment and genotyping in this region of the genome ( Zhang et al, 2021 ). Indeed, 80.7% of the sequence in the broader IGH locus was filtered out by Browning et al, 2018 through strict masking of aDNA genotypes to remove low-coverage, poorly mapping, or repeat-associated reads ( Figure 4—figure supplement 4 ).…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…The extent of this under-representation has recently been surveyed by Mikocziova et al 24 Recent efforts to generate accurate and complete assemblies of immunoglobulin, T-cell receptor, and natural killer cell receptors in a single individual have exploited state of the art long-read and other mapping and assembly approaches. 25 They have concluded that a full accurate and complete assembly of variation in some of these regions in a single individual is a technical challenge that has not yet been overcome and will require further methodological improvements. These problematic regions of the genome, and others of perhaps similar complexity and biological importance, underscore that the notion that SNP-based GWAS is a comprehensive tool for uncovering genetic variation contributing to disease traits is one in need of re-assessment.…”
Section: Gwas Snp Arrays Fail To Adequately Assess Genetic Variation In Immune Receptorsmentioning
confidence: 99%