Compared to its predecessors, the Telomere-to-Telomere CHM13 genome adds nearly 200 Mbp of sequence, corrects thousands of structural errors, and unlocks the most complex regions of the human genome to clinical and functional study. Here we demonstrate how the new reference universally improves read mapping and variant calling for 3,202 and 17 globally diverse samples sequenced with short and long reads, respectively. We identify hundreds of thousands of novel variants per sample - a new frontier for evolutionary and biomedical discovery. Simultaneously, the new reference eliminates tens of thousands of spurious variants per sample, including up to a 12-fold reduction of false positives in 269 medically relevant genes. The vast improvement in variant discovery coupled with population and functional genomic resources position T2T-CHM13 to replace GRCh38 as the prevailing reference for human genetics.
Large genomic insertions and deletions are a potent source of functional variation, but are challenging to resolve with short-read sequencing, limiting knowledge of the role of such structural variants (SVs) in human evolution. Here, we used a graph-based method to genotype long-read-discovered SVs in short-read data from diverse human genomes. We then applied an admixture-aware method to identify 220 SVs exhibiting extreme patterns of frequency differentiation—a signature of local adaptation. The top two variants traced to the immunoglobulin heavy chain locus, tagging a haplotype that swept to near fixation in certain Southeast Asian populations, but is rare in other global populations. Further investigation revealed evidence that the haplotype traces to gene flow from Neanderthals, corroborating the role of immune-related genes as prominent targets of adaptive introgression. Our study demonstrates how recent technical advances can help resolve signatures of key evolutionary events that remained obscured within technically challenging regions of the genome.
Gayal (Bos frontalis), also known as mithan or mithun, is a large endangered semi-domesticated bovine that has a limited geographical distribution in the hill-forests of China, Northeast India, Bangladesh, Myanmar, and Bhutan. Many questions about the gayal such as its origin, population history, and genetic basis of local adaptation remain largely unresolved. De novo sequencing and assembly of the whole gayal genome provides an opportunity to address these issues. We report a high-depth sequencing, de novo assembly, and annotation of a female Chinese gayal genome. Based on the Illumina genomic sequencing platform, we have generated 350.38 Gb of raw data from 16 different insert-size libraries. A total of 276.86 Gb of clean data is retained after quality control. The assembled genome is about 2.85 Gb with scaffold and contig N50 sizes of 2.74 Mb and 14.41 kb, respectively. Repetitive elements account for 48.13% of the genome. Gene annotation has yielded 26 667 protein-coding genes, of which 97.18% have been functionally annotated. BUSCO assessment shows that our assembly captures 93% (3183 of 4104) of the core eukaryotic genes and 83.1% of vertebrate universal single-copy orthologs. We provide the first comprehensive de novo genome of the gayal. This genetic resource is integral for investigating the origin of the gayal and performing comparative genomic studies to improve understanding of the speciation and divergence of bovine species. The assembled genome could be used as reference in future population genetic studies of gayal.
Extra or missing chromosomes—a phenomenon termed aneuploidy—frequently arise during human meiosis and embryonic mitosis and are the leading cause of pregnancy loss, including in the context of in vitro fertilization (IVF). While meiotic aneuploidies affect all cells and are deleterious, mitotic errors generate mosaicism, which may be compatible with healthy live birth. Large-scale abnormalities such as triploidy and haploidy also contribute to adverse pregnancy outcomes, but remain hidden from standard sequencing-based approaches to preimplantation genetic testing for aneuploidy (PGT-A). The ability to reliably distinguish meiotic and mitotic aneuploidies, as well as abnormalities in genome-wide ploidy, may thus prove valuable for enhancing IVF outcomes. Here, we describe a statistical method for distinguishing these forms of aneuploidy based on analysis of low-coverage whole-genome sequencing data, which is the current standard in the field. Our approach overcomes the sparse nature of the data by leveraging allele frequencies and linkage disequilibrium (LD) measured in a population reference panel. The method, which we term LD-informed PGT-A (LD-PGTA), retains high accuracy down to coverage as low as 0.05 × and at higher coverage can also distinguish between meiosis I and meiosis II errors based on signatures spanning the centromeres. LD-PGTA provides fundamental insight into the origins of human chromosome abnormalities, as well as a practical tool with the potential to improve genetic testing during IVF.
Residual feed intake (RFI) is a measure of feed efficiency. Pigs with low RFI have reduced feed costs without compromising their growth. For marker-assisted selection, it is helpful to identify genes or genetic markers associated with RFI in animals with improved feed efficiency at an early age. Using Illumina's PorcineSNP60 BeadChip, we performed a pilot genome-wide association study of 217 Junmu No. 1 white male pigs phenotyped for RFI. Two-step and one-step methods were used separately to identify associated SNPs. Both methods obtained similar results. Twelve SNPs were identified as significantly associated with RFI at a Bonferroni adjusted P-level < 9.7 × 10 , and 204 were found to have suggestive (moderately significant) association with RFI at P < 5 × 10 . NMBR, KCTD16, ASGR1, PRKCQ, PITRM1, TIAM1 and RND3 were identified as candidate genes for RFI.
The dominant white coat colour of farmed blue fox is inherited as a monogenic autosomal dominant trait and is suggested to be embryonic lethal in the homozygous state. In this study, the transcripts of KIT were identified by RT-PCR for a dominant white fox and a normal blue fox. Sequence analysis showed that the KIT transcript in normal blue fox contained the full-length coding sequence of 2919 bp (GenBank Acc. No KF530833), but in the dominant white individual, a truncated isoform lacking the entire exon 12 specifically co-expressed with the normal transcript. Genomic DNA sequencing revealed that a single nucleotide polymorphism (c.1867+1G>T) in intron 12 appeared only in the dominant white individuals and a 1-bp ins/del polymorphism in the same intron showed in individuals representing two different coat colours. Genotyping results of the SNP with PCR-RFLP in 185 individuals showed all 90 normal blue foxes were homozygous for the G allele, and all dominant white individuals were heterozygous. Due to the truncated protein with a deletion of 35 amino acids and an amino acid replacement (p.Pro623Ala) located in the conserved ATP binding domain, we propose that the mutant receptor had absent tyrosine kinase activity. These findings reveal that the base substitution at the first nucleotide of intron 12 of KIT gene, resulting in skipping of exon 12, is a causative mutation responsible for the dominant white phenotype of blue fox.
Additional supporting information may be found in the online version of this article. Figure S1 The white spotting phenotype in a Weimaraner puppy at 6 [A, B] and 7 weeks of age [C] as compared to the littermates [D].
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.