Cristian Groza scite author profile

The Human Pangenome Reference Consortium (HPRC) presents a first draft human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals. These assemblies cover more than 99% of the expected sequence and are more than 99% accurate at the structural and base-pair levels. Based on alignments of the assemblies, we generated a draft pangenome that captures known variants and haplotypes, reveals novel alleles at structurally complex loci, and adds 119 million base pairs of euchromatic polymorphic sequence and 1,529 gene duplications relative to the existing reference, GRCh38. Roughly 90 million of the additional base pairs derive from structural variation. Using our draft pangenome to analyze short-read data reduces errors when discovering small variants by 34% and boosts the detected structural variants per haplotype by 104% compared to GRCh38-based workflows, and by 34% compared to using previous diversity sets of genome assemblies.

show abstract

A draft human pangenome reference

Liao

Asri

Ebler

et al. 2023

Nature

343

202

View full text Add to dashboard Cite

Here the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals1. These assemblies cover more than 99% of the expected sequence in each genome and are more than 99% accurate at the structural and base pair levels. Based on alignments of the assemblies, we generate a draft pangenome that captures known variants and haplotypes and reveals new alleles at structurally complex loci. We also add 119 million base pairs of euchromatic polymorphic sequences and 1,115 gene duplications relative to the existing reference GRCh38. Roughly 90 million of the additional base pairs are derived from structural variation. Using our draft pangenome to analyse short-read data reduced small variant discovery errors by 34% and increased the number of structural variants detected per haplotype by 104% compared with GRCh38-based workflows, which enabled the typing of the vast majority of structural variant alleles per sample.

show abstract

Epigenetic variation impacts ancestry-associated differences in the transcriptional response to influenza infection

Aracena

Lin

Luo

et al. 2022

Preprint

View full text Add to dashboard Cite

SummaryHumans display remarkable inter-individual variation in immune response when exposed to identical immune challenges. Yet, our understanding of the genetic and epigenetic factors contributing to such variation remains limited. Here we carried out in-depth genetic, epigenetic, and transcriptional profiling on primary macrophages derived from a panel of European and African-ancestry individuals before and after infection with influenza A virus (IAV). We show that baseline epigenetic profiles are strongly predictive of the transcriptional response to IAV across individuals, and that ancestry-associated differences in gene expression are tightly coupled with variation in enhancer activity. Quantitative trait locus (QTL) mapping revealed highly coordinated genetic effects on gene regulation with many cis-acting genetic variants impacting concomitantly gene expression and multiple epigenetic marks. These data reveal that ancestry-associated differences in the epigenetic landscape are genetically controlled, even more so than variation in gene expression. Lastly, we show that among QTL variants that colocalized with immune-disease loci, only 7% were gene expression QTL, the remaining corresponding to genetic variants that impact one or more epigenetic marks, which stresses the importance of considering molecular phenotypes beyond gene expression in disease-focused studies.

show abstract

Pangenome graph construction from genome alignments with Minigraph-Cactus

et al. 2023

View full text Add to dashboard Cite

Personalized and graph genomes reveal missing signal in epigenomic data

et al. 2020

View full text Add to dashboard Cite

Background: Epigenomic studies that use next generation sequencing experiments typically rely on the alignment of reads to a reference sequence. However, because of genetic diversity and the diploid nature of the human genome, we hypothesize that using a generic reference could lead to incorrectly mapped reads and bias downstream results. Results: We show that accounting for genetic variation using a modified reference genome or a de novo assembled genome can alter histone H3K4me1 and H3K27ac ChIP-seq peak calls either by creating new personal peaks or by the loss of reference peaks. Using permissive cutoffs, modified reference genomes are found to alter approximately 1% of peak calls while de novo assembled genomes alter up to 5% of peaks. We also show statistically significant differences in the amount of reads observed in regions associated with the new, altered, and unchanged peaks. We report that short insertions and deletions (indels), followed by single nucleotide variants (SNVs), have the highest probability of modifying peak calls. We show that using a graph personalized genome represents a reasonable compromise between modified reference genomes and de novo assembled genomes. We demonstrate that altered peaks have a genomic distribution typical of other peaks. Conclusions: Analyzing epigenomic datasets with personalized and graph genomes allows the recovery of new peaks enriched for indels and SNVs. These altered peaks are more likely to differ between individuals and, as such, could be relevant in the study of various human phenotypes.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Cristian Groza

A Draft Human Pangenome Reference

A draft human pangenome reference

Epigenetic variation impacts ancestry-associated differences in the transcriptional response to influenza infection

Pangenome graph construction from genome alignments with Minigraph-Cactus

Personalized and graph genomes reveal missing signal in epigenomic data

Contact Info

Product

Resources

About