Background
Pistachio (
Pistacia vera
), one of the most important commercial nut crops worldwide, is highly adaptable to abiotic stresses and is tolerant to drought and salt stresses.
Results
Here, we provide a draft de novo genome of pistachio as well as large-scale genome resequencing. Comparative genomic analyses reveal stress adaptation of pistachio is likely attributable to the expanded cytochrome P450 and chitinase gene families. Particularly, a comparative transcriptomic analysis shows that the jasmonic acid (JA) biosynthetic pathway plays an important role in salt tolerance in pistachio. Moreover, we resequence 93 cultivars and 14 wild
P. vera
genomes and 35 closely related wild
Pistacia
genomes, to provide insights into population structure, genetic diversity, and domestication. We find that frequent genetic admixture occurred among the different wild
Pistacia
species. Comparative population genomic analyses reveal that pistachio was domesticated about 8000 years ago and suggest that key genes for domestication related to tree and seed size experienced artificial selection.
Conclusions
Our study provides insight into genetic underpinning of local adaptation and domestication of pistachio. The
Pistacia
genome sequences should facilitate future studies to understand the genetic basis of agronomically and environmentally related traits of desert crops.
Electronic supplementary material
The online version of this article (10.1186/s13059-019-1686-3) contains supplementary material, which is available to authorized users.
The geographic origin and migration of the brown rat (Rattus norvegicus) remain subjects of considerable debate. In this study, we sequenced whole genomes of 110 wild brown rats with a diverse world-wide representation. We reveal that brown rats migrated out of southern East Asia, rather than northern Asia as formerly suggested, into the Middle East and then to Europe and Africa, thousands of years ago. Comparison of genomes from different geographical populations reveals that many genes involved in the immune system experienced positive selection in the wild brown rat.
The Y chromosome plays key roles in male fertility and reflects the evolutionary history of paternal lineages. Here, we present a de novo genome assembly of the Hu sheep with the first draft assembly of ovine Y chromosome (oMSY), using nanopore sequencing and Hi-C technologies. The oMSY that we generated spans 10.6 Mb from which 775 Y-SNPs were identified by applying a large panel of whole genome sequences from worldwide sheep and wild Iranian mouflons. Three major paternal lineages (HY1a, HY1b and HY2) were defined across domestic sheep, of which HY2 was newly detected. Surprisingly, HY2 forms a monophyletic clade with the Iranian mouflons and is highly divergent from both HY1a and HY1b. Demographic analysis of Y chromosomes, mitochondrial and nuclear genomes confirmed that HY2 and the maternal counterpart of lineage C represented a distinct wild mouflon population in Iran that diverge from the direct ancestor of domestic sheep, the wild mouflons in Southeastern Anatolia. Our results suggest that wild Iranian mouflons had introgressed into domestic sheep and thereby introduced this Iranian mouflon specific lineage carrying HY2 to both East Asian and Africa sheep populations.
Structural variations (SVs) are a major contributor of genetic diversity and phenotypic variations, however their prevalence and functions in domestic animals are largely unexplored. Here, we assembled 26 haplotype-resolved genome assemblies from 13 genetically diverse sheep breeds using PacBio HiFi sequencing. We then constructed an ovine graph pan-genome and demonstrated its advantage in discovering 142,593 biallelic SVs (Insertions and deletions), 7,028 divergent alleles and 13,419 multiallelic variations with high accuracy and sensitivity. To link the SVs to genotypes, we genotyped the SVs in 687 resequenced individuals of domestic and wild sheep using a graph-based approach and identified numerous population-stratified variants, of which expression-associated SVs were detected by integrating RNA-seq data. Taking the varying sheep tail morphology as example, we located a putative causative insertion in HOXB13 gene responsible for the long tail and reported multiple large SVs associated with the fat tail. Beyond generating a benchmark resource for ovine structural variants, our study also highlighted that the population genetics analysis based on graph pan-genome rather than reference genome will greatly benefit the animal genetic research.
Structural variations (SVs) are a major contributor to genetic diversity and phenotypic variations, but their prevalence and functions in domestic animals are largely unexplored. Here we generated high-quality genome assemblies for 15 individuals from genetically diverse sheep breeds using Pacific Biosciences (PacBio) high-fidelity sequencing, discovering 130.3 Mb nonreference sequences, from which 588 genes were annotated. A total of 149,158 biallelic insertions/deletions, 6531 divergent alleles, and 14,707 multiallelic variations with precise breakpoints were discovered. The SV spectrum is characterized by an excess of derived insertions compared to deletions (94,422 vs. 33,571), suggesting recent active LINE expansions in sheep. Nearly half of the SVs display low to moderate linkage disequilibrium with surrounding single-nucleotide polymorphisms (SNPs) and most SVs cannot be tagged by SNP probes from the widely used ovine 50K SNP chip. We identified 865 population-stratified SVs including 122 SVs possibly derived in the domestication process among 690 individuals from sheep breeds worldwide. A novel 168-bp insertion in the 5′ untranslated region (5′ UTR) ofHOXB13is found at high frequency in long-tailed sheep. Further genome-wide association study and gene expression analyses suggest that this mutation is causative for the long-tail trait. In summary, we have developed a panel of high-quality de novo assemblies and present a catalog of structural variations in sheep. Our data capture abundant candidate functional variations that were previously unexplored and provide a fundamental resource for understanding trait biology in sheep.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.