Structural variations (SVs) are a major contributor of genetic diversity and phenotypic variations, however their prevalence and functions in domestic animals are largely unexplored. Here, we assembled 26 haplotype-resolved genome assemblies from 13 genetically diverse sheep breeds using PacBio HiFi sequencing. We then constructed an ovine graph pan-genome and demonstrated its advantage in discovering 142,593 biallelic SVs (Insertions and deletions), 7,028 divergent alleles and 13,419 multiallelic variations with high accuracy and sensitivity. To link the SVs to genotypes, we genotyped the SVs in 687 resequenced individuals of domestic and wild sheep using a graph-based approach and identified numerous population-stratified variants, of which expression-associated SVs were detected by integrating RNA-seq data. Taking the varying sheep tail morphology as example, we located a putative causative insertion in HOXB13 gene responsible for the long tail and reported multiple large SVs associated with the fat tail. Beyond generating a benchmark resource for ovine structural variants, our study also highlighted that the population genetics analysis based on graph pan-genome rather than reference genome will greatly benefit the animal genetic research.
Structural variations (SVs) are a major contributor to genetic diversity and phenotypic variations, but their prevalence and functions in domestic animals are largely unexplored. Here we generated high-quality genome assemblies for 15 individuals from genetically diverse sheep breeds using Pacific Biosciences (PacBio) high-fidelity sequencing, discovering 130.3 Mb nonreference sequences, from which 588 genes were annotated. A total of 149,158 biallelic insertions/deletions, 6531 divergent alleles, and 14,707 multiallelic variations with precise breakpoints were discovered. The SV spectrum is characterized by an excess of derived insertions compared to deletions (94,422 vs. 33,571), suggesting recent active LINE expansions in sheep. Nearly half of the SVs display low to moderate linkage disequilibrium with surrounding single-nucleotide polymorphisms (SNPs) and most SVs cannot be tagged by SNP probes from the widely used ovine 50K SNP chip. We identified 865 population-stratified SVs including 122 SVs possibly derived in the domestication process among 690 individuals from sheep breeds worldwide. A novel 168-bp insertion in the 5′ untranslated region (5′ UTR) ofHOXB13is found at high frequency in long-tailed sheep. Further genome-wide association study and gene expression analyses suggest that this mutation is causative for the long-tail trait. In summary, we have developed a panel of high-quality de novo assemblies and present a catalog of structural variations in sheep. Our data capture abundant candidate functional variations that were previously unexplored and provide a fundamental resource for understanding trait biology in sheep.
Background Sheep were among the first animals to be domesticated. They are raised all over the world and produce a major scale of animal-based protein for human consumption and play an important role in agricultural economy. Iran is one of the important locations for sheep genetic resources in the world. Here, we compared the Illumina Ovine SNP50 BeadChip data of three Iranian local breeds (Moghani, Afshari and Gezel), as a population that does not undergone artificial breeding programs as yet, and five other sheep breeds namely East Friesian white, East Friesian brown, Lacaune, DorsetHorn and Texel to detect genetic mechanisms underlying economical traits and daptation to harsh environments in sheep. Results To identify genomic regions that have been targeted by positive selection, we used fixation index (Fst) and nucleotide diversity (Pi) statistics. Further analysis indicated candidate genes involved in different important traits such as; wool production included crimp of wool (PTPN3, NBEA and KRTAP20–2 genes), fiber diameter (PIK3R4 gene), hair follicle development (LHX2 gene), the growth and development of fiber (COL17A1 gene)), adaptation to hot arid environments (CORIN gene), adaptive in deficit water status (CPQ gene), heat stress (PLCB4, FAM107B, NBEA, PIK3C2B and USP43 genes) in sheep. Conclusions We detected several candidate genes related to wool production traits and adaptation to hot arid environments in sheep that can be applicable for inbreeding goals. Our findings not only include the results of previous researches, but also identify a number of novel candidate genes related to studied traits. However, more works will be essential to acknowledge phenotype- genotype relationships of the identified genes in our study.
ContentsOvulation rate and prolificacy are the most important reproductive traits that have major impact on the efficiency of lamb meat production. Here, we compared the whole genomes of the Romanov sheep, known as one of the high prolific breeds, and four other sheep breeds namely Assaf, Awassi, Cambridge and British du cher, to identify genetic mechanisms underlying prolificacy in sheep. Selection signature analysis revealed 637 and 477 protein-coding genes under positive selection from F ST and nucleotide diversity (Pi) statistics, respectively. Further analysis showed that several candidate genes including LEPR, PDGFRL and KLF5 genes are involved in sheep prolificacy. The identified candidate genes in the selected regions are novel and provide new insights into the genetic mechanisms underlying prolificacy in sheep and can be useful in sheep breeding programmes to develop improved breeds for high reproductive efficiency. K E Y W O R D S comparative genomics, positive selection, reproductive traits, Romanov sheep, whole genome sequence | 359 NOSRATI eT Al.
The dog mtDNA diversity picture from wide geographical sampling but from a small number of individuals per region or breed, displayed little geographical correlation and high degree of haplotype sharing between very distant breeds. For a clear picture, we extensively surveyed Iranian native dogs (n = 305) in comparison with published European (n = 443) and Southwest Asian (n = 195) dogs. Twelve haplotypes related to haplogroups A, B and C were shared by Iranian, European, Southwest Asian and East Asian dogs. In Iran, haplotype and nucleotide diversities were highest in east, southeast and northwest populations while western population had the least. Sarabi and Saluki dog populations can be assigned into haplogroups A, B, C and D; Qahderijani and Kurdi to haplogroups A, B and C, Torkaman to haplogroups A, B and D while Sangsari and Fendo into haplogroups A and B, respectively. Evaluation of population differentiation using pairwise F generally revealed no clear population structure in most Iranian dog populations. The genetic signal of a recent demographic expansion was detected in East and Southeast populations. Further, in accordance with previous studies on dog-wolf hybridization for haplogroup d2 origin, the highest number of d2 haplotypes in Iranian dog as compared to other areas of Mediterranean basin suggests Iran as the probable center of its origin. Historical evidence showed that Silk Road linked Iran to countries in South East Asia and other parts of the world, which might have probably influenced effective gene flow within Iran and these regions. The medium nucleotide diversity observed in Iranian dog calls for utilization of appropriate management techniques in increasing effective population size.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.