The decrease in sequencing cost and increased sophistication of assembly algorithms for short-read platforms has resulted in a sharp increase in the number of species with genome assemblies. However, these assemblies are highly fragmented, with many gaps, ambiguities, and errors, impeding downstream applications. We demonstrate current state of the art for de novo assembly using the domestic goat (Capra hircus), based on long reads for contig formation, short reads for consensus validation, and scaffolding by optical and chromatin interaction mapping. These combined technologies produced the most continuous de novo mammalian assembly to date, with chromosome-length scaffolds and only 649 gaps. Our assembly represents a ~400-fold improvement in continuity due to properly assembled gaps compared to the previously published C. hircus assembly, and better resolves repetitive structures longer than 1 kb, representing the largest repeat family and immune gene complex ever produced for an individual of a ruminant species.
Goats and sheep are versatile domesticates that have been integrated into diverse environments and production systems. Natural and artificial selection have shaped the variation in the two species, but natural selection has played the major role among indigenous flocks. To investigate signals of natural selection, we analyzed genotype data generated using the caprine and ovine 50K SNP BeadChips from Barki goats and sheep that are indigenous to a hot arid environment in Egypt's Coastal Zone of the Western Desert. We identify several candidate regions under selection that spanned 119 genes. A majority of the genes were involved in multiple signaling and signal transduction pathways in a wide variety of cellular and biochemical processes. In particular, selection signatures spanning several genes that directly or indirectly influenced traits for adaptation to hot arid environments, such as thermo-tolerance (melanogenesis) (FGF2, GNAI3, PLCB1), body size and development (BMP2, BMP4, GJA3, GJB2), energy and digestive metabolism (MYH, TRHDE, ALDH1A3), and nervous and autoimmune response (GRIA1, IL2, IL7, IL21, IL1R1) were identified. We also identified eight common candidate genes under selection in the two species and a shared selection signature that spanned a conserved syntenic segment to bovine chromosome 12 on caprine and ovine chromosomes 12 and 10, respectively, providing, most likely, the evidence for selection in a common environment in two different but closely related species. Our study highlights the importance of indigenous livestock as model organisms for investigating selection sweeps and genome-wide association mapping.
The success of Genome Wide Association Studies in the discovery of sequence variation linked to complex traits in humans has increased interest in high throughput SNP genotyping assays in livestock species. Primary goals are QTL detection and genomic selection. The purpose here was design of a 50–60,000 SNP chip for goats. The success of a moderate density SNP assay depends on reliable bioinformatic SNP detection procedures, the technological success rate of the SNP design, even spacing of SNPs on the genome and selection of Minor Allele Frequencies (MAF) suitable to use in diverse breeds. Through the federation of three SNP discovery projects consolidated as the International Goat Genome Consortium, we have identified approximately twelve million high quality SNP variants in the goat genome stored in a database together with their biological and technical characteristics. These SNPs were identified within and between six breeds (meat, milk and mixed): Alpine, Boer, Creole, Katjang, Saanen and Savanna, comprising a total of 97 animals. Whole genome and Reduced Representation Library sequences were aligned on >10 kb scaffolds of the de novo goat genome assembly. The 60,000 selected SNPs, evenly spaced on the goat genome, were submitted for oligo manufacturing (Illumina, Inc) and published in dbSNP along with flanking sequences and map position on goat assemblies (i.e. scaffolds and pseudo-chromosomes), sheep genome V2 and cattle UMD3.1 assembly. Ten breeds were then used to validate the SNP content and 52,295 loci could be successfully genotyped and used to generate a final cluster file. The combined strategy of using mainly whole genome Next Generation Sequencing and mapping on a contig genome assembly, complemented with Illumina design tools proved to be efficient in producing this GoatSNP50 chip. Advances in use of molecular markers are expected to accelerate goat genomic studies in coming years.
Goats (Capra hircus) are an important farm animal species. Copy number variation (CNV) represents a major source of genomic structural variation. We investigated the diversity of CNV distribution in goats using CaprineSNP50 genotyping data generated by the ADAPTmap Project. We identified 6286 putative CNVs in 1023 samples from 50 goat breeds using PennCNV. These CNVs were merged into 978 CNV regions, spanning ~262 Mb of total length and corresponding to ~8.96% of the goat genome. We then divided the samples into six subgroups per geographic distribution and constructed a comparative CNV map. Our results revealed a population differentiation in CNV across different geographical areas, including Western Asia, Eastern Mediterranean, Alpine & Northern Europe, Madagascar, Northwestern Africa, and Southeastern Africa groups. The results of a cluster heatmap analysis based on the CNV count per individual across different groups was generally consistent with the one generated from the SNP data, likely reflecting the population history of different goat breeds. We sought to determine the gene content of these CNV events and found several important CNVoverlapping genes (e.g. EDNRA, ADAMTS20, ASIP, KDM5B, ADAM8, DGAT1, CHRNB1, CLCN7, and EXOSC4), which are involved in local adaptations such as coat color, muscle development, metabolic processes, osteopetrosis, and embryonic development. Therefore, this research generated an extensive CNV map in the worldwide population of goat, which offers novel insight into the goat genome and its functional annotation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.