High-quality and complete reference genome assemblies are fundamental for the application of genomics to biology, disease, and biodiversity conservation. However, such assemblies are only available for a few non-microbial species 1-4 . To address this issue, the international Genome 10K (G10K) consortium 5,6 has worked over a five-year period to evaluate and develop cost-effective methods for assembling the most accurate and complete reference genomes to date. Here we summarize these developments, introduce a set of quality standards, and present lessons learned from sequencing and assembling 16 species representing major vertebrate lineages (mammals, birds, reptiles, amphibians, teleost fishes and cartilaginous fishes). We confirm that long-read sequencing technologies are essential for maximizing genome quality and that unresolved complex repeats and haplotype heterozygosity are major sources of error in assemblies. Our new assemblies identify and correct substantial errors in some of the best historical reference genomes. Adopting these lessons, we have embarked on the Vertebrate Genomes Project (VGP), an effort to generate high-quality, complete reference genomes for all ~70,000 extant vertebrate species and help enable a new era of discovery across the life sciences.
High-quality and complete reference genome assemblies are fundamental for the application of genomics to biology, disease, and biodiversity conservation. However, such assemblies are available for only a few non-microbial species1–4. To address this issue, the international Genome 10K (G10K) consortium5,6 has worked over a five-year period to evaluate and develop cost-effective methods for assembling highly accurate and nearly complete reference genomes. Here we present lessons learned from generating assemblies for 16 species that represent six major vertebrate lineages. We confirm that long-read sequencing technologies are essential for maximizing genome quality, and that unresolved complex repeats and haplotype heterozygosity are major sources of assembly error when not handled correctly. Our assemblies correct substantial errors, add missing sequence in some of the best historical reference genomes, and reveal biological discoveries. These include the identification of many false gene duplications, increases in gene sizes, chromosome rearrangements that are specific to lineages, a repeated independent chromosome breakpoint in bat genomes, and a canonical GC-rich pattern in protein-coding genes and their regulatory regions. Adopting these lessons, we have embarked on the Vertebrate Genomes Project (VGP), an international effort to generate high-quality, complete reference genomes for all of the roughly 70,000 extant vertebrate species and to help to enable a new era of discovery across the life sciences.
BackgroundThe history of African indigenous cattle and their adaptation to environmental and human selection pressure is at the root of their remarkable diversity. Characterization of this diversity is an essential step towards understanding the genomic basis of productivity and adaptation to survival under African farming systems.ResultsWe analyze patterns of African cattle genetic variation by sequencing 48 genomes from five indigenous populations and comparing them to the genomes of 53 commercial taurine breeds. We find the highest genetic diversity among African zebu and sanga cattle. Our search for genomic regions under selection reveals signatures of selection for environmental adaptive traits. In particular, we identify signatures of selection including genes and/or pathways controlling anemia and feeding behavior in the trypanotolerant N’Dama, coat color and horn development in Ankole, and heat tolerance and tick resistance across African cattle especially in zebu breeds.ConclusionsOur findings unravel at the genome-wide level, the unique adaptive diversity of African cattle while emphasizing the opportunities for sustainable improvement of livestock productivity on the continent.Electronic supplementary materialThe online version of this article (doi:10.1186/s13059-017-1153-y) contains supplementary material, which is available to authorized users.
Non-alcoholic fatty liver disease (NAFLD) is one of the most frequent causes of liver disease and its prevalence is a serious and growing clinical problem. Caloric restriction (CR) is commonly recommended for improvement of obesity-related diseases such as NAFLD. However, the effects of CR on hepatic metabolism remain unknown. We investigated the effects of CR on metabolic dysfunction in the liver of obese diabetic db/db mice. We found that CR of db/db mice reverted insulin resistance, hepatic steatosis, body weight and adiposity to those of db/m mice. 1H-NMR- and UPLC-QTOF-MS-based metabolite profiling data showed significant metabolic alterations related to lipogenesis, ketogenesis, and inflammation in db/db mice. Moreover, western blot analysis showed that lipogenesis pathway enzymes in the liver of db/db mice were reduced by CR. In addition, CR reversed ketogenesis pathway enzymes and the enhanced autophagy, mitochondrial biogenesis, collagen deposition and endoplasmic reticulum stress in db/db mice. In particular, hepatic inflammation-related proteins including lipocalin-2 in db/db mice were attenuated by CR. Hepatic metabolomic studies yielded multiple pathological mechanisms of NAFLD. Also, these findings showed that CR has a therapeutic effect by attenuating the deleterious effects of obesity and diabetes-induced multiple complications.
Adipocytes mainly function as energy storage and endocrine cells. Adipose tissues showed the biological and genetic difference based on their depots. The difference of adipocytes between depots might be influenced by the inherent genetic programing for adipogenesis. We used RNA-seq technique to investigate the transcriptomes in 3 adipose tissues of omental (O), subcutaneous (S) and intramuscular (I) fats in cattle. Sequence reads were obtained from Illumina HiSeq2000 and mapped to the bovine genome using Tophat2. Differentially expressed genes (DEG) between adipose tissues were detected by EdgeR. We identified 5797, 2156, and 5455 DEGs in the comparison between OI, OS, and IS respectively and also found 5657 DEGs in the comparison between the intramuscular and the combined omental and subcutaneous fats (C) (FDR<0.01). Depot specifically up- and down- regulated DEGs were 853 in S, 48 in I, and 979 in O. The numbers of DEGs and functional annotation studies suggested that I had the different genetic profile compared to other two adipose tissues. In I, DEGs involved in the developmental process (eg. EGR2, FAS, and KLF7) were up-regulated and those in the immune system process were down-regulated. Many DEGs from the adipose tissues were enriched in the various GO terms of developmental process and KEGG pathway analysis showed that the ECM-receptor interaction was one of commonly enriched pathways in all of the 3 adipose tissues and also functioned as a sub-pathway of other enriched pathways. However, genes involved in the ECM-receptor interaction were differentially regulated depending on the depots. Collagens, main ECM constituents, were significantly up-regulated in S and integrins, transmembrane receptors, were up-regulated in I. Different laminins were up-regulated in the different depots. This comparative transcriptome analysis of three adipose tissues suggested that the interactions between ECM components and transmembrane receptors of fat cells depend on the depot specific adipogenesis.
BackgroundAnimal domestication involved drastic phenotypic changes driven by strong artificial selection and also resulted in new populations of breeds, established by humans. This study aims to identify genes that show evidence of recent artificial selection during pig domestication.ResultsWhole-genome resequencing of 30 individual pigs from domesticated breeds, Landrace and Yorkshire, and 10 Asian wild boars at ~16-fold coverage was performed resulting in over 4.3 million SNPs for 19,990 genes. We constructed a comprehensive genome map of directional selection by detecting selective sweeps using an FST-based approach that detects directional selection in lineages leading to the domesticated breeds and using a haplotype-based test that detects ongoing selective sweeps within the breeds. We show that candidate genes under selection are significantly enriched for loci implicated in quantitative traits important to pig reproduction and production. The candidate gene with the strongest signals of directional selection belongs to group III of the metabolomics glutamate receptors, known to affect brain functions associated with eating behavior, suggesting that loci under strong selection include loci involved in behaviorial traits in domesticated pigs including tameness.ConclusionsWe show that a significant proportion of selection signatures coincide with loci that were previously inferred to affect phenotypic variation in pigs. We further identify functional enrichment related to behavior, such as signal transduction and neuronal activities, for those targets of selection during domestication in pigs.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-015-1330-x) contains supplementary material, which is available to authorized users.
Background Modern sequencing technologies should make the assembly of the relatively small mitochondrial genomes an easy undertaking. However, few tools exist that address mitochondrial assembly directly. Results As part of the Vertebrate Genomes Project (VGP) we develop mitoVGP, a fully automated pipeline for similarity-based identification of mitochondrial reads and de novo assembly of mitochondrial genomes that incorporates both long (> 10 kbp, PacBio or Nanopore) and short (100–300 bp, Illumina) reads. Our pipeline leads to successful complete mitogenome assemblies of 100 vertebrate species of the VGP. We observe that tissue type and library size selection have considerable impact on mitogenome sequencing and assembly. Comparing our assemblies to purportedly complete reference mitogenomes based on short-read sequencing, we identify errors, missing sequences, and incomplete genes in those references, particularly in repetitive regions. Our assemblies also identify novel gene region duplications. The presence of repeats and duplications in over half of the species herein assembled indicates that their occurrence is a principle of mitochondrial structure rather than an exception, shedding new light on mitochondrial genome evolution and organization. Conclusions Our results indicate that even in the “simple” case of vertebrate mitogenomes the completeness of many currently available reference sequences can be further improved, and caution should be exercised before claiming the complete assembly of a mitogenome, particularly from short reads alone.
Background: Abalones are large marine snails in the family Haliotidae and the genus Haliotis belonging to the class Gastropoda of the phylum Mollusca. The family Haliotidae contains only one genus, Haliotis, and this single genus is known to contain several species of abalone. With 18 additional subspecies, the most comprehensive treatment of Haliotidae considers 56 species valid [1]. Abalone is an economically important fishery and aquaculture animal that is considered a highly prized seafood delicacy. The total global supply of abalone has increased 5-fold since the 1970s and farm production increased explosively from 50 mt to 103 464 mt in the past 40 years. Additionally, researchers have recently focused on abalone given their reported tumor suppression effect. However, despite the valuable features of this marine animal, no genomic information is available for the Haliotidae family and related research is still limited. To construct the H. discus hannai genome, a total of 580-G base pairs using Illumina and Pacbio platforms were generated with 322-fold coverage based on the 1.8-Gb estimated genome size of H. discus hannai using flow cytometry. The final genome assembly consisted of 1.86 Gb with 35 450 scaffolds (>2 kb). GC content level was 40.51%, and the N50 length of assembled scaffolds was 211 kb. We identified 29 449 genes using Evidence Modeler based on the gene information from ab initio prediction, protein homology with known genes, and transcriptome evidence of RNA-seq. Here we present the first Haliotidae genome, H. discus hannai, with sequencing data, assembly, and gene annotation information. This will be helpful for resolving the lack of genomic information in the Haliotidae family as well as providing more opportunities for understanding gastropod evolution.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.