Whole transcriptome sequencing (RNA-seq) has become a standard for cataloguing and monitoring RNA populations. One of the main bottlenecks, however, is to correctly identify the different classes of RNAs among the plethora of reconstructed transcripts, particularly those that will be translated (mRNAs) from the class of long non-coding RNAs (lncRNAs). Here, we present FEELnc (FlExible Extraction of LncRNAs), an alignment-free program that accurately annotates lncRNAs based on a Random Forest model trained with general features such as multi k-mer frequencies and relaxed open reading frames. Benchmarking versus five state-of-the-art tools shows that FEELnc achieves similar or better classification performance on GENCODE and NONCODE data sets. The program also provides specific modules that enable the user to fine-tune classification accuracy, to formalize the annotation of lncRNA classes and to identify lncRNAs even in the absence of a training set of non-coding RNAs. We used FEELnc on a real data set comprising 20 canine RNA-seq samples produced by the European LUPA consortium to substantially expand the canine genome annotation to include 10 374 novel lncRNAs and 58 640 mRNA transcripts. FEELnc moves beyond conventional coding potential classifiers by providing a standardized and complete solution for annotating lncRNAs and is freely available at https://github.com/tderrien/FEELnc.
Stature is affected by many polymorphisms of small effect in humans . In contrast, variation in dogs, even within breeds, has been suggested to be largely due to variants in a small number of genes. Here we use data from cattle to compare the genetic architecture of stature to those in humans and dogs. We conducted a meta-analysis for stature using 58,265 cattle from 17 populations with 25.4 million imputed whole-genome sequence variants. Results showed that the genetic architecture of stature in cattle is similar to that in humans, as the lead variants in 163 significantly associated genomic regions (P < 5 × 10) explained at most 13.8% of the phenotypic variance. Most of these variants were noncoding, including variants that were also expression quantitative trait loci (eQTLs) and in ChIP-seq peaks. There was significant overlap in loci for stature with humans and dogs, suggesting that a set of common genes regulates body size in mammals.
Summary The domestic dog serves as an excellent model to investigate the genetic basis of disease. More than 400 heritable traits analogous to human diseases have been described in dogs. To further canine medical genetics research, we established the Dog Biomedical Variant Database Consortium (DBVDC) and present a comprehensive list of functionally annotated genome variants that were identified with whole genome sequencing of 582 dogs from 126 breeds and eight wolves. The genomes used in the study have a minimum coverage of 10 × and an average coverage of ~24×. In total, we identified 23 133 692 single-nucleotide variants (SNVs) and 10 048 038 short indels, including 93% undescribed variants. On average, each individual dog genome carried ~ 4.1 million single-nucleotide and ~1.4 million short-indel variants with respect to the reference genome assembly. About 2% of the variants were located in coding regions of annotated genes and loci. Variant effect classification showed 247 141 SNVs and 99 562 short indels having moderate or high impact on 11 267 protein-coding genes. On average, each genome contained heterozygous loss-of-function variants in 30 potentially embryonic lethal genes and 97 genes associated with developmental disorders. More than 50 inherited disorders and traits have been unravelled using the DBVDC variant catalogue, enabling genetic testing for breeding and diagnostics. This resource of annotated variants and their corresponding genotype frequencies constitutes a highly useful tool for the identification of potential variants causative for rare inherited disorders in dogs.
The genomic changes underlying both early and late stages of horse domestication remain largely unknown. We examined the genomes of 14 early domestic horses from the Bronze and Iron Ages, dating to between ~4.1 and 2.3 thousand years before present. We find early domestication selection patterns supporting the neural crest hypothesis, which provides a unified developmental origin for common domestic traits. Within the past 2.3 thousand years, horses lost genetic diversity and archaic DNA tracts introgressed from a now-extinct lineage. They accumulated deleterious mutations later than expected under the cost-of-domestication hypothesis, probably because of breeding from limited numbers of stallions. We also reveal that Iron Age Scythian steppe nomads implemented breeding strategies involving no detectable inbreeding and selection for coat-color variation and robust forelimbs.
SUMMARY Przewalski’s horses (PHs, Equus ferus ssp. przewalskii) were discovered in the Asian steppes in the 1870s and represent the last remaining true wild horses. PHs became extinct in the wild in the 1960s but survived in captivity, thanks to major conservation efforts. The current population is still endangered, with just 2,109 individuals, one-quarter of which are in Chinese and Mongolian reintroduction reserves [1]. These horses descend from a founding population of 12 wild-caught PHs and possibly up to four domesticated individuals [2–4]. With a stocky build, an erect mane, and stripped and short legs, they are phenotypically and behaviorally distinct from domesticated horses (DHs, Equus caballus). Here, we sequenced the complete genomes of 11 PHs, representing all founding lineages, and five historical specimens dated to 1878–1929 CE, including the Holotype. These were compared to the hitherto-most-extensive genome dataset characterized for horses, comprising 21 new genomes. We found that loci showing the most genetic differentiation with DHs were enriched in genes involved in metabolism, cardiac disorders, muscle contraction, reproduction, behavior, and signaling pathways. We also show that DH and PH populations split ~45,000 years ago and have remained connected by gene-flow thereafter. Finally, we monitor the genomic impact of ~110 years of captivity, revealing reduced heterozygosity, increased inbreeding, and variable introgression of domestic alleles, ranging from non-detectable to as much as 31.1%. This, together with the identification of ancestry informative markers and corrections to the International Studbook, establishes a framework for evaluating the persistence of genetic variation in future reintroduced populations.
The Y chromosome directly reflects male genealogies, but the extremely low Y chromosome sequence diversity in horses has prevented the reconstruction of stallion genealogies [1, 2]. Here, we resolve the first Y chromosome genealogy of modern horses by screening 1.46 Mb of the male-specific region of the Y chromosome (MSY) in 52 horses from 21 breeds. Based on highly accurate pedigree data, we estimated the de novo mutation rate of the horse MSY and showed that various modern horse Y chromosome lineages split much later than the domestication of the species. Apart from few private northern European haplotypes, all modern horse breeds clustered together in a roughly 700-year-old haplogroup that was transmitted to Europe by the import of Oriental stallions. The Oriental horse group consisted of two major subclades: the Original Arabian lineage and the Turkoman horse lineage. We show that the English Thoroughbred MSY was derived from the Turkoman lineage and that English Thoroughbred sires are largely responsible for the predominance of this haplotype in modern horses.
Yakutia, Sakha Republic, in the Siberian Far East, represents one of the coldest places on Earth, with winter record temperatures dropping below −70°C. Nevertheless, Yakutian horses survive all year round in the open air due to striking phenotypic adaptations, including compact body conformations, extremely hairy winter coats, and acute seasonal differences in metabolic activities. The evolutionary origins of Yakutian horses and the genetic basis of their adaptations remain, however, contentious. Here, we present the complete genomes of nine present-day Yakutian horses and two ancient specimens dating from the early 19th century and ∼5,200 y ago. By comparing these genomes with the genomes of two Late Pleistocene, 27 domesticated, and three wild Przewalski's horses, we find that contemporary Yakutian horses do not descend from the native horses that populated the region until the mid-Holocene, but were most likely introduced following the migration of the Yakut people a few centuries ago. Thus, they represent one of the fastest cases of adaptation to the extreme temperatures of the Arctic. We find cisregulatory mutations to have contributed more than nonsynonymous changes to their adaptation, likely due to the comparatively limited standing variation within gene bodies at the time the population was founded. Genes involved in hair development, body size, and metabolic and hormone signaling pathways represent an essential part of the Yakutian horse adaptive genetic toolkit. Finally, we find evidence for convergent evolution with native human populations and woolly mammoths, suggesting that only a few evolutionary strategies are compatible with survival in extremely cold environments. The Yakutian horse is the most northerly distributed horse on the planet and certainly the most resistant to cold. In contrast to SignificanceYakutia is among the coldest regions in the Northern Hemisphere, showing ∼40% of its territory above the Arctic Circle. Native horses are particularly adapted to this environment, with body sizes and thick winter coats minimizing heat loss. We sequenced complete genomes of two ancient and nine present-day Yakutian horses to elucidate their evolutionary origins. We find that the contemporary population descends from domestic livestock, likely brought by early horse-riders who settled in the region a few centuries ago. The metabolic, anatomical, and physiological adaptations of these horses therefore emerged on very short evolutionary time scales. We show the relative importance of regulatory changes in the adaptive process and identify genes independently selected in cold-adapted human populations and woolly mammoths.
The molecular regulation of horn growth in ruminants is still poorly understood. To investigate this process, we collected 1019 hornless (polled) animals from different cattle breeds. High-density SNP genotyping confirmed the presence of two different polled associated haplotypes in Simmental and Holstein cattle co-localized on BTA 1. We refined the critical region of the Simmental polled mutation to 212 kb and identified an overlapping region of 932 kb containing the Holstein polled mutation. Subsequently, whole genome sequencing of polled Simmental and Holstein cows was used to determine polled associated genomic variants. By genotyping larger cohorts of animals with known horn status we found a single perfectly associated insertion/deletion variant in Simmental and other beef cattle confirming the recently published possible Celtic polled mutation. We identified a total of 182 sequence variants as candidate mutations for polledness in Holstein cattle, including an 80 kb genomic duplication and three SNPs reported before. For the first time we showed that hornless cattle with scurs are obligate heterozygous for one of the polled mutations. This is in contrast to published complex inheritance models for the bovine scurs phenotype. Studying differential expression of the annotated genes and loci within the mapped region on BTA 1 revealed a locus (LOC100848215), known in cow and buffalo only, which is higher expressed in fetal tissue of wildtype horn buds compared to tissue of polled fetuses. This implicates that the presence of this long noncoding RNA is a prerequisite for horn bud formation. In addition, both transcripts associated with polledness in goat and sheep (FOXL2 and RXFP2), show an overexpression in horn buds confirming their importance during horn development in cattle.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.