The genetic changes underlying the initial steps of animal domestication are still poorly understood. We generated a high-quality reference genome for rabbit and compared it to resequencing data from populations of wild and domestic rabbits. We identified over 100 selective sweeps specific to domestic rabbits, but only a relatively small number of fixed (or nearly fixed) SNPs for derived alleles. SNPs with marked allele frequency differences between wild and domestic rabbits were enriched for conserved non-coding sites. Enrichment analyses suggest that genes affecting brain and neuronal development have often been targeted during domestication. We propose that due to a truly complex genetic background, tame behavior in rabbits and other domestic animals evolved by shifts in allele frequencies at many loci, rather than by critical changes at only a few ‘domestication loci’.
We used 20 de novo genome assemblies to probe the speciation history and architecture of gene flow in rapidly radiating Heliconius butterflies. Our tests to distinguish incomplete lineage sorting from introgression indicate that gene flow has obscured several ancient phylogenetic relationships in this group over large swathes of the genome. Introgressed loci are underrepresented in low-recombination and gene-rich regions, consistent with the purging of foreign alleles more tightly linked to incompatibility loci. Here, we identify a hitherto unknown inversion that traps a color pattern switch locus. We infer that this inversion was transferred between lineages by introgression and is convergent with a similar rearrangement in another part of the genus. These multiple de novo genome sequences enable improved understanding of the importance of introgression and selective processes in adaptive radiation.
Mouse chromosome 10 harbors several loci associated with hearing loss, including waltzer (v), modifier-of deaf waddler (mdfw) and Age-related hearing loss (Ahl). The human region that is orthologous to the mouse 'waltzer' region is located at 10q21-q22 and contains the human deafness loci DFNB12 and USH1D). Numerous mutations at the waltzer locus have been documented causing erratic circling and hearing loss. Here we report the identification of a new gene mutated in v. The 10.5-kb Cdh23 cDNA encodes a very large, single-pass transmembrane protein, that we have called otocadherin. It has an extracellular domain that contains 27 repeats; these show significant homology to the cadherin ectodomain. In v(6J), a GT transversion creates a premature stop codon. In v(Alb), a CT exchange generates an ectopic donor splice site, effecting deletion of 119 nucleotides of exonic sequence. In v(2J), a GA transition abolishes the donor splice site, leading to aberrant splice forms. All three alleles are predicted to cause loss of function. We demonstrate Cdh23 expression in the neurosensory epithelium and show that during early hair-cell differentiation, stereocilia organization is disrupted in v(2J) homozygotes. Our data indicate that otocadherin is a critical component of hair bundle formation. Mutations in human CDH23 cause Usher syndrome type 1D and thus, establish waltzer as the mouse model for USH1D.
Advances in genome sequencing and assembly technologies are generating many high-quality genome sequences, but assemblies of large, repeat-rich polyploid genomes, such as that of bread wheat, remain fragmented and incomplete. We have generated a new wheat whole-genome shotgun sequence assembly using a combination of optimized data types and an assembly algorithm designed to deal with large and complex genomes. The new assembly represents >78% of the genome with a scaffold N50 of 88.8 kb that has a high fidelity to the input data. Our new annotation combines strand-specific Illumina RNA-seq and Pacific Biosciences (PacBio) full-length cDNAs to identify 104,091 high-confidence protein-coding genes and 10,156 noncoding RNA genes. We confirmed three known and identified one novel genome rearrangements. Our approach enables the rapid and scalable assembly of wheat genomes, the identification of structural variants, and the definition of complete gene models, all powerful resources for trait analysis and breeding of this key global crop.
The koala, the only extant species of the marsupial family Phascolarctidae, is classified as 'vulnerable' due to habitat loss and widespread disease. We sequenced the koala genome, producing a complete and contiguous marsupial reference genome, including centromeres. We reveal that the koala's ability to detoxify eucalypt foliage may be due to expansions within a cytochrome P450 gene family, and its ability to smell, taste and moderate ingestion of plant secondary metabolites may be due to expansions in the vomeronasal and taste receptors. We characterized novel lactation proteins that protect young in the pouch and annotated immune genes important for response to chlamydial disease. Historical demography showed a substantial population crash coincident with the decline of Australian megafauna, while contemporary populations had biogeographic boundaries and increased inbreeding in populations affected by historic translocations. We identified genetically diverse populations that require habitat corridors and instituting of translocation programs to aid the koala's survival in the wild.
The domestic dog, Canis familiaris, is a well-established model system for mapping trait and disease loci. While the original draft sequence was of good quality, gaps were abundant particularly in promoter regions of the genome, negatively impacting the annotation and study of candidate genes. Here, we present an improved genome build, canFam3.1, which includes 85 MB of novel sequence and now covers 99.8% of the euchromatic portion of the genome. We also present multiple RNA-Sequencing data sets from 10 different canine tissues to catalog ∼175,000 expressed loci. While about 90% of the coding genes previously annotated by EnsEMBL have measurable expression in at least one sample, the number of transcript isoforms detected by our data expands the EnsEMBL annotations by a factor of four. Syntenic comparison with the human genome revealed an additional ∼3,000 loci that are characterized as protein coding in human and were also expressed in the dog, suggesting that those were previously not annotated in the EnsEMBL canine gene set. In addition to ∼20,700 high-confidence protein coding loci, we found ∼4,600 antisense transcripts overlapping exons of protein coding genes, ∼7,200 intergenic multi-exon transcripts without coding potential, likely candidates for long intergenic non-coding RNAs (lincRNAs) and ∼11,000 transcripts were reported by two different library construction methods but did not fit any of the above categories. Of the lincRNAs, about 6,000 have no annotated orthologs in human or mouse. Functional analysis of two novel transcripts with shRNA in a mouse kidney cell line altered cell morphology and motility. All in all, we provide a much-improved annotation of the canine genome and suggest regulatory functions for several of the novel non-coding transcripts.
We here pioneer a low-cost assembly strategy for 20 Heliconiini genomes to characterize the evolutionary history of the rapidly radiating genus Heliconius. A bifurcating tree provides a poor fit to the data, and we therefore explore a reticulate phylogeny for Heliconius. We probe the genomic architecture of gene flow, and develop a new method to distinguish incomplete lineage sorting from introgression. We find that most loci with non-canonical histories arose through introgression, and are strongly underrepresented in regions of low recombination and high gene density. This is expected if introgressed alleles are more likely to be purged in such regions due to tighter linkage with incompatibility loci. Finally, we identify a hitherto unrecognized inversion, and show it is a convergent structural rearrangement that captures a known color pattern switch locus within the genus. Our multi-genome assembly approach enables an improved understanding of adaptive radiation.
Satsuma is part of the Spines software package, implemented in C++ on Linux. The latest version of Spines can be freely downloaded under the LGPL license from http://www.broadinstitute.org/science/programs/genome-biology/spines/.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.