Centromeres and large-scale structural variants evolve and contribute to genome diversity during vertebrate speciation. Here, we perform de novo long-read genome assembly of three inbred medaka strains that are derived from geographically isolated subpopulations and undergo speciation. Using single-molecule real-time (SMRT) sequencing, we obtain three chromosome-mapped genomes of length ~734, ~678, and ~744Mbp with a resource of twenty-two centromeric regions of length 20–345kbp. Centromeres are positionally conserved among the three strains and even between four pairs of chromosomes that were duplicated by the teleost-specific whole-genome duplication 320–350 million years ago. The centromeres do not all evolve at a similar pace; rather, centromeric monomers in non-acrocentric chromosomes evolve significantly faster than those in acrocentric chromosomes. Using methylation sensitive SMRT reads, we uncover centromeres are mostly hypermethylated but have hypomethylated sub-regions that acquire unique sequence compositions independently. These findings reveal the potential of non-acrocentric centromere evolution to contribute to speciation.
Genebanks provide access to diverse materials for crop improvement. To utilize and evaluate them effectively, core collections, such as the World Rice Core Collection (WRC) in the Genebank at the National Agriculture and Food Research Organization, have been developed. Because the WRC consists of 69 accessions with a high degree of genetic diversity, it has been used for >300 projects. To allow deeper investigation of existing WRC data and to further promote research using Genebank rice accessions, we performed whole-genome resequencing of these 69 accessions, examining their sequence variation by mapping against the Oryza sativa ssp. japonica Nipponbare genome. We obtained a total of 2,805,329 single nucleotide polymorphisms (SNPs) and 357,639 insertion–deletions. Based on the principal component analysis and population structure analysis of these data, the WRC can be classified into three major groups. We applied TASUKE, a multiple genome browser to visualize the different WRC genome sequences, and classified haplotype groups of genes affecting seed characteristics and heading date. TASUKE thus provides access to WRC genotypes as a tool for reverse genetics. We examined the suitability of the compact WRC population for genome-wide association studies (GWASs). Heading date, affected by a large number of quantitative trait loci (QTLs), was not associated with known genes, but several seed-related phenotypes were associated with known genes. Thus, for QTLs of strong effect, the compact WRC performed well in GWAS. This information enables us to understand genetic diversity in 37,000 rice accessions maintained in the Genebank and to find genes associated with different phenotypes. The sequence data have been deposited in DNA Data Bank of Japan Sequence Read Archive (DRA) (Supplementary Table S1).
Mobile genetic elements (e.g., transposable elements and viruses) display significant diversity with various life cycles, but how novel elements emerge remains obscure. Here, we report a giant (180-kb long) transposon, Teratorn, originally identified in the genome of medaka, Oryzias latipes. Teratorn belongs to the piggyBac superfamily and retains the transposition activity. Remarkably, Teratorn is largely derived from a herpesvirus of the Alloherpesviridae family that could infect fish and amphibians. Genomic survey of Teratorn-like elements reveals that some of them exist as a fused form between piggyBac transposon and herpesvirus genome in teleosts, implying the generality of transposon-herpesvirus fusion. We propose that Teratorn was created by a unique fusion of DNA transposon and herpesvirus, leading to life cycle shift. Our study supports the idea that recombination is the key event in generation of novel mobile genetic elements.
Chromatin looping plays an important role in genome regulation. However, because ChIP-seq and loop-resolution Hi-C (DNA-DNA proximity ligation) are extremely challenging in mammalian early embryos, the developmental stage at which cohesin-mediated loops form remains unknown. Here, we study early development in medaka (the Japanese killifish, Oryzias latipes) at 12 time points before, during, and after gastrulation (the onset of cell differentiation) and characterize transcription, protein binding, and genome architecture. We find that gastrulation is associated with drastic changes in genome architecture, including the formation of the first loops between sites bound by the insulator protein CTCF and a large increase in the size of contact domains. In contrast, the binding of the CTCF is fixed throughout embryogenesis. Loops form long after genome-wide transcriptional activation, and long after domain formation seen in mouse embryos. These results suggest that, although loops may play a role in differentiation, they are not required for zygotic transcription. When we repeated our experiments in zebrafish, loops did not emerge until gastrulation, that is, well after zygotic genome activation. We observe that loop positions are highly conserved in synteny blocks of medaka and zebrafish, indicating that the 3D genome architecture has been maintained for >110–200 million years of evolution.
Loss of pod shattering is one of the most important domestication-related traits in legume crops. The non-shattering phenotypes have been achieved either by disturbed formation of abscission layer between the valves, or by loss of helical tension in sclerenchyma of endocarp, that split open the pods to disperse the seeds. During domestication, azuki bean (Vigna angularis) and yard-long bean (Vigna unguiculata cv-gr. Sesquipedalis) have reduced or lost the sclerenchyma and thus the shattering behavior of seed pods. Here we performed fine-mapping with backcrossed populations and narrowed the candidate genomic region down to 4 kbp in azuki bean and 13 kbp in yard-long bean. Among the genes located in these regions, we found MYB26 genes encoded truncated proteins in azuki bean, yard-long bean, and even cowpea. As such, our findings indicate that independent domestication on the two legumes has selected the same locus for the same traits. We also argue that MYB26 could be a target gene for improving shattering phenotype in other legumes, such as soybean.
Summary: Because an enormous amount of sequence data is being collected, a method to effectively display sequence variation information is urgently needed. tasuke is a web application that visualizes large-scale resequencing data generated by next-generation sequencing technologies and is suitable for rapid data release to the public on the web. The variation and read depths of multiple genomes, as well as annotations, can be shown simultaneously at various scales. We demonstrate the use of TASUKE by applying it to 50 rice and 100 human genome resequencing datasets.Availability and implementation: The tasuke program package and user manual are available from http://tasuke.dna.affrc.go.jp/.Contact: taitoh@affrc.go.jp
Gene targeting (GT) is a technique used to modify endogenous genes in target genomes precisely via homologous recombination (HR). Although GT plants are produced using genetic transformation techniques, if the difference between the endogenous and the modified gene is limited to point mutations, GT crops can be considered equivalent to non-genetically modified mutant crops generated by conventional mutagenesis techniques. However, it is difficult to guarantee the non-incorporation of DNA fragments from Agrobacterium in GT plants created by Agrobacterium-mediated GT despite screening with conventional Southern blot and/or PCR techniques. Here, we report a comprehensive analysis of herbicide-tolerant rice plants generated by inducing point mutations in the rice ALS gene via Agrobacterium-mediated GT. We performed genome comparative genomic hybridization (CGH) array analysis and whole-genome sequencing to evaluate the molecular composition of GT rice plants. Thus far, no integration of Agrobacterium-derived DNA fragments has been detected in GT rice plants. However, >1,000 single nucleotide polymorphisms (SNPs) and insertion/deletion (InDels) were found in GT plants. Among these mutations, 20–100 variants might have some effect on expression levels and/or protein function. Information about additive mutations should be useful in clearing out unwanted mutations by backcrossing.
Recent revolutionary advancements in sequencing technologies have made it possible to obtain mass quantities of genome-scale sequence data in a cost-effective manner and have drastically altered molecular biological studies. To utilize these sequence data, genome-wide association studies (GWASs) have become increasingly important. Hence, there is an urgent need to develop a visualization tool that enables efficient data retrieval, integration of GWAS results with diverse information and rapid public release of such large-scale genotypic and phenotypic data. We developed a web-based genome browser TASUKE+ (https://tasuke.dna.affrc.go.jp/), which is equipped with the following functions: (i) interactive GWAS results visualization with genome resequencing data and annotation information, (ii) PCR primer design, (iii) phylogenetic tree reconstruction and (iv) data sharing via the web. GWAS results can be displayed in parallel with polymorphism data, read depths and annotation information in an interactive and scalable manner. Users can design PCR primers for polymorphic sites of interest. In addition, a molecular phylogenetic tree of any region can be reconstructed so that the overall relationship among the examined genomes can be understood intuitively at a glance. All functions are implemented through user-friendly web-based interfaces so that researchers can easily share data with collaborators in remote places without extensive bioinformatics knowledge.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.