We report the draft genome of the black cottonwood tree, Populus trichocarpa . Integration of shotgun sequence assembly with genetic mapping enabled chromosome-scale reconstruction of the genome. More than 45,000 putative protein-coding genes were identified. Analysis of the assembled genome revealed a whole-genome duplication event; about 8000 pairs of duplicated genes from that event survived in the Populus genome. A second, older duplication event is indistinguishably coincident with the divergence of the Populus and Arabidopsis lineages. Nucleotide substitution, tandem gene duplication, and gross chromosomal rearrangement appear to proceed substantially more slowly in Populus than in Arabidopsis. Populus has more protein-coding genes than Arabidopsis , ranging on average from 1.4 to 1.6 putative Populus homologs for each Arabidopsis gene. However, the relative frequency of protein domains in the two genomes is similar. Overrepresented exceptions in Populus include genes associated with lignocellulosic wall biosynthesis, meristem development, disease resistance, and metabolite transport.
We report an improved draft nucleotide sequence of the 2.3-gigabase genome of maize, an important crop plant and model for biological research. Over 32,000 genes were predicted, of which 99.8% were placed on reference chromosomes. Nearly 85% of the genome is composed of hundreds of families of transposable elements, dispersed nonuniformly across the genome. These were responsible for the capture and amplification of numerous gene fragments and affect the composition, sizes, and positions of centromeres. We also report on the correlation of methylation-poor regions with Mu transposon insertions and recombination, and copy number variants with insertions and/or deletions, as well as how uneven gene losses between duplicated regions were involved in returning an ancient allotetraploid to a genetically diploid state. These analyses inform and set the stage for further investigations to improve our understanding of the domestication and agricultural improvements of maize.
Transposable elements are the single largest component of the genetic material of most eukaryotes. The recent availability of large quantities of genomic sequence has led to a shift from the genetic characterization of single elements to genome-wide analysis of enormous transposable-element populations. Nowhere is this shift more evident than in plants, in which transposable elements were first discovered and where they are still actively reshaping genomes.
The publication of draft sequences for the two subspecies of Oryza sativa (rice), japonica (cv. Nipponbare) and indica (cv. 93-11), provides a unique opportunity to study the dynamics of transposable elements in this important crop plant. Here we report the use of these sequences in a computational approach to identify the first active DNA transposons from rice and the first active miniature inverted-repeat transposable element (MITE) from any organism. A sequence classified as a Tourist-like MITE of 430 base pairs, called miniature Ping (mPing), was present in about 70 copies in Nipponbare and in about 14 copies in 93-11. These mPing elements, which are all nearly identical, transpose actively in an indica cell-culture line. Database searches identified a family of related transposase-encoding elements (called Pong), which also transpose actively in the same cells. Virtually all new insertions of mPing and Pong elements were into low-copy regions of the rice genome. Since the domestication of rice mPing MITEs have been amplified preferentially in cultivars adapted to environmental extremes-a situation that is reminiscent of the genomic shock theory for transposon activation.
Mutator-like transposable elements (MULEs) are found in many eukaryotic genomes and are especially prevalent in higher plants. In maize, rice and Arabidopsis a few MULEs were shown to carry fragments of cellular genes. These chimaeric elements are called Pack-MULEs in this study. The abundance of MULEs in rice and the availability of most of the genome sequence permitted a systematic analysis of the prevalence and nature of Pack-MULEs in an entire genome. Here we report that there are over 3,000 Pack-MULEs in rice containing fragments derived from more than 1,000 cellular genes. Pack-MULEs frequently contain fragments from multiple chromosomal loci that are fused to form new open reading frames, some of which are expressed as chimaeric transcripts. About 5% of the Pack-MULEs are represented in collections of complementary DNA. Functional analysis of amino acid sequences and proteomic data indicate that some captured gene fragments might be functional. Comparison of the cellular genes and Pack-MULE counterparts indicates that fragments of genomic DNA have been captured, rearranged and amplified over millions of years. Given the abundance of Pack-MULEs in rice and the widespread occurrence of MULEs in all characterized plant genomes, gene fragment acquisition by Pack-MULEs might represent an important new mechanism for the evolution of genes in higher plants.
High-copy-number transposable elements comprise the majority of eukaryotic genomes where they are major contributors to gene and genome evolution. However, it remains unclear how a host genome can survive a rapid burst of hundreds or thousands of insertions because such bursts are exceedingly rare in nature and therefore difficult to observe in real time. In a previous study we reported that in a few rice strains the DNA transposon mPing was increasing its copy number by approximately 40 per plant per generation. Here we exploit the completely sequenced rice genome to determine 1,664 insertion sites using high-throughput sequencing of 24 individual rice plants and assess the impact of insertion on the expression of 710 genes by comparative microarray analysis. We find that the vast majority of transposable element insertions either upregulate or have no detectable effect on gene transcription. This modest impact reflects a surprising avoidance of exon insertions by mPing and a preference for insertion into 5' flanking sequences of genes. Furthermore, we document the generation of new regulatory networks by a subset of mPing insertions that render adjacent genes stress inducible. As such, this study provides evidence for models first proposed previously for the involvement of transposable elements and other repetitive sequences in genome restructuring and gene regulation.
Previous studies have suggested that the R locus of maize is responsible for determining the temporal and spatial pattern ofanthocyanin pigmentation in the plant. In this report we demonstrate that three members of the R gene family, P, S, and Lc, encode homologous transcripts 2.5 kilobases in length. The structure of one R gene, Lc, was determined by sequencing cDNA and genomic clones. The putative Lc protein, deduced from the cDNA sequence, is composed of 610 amino acids and has homology to the helixloop-helix DNA-binding/dimerization motif found in the Lmyc gene product and other regulatory proteins. It also contains a large acidic domain that may be involved in transcriptional activation. Consistent with its proposed role as a transcriptional activator is our finding that a functional R gene is required for the accumulation of transcripts of at least two genes in the anthocyanin biosynthetic pathway. We discuss the possibility that the diverse patterns of anthocyanin pigmentation conditioned by different R genes reflect differences in the R gene promoters rather than their gene products.The anthocyanin biosynthetic pathway of maize has proven to be an ideal system for understanding genetic interactions between regulatory and structural genes (for review see ref.
Miniature inverted-repeat transposable elements (MITEs) are a special type of Class 2 non-autonomous transposable element (TE) that are abundant in the non-coding regions of the genes of many plant and animal species. The accurate identification of MITEs has been a challenge for existing programs because they lack coding sequences and, as such, evolve very rapidly. Because of their importance to gene and genome evolution, we developed MITE-Hunter, a program pipeline that can identify MITEs as well as other small Class 2 non-autonomous TEs from genomic DNA data sets. The output of MITE-Hunter is composed of consensus TE sequences grouped into families that can be used as a library file for homology-based TE detection programs such as RepeatMasker. MITE-Hunter was evaluated by searching the rice genomic database and comparing the output with known rice TEs. It discovered most of the previously reported rice MITEs (97.6%), and found sixteen new elements. MITE-Hunter was also compared with two other MITE discovery programs, FINDMITE and MUST. Unlike MITE-Hunter, neither of these programs can search large genomic data sets including whole genome sequences. More importantly, MITE-Hunter is significantly more accurate than either FINDMITE or MUST as the vast majority of their outputs are false-positives.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.