Aegilops tauschii is the diploid progenitor of the D genome of hexaploid wheat 1 (Triticum aestivum, genomes AABBDD) and an important genetic resource for wheat [2][3][4] . The large size and highly repetitive nature of the Ae. tauschii genome has until now precluded the development of a reference-quality genome sequence 5 .Here we use an array of advanced technologies, including orderedclone genome sequencing, whole-genome shotgun sequencing, and BioNano optical genome mapping, to generate a referencequality genome sequence for Ae. tauschii ssp. strangulata accession AL8/78, which is closely related to the wheat D genome. We show that compared to other sequenced plant genomes, including a much larger conifer genome, the Ae. tauschii genome contains unprecedented amounts of very similar repeated sequences. Our genome comparisons reveal that the Ae. tauschii genome has a greater number of dispersed duplicated genes than other sequenced genomes and its chromosomes have been structurally evolving an order of magnitude faster than those of other grass genomes.
BackgroundThe size and complexity of conifer genomes has, until now, prevented full genome sequencing and assembly. The large research community and economic importance of loblolly pine, Pinus taeda L., made it an early candidate for reference sequence determination.ResultsWe develop a novel strategy to sequence the genome of loblolly pine that combines unique aspects of pine reproductive biology and genome assembly methodology. We use a whole genome shotgun approach relying primarily on next generation sequence generated from a single haploid seed megagametophyte from a loblolly pine tree, 20-1010, that has been used in industrial forest tree breeding. The resulting sequence and assembly was used to generate a draft genome spanning 23.2 Gbp and containing 20.1 Gbp with an N50 scaffold size of 66.9 kbp, making it a significant improvement over available conifer genomes. The long scaffold lengths allow the annotation of 50,172 gene models with intron lengths averaging over 2.7 kbp and sometimes exceeding 100 kbp in length. Analysis of orthologous gene sets identifies gene families that may be unique to conifers. We further characterize and expand the existing repeat library based on the de novo analysis of the repetitive content, estimated to encompass 82% of the genome.ConclusionsIn addition to its value as a resource for researchers and breeders, the loblolly pine genome sequence and assembly reported here demonstrates a novel approach to sequencing the large and complex genomes of this important group of plants that can now be widely applied.
Because of the huge size of the common wheat (Triticum aestivum L., 2n ϭ 6x ϭ 42, AABBDD) genome of 17,300 Mb, sequencing and mapping of the expressed portion is a logical first step for gene discovery. Here we report mapping of 7104 expressed sequence tag (EST) unigenes by Southern hybridization into a chromosome bin map using a set of wheat aneuploids and deletion stocks. Each EST detected a mean of 4.8 restriction fragments and 2.8 loci. More loci were mapped in the B genome (5774) than in the A (5173) or D (5146) genomes. The EST density was significantly higher for the D genome than for the A or B. In general, EST density increased relative to the physical distance from the centromere. The majority of EST-dense regions are in the distal parts of chromosomes. Most of the agronomically important genes are located in EST-dense regions. The chromosome bin map of ESTs is a unique resource for SNP analysis, comparative mapping, structural and functional analysis, and polyploid evolution, as well as providing a framework for constructing a sequence-ready, BAC-contig map of the wheat genome.
The current limitations in genome sequencing technology require the construction of physical maps for high-quality draft sequences of large plant genomes, such as that of Aegilops tauschii, the wheat D-genome progenitor. To construct a physical map of the Ae. tauschii genome, we fingerprinted 461,706 bacterial artificial chromosome clones, assembled contigs, designed a 10K Ae. tauschii Infinium SNP array, constructed a 7,185-marker genetic map, and anchored on the map contigs totaling 4.03 Gb. Using whole genome shotgun reads, we extended the SNP marker sequences and found 17,093 genes and gene fragments. We showed that collinearity of the Ae. tauschii genes with Brachypodium distachyon, rice, and sorghum decreased with phylogenetic distance and that structural genome evolution rates have been high across all investigated lineages in subfamily Pooideae, including that of Brachypodieae. We obtained additional information about the evolution of the seven Triticeae chromosomes from 12 ancestral chromosomes and uncovered a pattern of centromere inactivation accompanying nested chromosome insertions in grasses. We showed that the density of noncollinear genes along the Ae. tauschii chromosomes positively correlates with recombination rates, suggested a cause, and showed that new genes, exemplified by disease resistance genes, are preferentially located in high-recombination chromosome regions. (2), and 90% of its genome was estimated to be repetitive DNA (3). The Ae. tauschii genome and the D genome of hexaploid wheat are closely related due to the recent origin of hexaploid wheat (4). Ae. tauschii is therefore an important resource for wheat breeding, and its genome is an invaluable reference for wheat genomics, as illustrated by the utility of its sequences in the analysis of the wheat gene space (5). The utility of Ae. tauschii for wheat genetics and genomics would be further enhanced by a high-quality draft sequence of its genome. With current technology, the only approach to produce a high-quality de novo draft sequence for a genome of this size and complexity is the orderedclone sequencing approach, which requires a physical map.Physical map construction necessitates fingerprinting multiple genome equivalents of bacterial artificial chromosome (BAC) clones, assembling them into contigs, and anchoring the contigs on a genetic map (6-8). Great strides have been made in BAC fingerprinting techniques (7, 9-12) and software for fingerprint editing and contig assembly (13-16). It is now possible with these technological advances to fingerprint and assemble contigs from hundreds of thousands of BAC clones (7,8,(17)(18)(19). In contrast, contig anchoring remains a weakness in physical mapping of large plant genomes because of their low gene density, extensive gene duplication, and abundance of repetitive DNA. BAC end sequences (BESs) are an effective means of contig anchoring in small genomes (11). In large genomes, however, hundreds of thousands of BESs are needed. DNA hybridization and PCRbased anchoring (6,7,20,21)...
Genes detected by wheat expressed sequence tags (ESTs) were mapped into chromosome bins delineated by breakpoints of 159 overlapping deletions. These data were used to assess the organizational and evolutionary aspects of wheat genomes. Relative gene density and recombination rate increased with the relative distance of a bin from the centromere. Single-gene loci present once in the wheat genomes were found predominantly in the proximal, low-recombination regions, while multigene loci tended to be more frequent in distal, high-recombination regions. One-quarter of all gene motifs within wheat genomes were represented by two or more duplicated loci (paralogous sets). For 40 such sets, ancestral loci and loci derived from them by duplication were identified. Loci derived by duplication were most frequently located in distal, high-recombination chromosome regions whereas ancestral loci were most frequently located proximal to them. It is suggested that recombination has played a central role in the evolution of wheat genome structure and that gradients of recombination rates along chromosome arms promote more rapid rates of genome evolution in distal, high-recombination regions than in proximal, low-recombination regions.
Single-nucleotide polymorphism was used in the construction of an expressed sequence tag map of Aegilops tauschii, the diploid source of the wheat D genome. Comparisons of the map with the rice and sorghum genome sequences revealed 50 inversions and translocations; 2, 8, and 40 were assigned respectively to the rice, sorghum, and Ae. tauschii lineages, showing greatly accelerated genome evolution in the large Triticeae genomes. The reduction of the basic chromosome number from 12 to 7 in the Triticeae has taken place by a process during which an entire chromosome is inserted by its telomeres into a break in the centromeric region of another chromosome. The original centromere-telomere polarity of the chromosome arms is maintained in the new chromosome. An intrachromosomal telomeretelomere fusion resulting in a pericentric translocation of a chromosome segment or an entire arm accompanied or preceded the chromosome insertion in some instances. Insertional dysploidy has been recorded in three grass subfamilies and appears to be the dominant mechanism of basic chromosome number reduction in grasses. A total of 64% and 66% of Ae. tauschii genes were syntenic with sorghum and rice genes, respectively. Synteny was reduced in the vicinity of the termini of modern Ae. tauschii chromosomes but not in the vicinity of the ancient termini embedded in the Ae. tauschii chromosomes, suggesting that the dependence of synteny erosion on gene location along the centromere-telomere axis either evolved recently in the Triticeae phylogenetic lineage or its evolution was recently accelerated.dysploidy ͉ linkage map ͉ rice ͉ sorghum ͉ wheat
BackgroundA genome-wide assessment of nucleotide diversity in a polyploid species must minimize the inclusion of homoeologous sequences into diversity estimates and reliably allocate individual haplotypes into their respective genomes. The same requirements complicate the development and deployment of single nucleotide polymorphism (SNP) markers in polyploid species. We report here a strategy that satisfies these requirements and deploy it in the sequencing of genes in cultivated hexaploid wheat (Triticum aestivum, genomes AABBDD) and wild tetraploid wheat (Triticum turgidum ssp. dicoccoides, genomes AABB) from the putative site of wheat domestication in Turkey. Data are used to assess the distribution of diversity among and within wheat genomes and to develop a panel of SNP markers for polyploid wheat.ResultsNucleotide diversity was estimated in 2114 wheat genes and was similar between the A and B genomes and reduced in the D genome. Within a genome, diversity was diminished on some chromosomes. Low diversity was always accompanied by an excess of rare alleles. A total of 5,471 SNPs was discovered in 1791 wheat genes. Totals of 1,271, 1,218, and 2,203 SNPs were discovered in 488, 463, and 641 genes of wheat putative diploid ancestors, T. urartu, Aegilops speltoides, and Ae. tauschii, respectively. A public database containing genome-specific primers, SNPs, and other information was constructed. A total of 987 genes with nucleotide diversity estimated in one or more of the wheat genomes was placed on an Ae. tauschii genetic map, and the map was superimposed on wheat deletion-bin maps. The agreement between the maps was assessed.ConclusionsIn a young polyploid, exemplified by T. aestivum, ancestral species are the primary source of genetic diversity. Low effective recombination due to self-pollination and a genetic mechanism precluding homoeologous chromosome pairing during polyploid meiosis can lead to the loss of diversity from large chromosomal regions. The net effect of these factors in T. aestivum is large variation in diversity among genomes and chromosomes, which impacts the development of SNP markers and their practical utility. Accumulation of new mutations in older polyploid species, such as wild emmer, results in increased diversity and its more uniform distribution across the genome.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.