High-density single nucleotide polymorphism (SNP) genotyping arrays are a powerful tool for studying genomic patterns of diversity, inferring ancestral relationships between individuals in populations and studying marker–trait associations in mapping experiments. We developed a genotyping array including about 90 000 gene-associated SNPs and used it to characterize genetic variation in allohexaploid and allotetraploid wheat populations. The array includes a significant fraction of common genome-wide distributed SNPs that are represented in populations of diverse geographical origin. We used density-based spatial clustering algorithms to enable high-throughput genotype calling in complex data sets obtained for polyploid wheat. We show that these model-free clustering algorithms provide accurate genotype calling in the presence of multiple clusters including clusters with low signal intensity resulting from significant sequence divergence at the target SNP site or gene deletions. Assays that detect low-intensity clusters can provide insight into the distribution of presence–absence variation (PAV) in wheat populations. A total of 46 977 SNPs from the wheat 90K array were genetically mapped using a combination of eight mapping populations. The developed array and cluster identification algorithms provide an opportunity to infer detailed haplotype structure in polyploid wheat and will serve as an invaluable resource for diversity studies and investigating the genetic basis of trait variation in wheat.
An annotated reference sequence representing the hexaploid bread wheat genome in 21 pseudomolecules has been analyzed to identify the distribution and genomic context of coding and noncoding elements across the A, B, and D subgenomes. With an estimated coverage of 94% of the genome and containing 107,891 high-confidence gene models, this assembly enabled the discovery of tissue- and developmental stage–related coexpression networks by providing a transcriptome atlas representing major stages of wheat development. Dynamics of complex gene families involved in environmental adaptation and end-use quality were revealed at subgenome resolution and contextualized to known agronomic single-gene or quantitative trait loci. This community resource establishes the foundation for accelerating wheat research and application through improved understanding of wheat biology and genomics-assisted breeding.
An ordered draft sequence of the 17-gigabase hexaploid bread wheat (Triticum aestivum) genome has been produced by sequencing isolated chromosome arms. We have annotated 124,201 gene loci distributed nearly evenly across the homeologous chromosomes and subgenomes. Comparative gene analysis of wheat subgenomes and extant diploid and tetraploid wheat relatives showed that high sequence similarity and structural conservation are retained, with limited gene loss, after polyploidization. However, across the genomes there was evidence of dynamic gene gain, loss, and duplication since the divergence of the wheat lineages. A high degree of transcriptional autonomy and no global dominance was found for the subgenomes. These insights into the genome biology of a polyploid crop provide a springboard for faster gene isolation, rapid genetic marker development, and precise breeding to meet the needs of increasing food demand worldwide.
We produced a reference sequence of the 1-gigabase chromosome 3B of hexaploid bread wheat. By sequencing 8452 bacterial artificial chromosomes in pools, we assembled a sequence of 774 megabases carrying 5326 protein-coding genes, 1938 pseudogenes, and 85% of transposable elements. The distribution of structural and functional features along the chromosome revealed partitioning correlated with meiotic recombination. Comparative analyses indicated high wheat-specific inter- and intrachromosomal gene duplication activities that are potential sources of variability for adaption. In addition to providing a better understanding of the organization, function, and evolution of a large and polyploid genome, the availability of a high-quality sequence anchored to genetic maps will accelerate the identification of genes underlying important agronomic traits.
Transposable elements (TEs) are mobile, repetitive DNA sequences that are almost ubiquitous in prokaryotic and eukaryotic genomes. They have a large impact on genome structure, function and evolution. With the recent development of high-throughput sequencing methods, many genome sequences have become available, making possible comparative studies of TE dynamics at an unprecedented scale. Several methods have been proposed for the de novo identification of TEs in sequenced genomes. Most begin with the detection of genomic repeats, but the subsequent steps for defining TE families differ. High-quality TE annotations are available for the Drosophila melanogaster and Arabidopsis thaliana genome sequences, providing a solid basis for the benchmarking of such methods. We compared the performance of specific algorithms for the clustering of interspersed repeats and found that only a particular combination of algorithms detected TE families with good recovery of the reference sequences. We then applied a new procedure for reconciling the different clustering results and classifying TE sequences. The whole approach was implemented in a pipeline using the REPET package. Finally, we show that our combined approach highlights the dynamics of well defined TE families by making it possible to identify structural variations among their copies. This approach makes it possible to annotate TE families and to study their diversification in a single analysis, improving our understanding of TE dynamics at the whole-genome scale and for diverse species.
As the staple food for 35% of the world's population, wheat is one of the most important crop species. To date, sequence-based tools to accelerate wheat improvement are lacking. As part of the international effort to sequence the 17-billion-base-pair hexaploid bread wheat genome (2n = 6x = 42 chromosomes), we constructed a bacterial artificial chromosome (BAC)-based integrated physical map of the largest chromosome, 3B, that alone is 995 megabases. A chromosome-specific BAC library was used to assemble 82% of the chromosome into 1036 contigs that were anchored with 1443 molecular markers, providing a major resource for genetic and genomic studies. This physical map establishes a template for the remaining wheat chromosomes and demonstrates the feasibility of constructing physical maps in large, complex, polyploid genomes with a chromosome-based approach.
To improve our understanding of the organization and evolution of the wheat (Triticum aestivum) genome, we sequenced and annotated 13-Mb contigs (18.2 Mb) originating from different regions of its largest chromosome, 3B (1 Gb), and produced a 2x chromosome survey by shotgun Illumina/Solexa sequencing. All regions carried genes irrespective of their chromosomal location. However, gene distribution was not random, with 75% of them clustered into small islands containing three genes on average. A twofold increase of gene density was observed toward the telomeres likely due to high tandem and interchromosomal duplication events. A total of 3222 transposable elements were identified, including 800 new families. Most of them are complete but showed a highly nested structure spread over distances as large as 200 kb. A succession of amplification waves involving different transposable element families led to contrasted sequence compositions between the proximal and distal regions. Finally, with an estimate of 50,000 genes per diploid genome, our data suggest that wheat may have a higher gene number than other cereals. Indeed, comparisons with rice (Oryza sativa) and Brachypodium revealed that a high number of additional noncollinear genes are interspersed within a highly conserved ancestral grass gene backbone, supporting the idea of an accelerated evolution in the Triticeae lineages.
The grass family comprises the most important cereal crops and is a good system for studying, with comparative genomics, mechanisms of evolution, speciation, and domestication. Here, we identified and characterized the evolution of shared duplications in the rice (Oryza sativa) and wheat (Triticum aestivum) genomes by comparing 42,654 rice gene sequences with 6426 mapped wheat ESTs using improved sequence alignment criteria and statistical analysis. Intraspecific comparisons identified 29 interchromosomal duplications covering 72% of the rice genome and 10 duplication blocks covering 67.5% of the wheat genome. Using the same methodology, we assessed orthologous relationships between the two genomes and detected 13 blocks of colinearity that represent 83.1 and 90.4% of the rice and wheat genomes, respectively. Integration of the intraspecific duplications data with colinearity relationships revealed seven duplicated segments conserved at orthologous positions. A detailed analysis of the length, composition, and divergence time of these duplications and comparisons with sorghum (Sorghum bicolor) and maize (Zea mays) indicated common and lineage-specific patterns of conservation between the different genomes. This allowed us to propose a model in which the grass genomes have evolved from a common ancestor with a basic number of five chromosomes through a series of whole genome and segmental duplications, chromosome fusions, and translocations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.