Gossypium hirsutum has proven difficult to sequence owing to its complex allotetraploid (AtDt) genome. Here we produce a draft genome using 181-fold paired-end sequences assisted by fivefold BAC-to-BAC sequences and a high-resolution genetic map. In our assembly 88.5% of the 2,173-Mb scaffolds, which cover 89.6%∼96.7% of the AtDt genome, are anchored and oriented to 26 pseudochromosomes. Comparison of this G. hirsutum AtDt genome with the already sequenced diploid Gossypium arboreum (AA) and Gossypium raimondii (DD) genomes revealed conserved gene order. Repeated sequences account for 67.2% of the AtDt genome, and transposable elements (TEs) originating from Dt seem more active than from At. Reduction in the AtDt genome size occurred after allopolyploidization. The A or At genome may have undergone positive selection for fiber traits. Concerted evolution of different regulatory mechanisms for Cellulose synthase (CesA) and 1-Aminocyclopropane-1-carboxylic acid oxidase1 and 3 (ACO1,3) may be important for enhanced fiber production in G. hirsutum.
The ancestors of Gossypium arboreum and Gossypium herbaceum provided the A subgenome for the modern cultivated allotetraploid cotton. Here, we upgraded the G. arboreum genome assembly by integrating different technologies. We resequenced 243 G. arboreum and G. herbaceum accessions to generate a map of genome variations and found that they are equally diverged from Gossypium raimondii. Independent analysis suggested that Chinese G. arboreum originated in South China and was subsequently introduced to the Yangtze and Yellow River regions. Most accessions with domestication-related traits experienced geographic isolation. Genome-wide association study (GWAS) identified 98 significant peak associations for 11 agronomically important traits in G. arboreum. A nonsynonymous substitution (cysteine-to-arginine substitution) of GaKASIII seems to confer substantial fatty acid composition (C16:0 and C16:1) changes in cotton seeds. Resistance to fusarium wilt disease is associated with activation of GaGSTF9 expression. Our work represents a major step toward understanding the evolution of the A genome of cotton.
ultivated cotton is one of the most economically important crop plants in the world. The allotetraploid Upland cotton, G. hirsutum (n = 2x = 26, (AD) 1), currently dominates the world's cotton commerce 1,2. Hybridization between the Old World A-genome progenitor and a New World D-genome ancestor, followed by chromosome doubling, formed the allopolyploid cotton ~1−2 million years ago (Ma) 3,4. Uncertainty regarding the actual A-genome donor of the widely cultivated allotetraploid cotton G. hirsutum has persisted 5-13. A 1 (n = x = 13) and A 2 (n = x = 13), commonly known as African and Asiatic cotton, respectively, are the only two extant diploid A-genome species in the world 14. Stephens first proposed in Nature, using genetic and morphological evidence, that A 2 was the A-genome donor of present-day allopolyploid cottons 6. Gerstel argued via cytogenetic studies that A 1 was more closely related to the A-genome in the allopolyploids than A 2 (ref. 8). Despite recent efforts to sequence the cotton genomes, including Gossypium raimondii (D 5) 15,16 , A 2 (refs. 17,18), (AD) 1 (refs. 10,19-21) and Gossypium barbadense 10,21 ((AD) 2 , a much less cultivated tetraploid cotton), the origin history of the A-genome donor for the tetraploid (AD) 1-genome 5,11,13 and the extent of divergence between the A-genomes remain elusive 22,23. Abundant studies support a Gossypium species resembling D 5 as the D-genome donor 13 , but currently there is no solid evidence to suggest that the actual A-genome donor of tetraploid cottons is either A 2 (refs. 6,7,10,19) or A 1 (refs. 8,9,11-13) as has been suggested. In this study, we assembled A 1 variety africanum for the first time and reassembled high-quality A 2 cultivar Shixiya1 and (AD) 1 genetic standard Texas Marker-1 (TM-1) genomes on the basis of PacBio long reads, paired-end sequencing and high-throughput chromosome conformation capture (Hi-C) technologies. Upon assembling and updating cotton genomes, we revealed the origin of cotton A-genomes, the occurrence of several transposable element (TE) bursts and the genetic divergence of diploid A-genomes worldwide. Also, we identified abundant structural variations (SVs) that have affected the expression of neighboring genes and help explain phenotypic differences among the cotton species. Results Sequencing and assembly of three high-quality cotton genomes. Here we sequenced the A 1-genome var. africanum for the first time by generating ~225-gigabase (Gb) PacBio single-molecule real-time (SMRT) long reads (the N50 (minimum length to cover 50% of the total length) of these reads was 13 kilobases (kb)) with 138-fold genome coverage. We generated an assembly that captured 1,556 megabases (Mb) of genome sequences, consisting of 1,781 contigs with the N50 of these contigs reaching up to 1,915 kb (Table 1). The initial assemblies were then corrected by using highly accurate Illumina paired-end reads (Supplementary Table 1). Finally, 95.69% of total contigs spanning 1,489 Mb were categorized and ordered into 13 chromosome-scale scaffold...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.