Allotetraploid cotton species (Gossypium hirsutum and Gossypium barbadense) have long been cultivated worldwide for natural renewable textile fibers. The draft genome sequences of both species are available but they are highly fragmented and incomplete 1-4. Here we report referencegrade genome assemblies and annotations for G. hirsutum accession Texas Marker-1 (TM-1) and G. barbadense accession 3-79 by integrating single-molecule real-time sequencing, BioNano optical mapping and high-throughput chromosome conformation capture techniques. Compared with previous assembled draft genomes 1,3 , these genome sequences show considerable improvements in contiguity and completeness for regions with high content of repeats such as centromeres. Comparative genomics analyses identify extensive structural variations that probably occurred after polyploidization, highlighted by large paracentric/pericentric inversions in 14 chromosomes. We constructed an introgression line population to introduce favorable chromosome segments from G. barbadense to G. hirsutum, allowing us to identify 13 quantitative trait loci associated with superior fiber quality. These resources will accelerate evolutionary and functional genomic studies in cotton and inform future breeding programs for fiber improvement. Cotton represents the largest source of natural textile fibers in the world. Over 90% of annual fiber production comes from allotetraploid cotton (G. hirsutum and G. barbadense), which originated from an allopolyplodization event approximately 1-2 million year ago, followed by millennia of asymmetric subgenome selection 5,6. G. hirsutum is cultivated all over the world because of its high yield and G. barbadense is prized for its superior fiber quality. To cultivate G. hirsutum that produces longer, finer and stronger fibers, one approach is to introduce the superior fiber traits from G. barbadense into G. hirsutum. A genomics-enabled breeding strategy requires a detailed and robust understanding of genomic organization. Genomic feature G. hirsutum G. barbadense
Publisher's copyright statement:Additional information: Use policyThe full-text may be used and/or reproduced, and given to third parties in any format or medium, without prior permission or charge, for personal research or study, educational, or not-for-pro t purposes provided that:• a full bibliographic reference is made to the original source • a link is made to the metadata record in DRO • the full-text is not changed in any way The full-text must not be sold in any format or medium without the formal permission of the copyright holders.Please consult the full DRO policy for further details. Dt, 0.56 × 10 -3 ) ( Fig. 1d and Supplementary Fig. 3). This shows that a large amount is associated with the development of the long fiber trait in cultivated cotton (Fig. 3b). 217Domestication has led to the transformation of cotton fiber from brown to white. 218To understand this phenomenon, we examined two homoeologous gene pairs only 219 subjected to domestication selection in the Dt, 4-COUMARATE:COA LIGASE (4CL) 220 and CHALCONE SYNTHASE (CHS), which encode enzymes involved in the 221 phenylpropanoid metabolic pathway ( Fig. 3c and Supplementary Fig. 6 Fig. 3c). These SNPs display reductions in nucleotide diversity that occurred 225 during domestication (Fig. 3c). Interestingly, we found that the two SNPs in the Fig. 8) 42 . We identified a total of 188,360 DNase I-hypersensitive 248 sites (DHSs) in cotton leaves and fibers, of which ca. 47% are common to both tissues 249 (Fig. 4a). DHSs were preferentially identified in chromosomal arms and 250 approximately half were detected in promoter and intergenic regions ( Fig. 4b and 251 Supplementary Fig. 9). We found DHSs are hypo-methylated, consistent with 252 previous studies 42 (Fig. 4c) H3K4me1 and inactive H3K9me2 (Fig. 4d). Intergenic DHSs were also found to 255 exhibit an enrichment of H3K4me3 and H3K27me3, but depletion of H3K9me2 and 256 no enrichment of H3K4me1 (Fig. 4e). As predicted, the patterns of chromatin 257 modification marks in cotton are different between genic and TE regions 258 ( Supplementary Fig. 10). In addition, genes with promoter DHSs are generally 259 expressed at a higher level in both tissues than those without promoter DHSs (Fig. 4f), 260 and tissue-specific promoter DHSs corresponded to higher levels of gene expression 261 ( Fig. 4g) Hi-C analysis was carried out using the TM-1 accession to characterize global 296 chromatin interactions. We generated 1.1 billion Hi-C paired-end reads, of which ca. possible Hi-C bias, HindIII fragments of less than 2 kb were merged to obtain 299 305,682 chromosomal anchor regions (Fig. 5a). On the basis of a high-quality 300 genome assembly of TM-1 (Supplementary Fig. 11), we used the Hi-C data to 301 characterize the cotton chromatin interactome (Supplementary Fig. 12) and ( Fig. 5b), but many topologically associated domain-like (TAD-like) regions were 305 identified (Fig. 5c, Supplementary Fig. 13 and Supplementary are less frequent at regions marked by H3K9me2 (Fig. 5d). (Fig. 5g). 320We...
Summary Gossypium hirsutum L. represents the largest source of textile fibre, and China is one of the largest cotton‐producing and cotton‐consuming countries in the world. To investigate the genetic architecture of the agronomic traits of upland cotton in China, a diverse and nationwide population containing 503 G. hirsutum accessions was collected for a genome‐wide association study (GWAS) on 16 agronomic traits. The accessions were planted in four places from 2012 to 2013 for phenotyping. The CottonSNP63K array and a published high‐density map based on this array were used for genotyping. The 503 G. hirsutum accessions were divided into three subpopulations based on 11 975 quantified polymorphic single‐nucleotide polymorphisms (SNPs). By comparing the genetic structure and phenotypic variation among three genetic subpopulations, seven geographic distributions and four breeding periods, we found that geographic distribution and breeding period were not the determinants of genetic structure. In addition, no obvious phenotypic differentiations were found among the three subpopulations, even though they had different genetic backgrounds. A total of 324 SNPs and 160 candidate quantitative trait loci (QTL) regions were identified as significantly associated with the 16 agronomic traits. A network was established for multieffects in QTLs and interassociations among traits. Thirty‐eight associated regions had pleiotropic effects controlling more than one trait. One candidate gene, Gh_D08G2376, was speculated to control the lint percentage (LP). This GWAS is the first report using high‐resolution SNPs in upland cotton in China to comprehensively investigate agronomic traits, and it provides a fundamental resource for cotton genetic research and breeding.
SummaryAlternative splicing (AS) is a crucial regulatory mechanism in eukaryotes, which acts by greatly increasing transcriptome diversity. The extent and complexity of AS has been revealed in model plants using high-throughput next-generation sequencing. However, this technique is less effective in accurately identifying transcript isoforms in polyploid species because of the high sequence similarity between coexisting subgenomes.Here we characterize AS in the polyploid species cotton. Using Pacific Biosciences singlemolecule long-read isoform sequencing (Iso-Seq), we developed an integrated pipeline for Iso-Seq transcriptome data analysis (https://github.com/Nextomics/pipeline-for-isoseq).We identified 176 849 full-length transcript isoforms from 44 968 gene models and updated gene annotation. These data led us to identify 15 102 fibre-specific AS events and estimate that c. 51.4% of homoeologous genes produce divergent isoforms in each subgenome. We reveal that AS allows differential regulation of the same gene by miRNAs at the isoform level. We also show that nucleosome occupancy and DNA methylation play a role in defining exons at the chromatin level.This study provides new insights into the complexity and regulation of AS, and will enhance our understanding of AS in polyploid species. Our methodology for Iso-Seq data analysis will be a useful reference for the study of AS in other species.
The formation of polyploids significantly increases the complexity of transcriptional regulation, which is expected to be reflected in sophisticated higher-order chromatin structures. However, knowledge of three-dimensional (3D) genome structure and its dynamics during polyploidization remains poor. Here, we characterize 3D genome architectures for diploid and tetraploid cotton, and find the existence of A/B compartments and topologically associated domains (TADs). By comparing each subgenome in tetraploids with its extant diploid progenitor, we find that genome allopolyploidization has contributed to the switching of A/B compartments and the reorganization of TADs in both subgenomes. We also show that the formation of TAD boundaries during polyploidization preferentially occurs in open chromatin, coinciding with the deposition of active chromatin modification. Furthermore, analysis of inter-subgenomic chromatin interactions has revealed the spatial proximity of homoeologous genes, possibly associated with their coordinated expression. This study advances our understanding of chromatin organization in plants and sheds new light on the relationship between 3D genome evolution and transcriptional regulation.
Summary The cotton fibre serves as a valuable experimental system to study cell wall synthesis in plants, but our understanding of the genetic regulation of this process during fibre development remains limited. We performed a genome‐wide association study (GWAS) and identified 28 genetic loci associated with fibre quality in allotetraploid cotton. To investigate the regulatory roles of these loci, we sequenced fibre transcriptomes of 251 cotton accessions and identified 15 330 expression quantitative trait loci (eQTL). Analysis of local eQTL and GWAS data prioritised 13 likely causal genes for differential fibre quality in a transcriptome‐wide association study (TWAS). Characterisation of distal eQTL revealed unequal genetic regulation patterns between two subgenomes, highlighted by an eQTL hotspot (Hot216) that established a genome‐wide genetic network regulating the expression of 962 genes. The primary regulatory role of Hot216, and specifically the gene encoding a KIP‐related protein, was found to be the transcriptional regulation of genes responsible for cell wall synthesis, which contributes to fibre length by modulating the developmental transition from rapid cell elongation to secondary cell wall synthesis. This study uncovered the genetic regulation of fibre‐cell development and revealed the molecular basis of the temporal modulation of secondary cell wall synthesis during plant cell elongation.
RAD sequencing was performed using DH962 and Jimian5 as upland cotton mapping parents. Sequencing data for DH962 and Jimian5 were assembled into the genome sequences of ≈55.27 and ≈57.06 Mb, respectively. Analysing genome sequences of the two parents, 1,323 SSR, 3,838 insertion/deletion (InDel), and 9,366 single-nucleotide polymorphism (SNP) primer pairs were developed. All of the SSRs, 121 InDels, 441 SNPs, and other 6,747 primer pairs were screened in the two parents, and a total of 535 new polymorphic loci were identified. A genetic map including 1,013 loci was constructed using these results and 506 loci previously published for this population. Twenty-seven new QTLs for yield and fibre quality were identified, indicating that the efficiency of QTL detection was greatly improved by the increase in map density. Comparative genomics showed there to be considerable homology and collinearity between the AT and A2 genomes and between the DT and D5 genomes, although there were a few exchanges and introgressions among the chromosomes of the A2 genome. Here, the development of markers using parental RAD sequencing was effective, and a high-density intraspecific genetic map was constructed. This map can be used for molecular marker-assisted selection in cotton.
SummaryBrown fibre cotton is an environmental‐friendly resource that plays a key role in the textile industry. However, the fibre quality and yield of natural brown cotton are poor, and fundamental research on brown cotton is relatively scarce. To understand the genetic basis of brown fibre cotton, we constructed linkage and association populations to systematically examine brown fibre accessions. We fine‐mapped the brown fibre region, Lc 1, and dissected it into 2 loci, qBF‐A07‐1 and qBF‐A07‐2. The qBF‐A07‐1 locus mediates the initiation of brown fibre production, whereas the shade of the brown fibre is affected by the interaction between qBF‐A07‐1 and qBF‐A07‐2. Gh_A07G2341 and Gh_A07G0100 were identified as candidate genes for qBF‐A07‐1 and qBF‐A07‐2, respectively. Haploid analysis of the signals significantly associated with these two loci showed that most tetraploid modern brown cotton accessions exhibit the introgression signature of Gossypium barbadense. We identified 10 quantitative trait loci (QTLs) for fibre yield and 19 QTLs for fibre quality through a genome‐wide association study (GWAS) and found that qBF‐A07‐2 negatively affects fibre yield and quality through an epistatic interaction with qBF‐A07‐1. This study sheds light on the genetics of fibre colour and lint‐related traits in brown fibre cotton, which will guide the elite cultivars breeding of brown fibre cotton.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.