Cultivated strawberry emerged from the hybridization of two wild octoploid species, both descendants from the merger of four diploid progenitor species into a single nucleus more than 1 million years ago. Here we report a near-complete chromosome-scale assembly for cultivated octoploid strawberry (Fragaria × ananassa) and uncovered the origin and evolutionary processes that shaped this complex allopolyploid. We identified the extant relatives of each diploid progenitor species and provide support for the North American origin of octoploid strawberry. We examined the dynamics among the four subgenomes in octoploid strawberry and uncovered the presence of a single dominant subgenome with significantly greater gene content, gene expression abundance, and biased exchanges between homoeologous chromosomes, as compared with the other subgenomes. Pathway analysis showed that certain metabolomic and disease-resistance traits are largely controlled by the dominant subgenome. These findings and the reference genome should serve as a powerful platform for future evolutionary studies and enable molecular breeding in strawberry.
SignificanceWorldwide, potato is the third most important crop grown for direct human consumption, but breeders have struggled to produce new varieties that outperform those released over a century ago, as evidenced by the most widely grown North American cultivar (Russet Burbank) released in 1876. Despite its importance, potato genetic diversity at the whole-genome level remains largely unexplored. Analysis of cultivated potato and its wild relatives using modern genomics approaches can provide insight into the genomic diversity of extant germplasm, reveal historic introgressions and hybridization events, and identify genes targeted during domestication that control variance for agricultural traits, all critical information to address food security in 21st century agriculture.
Allo-octoploid cultivated strawberry (Fragaria × ananassa) originated through a combination of polyploid and homoploid hybridization, domestication of an interspecific hybrid lineage, and continued admixture of wild species over the last 300 years. While genes appear to flow freely between the octoploid progenitors, the genome structures and diversity of the octoploid species remain poorly understood. The complexity and absence of an octoploid genome frustrated early efforts to study chromosome evolution, resolve subgenomic structure, and develop a single coherent linkage group nomenclature. Here, we show that octoploid Fragaria species harbor millions of subgenome-specific DNA variants. Their diversity was sufficient to distinguish duplicated (homoeologous and paralogous) DNA sequences and develop 50K and 850K SNP genotyping arrays populated with co-dominant, disomic SNP markers distributed throughout the octoploid genome. Whole-genome shotgun genotyping of an interspecific segregating population yielded 1.9M genetically mapped subgenome variants in 5,521 haploblocks spanning 3,394 cM in F. chiloensis subsp. lucida, and 1.6M genetically mapped subgenome variants in 3,179 haploblocks spanning 2,017 cM in F. × ananassa. These studies provide a dense genomic framework of subgenome-specific DNA markers for seamlessly cross-referencing genetic and physical mapping information and unifying existing chromosome nomenclatures. Using comparative genomics, we show that geographically diverse wild octoploids are effectively diploidized, nearly completely collinear, and retain strong macro-synteny with diploid progenitor species. The preservation of genome structure among allo-octoploid taxa is a critical factor in the unique history of garden strawberry, where unimpeded gene flow supported its origin and domestication through repeated cycles of interspecific hybridization.
Clonally reproducing plants have the potential to bear a significantly greater mutational load than sexually reproducing species. To investigate this possibility, we examined the breadth of genome-wide structural variation in a panel of monoploid/ doubled monoploid clones generated from native populations of diploid potato (Solanum tuberosum), a highly heterozygous asexually propagated plant. As rare instances of purely homozygous clones, they provided an ideal set for determining the degree of structural variation tolerated by this species and deriving its minimal gene complement. Extensive copy number variation (CNV) was uncovered, impacting 219.8 Mb (30.2%) of the potato genome with nearly 30% of genes subject to at least partial duplication or deletion, revealing the highly heterogeneous nature of the potato genome. Dispensable genes (>7000) were associated with limited transcription and/or a recent evolutionary history, with lower deletion frequency observed in genes conserved across angiosperms. Association of CNV with plant adaptation was highlighted by enrichment in gene clusters encoding functions for environmental stress response, with gene duplication playing a part in species-specific expansions of stress-related gene families. This study revealed unique impacts of CNV in a species with asexual reproductive habits and how CNV may drive adaption through evolution of key stress pathways.
Cultivated strawberry (Fragaria × ananassa) is one of our youngest domesticates, originating in early eighteenth-century Europe from spontaneous hybrids between wild allo-octoploid species (F. chiloensis and F. virginiana). The improvement of horticultural traits by 300 years of breeding has enabled the global expansion of strawberry production. Here, we describe the genomic history of strawberry domestication from the earliest hybrids to modern cultivars. We observed a significant increase in heterozygosity among interspecific hybrids and a decrease in heterozygosity among domesticated descendants of those hybrids. Selective sweeps were found across the genome in early and modern phases of domestication— 59-76% of the selectively swept genes originated in the three less dominant ancestral subgenomes. Contrary to the tenet that genetic diversity is limited in cultivated strawberry, we found that the octoploid species harbor massive allelic diversity and that Fragaria × ananassa harbors as much allelic diversity as either wild founder. We identified 41.8M subgenome-specific DNA variants among resequenced wild and domesticated individuals. Strikingly, 98% of common alleles and 73% of total alleles were shared between wild and domesticated populations. Moreover, genome-wide estimates of nucleotide diversity were virtually identical in F. chiloensis, F. virginiana, and Fragaria × ananassa (π = 0.0059-0.0060). We found, however, that nucleotide diversity and heterozygosity were significantly lower in modern Fragaria × ananassa populations that have experienced significant genetic gains and have produced numerous agriculturally important cultivars.
Fusarium wilt, a soil-borne disease caused by the fungal pathogen Fusarium oxysporum f. sp. fragariae, threatens strawberry (Fragaria × ananassa) production worldwide. The spread of the pathogen, coupled with disruptive changes in soil fumigation practices, have greatly increased disease pressure and the importance of developing resistant cultivars. While resistant and susceptible cultivars have been reported, a limited number of germplasm accessions have been analyzed, and contradictory conclusions have been reached in earlier studies to elucidate the underlying genetic basis of resistance. Here, we report the discovery of Fw1, a dominant gene conferring resistance to Fusarium wilt in strawberry. The Fw1 locus was uncovered in a genome-wide association study of 565 historically and commercially important strawberry accessions genotyped with 14,408 SNP markers. Fourteen SNPs in linkage disequilibrium with Fw1 physically mapped to a 2.3 Mb segment on chromosome 2 in a diploid F. vesca reference genome. Fw1 and 11 tightly linked GWAS-significant SNPs mapped to linkage group 2C in octoploid segregating populations. The most significant SNP explained 85% of the phenotypic variability and predicted resistance in 97% of the accessions tested—broad-sense heritability was 0.96. Several disease resistance and defense-related gene homologs, including a small cluster of genes encoding nucleotide-binding leucine-rich-repeat proteins, were identified in the 0.7 Mb genomic segment predicted to harbor Fw1. DNA variants and candidate genes identified in the present study should facilitate the development of high-throughput genotyping assays for accurately predicting Fusarium wilt phenotypes and applying marker-assisted selection.
The fruits of diploid and octoploid strawberry (Fragaria spp) show substantial natural variation in color due to distinct anthocyanin accumulation and distribution patterns. Anthocyanin biosynthesis is controlled by a clade of R2R3 MYB transcription factors, among which MYB10 is the main activator in strawberry fruit. Here, we show that mutations in MYB10 cause most of the variation in anthocyanin accumulation and distribution observed in diploid woodland strawberry (F. vesca) and octoploid cultivated strawberry (F. 3ananassa). Using a mapping-by-sequencing approach, we identified a gypsytransposon in MYB10 that truncates the protein and knocks out anthocyanin biosynthesis in a white-fruited F. vesca ecotype. Two additional loss-of-function mutations in MYB10 were identified among geographically diverse white-fruited F. vesca ecotypes. Genetic and transcriptomic analyses of octoploid Fragaria spp revealed that FaMYB10-2, one of three MYB10 homoeologs identified, regulates anthocyanin biosynthesis in developing fruit. Furthermore, independent mutations in MYB10-2 are the underlying cause of natural variation in fruit skin and flesh color in octoploid strawberry. We identified a CACTA-like transposon (FaEnSpm-2) insertion in the MYB10-2 promoter of red-fleshed accessions that was associated with enhanced expression. Our findings suggest that cis-regulatory elements in FaEnSpm-2 are responsible for enhanced MYB10-2 expression and anthocyanin biosynthesis in strawberry fruit flesh.
The PacBio® HiFi sequencing method yields highly accurate long-read sequencing datasets with read lengths averaging 10–25 kb and accuracies greater than 99.5%. These accurate long reads can be used to improve results for complex applications such as single nucleotide and structural variant detection, genome assembly, assembly of difficult polyploid or highly repetitive genomes, and assembly of metagenomes. Currently, there is a need for sample data sets to both evaluate the benefits of these long accurate reads as well as for development of bioinformatic tools including genome assemblers, variant callers, and haplotyping algorithms. We present deep coverage HiFi datasets for five complex samples including the two inbred model genomes Mus musculus and Zea mays, as well as two complex genomes, octoploid Fragaria × ananassa and the diploid anuran Rana muscosa. Additionally, we release sequence data from a mock metagenome community. The datasets reported here can be used without restriction to develop new algorithms and explore complex genome structure and evolution. Data were generated on the PacBio Sequel II System.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.