The domesticated sunflower, Helianthus annuus L., is a global oil crop that has promise for climate change adaptation, because it can maintain stable yields across a wide variety of environmental conditions, including drought 1 . Even greater resilience is achievable through the mining of resistance alleles from compatible wild sunflower relatives 2,3 , including numerous extremophile species 4 . Here we report a high-quality reference for the sunflower genome (3.6 gigabases), together with extensive transcriptomic data from vegetative and floral organs. The genome mostly consists of highly similar, related sequences 5 and required single-molecule realtime sequencing technologies for successful assembly. Genome analyses enabled the reconstruction of the evolutionary history of the Asterids, further establishing the existence of a whole-genome triplication at the base of the Asterids II clade 6 and a sunflowerspecific whole-genome duplication around 29 million years ago 7 . An integrative approach combining quantitative genetics, expression and diversity data permitted development of comprehensive gene networks for two major breeding traits, flowering time and oil metabolism, and revealed new candidate genes in these networks. We found that the genomic architecture of flowering time has been shaped by the most recent whole-genome duplication, which suggests that ancient paralogues can remain in the same regulatory networks for dozens of millions of years. This genome represents a cornerstone for future research programs aiming to exploit genetic diversity to improve biotic and abiotic stress resistance and oil production, while also considering agricultural constraints and human nutritional needs 8,9 .As the only major crop domesticated in North America, with its sunlike inflorescence that inspired artists, the sunflower is both a social icon and a major research focus for scientists. In evolutionary biology, the Helianthus genus is a long-time model for hybrid speciation and adaptive introgression 10 . In plant science, the sunflower is a model for understanding solar tracking 11 and inflorescence development 12 .Despite this large interest, assembling its genome has been extremely difficult as it mainly consists of long and highly similar repeats. This complexity has challenged leading-edge assembly protocols for close to a decade 13 .To finally overcome this challenge, we generated a 102× sequencing coverage of the genome of the inbred line XRQ using 407 singlemolecule real-time (SMRT) cells on the PacBio RS II platform. Production of 32 million very long reads allowed us to generate a genome assembly that captures 3 gigabases (Gb) (80% of the estimated genome size) in 13,957 sequence contigs. Four high-density genetic maps were combined with a sequence-based physical map to build the sequences of the 17 pseudo-chromosomes that anchor 97% of the gene content (Fig.
SUMMARYFlax (Linum usitatissimum) is an ancient crop that is widely cultivated as a source of fiber, oil and medicinally relevant compounds. To accelerate crop improvement, we performed whole-genome shotgun sequencing of the nuclear genome of flax. Seven paired-end libraries ranging in size from 300 bp to 10 kb were sequenced using an Illumina genome analyzer. A de novo assembly, comprised exclusively of deep-coverage (approximately 94· raw, approximately 69· filtered) short-sequence reads (44-100 bp), produced a set of scaffolds with N 50 = 694 kb, including contigs with N 50 = 20.1 kb. The contig assembly contained 302 Mb of nonredundant sequence representing an estimated 81% genome coverage. Up to 96% of published flax ESTs aligned to the whole-genome shotgun scaffolds. However, comparisons with independently sequenced BACs and fosmids showed some mis-assembly of regions at the genome scale. A total of 43 384 protein-coding genes were predicted in the whole-genome shotgun assembly, and up to 93% of published flax ESTs, and 86% of A. thaliana genes aligned to these predicted genes, indicating excellent coverage and accuracy at the gene level. Analysis of the synonymous substitution rates (K s ) observed within duplicate gene pairs was consistent with a recent (5-9 MYA) whole-genome duplication in flax. Within the predicted proteome, we observed enrichment of many conserved domains (Pfam-A) that may contribute to the unique properties of this crop, including agglutinin proteins. Together these results show that de novo assembly, based solely on wholegenome shotgun short-sequence reads, is an efficient means of obtaining nearly complete genome sequence information for some plant species.
Genomic studies of speciation often report the presence of highly differentiated genomic regions interspersed within a milieu of weakly diverged loci. The formation of these speciation islands is generally attributed to reduced inter-population gene flow near loci under divergent selection, but few studies have critically evaluated this hypothesis. Here, we report on transcriptome scans among four recently diverged pairs of sunflower (Helianthus) species that vary in the geographical context of speciation. We find that genetic divergence is lower in sympatric and parapatric comparisons, consistent with a role for gene flow in eroding neutral differences. However, genomic islands of divergence are numerous and small in all comparisons, and contrary to expectations, island number and size are not significantly affected by levels of interspecific gene flow. Rather, island formation is strongly associated with reduced recombination rates. Overall, our results indicate that the functional architecture of genomes plays a larger role in shaping genomic divergence than does the geography of speciation.
Populus trichocarpa is an ecologically important tree across western North America. We used a large population sample of 498 accessions over a wide geographical area genotyped with a 34K Populus SNP array to quantify geographical patterns of genetic variation in this species (landscape genomics). We present evidence that three processes contribute to the observed patterns: (1) introgression from the sister species P. balsamifera, (2) isolation by distance (IBD), and (3) natural selection. Introgression was detected only at the margins of the species' distribution. IBD was significant across the sampled area as a whole, but no evidence of restricted gene flow was detected in a core of drainages from southern British Columbia (BC). We identified a large number of FST outliers. Gene Ontology analyses revealed that FST outliers are overrepresented in genes involved in circadian rhythm and response to red/far-red light when the entire dataset is considered, whereas in southern BC heat response genes are overrepresented. We also identified strong correlations between geoclimate variables and allele frequencies at FST outlier loci that provide clues regarding the selective pressures acting at these loci.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.