Oilseed rape (Brassica napus L.) was formed~7500 years ago by hybridization between B. rapa and B. oleracea, followed by chromosome doubling, a process known as allopolyploidy. Together with more ancient polyploidizations, this conferred an aggregate 72× genome multiplication since the origin of angiosperms and high gene content. We examined the B. napus genome and the consequences of its recent duplication. The constituent A n and C n subgenomes are engaged in subtle structural, functional, and epigenetic cross-talk, with abundant homeologous exchanges. Incipient gene loss and expression divergence have begun. Selection in B. napus oilseed types has accelerated the loss of glucosinolate genes, while preserving expansion of oil biosynthesis genes. These processes provide insights into allopolyploid evolution and its relationship with crop domestication and improvement.T he Brassicaceae are a large eudicot family (1) and include the model plant Arabidopsis thaliana. Brassicas have a propensity for genome duplications ( Fig. 1) and genome mergers (2). They are major contributors to the human diet and were among the earliest cultigens (3).B. napus (genome A n A n C n C n ) was formed by recent allopolyploidy between ancestors of B. oleracea (Mediterranean cabbage, genome C o C o ) and B. rapa (Asian cabbage or turnip, genome A r A r ) and is polyphyletic (2, 4), with spontaneous formation regarded by Darwin as an example of unconscious selection (5). Cultivation began in Europe during the Middle Ages and spread worldwide. Diversifying selection gave rise to oilseed rape (canola), rutabaga, fodder rape, and kale morphotypes grown for oil, fodder, and food (4, 6).The homozygous B. napus genome of European winter oilseed cultivar 'Darmor-bzh' was assembled with long-read [>700 base pairs (bp)] 454 GS-FLX+ Titanium (Roche, Basel, Switzerland) and Sanger sequence (tables S1 to S5 and figs. S1 to S3) (7). Correction and gap filling used 79 Gb of Illumina (San Diego, CA) HiSeq sequence. A final assembly of 849.7 Mb was obtained with SOAP (8) and Newbler (Roche), with 89% nongapped sequence (tables S2 and S3). Unique mapping of 5× nonassembled 454 sequences from B. rapa ('Chiifu') or B. oleracea (' TO1000') assigned most of the 20,702 B. napus scaffolds to either the A n (8294) or the C n (9984) subgenomes (tables S4 and S5 and fig. S3). The assembly covers~79% of the 1130-Mb genome and includes 95.6% of Brassica expressed sequence tags (ESTs) (7). A single-nucleotide polymorphism (SNP) map (tables S6 to S9 and figs. S4 to S8) genetically anchored 712.3 Mb (84%) of the genome assembly, yielding pseudomolecules for the 19 chromosomes (table S10).The assembled C n subgenome (525.8 Mb) is larger than the A n subgenome (314.2 Mb), consistent with the relative sizes of the assembled C o genome of B. oleracea (540 Mb, 85% of thẽ 630-Mb genome) and the A r genome of B. rapa (312 Mb, 59% of the~530-Mb genome) (9-11). The B. napus assembly contains 34.8% transposable elements (TEs), less than the 40% estimated from raw reads (table...
The domesticated sunflower, Helianthus annuus L., is a global oil crop that has promise for climate change adaptation, because it can maintain stable yields across a wide variety of environmental conditions, including drought 1 . Even greater resilience is achievable through the mining of resistance alleles from compatible wild sunflower relatives 2,3 , including numerous extremophile species 4 . Here we report a high-quality reference for the sunflower genome (3.6 gigabases), together with extensive transcriptomic data from vegetative and floral organs. The genome mostly consists of highly similar, related sequences 5 and required single-molecule realtime sequencing technologies for successful assembly. Genome analyses enabled the reconstruction of the evolutionary history of the Asterids, further establishing the existence of a whole-genome triplication at the base of the Asterids II clade 6 and a sunflowerspecific whole-genome duplication around 29 million years ago 7 . An integrative approach combining quantitative genetics, expression and diversity data permitted development of comprehensive gene networks for two major breeding traits, flowering time and oil metabolism, and revealed new candidate genes in these networks. We found that the genomic architecture of flowering time has been shaped by the most recent whole-genome duplication, which suggests that ancient paralogues can remain in the same regulatory networks for dozens of millions of years. This genome represents a cornerstone for future research programs aiming to exploit genetic diversity to improve biotic and abiotic stress resistance and oil production, while also considering agricultural constraints and human nutritional needs 8,9 .As the only major crop domesticated in North America, with its sunlike inflorescence that inspired artists, the sunflower is both a social icon and a major research focus for scientists. In evolutionary biology, the Helianthus genus is a long-time model for hybrid speciation and adaptive introgression 10 . In plant science, the sunflower is a model for understanding solar tracking 11 and inflorescence development 12 .Despite this large interest, assembling its genome has been extremely difficult as it mainly consists of long and highly similar repeats. This complexity has challenged leading-edge assembly protocols for close to a decade 13 .To finally overcome this challenge, we generated a 102× sequencing coverage of the genome of the inbred line XRQ using 407 singlemolecule real-time (SMRT) cells on the PacBio RS II platform. Production of 32 million very long reads allowed us to generate a genome assembly that captures 3 gigabases (Gb) (80% of the estimated genome size) in 13,957 sequence contigs. Four high-density genetic maps were combined with a sequence-based physical map to build the sequences of the 17 pseudo-chromosomes that anchor 97% of the gene content (Fig.
SNP genotyping arrays have been useful for many applications that require a large number of molecular markers such as high-density genetic mapping, genome-wide association studies (GWAS), and genomic selection. We report the establishment of a large maize SNP array and its use for diversity analysis and high density linkage mapping. The markers, taken from more than 800,000 SNPs, were selected to be preferentially located in genes and evenly distributed across the genome. The array was tested with a set of maize germplasm including North American and European inbred lines, parent/F1 combinations, and distantly related teosinte material. A total of 49,585 markers, including 33,417 within 17,520 different genes and 16,168 outside genes, were of good quality for genotyping, with an average failure rate of 4% and rates up to 8% in specific germplasm. To demonstrate this array's use in genetic mapping and for the independent validation of the B73 sequence assembly, two intermated maize recombinant inbred line populations – IBM (B73×Mo17) and LHRF (F2×F252) – were genotyped to establish two high density linkage maps with 20,913 and 14,524 markers respectively. 172 mapped markers were absent in the current B73 assembly and their placement can be used for future improvements of the B73 reference sequence. Colinearity of the genetic and physical maps was mostly conserved with some exceptions that suggest errors in the B73 assembly. Five major regions containing non-colinearities were identified on chromosomes 2, 3, 6, 7 and 9, and are supported by both independent genetic maps. Four additional non-colinear regions were found on the LHRF map only; they may be due to a lower density of IBM markers in those regions or to true structural rearrangements between lines. Given the array's high quality, it will be a valuable resource for maize genetics and many aspects of maize breeding.
A r t i c l e s Theobroma cacao L. is a diploid tree fruit species (2n = 2x = 20 (ref. 1)) endemic to the South American rainforests. Cocoa was domesticated approximately 3,000 years ago 2 in Central America 3. The Criollo cocoa variety, having a nearly unique and homozygous genotype, was among the first to be cultivated 4. Criollo is now one of the two cocoa varieties providing fine flavor chocolate. However, due to its poor agronomic performance and disease susceptibility, more vigorous hybrids created with foreign (Forastero) genotypes have been introduced. These hybrids, named Trinitario, are now widely cultivated 5. Here we report the sequence of a Belizean Criollo plant 6. Consumers have shown an increased interest for high-quality chocolate, and for dark chocolate, containing a higher percentage of cocoa 7. Fine-cocoa production is nevertheless estimated to be less than 5% of the world cocoa production due to the low productivity and disease susceptibility of the traditional fine-flavor cocoa varieties. Therefore, breeding of improved Criollo varieties is important for sustainable production of fine-flavor cocoa. 3.7 million tons of cocoa are produced annually (see URLs). However, fungal, oomycete and viral diseases, as well as insect pests, are responsible for an estimated 30% of harvest losses (see URLs). Like many other tropical crops, knowledge of T. cacao genetics and genomics is limited. To accelerate progress in cocoa breeding and the understanding of its biochemistry, we sequenced and analyzed the genome
ea (Pisum sativum L., 2n = 14) is the second most important grain legume in the world after common bean and is an important green vegetable with 14.3 t of dry pea and 19.9 t of green pea produced in 2016 (http://www.fao.org/faostat/). Pea belongs to the Leguminosae (or Fabaceae), which includes cool season grain legumes from the Galegoid clade, such as pea, lentil (Lens culinaris Medik.), chickpea (Cicer arietinum L.), faba bean (Vicia faba L.) and tropical grain legumes from the Milletoid clade, such as common bean (Phaseolus vulgaris L.), cowpea (Vigna unguiculata (L.) Walp.) and mungbean (Vigna radiata (L.) R. Wilczek). It provides significant ecosystem services: it is a valuable source of dietary proteins, mineral nutrients, complex starch and fibers with demonstrated health benefits 1-4 and its symbiosis with N-fixing soil bacteria reduces the need for applied N fertilizers so mitigating greenhouse gas emissions 5-7. Pea was domesticated ~10,000 years
Rose is the world's most important ornamental plant, with economic, cultural and symbolic value. Roses are cultivated worldwide and sold as garden roses, cut flowers and potted plants. Roses are outbred and can have various ploidy levels. Our objectives were to develop a high-quality reference genome sequence for the genus Rosa by sequencing a doubled haploid, combining long and short reads, and anchoring to a high-density genetic map, and to study the genome structure and genetic basis of major ornamental traits. We produced a doubled haploid rose line ('HapOB') from Rosa chinensis 'Old Blush' and generated a rose genome assembly anchored to seven pseudo-chromosomes (512 Mb with N50 of 3.4 Mb and 564 contigs). The length of 512 Mb represents 90.1-96.1% of the estimated haploid genome size of rose. Of the assembly, 95% is contained in only 196 contigs. The anchoring was validated using high-density diploid and tetraploid genetic maps. We delineated hallmark chromosomal features, including the pericentromeric regions, through annotation of transposable element families and positioned centromeric repeats using fluorescent in situ hybridization. The rose genome displays extensive synteny with the Fragaria vesca genome, and we delineated only two major rearrangements. Genetic diversity was analysed using resequencing data of seven diploid and one tetraploid Rosa species selected from various sections of the genus. Combining genetic and genomic approaches, we identified potential genetic regulators of key ornamental traits, including prickle density and the number of flower petals. A rose APETALA2/TOE homologue is proposed to be the major regulator of petal number in rose. This reference sequence is an important resource for studying polyploidization, meiosis and developmental processes, as we demonstrated for flower and prickle development. It will also accelerate breeding through the development of molecular markers linked to traits, the identification of the genes underlying them and the exploitation of synteny across Rosaceae.
A new version of the grapevine reference genome assembly (12X.v2) and of its annotation (VCost.v3
Grapevine is a very important crop species that is mainly cultivated worldwide for fruits, wine and juice. Identification of the genetic bases of performance traits through association mapping studies requires a precise knowledge of the available diversity and how this diversity is structured and varies across the whole genome. An 18k SNP genotyping array was evaluated on a panel of Vitis vinifera cultivars and we obtained a data set with no missing values for a total of 10207 SNPs and 783 different genotypes. The average inter-SNP spacing was ~47 kbp, the mean minor allele frequency (MAF) was 0.23 and the genetic diversity in the sample was high (He = 0.32). Fourteen SNPs, chosen from those with the highest MAF values, were sufficient to identify each genotype in the sample. Parentage analysis revealed 118 full parentages and 490 parent-offspring duos, thus confirming the close pedigree relationships within the cultivated grapevine. Structure analyses also confirmed the main divisions due to an eastern-western gradient and human usage (table vs. wine). Using a multivariate approach, we refined the structure and identified a total of eight clusters. Both the genetic diversity (He, 0.26–0.32) and linkage disequilibrium (LD, 28.8–58.2 kbp) varied between clusters. Despite the short span LD, we also identified some non-recombining haplotype blocks that may complicate association mapping. Finally, we performed a genome-wide association study that confirmed previous works and also identified new regions for important performance traits such as acidity. Taken together, all the results contribute to a better knowledge of the genetics of the cultivated grapevine.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.