A r t i c l e s Theobroma cacao L. is a diploid tree fruit species (2n = 2x = 20 (ref. 1)) endemic to the South American rainforests. Cocoa was domesticated approximately 3,000 years ago 2 in Central America 3. The Criollo cocoa variety, having a nearly unique and homozygous genotype, was among the first to be cultivated 4. Criollo is now one of the two cocoa varieties providing fine flavor chocolate. However, due to its poor agronomic performance and disease susceptibility, more vigorous hybrids created with foreign (Forastero) genotypes have been introduced. These hybrids, named Trinitario, are now widely cultivated 5. Here we report the sequence of a Belizean Criollo plant 6. Consumers have shown an increased interest for high-quality chocolate, and for dark chocolate, containing a higher percentage of cocoa 7. Fine-cocoa production is nevertheless estimated to be less than 5% of the world cocoa production due to the low productivity and disease susceptibility of the traditional fine-flavor cocoa varieties. Therefore, breeding of improved Criollo varieties is important for sustainable production of fine-flavor cocoa. 3.7 million tons of cocoa are produced annually (see URLs). However, fungal, oomycete and viral diseases, as well as insect pests, are responsible for an estimated 30% of harvest losses (see URLs). Like many other tropical crops, knowledge of T. cacao genetics and genomics is limited. To accelerate progress in cocoa breeding and the understanding of its biochemistry, we sequenced and analyzed the genome
A large set of 26 new reference transcriptomes dedicated to comparative population genomics in crops and wild relativesThe International Center for Tropical Agriculture (CIAT) believes that open access contributes to its mission of reducing hunger and poverty, and improving human nutrition in the tropics through research aimed at increasing the eco-efficiency of agriculture.CIAT is committed to creating and sharing knowledge and information openly and globally. We do this through collaborative research as well as through the open sharing of our data, tools, and publications.
Citation:Sarah For more information, please contact CIAT Library at CIAT-Library@cgiar.org.
Accepted ArticleThis article has been accepted for publication and undergone full peer review but has not been through the copyediting, typesetting, pagination and proofreading process, which may lead to differences between this version and the Version of Record. Please cite this article as doi: 10.1111/1755-0998.12587 This article is protected by copyright. All rights reserved.
Accepted ArticleThis article is protected by copyright. All rights reserved.
AbstractWe produced a unique large dataset of reference transcriptomes to obtain new knowledge about the evolution of plant genomes and crop domestication. For this purpose we validated a RNA-Seq data assembly protocol to perform comparative population genomics. For the validation, we assessed and compared the quality of de novo Illumina short-read assemblies using data from two crops for which an annotated reference genome was available, namely grapevine and sorghum. We used the same protocol for the release of 26 new transcriptomes of crop plants and wild relatives, including still understudied crops such as yam, pearl millet and fonio. The species list has a wide taxonomic representation with the inclusion of 15 monocots and 11eudicots. All contigs were annotated using BLAST, prot4EST, and Blast2GO. A strong originality of the dataset is that each crop is associated with close relative species, which will permit whole genome comparative evolutionary studies between crops and their wild related species. This large resource will thus serve research communities working on both crops and model organisms. All the data are available at
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.