Long read reference genome-free reconstruction of a full-length transcriptome from Astragalus membranaceus reveals transcript variants involved in bioactive compound biosynthesis

Li, Jun; Harata-Lee, Yuka; Denton, Matthew D.; Feng, Qi; Rathjen, Judith R.; Qu, Zhibo; Adelson, David L.

doi:10.1038/celldisc.2017.31

Cited by 104 publications

(89 citation statements)

References 37 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The average length of consensus sequences ranged from 925 bp to 1,438 kb (Figure S1). The number of consensus transcripts significantly exceeded the expected number of expressed genes, however, this is consistent with other reference genome-free IsoSeq analyses (Li et al, 2017; Kuang et al, 2019; Yan et al, 2019). Inflated numbers of consensus transcripts can result from sequencing of multiple alternatively spliced isoforms of the same gene, sequencing of incompletely processed mRNA molecules (Martin et al, 2014), high sequence error rates preventing multiple sequences from the same transcript being collapsed into a consensus, divergent haplotypes of the same locus present in our clonally propagated, wild collected, or partially inbred starting material, or contamination of the original samples with mRNA from non-target organisms.…”

Section: Resultssupporting

confidence: 89%

IsoSeq transcriptome assembly of C₃panicoid grasses provides tools to study evolutionary change in the Panicoideae

Carvalho

Schnable

2019

Preprint

View full text Add to dashboard Cite

7The number of plant species with genomic and transcriptomic data has been increasing 8 rapidly. The grasses -Poaceae -have been well represented among species with published 9 reference genomes. However, as a result the genomes of wild grasses are less frequently tar-10 geted by sequencing efforts. Sequence data from wild relatives of crop species in the grasses 11 can aid the study of domestication, gene discovery for breeding and crop improvement, and 12 improve our understanding of the evolution of C4 photosynthesis. Here we used long read 13 sequencing technology to characterize the transcriptomes of three C3 panicoid grass species: 14Dichanthelium oligosanthes, Chasmanthium laxum, and Hymenachne amplexicaulis. Based on 15 alignments to the sorghum genome we estimate that assembled consensus transcripts from 16 each species capture between 54.2 and 65.7% of the conserved syntenic gene space in grasses. 17Genes co-opted into C4 were also well represented in this dataset, despite concerns that, be-18 cause these genes might play roles unrelated to photosynthesis in the target species, they 19 would be expressed at low levels and missed by transcript-based sequencing. A combined 20 analysis using syntenic orthologous genes from grasses with published reference genomes and 21 consensus long read sequences from these wild species was consistent with previously pub-22 lished phylogenies. It is hoped that this data, targeting under represented classes of species 23 within the PACMAD grasses -wild species and species utilizing C3 photosynthesis -will aid 24 in futurue studies of domestication and C4 evolution by decreasing the evolutionary distance 25 between C4 and C3 species within this clade, enabling more accurate comparisons associated 26

show abstract

Section: Resultssupporting

confidence: 89%

IsoSeq transcriptome assembly of C₃panicoid grasses provides tools to study evolutionary change in the Panicoideae

Carvalho

Schnable

2019

Preprint

View full text Add to dashboard Cite

show abstract

“…While H. amplexicaulis exhibited the shortest consensus transcript length, this was not reflected in a reduced number of complete ORFs-those containing both an in-frame ATG and stop codon and occupying at least 60% of the total transcript length. The number of consensus transcripts significantly exceeded the expected number of expressed genes; however, this is consistent with other reference genome-freeIsoSeq analyses(Kuang, Sun, Wei, Li, & Sun, 2019;Li et al, 2017;Yan et al, 2019). Inflated numbers of consensus transcripts can result from sequencing of multiple alternatively spliced isoforms of the same gene, sequencing of incompletely processed mRNA molecules(Martin et al, 2014), high sequence error rates preventing multiple sequences from the same transcript being collapsed into a consensus, divergent haplotypes of the same locus present in our clonally propagated, wild collected, or partially inbred starting material, or contamination of the original samples with mRNA from non-target organisms.…”

supporting

confidence: 89%

IsoSeq transcriptome assembly of C₃ panicoid grasses provides tools to study evolutionary change in the Panicoideae

2020

View full text Add to dashboard Cite

The number of plant species with genomic and transcriptomic data has been increasing rapidly. The grasses—Poaceae—have been well represented among species with published reference genomes. However, as a result the genomes of wild grasses are less frequently targeted by sequencing efforts. Sequence data from wild relatives of crop species in the grasses can aid the study of domestication, gene discovery for breeding and crop improvement, and improve our understanding of the evolution of C4 photosynthesis. Here, we used long‐read sequencing technology to characterize the transcriptomes of three C3 panicoid grass species: Dichanthelium oligosanthes, Chasmanthium laxum, and Hymenachne amplexicaulis. Based on alignments to the sorghum genome, we estimate that assembled consensus transcripts from each species capture between 54.2% and 65.7% of the conserved syntenic gene space in grasses. Genes co‐opted into C4 were also well represented in this dataset, despite concerns that because these genes might play roles unrelated to photosynthesis in the target species, they would be expressed at low levels and missed by transcript‐based sequencing. A combined analysis using syntenic orthologous genes from grasses with published reference genomes and consensus long‐read sequences from these wild species was consistent with previously published phylogenies. It is hoped that these data, targeting underrepresented classes of species within the PACMAD grasses—wild species and species utilizing C3 photosynthesis—will aid in future studies of domestication and C4 evolution by decreasing the evolutionary distance between C4 and C3 species within this clade, enabling more accurate comparisons associated with evolution of the C4 pathway.

show abstract

“…Moreover, small laboratories require high sequencing costs due to the need for long reads and high-depth short read sequences to be accurate in de novo assembly. Plants with large genomes pose even more difficult as in, for example, the common soybean crop, which has a genome size of ∼1.1Gb [21]. To improve the comprehensive accuracy of gene prediction, there is a need to introduce a new approach, the “Isoform sequencing (Iso-Seq).” Thanks to its long-read technology, Iso-Seq facilitates identifying new isoforms with a high level of accuracy [22].…”

Section: Introductionmentioning

confidence: 99%

De novo transcriptome sequence of Senna tora provides insights into anthraquinone biosynthesis

Kang

Lee

et al. 2019

Preprint

View full text Add to dashboard Cite

AbstractSenna tora is an annual herb with rich source of anthraquinones that have tremendous pharmacological properties. However, there is little mention of genetic information for this species, especially regarding the biosynthetic pathways of anthraquinones. To understand the key genes and regulatory mechanism of anthraquinone biosynthesis pathways, we performed spatial and temporal transcriptome sequencing of S. tora using short RNA sequencing (RNA-Seq) and long-read isoform sequencing (Iso-Seq) technologies, and generated two unigene sets composed of 118,635 and 39,364, respectively. A comprehensive functional annotation and classification with multiple public databases identified array of genes involved in major secondary metabolite biosynthesis pathways and important transcription factor (TF) families (MYB, MYB-related, AP2/ERF, C2C2-YABBY, and bHLH). Differential expression analysis indicated that the expression level of genes involved in anthraquinone biosynthetic pathway regulates differently depending on the degree of tissues and seeds development. Furthermore, we identified that the amount of anthraquinone compounds were greater in late seeds than early ones. In conclusion, these results provide a rich resource for understanding the anthraquinone metabolism in S. tora.

show abstract

Long read reference genome-free reconstruction of a full-length transcriptome from Astragalus membranaceus reveals transcript variants involved in bioactive compound biosynthesis

Cited by 104 publications

References 37 publications

IsoSeq transcriptome assembly of C₃panicoid grasses provides tools to study evolutionary change in the Panicoideae

IsoSeq transcriptome assembly of C₃panicoid grasses provides tools to study evolutionary change in the Panicoideae

IsoSeq transcriptome assembly of C₃ panicoid grasses provides tools to study evolutionary change in the Panicoideae

De novo transcriptome sequence of Senna tora provides insights into anthraquinone biosynthesis

Contact Info

Product

Resources

About

Long read reference genome-free reconstruction of a full-length transcriptome from Astragalus membranaceus reveals transcript variants involved in bioactive compound biosynthesis

Cited by 104 publications

References 37 publications

IsoSeq transcriptome assembly of C3panicoid grasses provides tools to study evolutionary change in the Panicoideae

IsoSeq transcriptome assembly of C3panicoid grasses provides tools to study evolutionary change in the Panicoideae

IsoSeq transcriptome assembly of C3 panicoid grasses provides tools to study evolutionary change in the Panicoideae

De novo transcriptome sequence of Senna tora provides insights into anthraquinone biosynthesis

Contact Info

Product

Resources

About

IsoSeq transcriptome assembly of C₃panicoid grasses provides tools to study evolutionary change in the Panicoideae

IsoSeq transcriptome assembly of C₃panicoid grasses provides tools to study evolutionary change in the Panicoideae

IsoSeq transcriptome assembly of C₃ panicoid grasses provides tools to study evolutionary change in the Panicoideae