In plants, the two-component systems (TCSs) play important roles in regulating diverse biological processes, including responses to environmental stress stimuli. Within the soybean genome, the TCSs consist of at least 21 histidine kinases, 13 authentic and pseudo-phosphotransfers and 18 type-A, 15 type-B, 3 type-C and 11 pseudo-response regulator proteins. Structural and phylogenetic analyses of soybean TCS members with their Arabidopsis and rice counterparts revealed similar architecture of their TCSs. We identified a large number of closely homologous soybean TCS genes, which likely resulted from genome duplication. Additionally, we analysed tissue-specific expression profiles of those TCS genes, whose data are available from public resources. To predict the putative regulatory functions of soybean TCS members, with special emphasis on stress-responsive functions, we performed comparative analyses from all the TCS members of soybean, Arabidopsis and rice and coupled these data with annotations of known abiotic stress-responsive cis-elements in the promoter region of each soybean TCS gene. Our study provides insights into the architecture and a solid foundation for further functional characterization of soybean TCS elements. In addition, we provide a new resource for studying the conservation and divergence among the TCSs within plant species and/or between plants and other organisms.
SUMMARYChinese liquorice/licorice (Glycyrrhiza uralensis) is a leguminous plant species whose roots and rhizomes have been widely used as a herbal medicine and natural sweetener. Whole-genome sequencing is essential for gene discovery studies and molecular breeding in liquorice. Here, we report a draft assembly of the approximately 379-Mb whole-genome sequence of strain 308-19 of G. uralensis; this assembly contains 34 445 predicted protein-coding genes. Comparative analyses suggested well-conserved genomic components and collinearity of gene loci (synteny) between the genome of liquorice and those of other legumes such as Medicago and chickpea. We observed that three genes involved in isoflavonoid biosynthesis, namely, 2-hydroxyisoflavanone synthase (CYP93C), 2,7,4 0 -trihydroxyisoflavanone 4 0 -O-methyltransferase/isoflavone 4 0 -O-methyltransferase (HI4OMT) and isoflavone-7-O-methyltransferase (7-IOMT) formed a cluster on the scaffold of the liquorice genome and showed conserved microsynteny with Medicago and chickpea. Based on the liquorice genome annotation, we predicted genes in the P450 and UDP-dependent glycosyltransferase (UGT) superfamilies, some of which are involved in triterpenoid saponin biosynthesis, and characterised their gene expression with the reference genome sequence. The genome sequencing and its annotations provide an essential resource for liquorice improvement through molecular breeding and the discovery of useful genes for engineering bioactive components through synthetic biology approaches.
Sequence-specific DNA-binding transcription factors (TFs) are often termed as ‘master regulators’ which bind to DNA and either activate or repress gene transcription. We have computationally analysed the soybean genome sequence data and constructed a proper set of TFs based on the Hidden Markov Model profiles of DNA-binding domain families. Within the soybean genome, we identified 4342 loci encoding 5035 TF models which grouped into 61 families. We constructed a database named SoybeanTFDB () containing the full compilation of soybean TFs and significant information such as: functional motifs, full-length cDNAs, domain alignments, promoter regions, genomic organization and putative regulatory functions based on annotations of gene ontology (GO) inferred by comparative analysis with Arabidopsis. With particular interest in abiotic stress signalling, we analysed the promoter regions for all of the TF encoding genes as a means to identify abiotic stress responsive cis-elements as well as all types of cis-motifs provided by the PLACE database. SoybeanTFDB enables scientists to easily access cis-element and GO annotations to aid in the prediction of TF function and selection of TFs with functions of interest. This study provides a basic framework and an important user-friendly public information resource which enables analyses of transcriptional regulation in soybean.
The Triticeae Full-Length CDS Database (TriFLDB) contains available information regarding full-length coding sequences (CDSs) of the Triticeae crops wheat (Triticum aestivum) and barley (Hordeum vulgare) and includes functional annotations and comparative genomics features. TriFLDB provides a search interface using keywords for gene function and related Gene Ontology terms and a similarity search for DNA and deduced translated amino acid sequences to access annotations of Triticeae full-length CDS (TriFLCDS) entries. Annotations consist of similarity search results against several sequence databases and domain structure predictions by InterProScan. The deduced amino acid sequences in TriFLDB are grouped with the proteome datasets for Arabidopsis (Arabidopsis thaliana), rice (Oryza sativa), and sorghum (Sorghum bicolor) by hierarchical clustering in stepwise thresholds of sequence identity, providing hierarchical clustering results based on full-length protein sequences. The database also provides sequence similarity results based on comparative mapping of TriFLCDSs onto the rice and sorghum genome sequences, which together with current annotations can be used to predict gene structures for TriFLCDS entries. To provide the possible genetic locations of full-length CDSs, TriFLCDS entries are also assigned to the genetically mapped cDNA sequences of barley and diploid wheat, which are currently accommodated in the Triticeae Mapped EST Database. These relational data are searchable from the search interfaces of both databases. The current TriFLDB contains 15,871 full-length CDSs from barley and wheat and includes putative full-length cDNAs for barley and wheat, which are publicly accessible. This informative content provides an informatics gateway for Triticeae genomics and grass comparative genomics. TriFLDB is publicly available at http://TriFLDB.psc.riken.jp/.
Cassava is an important crop that provides food security and income generation in many tropical countries and is known for its adaptability to various environmental conditions. Despite its global importance, the development of cassava microarray tools has not been well established. Here, we describe the development of a 60-mer oligonucleotide Agilent microarray representing ∼20 000 cassava genes and how it can be applied to expression profiling under drought stress using three cassava genotypes (MTAI16, MECU72 and MPER417-003). Our results identified about 1300 drought stress up-regulated genes in cassava and indicated that cassava has similar mechanisms for drought stress response and tolerance as other plant species. These results demonstrate that our microarray is a useful tool for analysing the cassava transcriptome and that it is applicable for various cassava genotypes.
A large collection of full-length cDNAs is essential for the correct annotation of genomic sequences and for the functional analysis of genes and their products. We obtained a total of 39 936 soybean cDNA clones (GMFL01 and GMFL02 clone sets) in a full-length-enriched cDNA library which was constructed from soybean plants that were grown under various developmental and environmental conditions. Sequencing from 5′ and 3′ ends of the clones generated 68 661 expressed sequence tags (ESTs). The EST sequences were clustered into 22 674 scaffolds involving 2580 full-length sequences. In addition, we sequenced 4712 full-length cDNAs. After removing overlaps, we obtained 6570 new full-length sequences of soybean cDNAs so far. Our data indicated that 87.7% of the soybean cDNA clones contain complete coding sequences in addition to 5′- and 3′-untranslated regions. All of the obtained data confirmed that our collection of soybean full-length cDNAs covers a wide variety of genes. Comparative analysis between the derived sequences from soybean and Arabidopsis, rice or other legumes data revealed that some specific genes were involved in our collection and a large part of them could be annotated to unknown functions. A large set of soybean full-length cDNA clones reported in this study will serve as a useful resource for gene discovery from soybean and will also aid a precise annotation of the soybean genome.
Medicinal and industrial properties of phytochemicals (e.g. glycyrrhizin) from the root of Glycyrrhiza uralensis (licorice plant) made it an attractive, multimillion-dollar trade item. Bioengineering is one of the solutions to overcome such high market demand and to protect plants from extinction. Unfortunately, limited genomic information on medicinal plants restricts their research and thus biosynthetic mechanisms of many important phytochemicals are still poorly understood. In this work we utilized the de novo (no reference genome sequence available) assembly of Illumina RNA-Seq data to study the transcriptome of the licorice plant. Our analysis is based on sequencing results of libraries constructed from samples belonging to different tissues (root and leaf) and collected in different seasons and from two distinct strains (low and high glycyrrhizin producers). We provide functional annotations and the expression profile of 43,882 assembled unigenes, which are suitable for various further studies. Here, we searched for G. uralensis-specific enzymes involved in isoflavonoid biosynthesis as well as elucidated putative cytochrome P450 enzymes and putative vacuolar saponin transporters involved in glycyrrhizin production in the licorice root. To disseminate the data and the analysis results, we constructed a publicly available G. uralensis database. This work will contribute to a better understanding of the biosynthetic pathways of secondary metabolites in licorice plants, and possibly in other medicinal plants, and will provide an important resource to further advance transcriptomic studies in legumes.
The Rubiaceae species, Ophiorrhiza pumila, accumulates camptothecin, an anti-cancer alkaloid with a potent DNA topoisomerase I inhibitory activity, as well as anthraquinones that are derived from the combination of the isochorismate and hemiterpenoid pathways. The biosynthesis of these secondary products is active in O. pumila hairy roots yet very low in cell suspension culture. Deep transcriptome analysis was conducted in O. pumila hairy roots and cell suspension cultures using the Illumina platform, yielding a total of 2 Gb of sequence for each sample. We generated a hybrid transcriptome assembly of O. pumila using the Illumina-derived short read sequences and conventional Sanger-derived expressed sequence tag clones derived from a full-length cDNA library constructed using RNA from hairy roots. Among 35,608 non-redundant unigenes, 3,649 were preferentially expressed in hairy roots compared with cell suspension culture. Candidate genes involved in the biosynthetic pathway for the monoterpenoid indole alkaloid camptothecin were identified; specifically, genes involved in post-strictosamide biosynthetic events and genes involved in the biosynthesis of anthraquinones and chlorogenic acid. Untargeted metabolomic analysis by Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR-MS) indicated that most of the proposed intermediates in the camptothecin biosynthetic pathway accumulated in hairy roots in a preferential manner compared with cell suspension culture. In addition, a number of anthraquinones and chlorogenic acid preferentially accumulated in hairy roots compared with cell suspension culture. These results suggest that deep transcriptome and metabolome data sets can facilitate the identification of genes and intermediates involved in the biosynthesis of secondary products including camptothecin in O. pumila.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.