Interoperable genome annotation with GBOL, an extendable infrastructure for functional data mining

Dam, Jesse C. J. van; Koehorst, Jasper J.; Vik, Jon Olav; Schaap, Peter J.; Suárez-Diez, María

doi:10.1101/184747

Cited by 11 publications

(15 citation statements)

References 36 publications

(22 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Gene predictions were directly stored in the SAPP semantic database ( Koehorst et al, 2017 ). Structural feature description was done using the GBOL ontology ( van Dam et al, 2017 ) Functional genome annotation was done with a standalone version of interproscan v5.24.63.0 ( Zdobnov and Apweiler, 2001 ) in direct interaction with the SAPP database using the pfam31 ( Bateman et al, 2004 ) database. The raw reads and full genome sequence are available from ENA (accession numbers PRJEB21769 and GCA_900248155 ).…”

Section: Methodsmentioning

confidence: 99%

Forward Genetics by Genome Sequencing Uncovers the Central Role of the Aspergillus niger goxB Locus in Hydrogen Peroxide Induced Glucose Oxidase Expression

et al. 2018

Self Cite

View full text Add to dashboard Cite

Aspergillus niger is an industrially important source for gluconic acid and glucose oxidase (GOx), a secreted commercially important flavoprotein which catalyses the oxidation of β-D-glucose by molecular oxygen to D-glucolactone and hydrogen peroxide. Expression of goxC, the GOx encoding gene and the concomitant two step conversion of glucose to gluconic acid requires oxygen and the presence of significant amounts of glucose in the medium and is optimally induced at pH 5.5. The molecular mechanisms underlying regulation of goxC expression are, however, still enigmatic. Genetic studies aimed at understanding GOx induction have indicated the involvement of at least seven complementation groups, for none of which the molecular basis has been resolved. In this study, a mapping-by-sequencing forward genetics approach was used to uncover the molecular role of the goxB locus in goxC expression. Using the Illumina and PacBio sequencing platforms a hybrid high quality draft genome assembly of laboratory strain N402 was obtained and used as a reference for mapping of genomic reads obtained from the derivative NW103:goxB mutant strain. The goxB locus encodes a thioredoxin reductase. A deletion of the encoding gene in the N402 parent strain led to a high constitutive expression level of the GOx and the lactonase encoding genes required for the two-step conversion of glucose in gluconic acid and of the catR gene encoding catalase R. This high constitutive level of expression was observed to be irrespective of the carbon source and oxidative stress applied. A model clarifying the role of GoxB in the regulation of the expression of goxC involving hydrogen peroxide as second messenger is presented.

show abstract

Section: Methodsmentioning

confidence: 99%

Forward Genetics by Genome Sequencing Uncovers the Central Role of the Aspergillus niger goxB Locus in Hydrogen Peroxide Induced Glucose Oxidase Expression

et al. 2018

Self Cite

View full text Add to dashboard Cite

show abstract

“…The genomic annotation (GFF3) and corresponding genomic sequence (FASTA) of N. gaditana were converted into a semantic framework using SAPP according to the GBOL ontology (Koehorst et al 2017;Van Dam et al 2017). Each RNA-seq dataset was mapped using the transcriptomics module using STAR 2.5 as the read mapping software (Dobin et al 2013).…”

Section: Rna-sequencingmentioning

confidence: 99%

Time-dependent transcriptome profile of genes involved in triacylglycerol (TAG) and polyunsaturated fatty acid synthesis in Nannochloropsis gaditana during nitrogen starvation

et al. 2020

Self Cite

View full text Add to dashboard Cite

In this research, the gene expression of genes involved in lipid metabolism of the eustigmatophyte alga Nannochloropsis gaditana was measured by transcriptomic data. This microalga can be used as a source of triacylglycerol (TAG) and the omega-3 fatty acid eicosapentaenoic acid (EPA). Insight in TAG and EPA production and regulation are needed to improve their productivity. Nitrogen starvation induces TAG accumulation in N. gaditana. Previous research showed that during nitrogen starvation, EPA was translocated from the polar lipids to TAG and de novo synthesized in N. gaditana. Therefore, the expression levels of genes involved in fatty acid translocation and de novo TAG synthesis were measured. Furthermore, the genes involved in de novo EPA synthesis such as elongases and desaturases were studied. The expression levels were measured during the first hours of nitrogen starvation and the subsequent period of 14 days. One phospholipid:diacylglycerol acyltransferase (PDAT) gene involved in translocation of fatty acids from membrane lipids to TAG was upregulated. In addition, several lipases were upregulated, suggesting that these enzymes might be responsible for the translocation of EPA to TAG. Most desaturases and elongases involved in de novo EPA synthesis were downregulated during nitrogen starvation, except for Δ9 desaturase which was upregulated. This upregulation correlates with the increase in oleic acid. Due to the presence of many hypothetical genes, improvement in annotation is needed to increase our understanding of these pathways and their regulation.

show abstract

“…A total of 5713 publicly available complete bacterial genomes were downloaded from the NCBI repository (November 2016) 40 . To prevent technical bias due to the use of different annotation tools and pipelines and different thresholds for assessing the significance of the inferred genetic elements, genomes were consistently structurally and functionally de-novo annotated using SAPP 22 , an annotation platform implementing a strictly defined ontology 41 .…”

Section: Genome Annotationmentioning

confidence: 99%

“…Genes were predicted using Prodigal (2.6.3) 43 and the identified proteins were functionally annotated using the Pfam library (version 30.0) within InterProScan (version 5.21-60.0) 25,44 . Annotations were automatically converted into RDF according to the GBOL ontology 41 and loaded into a semantic database for high-throughput annotation and analysis. For the retrieval of information, SPARQL was used (See supplementary file S5 for all queries used).…”

Section: Genome Annotationmentioning

confidence: 99%

Expected and observed genotype complexity in prokaryotes: correlation between 16S-rRNA phylogeny and protein domain content

Koehorst

Schaap

Suárez-Diez

2018

Preprint

Self Cite

View full text Add to dashboard Cite

BackgroundThe omnipresent 16S ribosomal RNA gene (16S-rRNA) is commonly used to identify and classify bacteria though it does not take into account the distinctive functional characteristics of taxa. We explored functional domain landscapes of over 5700 complete bacterial genomes, representing a wide coverage of the bacterial tree of life, and investigated to what extent the observed protein domain diversity correlates with the expected evolutionary diversity, using 16S-rRNA as metric for evolutionary distance. Results Analysis of protein domains showed that 83% of the bacterial genes code for at least one of the 9722 domain classes identified. By comparing clade specific and global persistence scores, candidate horizontal gene transfer and signifying domains could be identified. 16S-rRNA and functional domain content distances were used to evaluate and compare species divergence and overall a sigmoid curve is observed. Already at close 16S-rRNA evolutionary distances, high levels of functional diversity can be observed. At a larger 16S-rRNA distance, functional differences accumulate at a relatively lower pace. Conclusions Analysis of 16S-rRNA sequences in the same taxa suggests that, in many cases, additional means of classification are required to obtain reliable phylogenetic relationships. Whole genome protein domain class phylogenies correlate with, and complement 16S-rRNA sequence-based phylogenies. Moreover, domain-based phylogenies can be constructed over large evolutionary distances and provide an in-depth insight of the functional diversity within and among species and enables large scale functional comparisons. The increased granularity obtained paves way for new applications to better predict the relationships between genotype, physiology and ecology. 2/15 4/15

show abstract

Interoperable genome annotation with GBOL, an extendable infrastructure for functional data mining

Cited by 11 publications

References 36 publications

Forward Genetics by Genome Sequencing Uncovers the Central Role of the Aspergillus niger goxB Locus in Hydrogen Peroxide Induced Glucose Oxidase Expression

Forward Genetics by Genome Sequencing Uncovers the Central Role of the Aspergillus niger goxB Locus in Hydrogen Peroxide Induced Glucose Oxidase Expression

Time-dependent transcriptome profile of genes involved in triacylglycerol (TAG) and polyunsaturated fatty acid synthesis in Nannochloropsis gaditana during nitrogen starvation

Expected and observed genotype complexity in prokaryotes: correlation between 16S-rRNA phylogeny and protein domain content

Contact Info

Product

Resources

About