The Gene Ontology Consortium (GOC) provides the most comprehensive resource currently available for computable knowledge regarding the functions of genes and gene products. Here, we report the advances of the consortium over the past two years. The new GO-CAM annotation framework was notably improved, and we formalized the model with a computational schema to check and validate the rapidly increasing repository of 2838 GO-CAMs. In addition, we describe the impacts of several collaborations to refine GO and report a 10% increase in the number of GO annotations, a 25% increase in annotated gene products, and over 9,400 new scientific articles annotated. As the project matures, we continue our efforts to review older annotations in light of newer findings, and, to maintain consistency with other ontologies. As a result, 20 000 annotations derived from experimental data were reviewed, corresponding to 2.5% of experimental GO annotations. The website (http://geneontology.org) was redesigned for quick access to documentation, downloads and tools. To maintain an accurate resource and support traceability and reproducibility, we have made available a historical archive covering the past 15 years of GO data with a consistent format and file structure for both the ontology and annotations.
We have investigated DNA segregation in E. coli by inserting multiple lac operator sequences into the chromosome near the origin of replication (oriC), in the hisC gene, a terminus marker, and into plasmids P1 and F. Expression of a GFP-LacI fusion protein allowed visualization of lac operator localization. oriC was shown to be specifically localized at or near the cell poles, and when duplicated, one copy moved to the site of new pole formation near the site of cell division. In contrast, P1 and F localized to the cell center and on duplication appeared to move rapidly to the quarter positions in the cell. Our analysis suggests that different active processes are involved in movement and localization of the chromosome and of the two plasmids during segregation.
The Gene Ontology (GO) Consortium (GOC, http://www.geneontology.org) is a community-based bioinformatics resource that classifies gene product function through the use of structured, controlled vocabularies. Over the past year, the GOC has implemented several processes to increase the quantity, quality and specificity of GO annotations. First, the number of manual, literature-based annotations has grown at an increasing rate. Second, as a result of a new ‘phylogenetic annotation’ process, manually reviewed, homology-based annotations are becoming available for a broad range of species. Third, the quality of GO annotations has been improved through a streamlined process for, and automated quality checks of, GO annotations deposited by different annotation groups. Fourth, the consistency and correctness of the ontology itself has increased by using automated reasoning tools. Finally, the GO has been expanded not only to cover new areas of biology through focused interaction with experts, but also to capture greater specificity in all areas of the ontology using tools for adding new combinatorial terms. The GOC works closely with other ontology developers to support integrated use of terminologies. The GOC supports its user community through the use of e-mail lists, social media and web-based resources.
The Gene Ontology (GO) knowledgebase (http://geneontology.org) is a comprehensive resource concerning the functions of genes and gene products (proteins and non-coding RNAs). GO annotations cover genes from organisms across the tree of life as well as viruses, though most gene function knowledge currently derives from experiments carried out in a relatively small number of model organisms. Here, we provide an updated overview of the GO knowledgebase, as well as the efforts of the broad, international consortium of scientists that develops, maintains and updates the GO knowledgebase. The GO knowledgebase consists of three components: 1) the Gene Ontology – a computational knowledge structure describing functional characteristics of genes; 2) GO annotations – evidence-supported statements asserting that a specific gene product has a particular functional characteristic; and 3) GO Causal Activity Models (GO-CAMs) – mechanistic models of molecular “pathways” (GO biological processes) created by linking multiple GO annotations using defined relations. Each of these components is continually expanded, revised and updated in response to newly published discoveries, and receives extensive QA checks, reviews and user feedback. For each of these components, we provide a description of the current contents, recent developments to keep the knowledgebase up to date with new discoveries, as well as guidance on how users can best make use of the data we provide. We conclude with future directions for the project.
BackgroundManually annotated corpora are critical for the training and evaluation of automated methods to identify concepts in biomedical text.ResultsThis paper presents the concept annotations of the Colorado Richly Annotated Full-Text (CRAFT) Corpus, a collection of 97 full-length, open-access biomedical journal articles that have been annotated both semantically and syntactically to serve as a research resource for the biomedical natural-language-processing (NLP) community. CRAFT identifies all mentions of nearly all concepts from nine prominent biomedical ontologies and terminologies: the Cell Type Ontology, the Chemical Entities of Biological Interest ontology, the NCBI Taxonomy, the Protein Ontology, the Sequence Ontology, the entries of the Entrez Gene database, and the three subontologies of the Gene Ontology. The first public release includes the annotations for 67 of the 97 articles, reserving two sets of 15 articles for future text-mining competitions (after which these too will be released). Concept annotations were created based on a single set of guidelines, which has enabled us to achieve consistently high interannotator agreement.ConclusionsAs the initial 67-article release contains more than 560,000 tokens (and the full set more than 790,000 tokens), our corpus is among the largest gold-standard annotated biomedical corpora. Unlike most others, the journal articles that comprise the corpus are drawn from diverse biomedical disciplines and are marked up in their entirety. Additionally, with a concept-annotation count of nearly 100,000 in the 67-article subset (and more than 140,000 in the full collection), the scale of conceptual markup is also among the largest of comparable corpora. The concept annotations of the CRAFT Corpus have the potential to significantly advance biomedical text mining by providing a high-quality gold standard for NLP systems. The corpus, annotation guidelines, and other associated resources are freely available at http://bionlp-corpora.sourceforge.net/CRAFT/index.shtml.
The conditioning of culture medium by the production of growth-regulatory substances is a wellestablished phenomenon with eukaryotic cells. It has recently been shown that many prokaryotes are also capable of modulating growth, and in some cases sensing cell density, by production of extracellular signaling molecules, thereby allowing single celled prokaryotes to function in some respects as multicellular organisms. As Escherichia coli shifts from exponential growth to stationary growth, many changes occur, including cell division leading to formation of short minicells and expression of numerous genes not expressed in exponential phase. An understanding of the coordination between the morphological changes associated with cell division and the physiological and metabolic changes is of fundamental importance to understanding regulation of the prokaryotic cell cycle. The ftsQA genes, which encode functions required for cell division in E. coli, are regulated by promoters Pi and P2, located upstream of the ftsQ gene. The P1 promoter is rpoSstimulated and the second, P2, is regulated by a member of the LuxR subfamily of transcriptional activators, SdiA, exhibiting features characteristic of an autoinduction (quorum sensing) mechanism. The activity of SdiA is potentiated by N-acylhomoserine lactones, which are the autoinducers of luciferase synthesis in luminous marine bacteria as well as of pathogenesis functions in several pathogenic bacteria. A compound(s) produced by E. coli itself during growth in Luria Broth stimulates transcription from P2 in an SdiA-dependent process. Another substance(s) enhances transcription of rpoS and (perhaps indirectly) offtsQA via promoter P1. It appears that this bimodal control mechanism may comprise a fail-safe system, such that transcription of the ftsQA genes may be properly regulated under a variety of different environmental and physiological conditions.
SummaryThe phenomenon of cell-density-dependent control of gene expression, called autoinduction, has long been a subject of interest and investigation in bioluminescent marine bacteria. It is now becoming clear that many other bacteria, including animal and plant pathogens, use an autoinduction mechanism to regulate a variety of functions. Cell-density-dependent gene expression provides an excellent example of multicellular behaviour in the prokaryotic kingdom where a single cell is able to communicate and sense when a minimal population unit, a 'quorum' of bacteria, is achieved in order for certain behaviour of the population to be performed efficiently. Regulation of bacterial bioluminescence has been studied for many years and*represents the best model system for understanding the mechanism of cell-density-dependent gene expression. This review will focus on transcriptional regulation of the Vibrio fischeri luminescence genes emphasizing the role of the transcriptional activator LuxR and possible autoinduction mechanisms that occur in E. coil Alternative views and opinions regarding the molecular details of the autoinduction mechanism will be discussed. OverviewVibrio fischeri is a marine bioluminescent bacterium which lives both as a symbiont of some marine fish and squid and as a free-living organism (for recent reviews see Dunlap and Greenberg, 1991; McFalFNgai and Ruby, 1991;Ruby and McFalI-Ngai, 1992). When free-living, V. fischeri exists at low cell densities and appears to be non-luminescent, Received
MicroRNA regulation of developmental and cellular processes is a relatively new field of study, and the available research data have not been organized to enable its inclusion in pathway and network analysis tools. The association of gene products with terms from the Gene Ontology is an effective method to analyze functional data, but until recently there has been no substantial effort dedicated to applying Gene Ontology terms to microRNAs. Consequently, when performing functional analysis of microRNA data sets, researchers have had to rely instead on the functional annotations associated with the genes encoding microRNA targets. In consultation with experts in the field of microRNA research, we have created comprehensive recommendations for the Gene Ontology curation of microRNAs. This curation manual will enable provision of a high-quality, reliable set of functional annotations for the advancement of microRNA research. Here we describe the key aspects of the work, including development of the Gene Ontology to represent this data, standards for describing the data, and guidelines to support curators making these annotations. The full microRNA curation guidelines are available on the GO Consortium wiki (http://wiki.geneontology.org/index.php/MicroRNA_GO_annotation_manual).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.