The quantity and complexity of data being generated and published in biology has increased substantially, but few methods exist for capturing knowledge about phenotypes derived from molecular interactions between diverse groups of species, in such a way that is amenable to data-driven biology and research. To improve access to this knowledge, we have constructed a framework for the curation of the scientific literature studying interspecies interactions, using data curated for the Pathogen-Host Interactions Database (PHI-base) as a case study. The framework provides a curation tool, phenotype ontology and controlled vocabularies to curate pathogen-host interaction data (at the level of the host, pathogen, strain, gene and genotype). The concept of a multispecies genotype, the 'metagenotype', is introduced to facilitate capturing changes in the pathogens' disease-causing abilities, and host resistance or susceptibility observed by gene alterations. We report on this framework and describe PHI-Canto, a community curation tool for use by publication authors.
The Gene Ontology (GO) knowledgebase (http://geneontology.org) is a comprehensive resource concerning the functions of genes and gene products (proteins and non-coding RNAs). GO annotations cover genes from organisms across the tree of life as well as viruses, though most gene function knowledge currently derives from experiments carried out in a relatively small number of model organisms. Here, we provide an updated overview of the GO knowledgebase, as well as the efforts of the broad, international consortium of scientists that develops, maintains and updates the GO knowledgebase. The GO knowledgebase consists of three components: 1) the Gene Ontology – a computational knowledge structure describing functional characteristics of genes; 2) GO annotations – evidence-supported statements asserting that a specific gene product has a particular functional characteristic; and 3) GO Causal Activity Models (GO-CAMs) – mechanistic models of molecular “pathways” (GO biological processes) created by linking multiple GO annotations using defined relations. Each of these components is continually expanded, revised and updated in response to newly published discoveries, and receives extensive QA checks, reviews and user feedback. For each of these components, we provide a description of the current contents, recent developments to keep the knowledgebase up to date with new discoveries, as well as guidance on how users can best make use of the data we provide. We conclude with future directions for the project.
scite is a Brooklyn-based startup that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2023 scite Inc. All rights reserved.
Made with 💙 for researchers