John Zobolas scite author profile

Motivation Combining multiple layers of information underlying biological complexity into a structured framework represent a challenge in systems biology. A key task is the formalization of such information in models describing how biological entities interact to mediate the response to external and internal signals. Several databases with signalling information, focus on capturing, organizing and displaying signalling interactions by representing them as binary, causal relationships between biological entities. The curation efforts that build these individual databases demand a concerted effort to ensure interoperability among resources. Results Aware of the enormous benefits of standardization efforts in the molecular interaction research field, representatives of the signalling network community agreed to extend the PSI-MI controlled vocabulary to include additional terms representing aspects of causal interactions. Here, we present a common standard for the representation and dissemination of signalling information: the PSI Causal Interaction tabular format (CausalTAB) which is an extension of the existing PSI-MI tab-delimited format, now designated PSI-MITAB 2.8. We define the new term ‘causal interaction’, and related child terms, which are children of the PSI-MI ‘molecular interaction’ term. The new vocabulary terms in this extended PSI-MI format will enable systems biologists to model large-scale signalling networks more precisely and with higher coverage than before. Availability and implementation PSI-MITAB 2.8 format and the new reference implementation of PSICQUIC are available online (https://psicquic.github.io/ and https://psicquic.github.io/MITAB28Format.html). Supplementary information Supplementary data are available at Bioinformatics online.

VSM-box: General-purpose Interface for Biocuration and Knowledge Representation

Vercruysse¹,

Zobolas²,

Touré³

et al. 2020

Preprint

VSM is a recently introduced method for entering and displaying any type of knowledge, in a form that is both semantically precise for computation and intuitive for human understanding. VSM is the combination of a new semantic model, and the design for a dedicated user interface to support it. Here we present the implementation of this user interface, as a sophisticated HTML-element, <vsm-box>, that can be embedded in any web-based curation app. We show how developers can use it for biocuration projects, customize it to particular end-user needs, and contribute to its growth. Vsm-box is open-source at https://github.com/vsmjs/vsm-box under the AGPL license, as a JavaScript (ES6) Vue.js web-component that runs in all modern web browsers. It is the capstone of the Vsmjs organization at https://github.com/vsmjs that groups its supporting modules. Extensive supplementary material on VSM and vsm-box is available at https://vsmjs.github.io.

UniBioDicts: Unified access to Biological Dictionaries

Zobolas¹,

Touré²,

Kuiper³

et al. 2020

Preprint

We present a set of software packages that provide uniform access to diverse biological vocabulary resources that are instrumental for current biocuration efforts and tools. The Unified Biological Dictionaries (UniBioDicts or UBDs) provide a single query-interface for accessing the online API services of leading biological data providers. Given a search string, UBDs return a list of matching term, identifier and metadata units from databases (e.g. UniProt), controlled vocabularies (e.g. PSI-MI), and ontologies (e.g. GO, via BioPortal). This can be coupled to for instance the ‘vsm-autocomplete’ module: an input field (user-interface component) that offers autocomplete lookup for these dictionaries. UBDs create a unified gateway for accessing life science concepts, helping curators find annotation terms across resources (based on descriptive metadata and unambiguous identifiers), and data users search and retrieve the right query terms.

CausalTab: PSI-MITAB 2.8 updated format for signaling data representation and dissemination

Perfetto

Acencio

Bradley

et al. 2018

Preprint

Combining multiple layers of information underlying biological complexity into a structured framework, and in particular deciphering the molecular mechanisms behind cellular phenotypes, represent two challenges in systems biology. A key task is the formalisation of such information in models describing how biological entities interact to mediate the response to external and internal signals. Several databases with signaling information, such as SIGNOR, SignaLink and IntAct, focus on capturing, organising and displaying signaling interactions by representing them as binary, causal relationships between biological entities. The curation efforts that build these individual databases demand a concerted effort to ensure interoperability among resources, through the development of a standardized exchange format, ontologies and controlled vocabularies supporting the domain of causal interactions. Aware of the enormous benefits of standardization efforts in the molecular interaction research field, representatives of the signalling network community agreed to extend the PSI-MI controlled vocabulary to include additional terms representing aspects of causal interactions. Here, we present a common standard for the representation and dissemination of signaling information: the PSI Causal Interaction tabular format (CausalTAB) which is an extension of the existing PSI-MI tab-delimited format, now designated MITAB2.8. We define the new term "causal interaction", and related child terms, which are children of the PSI-MI "molecular interaction" term. The new vocabulary terms in this extended PSI-MI format will enable systems biologists to model large-scale signaling networks more precisely and with higher coverage than before.

Boolean function metrics can assist modelers to check and choose logical rules

Journal of Theoretical Biology

Monteiro

Kuiper

et al. 2022

CausalBuilder: Bringing the MI2CAST Causal Interaction Annotation Standard to the Curator

Touré

Kuiper

et al. 2020

Preprint

Molecular causal interactions are defined as regulatory connections between biological components. They are commonly retrieved from biological experiments, and can be used for connecting biological molecules into regulatory computational models that represent biological systems. However, including a molecular causal interaction into a model requires assessing its relevance to that model, based on detailed knowledge about the biomolecules, interaction type, and biological context. In order to standardize the representation of this knowledge in ‘causal statements’, we recently developed the MI2CAST guidelines. Here we introduce causalBuilder: an intuitive web-based curation interface for the annotation of molecular causal interactions that comply with the MI2CAST standard. The causalBuilder prototype essentially embeds the MI2CAST curation guidelines in its interface, and makes its rules easy to follow by a curator. In addition, causalBuilder serves as an original application of the VSM general-purpose curation technology, and provides both curators and tool developers with an interface that can be fully configured to allow focusing on selected MI2CAST concepts to annotate. After information is entered, the causalBuilder prototype produces genuine causal statements that can be exported in different formats.

IIb‐RAD‐sequencing coupled with random forest classification indicates regional population structuring and sex‐specific differentiation in salmon lice (Lepeophtheirus salmonis)

Guragain

Båtnes

et al. 2022

Ecology and Evolution

The aquaculture industry has been dealing with salmon lice problems forming serious threats to salmonid farming. Several treatment approaches have been used to control the parasite. Treatment effectiveness must be optimized, and the systematic genetic differences between subpopulations must be studied to monitor louse species and enhance targeted control measures. We have used IIb‐RAD sequencing in tandem with a random forest classification algorithm to detect the regional genetic structure of the Norwegian salmon lice and identify important markers for sex differentiation of this species. We identified 19,428 single nucleotide polymorphisms (SNPs) from 95 individuals of salmon lice. These SNPs, however, were not able to distinguish the differential structure of lice populations. Using the random forest algorithm, we selected 91 SNPs important for geographical classification and 14 SNPs important for sex classification. The geographically important SNP data substantially improved the genetic understanding of the population structure and classified regional demographic clusters along the Norwegian coast. We also uncovered SNP markers that could help determine the sex of the salmon louse. A large portion of the SNPs identified to be under directional selection was also ranked highly important by random forest. According to our findings, there is a regional population structure of salmon lice associated with the geographical location along the Norwegian coastline.

UniBioDicts: Unified access to Biological Dictionaries

Touré

Kuiper

et al. 2021

Summary We present a set of software packages that provide uniform access to diverse biological vocabulary resources that are instrumental for current biocuration efforts and tools. The Unified Biological Dictionaries (UniBioDicts or UBDs) provide a single query-interface for accessing the online API services of leading biological data providers. Given a search string, UBDs return a list of matching term, identifier and metadata units from databases (e.g. UniProt), controlled vocabularies (e.g. PSI-MI) and ontologies (e.g. GO, via BioPortal). This functionality can be connected to input fields (user-interface components) that offer autocomplete lookup for these dictionaries. UBDs create a unified gateway for accessing life science concepts, helping curators find annotation terms across resources (based on descriptive metadata and unambiguous identifiers), and helping data users search and retrieve the right query terms. Availability and implementation The UBDs are available through npm and the code is available in the GitHub organisation UniBioDicts (https://github.com/UniBioDicts) under the Affero GPL license. Supplementary information Supplementary data are available at Bioinformatics online.