Motivation
Coiled-coil domains (CCD) are widespread in all organisms and perform several crucial functions. Given their relevance, the computational detection of coiled-coil domains is very important for protein functional annotation. State-of-the art prediction methods include the precise identification of coiled-coil domain boundaries, the annotation of the typical heptad repeat pattern along the coiled-coil helices as well as the prediction of the oligomerization state.
Results
In this paper we describe CoCoNat, a novel method for predicting coiled-coil helix boundaries, residue-level register annotation and oligomerization state. Our method encodes sequences with the combination of two state-of-the-art protein language models and implements a three-step deep learning procedure concatenated with a Grammatical-Restrained Hidden Conditional Random Field (GRHCRF) for CCD identification and refinement. A final neural network (NN) predicts the oligomerization state. When tested on a blind test set routinely adopted, CoCoNat obtains a performance superior to the current state-of-the-art both for residue-level and segment-level coiled-coil detection. CoCoNat significantly outperforms the most recent state-of-the art methods on register annotation and prediction of oligomerization states.
Availability
CoCoNat web server is available at https://coconat.biocomp.unibo.it. Standalone version is available on GitHub at https://github.com/BolognaBiocomp/coconat.
Epidemic spread of new pathogens is quite a frequent event that affects not only humans but also animals and plants, and specifically livestock and crops. In the last few years, many novel pathogenic viruses have threatened human life. Some were mutations of the traditional influenza viruses, and some were viruses that crossed the animal-human divide.In both cases, when a novel virus or bacterial strain for which there is no pre-existing immunity or a vaccine released, there is the possibility of an epidemic or even a pandemic event, as the one we are experiencing today with COVID-19.In this context, we defined an ELIXIR Service Bundle for Epidemic Response: a set of tools and workflows to facilitate and speed up the study of new pathogens, viruses or bacteria. The final goal of the bundle is to provide tools and resources to collect and analyse data on new pathogens (bacteria and viruses) and their relation to hosts (humans, animals, plants).
Coiled-coil domains (CCD) are widespread in all organisms performing several crucial functions. Given their relevance, the computational detection of coiled-coil domains is very important for protein functional annotation. State-of-the art prediction methods include the precise identification of coiled-coil domain boundaries, the annotation of the typical heptad repeat pattern along the coiled-coil helices as well as the prediction of the oligomerization state. In this paper we describe CoCoNat, a novel method for predicting coiled-coil helix boundaries, residue-level regis-ter annotation and oligomerization state. Our method encodes sequences with the combination of two state-of-the-art pro-tein language models and implements a three-step deep learning procedure concatenated with a Grammatical-Restrained Hidden Conditional Random Field (GRHCRF) for CCD identification and refinement. A final neural network (NN) predicts the oligomerization state. When tested on a blind test set routinely adopted, CoCoNat obtains a performance superior to the current state-of-the-art both for residue-level and segment-level coiled-coil detection. CoCoNat significantly outperforms the most recent state-of-the art method on register annotation and prediction of oligomerization states. CoCoNat is available at https://coconat.biocomp.unibo.it
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.