Public archiving in structural biology is well established with the Protein Data Bank (PDB; wwPDB.org) catering for atomic models and the Electron Microscopy Data Bank (EMDB; emdb-empiar.org) for 3D reconstructions from cryo-EM experiments. Even before the recent rapid growth in cryo-EM, there was an expressed community need for a public archive of image data from cryo-EM experiments for validation, software development, testing and training. Concomitantly, the proliferation of 3D imaging techniques for cells, tissues and organisms using volume EM (vEM) and X-ray tomography (XT) led to calls from these communities to publicly archive such data as well. EMPIAR (empiar.org) was developed as a public archive for raw cryo-EM image data and for 3D reconstructions from vEM and XT experiments and now comprises over a thousand entries totalling over 2 petabytes of data. EMPIAR resources include a deposition system, entry pages, facilities to search, visualize and download datasets, and a REST API for programmatic access to entry metadata. The success of EMPIAR also poses significant challenges for the future in dealing with the very fast growth in the volume of data and in enhancing its reusability.
Flaviviruses are responsible for serious diseases such as dengue, yellow fever, and zika fever. Their genomes encode a polyprotein which, after cleavage, results in three structural and seven non-structural proteins. Homologous proteins can be studied by conservation and coevolution analysis as detected in multiple sequence alignments, usually reporting positions which are strictly necessary for the structure and/or function of all members in a protein family or which are involved in a specific sub-class feature requiring the coevolution of residue sets. This study provides a complete conservation and coevolution analysis on all flaviviruses non-structural proteins, with results mapped on all well-annotated available sequences. A literature review on the residues found in the analysis enabled us to compile available information on their roles and distribution among different flaviviruses. Also, we provide the mapping of conserved and coevolved residues for all sequences currently in SwissProt as a supplementary material, so that particularities in different viruses can be easily analyzed.
Trypsin-like serine proteases are a group of homologous enzymes which exert multiple roles in both vertebrate and invertebrate organisms. Key properties of these enzymes include their activation from an inactive zymogen form to their active form by cleavage of residues in their N-terminus, the presence of a conserved catalytic triad of residues, and the existence of different patterns of substrate selectivity for residue cleavage between the various members of this protein family. In this article, we apply the decomposition of residue coevolution networks computational method to find sets of residues related to some of these key properties, especially to zymogen activation. Positive selection detection, normal modes analysis, and the calculation of thermal couplings between the bovine trypsinogen and bovine trypsin structures residues yielded further information for understanding the zymogen activation process and highlighted the importance of some of the coevolved set residues during these transitions.
Summary CONAN is a web application developed to detect specificity determinants and function related sites by amino acids co-variation networks analysis, emphasizing local co-evolutionary constraints. The software allows the characterization of structurally and functionally relevant groups of residues and their relationship with subsets of sequences by automatic cross-referencing with GO terms, UniprotKb annotations and INTERPRO. Availability CONAN is free and open-source, being distributed in the terms of the GPLV3 licence. The software is available as a web application and python script versions and can be accessed at http://bioinfo.icb.ufmg.br/conan. We also provide running instructions, the source code and a user guide.
Motivation Computational studies of molecular evolution are usually performed from a multiple alignment of homologous sequences, on which sequences resulting from a common ancestor are aligned so that equivalent residues are placed in the same position. Residues frequency patterns of a full alignment or from a subset of its sequences can be highly useful for suggesting positions under selection. Most methods mapping co-evolving or specificity determinant sites are focused on positions, however, they do not consider the case for residues that are specificity determinants in one subclass, but variable in others. In addition, many methods are impractical for very large alignments, such as those obtained from Pfam, or require a priori information of the subclasses to be analyzed. Results In this paper we apply the complex networks theory, widely used to analyze co-affiliation systems in the social and ecological contexts, to map groups of functional related residues. This methodology was initially evaluated in simulated environments and then applied to four different protein families datasets, in which several specificity determinant sets and functional motifs were successfully detected. Availability and implementation The algorithms and datasets used in the development of this project are available on http://www.biocomp.icb.ufmg.br/biocomp/software-and-databases/networkstats/. Supplementary information Supplementary data are available at Bioinformatics online.
Public archiving in structural biology is well established with the Protein Data Bank (PDB; wwPDB.org) catering for atomic models and the Electron Microscopy Data Bank (EMDB; emdb-empiar.org) for 3D reconstructions from cryo-EM experiments. Even before the recent rapid growth in cryo-EM, there was an expressed community need for a public archive of image data from cryo-EM experiments for validation, software development, testing and training. Concomitantly, the proliferation of 3D imaging techniques for cells, tissues and organisms using volume EM (vEM) and X-ray tomography (XT) led to calls from these communities to publicly archive such data as well. EMPIAR (empiar.org) was developed as a public archive for raw cryo-EM image data and for 3D reconstructions from vEM and XT experiments and now comprises over a thousand entries totalling over 2 petabytes of data. EMPIAR resources include a deposition system, entry pages, facilities to search, visualise and download datasets, and a REST API for programmatic access to entry metadata. The success of EMPIAR also poses significant challenges for the future in dealing with the very fast growth in the volume of data and in enhancing its reusability.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.