The aim of the UniProt Knowledgebase is to provide users with a comprehensive, high-quality and freely accessible set of protein sequences annotated with functional information. In this article, we describe significant updates that we have made over the last two years to the resource. The number of sequences in UniProtKB has risen to approximately 190 million, despite continued work to reduce sequence redundancy at the proteome level. We have adopted new methods of assessing proteome completeness and quality. We continue to extract detailed annotations from the literature to add to reviewed entries and supplement these in unreviewed entries with annotations provided by automated systems such as the newly implemented Association-Rule-Based Annotator (ARBA). We have developed a credit-based publication submission interface to allow the community to contribute publications and annotations to UniProt entries. We describe how UniProtKB responded to the COVID-19 pandemic through expert curation of relevant entries that were rapidly made available to the research community through a dedicated portal. UniProt resources are available under a CC-BY (4.0) license via the web at https://www.uniprot.org/.
Chemical shift variation in small-molecule (1)H NMR signals of biofluids complicates biomarker information recovery in metabonomic studies when using multivariate statistical and pattern recognition tools. Current peak realignment methods are generally time-consuming or align major peaks at the expense of minor peak shift accuracy. We present a novel recursive segment-wise peak alignment (RSPA) method to reduce variability in peak positions across the multiple (1)H NMR spectra used in metabonomic studies. The method refines a segmentation of reference and test spectra in a top-down fashion, sequentially subdividing the initial larger segments, as required, to improve the local spectral alignment. We also describe a general procedure that allows robust comparison of realignment quality of various available methods for a range of peak intensities. The RSPA method is illustrated with respect to 140 (1)H NMR rat urine spectra from a caloric restriction study and is compared with several other widely used peak alignment methods. We demonstrate the superior performance of the RSPA alignment over a wide range of peaks and its capacity to enhance interpretability and robustness of multivariate statistical tools. The approach is widely applicable for NMR-based metabolic studies and is potentially suitable for many other types of data sets such as chromatographic profiles and MS data.
Motivation The number of protein records in the UniProt Knowledgebase (UniProtKB: https://www.uniprot.org) continues to grow rapidly as a result of genome sequencing and the prediction of protein-coding genes. Providing functional annotation for these proteins presents a significant and continuing challenge. Results In response to this challenge, UniProt has developed a method of annotation, known as UniRule, based on expertly curated rules, which integrates related systems (RuleBase, HAMAP, PIRSR, PIRNR) developed by the members of the UniProt consortium. UniRule uses protein family signatures from InterPro, combined with taxonomic and other constraints, to select sets of reviewed proteins which have common functional properties supported by experimental evidence. This annotation is propagated to unreviewed records in UniProtKB that meet the same selection criteria, most of which do not have (and are never likely to have) experimentally verified functional annotation. Release 2020_01 of UniProtKB contains 6,496 UniRule rules which provide annotation for 53 million proteins, accounting for 30% of the 178 million records in UniProtKB. UniRule provides scalable enrichment of annotation in UniProtKB. Availability UniRule rules are integrated into UniProtKB and can be viewed at https://www.uniprot.org/unirule/ . UniRule rules and the code required to run the rules, are publicly available for researchers who wish to annotate their own sequences. The implementation used to run the rules is known as UniFIRE and is available at https://gitlab.ebi.ac.uk/uniprot-public/unifire.
Multicellular organisms maintain the stability of their internal environment using metabolic and physiological regulatory mechanisms that are disrupted during disease. The loss of homeostatic control results in a complex set of disordered states that may lead to metabolic network failure and irreversible system damage. We have applied a new statistical entropy-based approach to quantify temporal systemic disorder (divergence of metabolic responses) in experimental patho-physiological states, via NMR-spectroscopy generated metabolic profiles of urine. A recovery (R-) potential metric has also been developed to evaluate the relative extent to which defined metabolic processes are perturbed in the context of a global system in terms of multiple changes in concentrations of biofluid components accompanying the disrupted functional activity. This approach is sensitive to physiological as well as pathological interventions. We show that global disruptions of metabolic processes, lesion reversibility, and disorder in metabolic responses to a stressor can be visualized via metabolic entropy metrics, giving insights into biological robustness and thus providing a new tool for assessing deviation from homeostatic regulation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.