CATH (https://www.cathdb.info) identifies domains in protein structures from wwPDB and classifies these into evolutionary superfamilies, thereby providing structural and functional annotations. There are two levels: CATH-B, a daily snapshot of the latest domain structures and superfamily assignments, and CATH+, with additional derived data, such as predicted sequence domains, and functionally coherent sequence subsets (Functional Families or FunFams). The latest CATH+ release, version 4.3, significantly increases coverage of structural and sequence data, with an addition of 65,351 fully-classified domains structures (+15%), providing 500 238 structural domains, and 151 million predicted sequence domains (+59%) assigned to 5481 superfamilies. The FunFam generation pipeline has been re-engineered to cope with the increased influx of data. Three times more sequences are captured in FunFams, with a concomitant increase in functional purity, information content and structural coverage. FunFam expansion increases the structural annotations provided for experimental GO terms (+59%). We also present CATH-FunVar web-pages displaying variations in protein sequences and their proximity to known or predicted functional sites. We present two case studies (1) putative cancer drivers and (2) SARS-CoV-2 proteins. Finally, we have improved links to and from CATH including SCOP, InterPro, Aquaria and 2DProt.
Gene3D (http://gene3d.biochem.ucl.ac.uk) is a database of globular domain annotations for millions of available protein sequences. Gene3D has previously featured in the Database issue of NAR and here we report a significant update to the Gene3D database. The current release, Gene3D v16, has significantly expanded its domain coverage over the previous version and now contains over 95 million domain assignments. We also report a new method for dealing with complex domain architectures that exist in Gene3D, arising from discontinuous domains. Amongst other updates, we have added visualization tools for exploring domain annotations in the context of other sequence features and in gene families. We also provide web-pages to visualize other domain families that co-occur with a given query domain family.
Background Surgery is the main modality of cure for solid cancers and was prioritised to continue during COVID-19 outbreaks. This study aimed to identify immediate areas for system strengthening by comparing the delivery of elective cancer surgery during the COVID-19 pandemic in periods of lockdown versus light restriction. Methods This international, prospective, cohort study enrolled 20 006 adult (≥18 years) patients from 466 hospitals in 61 countries with 15 cancer types, who had a decision for curative surgery during the COVID-19 pandemic and were followed up until the point of surgery or cessation of follow-up (Aug 31, 2020). Average national Oxford COVID-19 Stringency Index scores were calculated to define the government response to COVID-19 for each patient for the period they awaited surgery, and classified into light restrictions (index <20), moderate lockdowns (20–60), and full lockdowns (>60). The primary outcome was the non-operation rate (defined as the proportion of patients who did not undergo planned surgery). Cox proportional-hazards regression models were used to explore the associations between lockdowns and non-operation. Intervals from diagnosis to surgery were compared across COVID-19 government response index groups. This study was registered at ClinicalTrials.gov , NCT04384926 . Findings Of eligible patients awaiting surgery, 2003 (10·0%) of 20 006 did not receive surgery after a median follow-up of 23 weeks (IQR 16–30), all of whom had a COVID-19-related reason given for non-operation. Light restrictions were associated with a 0·6% non-operation rate (26 of 4521), moderate lockdowns with a 5·5% rate (201 of 3646; adjusted hazard ratio [HR] 0·81, 95% CI 0·77–0·84; p<0·0001), and full lockdowns with a 15·0% rate (1775 of 11 827; HR 0·51, 0·50–0·53; p<0·0001). In sensitivity analyses, including adjustment for SARS-CoV-2 case notification rates, moderate lockdowns (HR 0·84, 95% CI 0·80–0·88; p<0·001), and full lockdowns (0·57, 0·54–0·60; p<0·001), remained independently associated with non-operation. Surgery beyond 12 weeks from diagnosis in patients without neoadjuvant therapy increased during lockdowns (374 [9·1%] of 4521 in light restrictions, 317 [10·4%] of 3646 in moderate lockdowns, 2001 [23·8%] of 11 827 in full lockdowns), although there were no differences in resectability rates observed with longer delays. Interpretation Cancer surgery systems worldwide were fragile to lockdowns, with one in seven patients who were in regions with full lockdowns not undergoing planned surgery and experiencing longer preoperative delays. Although short-term oncological outcomes were not compromised in those selected for surgery, delays and non-operations might lead to long-term reductions in survival. During current and future periods of societal restriction, the resilience of elective surgery systems requires strengthening, which might include...
SARS-CoV-2 has a zoonotic origin and was transmitted to humans via an undetermined intermediate host, leading to infections in humans and other mammals. To enter host cells, the viral spike protein (S-protein) binds to its receptor, ACE2, and is then processed by TMPRSS2. Whilst receptor binding contributes to the viral host range, S-protein:ACE2 complexes from other animals have not been investigated widely. To predict infection risks, we modelled S-protein:ACE2 complexes from 215 vertebrate species, calculated their relative energies, correlated these energies to COVID-19 infection data, and analysed structural interactions. We predict that known mutations are more detrimental in ACE2 than TMPRSS2. Finally, we demonstrate phylogenetically that human SARS-CoV-2 strains have been isolated in animals. Our results suggest that SARS-CoV-2 can infect a broad range of mammals, but not fish, birds or reptiles. Susceptible animals could serve as reservoirs of the virus, necessitating careful ongoing animal management and surveillance.
Gene3D http://gene3d.biochem.ucl.ac.uk is a database of domain annotations of Ensembl and UniProtKB protein sequences. Domains are predicted using a library of profile HMMs representing 2737 CATH superfamilies. Gene3D has previously featured in the Database issue of NAR and here we report updates to the website and database. The current Gene3D (v14) release has expanded its domain assignments to ∼20 000 cellular genomes and over 43 million unique protein sequences, more than doubling the number of protein sequences since our last publication. Amongst other updates, we have improved our Functional Family annotation method. We have also improved the quality and coverage of our 3D homology modelling pipeline of predicted CATH domains. Additionally, the structural models have been expanded to include an extra model organism (Drosophila melanogaster). We also document a number of additional visualization tools in the Gene3D website.
Salinity threat is estimated to reduce global rice production by 50%. Comprehensive analysis of the physiological and metabolite changes in rice plants from salinity stress (i.e. tolerant versus susceptible plants) is important to combat higher salinity conditions. In this study, we screened a total of 92 genotypes and selected the most salinity tolerant line (SS1-14) and most susceptible line (SS2-18) to conduct comparative physiological and metabolome inspections. We demonstrated that the tolerant line managed to maintain their water and chlorophyll content with lower incidence of sodium ion accumulation. We also examined the antioxidant activities of these lines: production of ascorbate peroxidase (APX) and catalase (CAT) were significantly higher in the sensitive line while superoxide dismutase (SOD) was higher in the tolerant line. Partial least squares discriminant analysis (PLS-DA) score plots show significantly different response for both lines after the exposure to salinity stress. In the tolerant line, there was an upregulation of non-polar metabolites and production of sucrose, GABA and acetic acid, suggesting an important role in salinity adaptation. In contrast, glutamine and putrescine were noticeably high in the susceptible rice. Coordination of different strategies in tolerant and susceptible lines show that they responded differently after exposure to salt stress. These findings can assist crop development in terms of developing tolerance mechanisms for rice crops.
Deep-learning (DL) methods like DeepMind’s AlphaFold2 (AF2) have led to substantial improvements in protein structure prediction. We analyse confident AF2 models from 21 model organisms using a new classification protocol (CATH-Assign) which exploits novel DL methods for structural comparison and classification. Of ~370,000 confident models, 92% can be assigned to 3253 superfamilies in our CATH domain superfamily classification. The remaining cluster into 2367 putative novel superfamilies. Detailed manual analysis on 618 of these, having at least one human relative, reveal extremely remote homologies and further unusual features. Only 25 novel superfamilies could be confirmed. Although most models map to existing superfamilies, AF2 domains expand CATH by 67% and increases the number of unique ‘global’ folds by 36% and will provide valuable insights on structure function relationships. CATH-Assign will harness the huge expansion in structural data provided by DeepMind to rationalise evolutionary changes driving functional divergence.
Computational modelling of proteins has been a major catalyst in structural biology. Bioinformatics groups have exploited the repositories of known structures to predict high-quality structural models with high efficiency at low cost. This article provides an overview of comparative modelling, reviews recent developments and describes resources dedicated to large-scale comparative modelling of genome sequences. The value of subclustering protein domain superfamilies to guide the template-selection process is investigated. Some recent cases in which structural modelling has aided experimental work to determine very large macromolecular complexes are also cited.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.